Tony’s Take, November 2025

This month’s topics:

The five topological ages of man.

A new study of 4,216 subjects aged zero to 90 identifies five distinct epochs of brain development, using topological criteria. These epochs are punctuated by turning points at ages 9, 32, 66 and 83. “Topological turning points across the human lifespan,” published in Nature Communications, was picked up by the BBC, NBC News, the Washington Post, and (most recently) the Wall Street Journal.

Following research going back to the 19th century, the study models the human brain as a network of nodes (anatomically or functionally distinct regions of the brain). The authors of the Nature Communications article—Alexa Mousley, Richard Bethlehem and Duncan Astle of the University of Cambridge and Fang-Cheng Yeh of the University of Pittsburgh—look at the activity from magnetic resonance imaging diffusion scans to infer which nodes are linked. Using data from the Human Connectome Project and other previous research, they recorded thirteen different measurements meant to capture the topological structure of each subject’s brain. Some examples:

  • The global efficiency measures how easy it is to travel between two nodes in the network. The calculation is elementary: for each node we tally an efficiency (E) which counts $1$ for each other node one link away, $\frac{1}{2}$ for each one two links away, etc., so that the more close neighbors it has, the higher the total is; for a node in an $n$-node network, the largest possible E-value is $n-1$. The global efficiency is the average of E, taken over all the nodes in the network.

    a. Graph of the vertices of a tetrahedron. b. Graph with four nodes linked in a line.
    In the 4-node network a, each node has $E=3$ and the global efficiency is the maximum possible, $3$. In b, the blue nodes have $E=2\frac{1}{2}$ whereas for the others, $E=1\frac{2}{3}$; the global efficiency is the
    average, $\frac{25}{13}$. Image credit: Tony Phillips.

  • The clustering coefficient measures how likely it is for two links that share a node to be part of a triangle. If a node has $d$ incident edges, let’s call its triangle likelihood number (TLN) the number of triangles it is in, divided by the number of triangles it could possibly be in, which is $\frac{d(d-1)}{2}$ (the number of pairs of incident edges).  The clustering coefficient of the network is the average of all its TLNs.

    The blue node has 4 incident edges and 2 incident triangles, so its TLN is $\frac{1}{3}$; similarly the two red nodes have TLN$=0$ and the others have TLN=$\frac{1}{3}$, giving a clustering coefficient for this network of $\frac{5}{21}$.
    Image credit: Tony Phillips.

The authors used the mathematical processing tool UMAP (Uniform Manifold Approximation and Projection) to consolidate the data into a curve, and PCA (Principal Component Analysis, as detailed below) to find a 3-dimensional projection of the data that preserved 76.6% of the variance.

PCA loadings. How the various topological measures contribute to the three principal components, PC1, PC2 and PC3. Global efficiency contributes almost entirely to PC1, and the clustering coefficient to PC2. Image from Fig. 6, Nature Communications 16 Article number 10055, used under Creative Commons Attribution 4.0 International License.

Applying that projection to the UMAP curve gives us a graphic representation of the evolution of brain topology during the human lifetime. This curve turns out to have four “turning points” where this evolution changes direction.

The non-monotonicity of brain topology evolution during human lifetime. The projection of the UMAP curve into principal component space, tracing that evolution from age 0 to 90, has four significant turning points: at ages 9, 32, 66 and 83. Image from Fig. 6, Nature Communications 16 Article number 10055, used under Creative Commons Attribution 4.0 International License.

The authors interpret these turning points as defining five “lifetime epochs.” These are infancy into childhood, adolescence, adulthood, early aging and late aging. As they explain, these topologically derived epochs correlate closely with stages of anatomical and behavioral development. For instance, during the infancy-into-childhood epoch, the global efficiency of the brain’s network decreases. This can be matched with the anatomical “competitive elimination of synapses.” Another example: during adulthood, the brain experiences a “period of network stability,” which corresponds behaviorally to a “plateau in intelligence and personality.”

The study’s definition of adolescence as lasting from ages 9 to 32 is unusual, a point that media coverage emphasized. (It does match the ancient Roman concept of adulescens.) The article treats the problem gingerly, noting that all of the subjects were from the United Kingdom and the United States. What was not reported in the media was the gender distinction: in their Supplementary Information Figure 5 (“Sex-stratified turning points”), the authors show the turning points framing adolescence as 11 and 37 for males, but 9 and 33 for females.

Beyond networks: topology and information theory.

The research from the previous item, as with many applications of topology to the life sciences, abstracts the living system to a network and analyzes it using graph theory. Thomas F. Varley, Alice Patania and Josh Bongard (University of Vermont) and Pedro Mediano (Imperial College London) argue that this isn’t always appropriate. In “The Topology of Synergy” (PLOS Computational Biology, November 13) they remark that the basic building block of a network—a link between two nodes—may not adequately represent real-world systems. Even the simple exclusive-OR (XOR) logical gate, which outputs 1 if there is an odd number of 1s among the inputs, and 0 otherwise, is fundamentally non-dyadic.

exclusive-OR gate
The XOR gate gives an example of a non-dyadic relation. In this case, the output depends on all three inputs, and there is no correlation between any two of them. Image credit: Tony Phillips.

The XOR gate is an example of what the authors call a “higher-order relationship.” They discuss two structural alternatives to networks, which they believe can better capture higher-order relationships. These are topological data analysis and multivariate information theory.

Topological data analysis (a systematic introduction is here) treats a data set as a cloud of points in some (usually high-dimensional) Euclidean space, so there is a well-defined notion of which points are close to each other and which are far away, and concepts from topology can be used to analyze the structure of this cloud. A topological tool often used is persistent homology, which was recently described in this column.

One-dimensional information theory was also discussed on this site in the context of the mathematics of communication. The question here is, roughly speaking, if you know $n$ things $T_1, \dots, T_n$ about some phenomenon, and you are told an $(n+1)$st thing $T_{n+1}$, how much have you learned? If $T_{n+1}$ is identical to one of your $T_i$, or a logical consequence of some combination of them, you have learned nothing: the new information is completely redundant. But if $T_{n+1}$ is somewhere between known and completely unsuspected, information theory gives a way of measuring how “much” new information it contains.

The mathematical definition of information is narrower than the usual understanding of the term. You start with a probability distribution $\mathbf{p}$, which generates values $a_1, \dots, a_n$ each with a certain probability $p_i$ (so $\sum_i p_i =1$). If any of the $p_j=1$ (so all the other $p_i=0$) then $\mathbf{p}$ is concentrated entirely on $a_j$. In that case we know the answer before measuring, and gain no information. On the other hand, the maximum uncertainty we can have about the measurement is when all the $p_i$ are equal (and so equal to $\frac{1}{n}$). To interpolate between these two extremes, in 1948 Claude Shannon defined information as$$H(\mathbf{p})= -\sum_{i=1}^n p_i \log_2(p_i),$$where we take as usual $0\log_2(0) = 0$. The information is measured in bits. When all the $p_i$ are equal, then$$H(\mathbf{p}) = -(\frac{1}{n}\log_2(\frac{1}{n}) + \cdots + \frac{1}{n}\log_2(\frac{1}{n})) = \log_2(n) \text{ bits.}$$

Suppose we have two probability distributions on the same $n$ possible values, $\mathbf{p}$ with probabilities $p_1, \dots, p_n$ and $\mathbf{q}$ with $q_1, \dots, q_n$. The quantity$$H(\mathbf{q} \mid \mathbf{p}) = -\sum_{i=1}^n q_i \log_2(\frac{q_i}{p_i}),$$measures how much additional information comes from $\mathbf{q}$ if we already know $\mathbf{p}$. This is called the relative information.

Note that if the two distributions are the same, i.e. $q_i=p_i$, then the relative information is 0. This is maximum redundancy, when $\mathbf{q}$ gives no additional information. For a system of more than two distributions $\mathbf{p}_1, \mathbf{p}_2, \mathbf{p}_3,\dots $ the authors use a related measure $\mathcal{O}(\mathbf{p}_1, \dots \mathbf{p}_n)$ to quantify the redundancy in the system: a larger positive $\mathcal{O}$-score means the system is more redundant, while a larger negative $\mathcal{O}$-score means the system is richer in information. This richness is the synergy referred to in their title.

The authors point out an intriguing possible relation between information and topological complexity, which they illustrate by using clouds of points in 3-dimensional $(X_1, X_2, X_3)$-space to stand for geometrical figures. This allows them to reframe a geometrical image as a set of data points being sampled from three different probability distributions: the $X_1$-coordinates of the points are sampled from distribution $\mathbf{p}_1$, the $X_2$-coordinates from distribution $\mathbf{p}_2$, and the $X_3$-coordinates from $\mathbf{p}_3$. They can then apply information theory and examine $\mathcal{O}(\mathbf{p}_1,\mathbf{p}_2, \mathbf{p}_3)$.

They give examples showing a correlation of topology and information theory. First, 10,000 points are randomly sampled from the surface of a sphere. The $\mathcal{O}$-score shows significant synergy, but when points are sampled from the solid ball, the $\mathcal{O}$-score is almost zero. The authors argue that the topology of the sphere explains the higher-order information.

Left: Hollow sphere with O=-1.368 nat. Right: Solid ball with O=-0.04 nat.
Analysis of a spherical surface (left) and a solid ball (right) of the same radius. The authors credit the non-trivial topology (the “empty cavity”) of the surface for its higher synergy. Here and below, “nat” means that the $\mathcal{O}$ scores were calculated using the more convenient natural logarithms. Image credit: adapted from The topology of synergy, Fig. 4, in the open access journal PLOS Computational Biology. Used under CC by 4.0 license.

The authors repeated the experiment with a 2-dimensional toroidal surface and with the 3-dimensional solid torus that the surface bounds. The surface has significant synergy while the solid is  “not significantly different from the ball.” This last observation is disappointing, because it suggests that the information-theoretical measurement doesn’t register the solid torus’ non-trivial topology (it is homotopy equivalent to a circle, whereas the solid ball is homotopy equivalent to a point).

Left: Hollow toroidal surface with O = -1.079 nat. Right: Solid torus with O = -0.057 nat.
Analysis of a toroidal surface (left) and a solid torus (right) of the same dimensions. As before, the surface with its empty cavity has higher synergy. But this application of information theory does not seem to pick up the 1-dimensional topology of the solid. Image credit: adapted from The topology of synergy, Fig. 4, in the open access journal PLOS Computational Biology. Used under CC by 4.0 license.

The authors also investigate the $\mathcal{O}$-score for data collected during fMRI studies of the brain. They observe a correlation between “the number of voids” and measurements of synergy, which suggests that “in a fundamental way, topological data analysis and synergistic information theory are looking at the same kind of underlying structure.” This is intuitively appealing, but as they say, the underlying mathematics is “as-yet undiscovered.”

—Tony Phillips, Stony Brook University