Six Impossible Things: Issue 1

January 17, 2024. In the first issue of 6IT: the mathematics of supervising children; unreliably clairvoyant demons; why the sky is blue (and why it isn’t); a Poisson process for primes; sychronizing fireflies; and quantum error correction from Penrose tilings.

“There's no use trying, ” she said. “One can't believe impossible things.” “I daresay you haven't had much practice,” said the Queen. “When I was your age, I always did it for half-an-hour a day. Why, sometimes I've believed as many as six impossible things before breakfast.”

— Lewis Carroll, Through the Looking Glass

Inspired by one of my heroes, John Baez, I’m planning to blog about “finds” in the hard sciences on a semi-regular basis. These may be papers, favourite theorems, new discoveries, or shower thoughts; I’m hoping to be brief and nerdily enthusiastic rather than thorough and technically deep. Either way, I hope it’s fun to read. Let’s see how it goes!

The obstructed supervision theorem
Zermelo’s demon and the axiom of choice
Why the sky is blue
A game of primes
The firefly effect
Quantum error correction from Penrose tilings

1. The obstructed supervision theorem

Suppose you have to guard an art gallery with $n$ straight walls. How many guards do you need to ensure the artworks are visible at all times? There is a beautiful proof, due to Steve Fisk and presented Proofs from THE BOOK, This book collects demonstrations of which God themself might approve. Whether an omniscient being needs proofs is another question! that $\lfloor n/3\rfloor$ guards are sufficent, and examples show this number is in general necessary. We’ll start with the examples. Take $m$ triangles and connect them with a corridor, forming a gallery of $n=3m$ walls. For $m = 3$, this gives:

Each triangle requires a guard to see the recessed tip, but one of these guards can survey the whole corridor from the bottom left. Thus, $m = n/3$ guards are needed. For $n$ not a multiple of $3$, we can simply remove one or two walls from the above construction. Now let’s turn to Fisk’s argument for sufficiency.

Art Gallery Theorem. For a polygonal gallery of $n$ walls, $\lfloor n/3 \rfloor$ guards are sufficient.

Proof.$\,$ (In short: triangulate, 3-color, and place guards at one color. Click to expand.)

Consider the graph formed by the $n$ walls of the gallery, with walls as edges and the points they meet as vertices. Triangulate this graph, i.e. introduce $n - 2$ new rectilinear edges that split the interior into triangles. We will show that we can color the vertices of this graph with three colors so that no adjacent vertex has the same color.

Note that, since there are no internal vertices, we can assign $0$ and $1$ in an alternating fashion to triangular faces. If we could not assign bits this way, it would mean an odd number of triangular faces meet at a point. This is only possible with internal vertices. We color the faces assigned $0$ with three colors (G, R, B) in clockwise order, and faces assigned $1$ with the colors in counterclockwise order, with the first face determining the coloring of the second face, and so on. The coloring clearly exists and is unique up to the choice of triangulation and ordering on the first face.

Running around the polygon, we have $n$ vertices of three colors, the rarest of which is assigned to at most $\lfloor n/3 \rfloor$ vertices. If it was assigned to more, there would be more than $n$ vertices!

We choose this rare color to be the location of our guards. Since each triangle has a vertex of this color, each triangle has a guard watching it, and we are done. The required triangulation doesn't obviously exist for non-convex polygons. To show it does, first we need the existence of a vertex with acute interior angle, like the leftmost corner above. Without such a vertex, the polygon could never close! So, let's draw this vertex:

We can either tringulate by drawing an edge between the two adjacent vertices, or if a vertex pokes inside that triangle, to the offending poker. Then we split off the triangle and use induction on the rest! $\blacksquare$

I became interested in the art gallery theorem because of a real world problem: supervision in out-of-school-hours (OSH) care. Instead of guards watching over paintings, there are educators watching over children, which in contrast to paintings are mobile and unpredictable. The requirement of full visibility makes more sense! My sister runs an OSH facility with an almost comically elaborate layout. The standard art gallery theorem does not apply due to the presence of internal polygons. Let us these call internal walls or polygons holes. With a little thought, one can extend to the art gallery theorem to the following:

Obstructed Supervision Theorem. For a polygonal layout of $n$ walls and $g$ holes, a total of $$ \left\lfloor \frac{n+2g}{3}\right\rfloor $$ educators are sufficient to supervise the whole space.

Proof.$\,$ (In short: connect a hole to the outside with two walls.)

For each hole, draw an edge from one of its vertices to a vertex on the outer polygon.

If each such edge is viewed as two walls, we now have a (non-convex) polygon of $n + 2g$ walls, and we can apply the original art gallery theorem. $\blacksquare$

This was first proved by O’Rourke (see Chapter 5 of his book). Examples of necessity are much harder to come by. We can modify our earlier examples by adding internal walls, which we view as degenerate polygons of $2$ walls. For instance, the following layout, with $n = 7$ and $g = 1$, requires $(n + 2g)/3 = 3$ educators:

Adding more slits and recesses, we can extend this to higher values of $n$ and $g$. But these degenerate polygons seem like cheating, After all, they can only hide a child of measure zero from a well-positioned supervisor. and if we require genuine polygons, it becomes much harder to saturate the bound; for general $n$ and $g$, the problem is unsolved. There is a full characterization of such polygons for a single hole (due to Shermer) that is very fiddly, and gives rise to constructions like the following ($n=40$ walls and $14$ guards):

This would not look out of place in a basilica or medieval fortress. It is remarkable that a few walls make the task so much harder; then again, perhaps we shouldn’t be surprised that trying to manage groups of children gives rise to unsolvable problems.

2. Zermelo’s demon and the axiom of choice

A recent paper by Baumeler, Dakić, and Del Santo introduced me to a curious consequence of the axiom of choice (AC). Recall that the AC states that, for any collection of nonempty sets $\{S_i: i\in I\}$, we can choose a set of representatives $\{s_i: i\in I\}$ where each $s_i \in S_i$, i.e. one element from each set. Though the AC is very intuitive—we can choose elements from nonempty sets—it has very counterintuitive consequences, such as the Banach-Tarski paradox.

This splits a perfect sphere into five wildly complicated pieces, and reassembles them to form two spheres of the same size! In his autobiography, Richard Feynman refutes Banach-Tarski Or he mockingly calls it, “So-and-so’s theorem of immeasurable measure”. on the grounds that no physical sphere, composed of atoms, can be split up this way. The atoms prevent the arbitrarily fine divisions needed to make the pieces non-measurable! One could view this as an argument against the AC, but I think the point is subtler: since we cannot carry out the Banach-Tarski “experiment” in reality, we cannot use physics to reason about the axioms of set theory. The AC could be true, it could be false; physics probably has nothing to do with it.

Baumeler, Dakić, and Del Santo disagree. They argue that the AC may have physical implications, using a bizarre corollary discovered by Gabay and O’Connor in 2004:

Infinite Hat Theorem. Suppose a countable number of prisoners are placed on the number line, one at each natural number, and independently assigned a blue or red hat with equal probability. We suppose they face forward, and can see all hats at higher values of $n$.

If the prisoners meet beforehand, can use the AC, and know their position on the number line, they can independently guess the color of their hats so that only a finite number are wrong.

Proof.$\,$ (Prisoners see eventually identical sequences, and guess backwards with AC.)

Consider the set $S = \{\text{R}, \text{B}\}^{\mathbb{N}}$ of infinite sequences of red and blue hats. Two sequences $s, s' \in S$ are eventually equal if they have the form $$ s = s_0 t, \quad s' = s'_0 t $$ for some finite sequences $s_0, s_0'$ of equal length, and $t \in S$. We let $s(n)$ denote the $n$th element of sequence $s$. It's easy to see that being eventually equal is an equivalence relation, with equivalence classes $S_i$ for some indexing set $i \in I$. By the AC, the prisoners can select a representative $s_i$ from each class.

Once placed on the number line and assigned hats, any two prisoners will see sequences that are eventually equal, with $t$ the sequence seen by the prisoner further along (see above). All prisoners can thus determine the equivalence class $S_i$ of the actual sequence of hats, choose the corresponding representative $s_i$, and output the guess $s_i(n)$ for their position $n$. Since the actual sequence and the guess $s_i$ are eventually equal, at most finitely many guesses are wrong. $\blacksquare$

There are already a number of Feynman-like objections we could make here; for instance, the prisoners need to have an infinitely large memory to store the representatives $s_i$. But even if we grant them these powers, what can they do with them? Infinite memory is reminiscent of Maxwell’s demon and relatives, so let us replace the prisoners with Zermelo’s demon, named after Ernst Zermelo, originator of the AC. Prior to his work on set theory, Zermelo invented a demon to sit around and wait for Poincaré recurrences. These recurrences reset a system to its initial state, thereby spontaneously reducing entropy and violating the second law of thermodynamics. We suppose Zermelo’s demon got bored—a recurrence takes much longer than the age of the universe—and decided, like Zermelo, to turn its efforts to mathematics.

The demon uses AC and has a set of lucky coins; any time it flips one, it learns the infinitely long history of prior coin outcomes (analogous to prisoners looking backwards) and the number of flips until the coin disintegrates (analogous to a prisoner learning their number). Like the prisoners, the demon can determine the equivalence class of $S_i$ of the history of flips, and choose the representative $s_i$ with the AC. But since any sequence of remaining flips gives rise to the same $s_i$, it’s clear that the demon learns nothing about them!

The fact that Zermelo’s demon cannot predict future coin flips makes the AC less physically disturbing. But there is another variant due to Hardin and Taylor which shows the demon can look into the future. I follow this nice exposition by Eric Moorhouse.

The Hardin-Taylor Theorem. Given an arbitrary function $f:\mathbb{R} \to \mathbb{R}$, Zermelo's demon can use the behaviour of $f$ on $(-\infty, t)$ to correctly predict the behaviour on $(-\infty, t')$ for some $t' > t$, except for $t$ in a countable set.

Proof.$\,$ (The demon uses the prisoner's trick with functions rather than sequences.)

To prove this, the demon first constructs a well-ordering A well-ordering is a linear order ($\prec, A)$ such that every nonempty set of $A$ has a least element. We use the AC to pick the least element, then its successor, and so on. $(\prec, \mathbb{R}^{\mathbb{R}})$ on the set of real functions using the AC. Consider the equivalence relation on functions $f: \mathbb{R}\to \mathbb{R}$ given by $$ f \sim_t g \quad \Longleftrightarrow \quad f(x) = g(x) \text{ for all } x < t. $$ The well-ordering allows the demon to define the representative $ \hat{f}_t = \min_{\prec} \{g : f \sim_t g\}. $ The demon then guesses that $\hat{f}_t$ is the extension of $f$.

This extrapolation will fail on some set of “unpredictable” times, $$ U = \{t \in \mathbb{R} : \hat{f}_t \neq \hat{f}_{t'} \text{ for all } t' > t\}. $$ These are times for which $\hat{f}_t$ fails to agree with $f$ at all $t' > t$, or equivalently, $\hat{f}_{t'}$. Our goal now is to show the set $U$ is countable. Observe that $U$ cannot have an infinite sequence of times, or equivalently a sequence of functions, $$t_1> t_2 > \cdots \quad \Longleftrightarrow \quad \hat{f}_{t_1} \succ \hat{f}_{t_2} \succ \cdots, $$ since $\hat{f}_{t_{i}} \sim_{t_{i+1}} f$, but $\hat{f}_{t_{i+1}}$ is defined as the minimal function satisfying this, so we must have $ \hat{f}_{t_{i}} \succ \hat{f}_{t_{i+1}} $. This infinite descending sequence is impossible, since $U$ has a least element. Since $U$ has no infinite descending sequence of times, for any $t \in U$, the set $U_t = U \cap (t, \infty)$ either has a least element $\min U_t$, or is empty, in which case $t = \max U$.

In the first case, we can find a rational number $\iota(t)$ which is between $t$ and $\min U_t$; in the second, we just choose some rational $\iota(t) > t$. The map $\iota: U \to \mathbb{Q}$ is an injection, and since $\mathbb{Q}$ is countable, $U$ is countable. $\blacksquare$

The demon is clairvoyant on almost all intervals $(-\infty, t)$, but with no guarantees about how long that clairvoyance can last. It would be unwise to follow its investment advice! As with the lucky coins, the demon’s powers are useless or unreliable. In fact, all the variants discussed in Hardin and Taylor’s monograph have some similar fineprint; this illustrates the fundamentally nonconstructive nature of the AC, where we can build something, but we can’t say when, where, or for how long. You might ask what it’s building. I think the simplest answer is a sort of error correcting code, robust against any finite (or countable, or “scattered” as per Definition 7.1.1 of Hardin and Taylor) number of errors. It is not useful since what it is good at transmitting or predicting is the equivalence class representative chosen by the AC, not the other points in the equivalence class we care about for our application. I think physics remains safe from Zermelo’s demon.

3. Why the sky is blue

Why is the sky blue? The conventional answer is that light bounces off air molecules, and high frequencies bounce more. Blue has a high frequency, so voilà! But actually, violet has a higher frequency, so why isn’t the sky violet? It turns out the sun emits more blue light, so surely that does it? Do the calculation and you find it doesn’t: the sky should be green. So what gives? We are missing physiology. It is the eye, and not the sky, that is blue.

To carefully determine the perceived colour, we need to go and look up luminous efficiency functions and do some integrals. This is hard work, and I am lazy. Luckily, there is a quick hack to get the right answer! Thanks to Jude Arnold for collaborating on an earlier version of this calculation. The idea is to assume the eye adapted to receive sunlight, so we can take its sensitivity curve $V$ to match the spectral intensity $I_{\odot}$ of the sun. Direct sunlight makes up about $85\%$ of the total power at midday, with the rest being the scattered or diffuse light itself, of intensity $I_{\text{diff}}$. So a better approximation is $V = \alpha I_{\odot} + (1-\alpha) I_{\text{diff}}$ for $\alpha=0.85$. Let’s see what color this predicts!

Spectral Adaptation. If the eye adapts to a mixture of direct and diffuse sunlight with weight $\alpha$, the peak To identify this with the color of the sky is a bit of a cheat, since the “peak” of the spectrum depends on how we parametrize it. It's different if we use wavelength! We really need to integrate against curves for each cone's sensitivity curve. frequency is \[ h\nu^* = (7 - 2 \alpha)(1- e^{2\alpha - 7}) k_\text{B}T_{\odot}, \] where $h$ is Planck's constant, $k_\text{B}$ is Boltzmann's constant, and $T_\odot$ is the sun's temperature. Setting $\alpha = 0.85$, this corresponds to a wavelength of $\lambda^* = 473 \text{ nm}.$

Calculation.$\,$ (Take a product of spectra, differentiate, and set to zero.)

For frequency $\nu$ and temperature $T$, the blackbody spectrum is proportional to \[ I_{\odot}(\nu) \propto \frac{\nu^{3}}{e^{h\nu/k_\text{B}T_{\odot}} - 1} \propto \frac{x^3}{e^x-1} \] for a dimensionless variable $x = h\nu/k_\text{B}T_{\odot}$. Rayleigh scattering increases with frequency as $\nu^{4}$, so $I_{\text{diff}}(\nu) \propto \nu^4 I_{\odot}(\nu)$. Thus, for weight $\alpha$, the sensitivity curve goes as \[ V(\nu) = \alpha I_{\odot}(\nu) + (1- \alpha)I_{\text{diff}}(\nu) \propto \left[\alpha + (1-\alpha) x^4\right] \frac{x^3}{e^x - 1}, \] since using the dimensionless variable ensures the constants of proportionality are equal. The perceived spectrum of the sky is $V\cdot I_{\text{diff}}$. To find the peak of this spectrum, we differentiate with respect to $x$ and set to zero, yielding the equation \[ (x - \beta)e^{x} +\beta = 0 \] for $\beta = 7 - 2\alpha$. This is transcendental, but has approximate solution $x \approx \beta(1- e^{-\beta})$. The corresponding frequency and wavelength peaks are then \[ h\nu^* = \beta (1- e^{-\beta}) k_\text{B}T_{\odot} \quad \Longrightarrow \quad \lambda^* = \frac{c}{\nu^*} = \frac{hc}{\beta (1- e^{-\beta}) k_\text{B}T_{\odot}}. \] If we now plug in $hc/k_\text{B} = 14.39 \text{ mm K}$, $\alpha = 0.85$, and $T_{\odot} = 5772 \text{ K}$ for the surface temperature of the sun, we get \[ \lambda^* = \frac{14.39 \text{ mm}}{\beta (1- e^{-\beta}) \cdot 5772} \approx 473 \text{ nm}, \] as claimed. $\blacksquare$

For reference, the real answer is $\lambda = 475 \text{ nm}$. Although the adaptive hypothesis is wrong, The (photopic) luminous efficiency function falls off much faster than a blackbody distribution. The evolutionary hypothesis isn’t bad, but ignores the physiological constraints on the eye which favor a narrowband design. it produces an estimate astoundingly close to the real color of the sky—the mark of a good hack! To get a sense of the discrepancy, we can map these frequencies to color space using the CIE 1931 model, which implements all that ocular physiology we ignore. It’s close, but a few nanometers can still make a perceptible difference.

This concludes our discussion of daylight. But the sky isn’t only blue during the day; it is sometimes blue as twilight falls, a fact I noticed a few weeks ago on an evening drive. The blue lies below the pure black of the night sky, and above the orange and yellow of sunset. The black makes sense because, far enough from the horizon, all the light has been scattered. Of course, go further and the sky is dark simply because it is in the earth’s shadow. Sunset is yellow and orange because most of the blue light has been scattered out by the time it arrives at the eye. But the band in between is a puzzle!

The explanation turns out to involve new physics: the absorption of red and yellow light by atmospheric ozone ($\text{O}_3$), a process discovered by James Chappuis in 1880 and applied to twilight by Edward Hulbert in the 1950s. Ozone molecules absorb photons, get excited and start wobbling, then split into oxygen molecules ($\text{O}_2$) and/or atoms ($\text{O}_1$). Wobbling is a continuous phenomenon, leading to a band of frequencies that can be absorbed. However, there are two peaks, at $\lambda = 575 \text{ nm}$ and $\lambda = 603 \text{ nm}$:

No wonder the sunset colors disappear so quickly! They’re eaten up by the ozone, and what’s left over is the blue light not yet scattered. You might wonder if the hole in the ozone layer leads to less blue in the antarctic twilight. As Lee, Meyer and Hoeppe show from painstaking observation at the south pole, it does not, which raises yet more questions… This paper has a wonderful potted history of the optics of twilight. They conclude that, although ozone is the main factor, aerosol levels, haze, and high-altitude meteorology are also relevant, particularly for the color of the zenith. With atmospheric optics, something is always up in the air.

References

Atmospheric Optics (2003), Craig Bohren.
“Atmospheric ozone and colors of the Antarctic twilight sky” (2011), Raymond Lee Jr., Wolfgang Meyer, and Götz Hoeppe.

4. A game of primes

I’m working on a research problem where, the more prime factors a number has, the harder the problem becomes. So I began to wonder: how many prime factors does a large number have, on average? Suppose $n$ can be factorized as

\[n = p_1^{m_1} p_2^{m_2} \cdots p_{T}^{m_{T}},\]

where each $p_j$ is prime, $m_j > 0$ and $T$ is the total number of primes. Define

\[\omega(n) = T, \quad \Omega(n) = \sum_{j=1}^T m_j.\]

We’ll focus on the second function, and try to understand its behaviour. A good place to start is numerics. Let’s plot a histogram of the distribution of $\Omega(n)$ in a window (of width $\min\{n, 10^3\}$) around $n_a = 10^{2a}$, for integers $1 \leq a \leq 5$:

We’ve marked the mode on the $x$ axis, and although the distribution spreads out to the right, it does so very, very slowly. As Carl Pomerance said of a related function, it “has been proved to go to infinity but has never been observed to do so”. The distribution itself looks well-behaved; after inspecting a few skewed possibilities, the best visual match is the Poisson distribution:

\[X \sim \text{Poiss}(\lambda), \quad \mathbb{P}(X=k) = e^{-k}\frac{\lambda^k}{k!}.\]

This distribution counts the number of times an event occurs, where each occurrence is independent and the average number of occurrences is $\lambda$. It’s tempting to view a prime $p$ as being Poisson distributed $\sim \text{Poiss}(\lambda_p)$ over each $n$, with $k$ counting how many times $p$ divides $n$. Since a prime $p$ divides every $p$th numer, we expect that $\lambda_p = 1/p$. If these different prime factors are independent of each other, then the total number of factors, on average, is

\[\mathbb{E}_{m\leq n}[\Omega(m)] = \sum_{p\leq n} \lambda_p = \sum_{p\leq n} \frac{1}{p}.\]

We can do this sum assuming a few results from analytic number theory.

Sum of Reciprocal Primes. The sum of reciprocal primes less than $n$ is \[ \sum_{p\leq n} \frac{1}{p} \sim \log \log n, \] where $f(n) \sim g(n)$ means $\lim_{n\to\infty} f(n)/g(n)=1$.

Proof.$\,$ (Integrate by parts and use the prime number theorem.)

For a continuously differentiable function $f(x)$ on $[\alpha, \beta]$, and a sequence $c_n$, we can integrate by parts to obtain Abel's Partial Summation Formula \[ \sum_{\alpha \leq n \leq \beta} c_n f(n) = C(\beta)f(\beta) - \int_\alpha^\beta \text{d}x \, C(x)f'(x), \tag{1}\label{abel} \] where $C(x) = \sum_{\alpha \leq n\leq x} c_n$. The Prime Number Theorem (PNT) states that the number of primes less than $n$, $\pi(n)$, has asymptotic approximation \[ \pi(x) = \sum_{2 \leq n \leq x} \delta_{\mathbb{P}}(n) \sim \frac{x}{\log x} \tag{2}\label{PNT} \] where $\delta_{\mathbb{P}}(n)$ is the indicator function on primes. Setting $c_n = \delta_{\mathbb{P}}(n)$, $f(n) = 1/n$, and noting that $C(x) = \pi(x)$, Abel summation applied to $(\ref{PNT})$ gives \begin{align*} \sum_{p \leq n} \frac{1}{p} & = \frac{\pi(n)}{n} + \int_2^n \text{d}x \, \frac{\pi(x)}{x^2} \sim \frac{1}{\log(n)} + \int_2^n \,\frac{\text{d}x}{x\log x} \sim \log \log n, \end{align*} discarding $1/\log n - \log\log 2 = o(\log\log n)$. $\blacksquare$

So, we expect that if we average $\Omega(m)$ over a large enough interval $[2, n]$, we should obtain $\log \log n$. In fact, a sum of independent Poisson-distributed variables is Poisson, since Poisson just counts the total number of events in terms of the expected number, and both the total and the expected number sum. This means that

\[\Omega(n) \sim \sum_{p \leq n}\text{Poiss}\left(\frac{1}{p}\right) = \text{Poiss}(\log \log n).\]

Plotting the probability density over the top of the empirical curves, we see excellent agreement, even at relatively small numbers! It looks as if the primes are Poisson random.

If true, this has a remarkable consequence. As $n$ becomes large, $\text{Poiss}(nx)$ can be viewed as a sum of $n$ i.i.d. variables $\text{Poiss}(x)$. Since, as we said, a sum of independent Poisson variables is Poisson. By the Central Limit Theorem (CLT), this will tend to a normal distribution $\mathcal{N}(\mu, \sigma^2)$ of mean and variance $\mu=\sigma^2=nx$. For our prime divisor-counting function $\Omega(n)$, this suggests that, as $n \to \infty$,

\[\Omega(m) \sim \mathcal{N}(\log\log n, \log\log n),\]

when $m \leq n$. The number of factors is Gaussian!

Let’s turn now to what can be proven using cold, hard math. The PNT $(\ref{PNT})$ was rigorously derived in 1896 (by Hadamard and de la Vallée Poussin independently) but in 1903, Edmund Landau gave a simpler proof and made the following generalization:

Poisson Primes. Let $\pi_k(n)$ count the numbers $m$, $1 \leq m\leq n$ with $k$ prime factors (with multiplicity), i.e. such that $\Omega(m) = k$. Then \[ \pi_k(n) \sim \frac{n}{\log n}\frac{(\log\log n)^{k-1}}{(k-1)!}. \tag{3} \label{PNT+} \] So $\Omega(m)$ for $m \leq n$ is asymptotically Poisson distributed with parameter $\lambda = \log\log n$.

Proof.$\,$ (Remove a prime from the set, use induction, and do a horrible integral.)

For a prime $p \leq n$, the number of sets of $k$ primes, which produce to less than $n$, is given by the number of sets of $k-1$ which produce to less than $n/p$: $$ \pi_{k}(n) = \sum_{p \leq} \pi_{k-1}\left(\frac{n}{p}\right). $$ I'll give an original but not entirely rigorous argument from induction; for a more careful proof, see Hardy and Wright (Theorem 437). The base case $k=1$ is the PNT $(\ref{PNT})$. Assume it is true for $k$; we will use this to show $k+1$. Our goal is to count all sets of $k+1$ primes whose product is less than $n$. If $p$ is in some of these sets, the remaining primes must multiply to less than $n/p$. Hence, \begin{align*} \pi_{k+1}(n) = \sum_{p \leq n} \pi_{k}\left(\frac{n}{p}\right) = \sum_{p \leq n} \frac{1}{(k-1)!}\frac{n/p}{\log(n/p)}[\log\log (n/p)]^{k-1}. \end{align*} Using Abel summation $(\ref{abel})$ in a similar fashion to earlier, we can replace summing over primes with integration weighted by $1/\log x$: \begin{align*} \pi_{k+1}(n) & \sim \frac{1}{(k-1)!}\int_2^{n/2} \text{d}x\, \frac{n/x}{\log x\log(n/x)}[\log\log (n/x)]^{k-1}\\ & = -\frac{1}{(k-1)!}\frac{n}{\log n}\int_{\log\log 2}^{\log\log (n/2)} \text{d}t\, \frac{t^{k-1}}{e^t/\log n - 1}, \end{align*} where we sum up to $n/2$ in the first integral Since if $x > n/2$, then $n/x < 2$ and hence $\pi_k(n/x) = 0$. and make the substitution $t = \log \log (n/x)$ in the second. This can be evaluated, using computer algebra or by hand, in terms of the polylogarithm $\text{Li}_j(x)$. Slogging through, we get \begin{align*} \pi_{k+1}(n) & \sim \frac{n}{\log n} \sum_{j =1}^{k}\frac{1}{(k-j)!} \left[\text{Li}_{j}\left(\frac{\log n}{\log (n/2)}\right) - \text{Li}_{j}\left(\frac{\log n}{\log 2}\right)\right]. \end{align*} The second term, at highest $j$, dominates, with \[ \pi_{k+1}(n) \sim -\frac{n}{\log n} \text{Li}_{k}\left(\frac{\log n}{\log 2}\right) \sim \frac{n}{\log n} \frac{(\log\log n)^k}{k!} \] from the large argument asymptotics of $\text{Li}_k$. This is what we wanted to show! $\blacksquare$

Now that we have an asymptotic Poisson distribution, we can show that at large $n$, the values of $\Omega(m)$ for $m \leq n$ are normally distributed. The rigorous proof of this result is called the Erdős–Kac Theorem, and its genesis is legendary. In Mark Kac’s words:

I first stated the conjecture during a lecture in Princeton in March 1939. Fortunately for me and possibly for mathematics, Erdős was in the audience and he immediately perked up. Before the lecture was over he had completed the proof.

Erdős and Kac’s derivation uses complicated sieve methods, and although simpler proofs now exist, we won’t bother. Instead, I’ll finish with the heuristic It’s heuristic because we haven’t kept track of the errors in Landau’s formula for $\pi_k(n)$. argument Kac may have presented at Princeton in 1939, since it uses Landau’s Poisson primes results. This is one of the many gems in Kac’s book on probabilistic methods.

The Erdős–Kac Theorem. Let $K_n(\alpha, \beta)$ denote the number of integers $1 \leq m\leq n$ such that \[ \log \log n +\alpha\sqrt{\log \log n} < \Omega(m) < \log \log n +\beta\sqrt{\log \log n}. \] Dividing $K_n(\alpha, \beta)$ by $n$ gives the empirical probability that $\Omega(m)$ is within these bounds. Then \[ \lim_{n\to\infty} \frac{1}{n}K_n(\alpha, \beta) = \frac{1}{\sqrt{2\pi}}\int_{\alpha}^\beta \text{d}t \, e^{-t^2/2}, \] so that for $m \leq n$, $\Omega(m)$ is normally distributed with mean and variance $\log\log n$.

Proof.$\,$ (Apply the CLT to Landau's Poisson distribution.)

Let's start by stating the CLT more precisely for a Poisson variable $X \sim \text{Poiss}(x)$. As $x \to \infty$, the CLT tells us that $Y = (X - x)/\sqrt{x} \sim \mathcal{N}(0, 1)$, and hence \[ \lim_{x\to \infty} \mathbb{P}_{X \sim \text{Poiss}}(x + \alpha\sqrt{x} < X <x + \beta\sqrt{x} ) =\mathbb{P}_{Y \sim \mathcal{N}}(\alpha < Y < \beta), \] or more explicitly, \[ \lim_{x\to\infty} \sum_{x + \alpha\sqrt{x} < k < x+ \beta\sqrt{x}} e^{-x}\frac{x^k}{k!} =\frac{1}{\sqrt{2\pi}}\int_\alpha^\beta \text{d}t\, e^{-t^2/2}. \label{CLT} \tag{5} \] We can split the count of $\Omega(m)$ into values it assumes, i.e. the number of $m \leq n$ with $\Omega(m) = k$, which is given by $\pi_k(n)$. Applying $(\ref{CLT})$ to Landau's formula $(\ref{PNT+})$, with $x=\log\log n$, we deduce \begin{align*} \lim_{n\to \infty}\frac{1}{n}K_n(\alpha, \beta) & = \lim_{n\to \infty}\sum_{x + \alpha \sqrt{x} < k < x + \beta \sqrt{x}} \frac{\pi_k(n)}{n} \\ & = \lim_{x\to\infty} \sum_{x + \alpha \sqrt{x} < k < x + \beta \sqrt{x}} e^{-x}\frac{x^{k-1}}{(k-1)!} \\ & = \frac{1}{\sqrt{2\pi}}\int_\alpha^\beta \text{d}t\, e^{-t^2/2}. \end{align*} Thus, $K_n \sim \mathcal{N}(x, x)$, and the values of $\Omega(m)$ for $m \leq n$ are normally distributed. $\blacksquare$

Kac said, rather poetically, that the primes play a game of chance. Landau’s formula implies that they play the same game as gum on the sidewalk, which is less poetic, but no less true.

References

An Introduction to the Theory of Numbers (1938), G. H. Hardy and E. M. Wright.
Statistical Independence in Probability, Analysis and Number Theory (1959), Mark Kac.

5. The firefly effect

For a long time, I’ve been itching to make a firefly simulator, and finally got round it! More specifically, I wanted to simulate the famous phase synchronization of lightning bugs (Photinus carolinus) in the Great Smoky Mountains of North Carolina and Tennessee. It happens during mating season, once a year, and is so popular there is a lottery for park entrance permits.

The mathematics is also very beautiful. If we think of the fireflies as oscillators, with brightness varying sinusoidally, then some sort of interaction or coupling between oscillators must cause them to synchronize. The fireflies do more than oscillate at a single frequency, with a “bright” period of $5$ to $8$ flashes followed roughly $8$ seconds of darkness. This isn’t so hard to add, but I don’t do it for simplicity. Kuramoto (1975) introduced a cute model for coupled oscillators which does the trick. Say we have $N$ fireflies, $i=1, 2, \ldots, N$, with brightness controlled by a phase $\theta_i$. If firefly $i$ oscillates with natural frequency $\omega_i$, its phase changes as

\[\frac{\mathrm{d}\theta_i}{\mathrm{d}t} = \dot{\theta}_i= \omega_i.\]

To measure how synchronized our fireflies are, we define the average phasor

\[re^{i\psi} = \frac{1}{N}\sum_{j=1}^N e^{i\theta_j}\]

with coherence $r$ and mean phase $\psi$. Phases add constructively if $\theta_j = \theta$ is the same for all $j$, with $r = 1$, but otherwise there is some destructive intereference, and in the extreme case $r = 0$. We draw some cartoon phasors below:

Let’s now think about how to couple the oscillating bugs. If firefly $i$ aligns its phase with firefly $j$, then $\theta_i$ will slow down if it is ahead of $\theta_j$ ($\theta_j - \theta_i < 0$), and speed up if it lags behind ($\theta_j - \theta_i > 0$). We can implement such a coupling by adding a term to $\dot{\theta}_i$ proportional to the difference of phases:

\[\dot{\theta}_i = \omega_i + k (\theta_j - \theta_i),\]

for a coupling constant $k > 0$. This has the same mathematical form as friction. Unfortunately, it doesn’t respect the $2\pi$ periodicity of the phase. But when $\theta_j \approx \theta_i$, $\sin(\theta_j - \theta_i)$ approximates a linear coupling, and this does respect periodicity. Making this replacement, and summing over all fireflies, we obtain the Kuramoto model:

\[\dot{\theta}_i = \omega_i + \frac{K}{N}\sum_{j=1}^N\sin(\theta_j - \theta_i)\]

for a normalized coupling constant $K = Nk$. Below, we provide a firefly simulator, with natural frequencies $\omega_i$ uniformly randomized over $[-0.01, 0.01]$ and initial phases $\theta_i$ over $[-\pi, \pi)$. Fireflies also wander randomly in space, but this is uncorrelated with flashing. You can control the normalized coupling $K$ by clicking and dragging the mouse up and down, and the number of fireflies $N$ with the left and right arrow keys.

At zero coupling, fireflies flash in an uncorrelated way and the coherence vanishes. Above some critical threshold, we begin to see partial synchronization, and eventually $r = 1$, with all fireflies blinking in unison. You can see this in the simulation. If you’re interested, I’ve included details about the analytic solution in the large-$N$ limit below, following Acebrón et al.

Partial Synchronization in the Large-$N$ Limit. As $N \to \infty$, there are two populations of firefly: wanderers, with $|\omega| > Kr$ and uniformly random phase; and followers, satisfying \[ \frac{\omega}{K r} = \sin(\theta - \psi), \] where $re^{i\psi}$ is the mean phasor. If the distribution of natural frequencies $g(\omega) = g(-\omega)$ is even, then followers only appear above a critical coupling $K_c = 2/[\pi g(0)]$.

Calculation.$\,$ (Couple to the mean field, turn sums to integrals, solve Fokker-Planck.)

So far, we've only used the amplitude of the phasor $r e^{i\psi}$, but the mean phase $\psi$ also has an important role to play. In fact, it provides a mean field description of the oscillators. To see this, mutiply the phasor by $e^{-i\theta_i}$ and take the imaginary part: \begin{align} \Im[re^{i(\psi - \theta_i)}] & = r\sin(\psi - \theta_i) = \frac{1}{N}\sum_{j=1}^N \sin(\theta_j - \theta_i) = K^{-1}(\dot{\theta}_i - \omega_i) \\ \Longrightarrow \quad \dot{\theta}_i & = \omega_i + Kr\sin(\psi - \theta_i). \tag{5.1} \label{kuramoto-mf} \end{align} Thus, each oscillator tries to align itself to the mean phase $\psi$, with a coupling determined not only by $K$ but the coherence $r$. Since the phasor is an average, we can explicitly view it as an expectation value with respect to a discrete pdf $\rho$: \[ r e^{i\psi} = \mathbb{E}_{\rho}\left[e^{i\theta}\right], \quad \rho(\theta, t) = \int_{-\infty}^\infty \mathrm{d}\omega\,\rho(\theta, t | \omega) \sum_{j=1}^N \frac{1}{N} \delta(\omega - \omega_j), \] where $\rho(\theta, t | \omega)$ governs the probability distribution of oscillators with natural frequency $\omega$; the initial phase remains random. It's clear that if $\theta_\omega(t)$ is the solution to $(\ref{kuramoto-mf})$ with initial phase $\theta = 0$, then for some function $f$, \[ \rho(\theta, t | \omega) = f(\theta - \theta_\omega(t)) \quad \Longrightarrow \quad \frac{\partial \rho}{\partial t} = -\frac{\partial}{\partial\theta} \big\{\big[\omega + Kr \sin(\psi-\theta)\big]\rho\big\}. \tag{5.2} \label{kuramoto-fp} \] Taking the $N\to\infty$ limit, we can replace the sum over delta functions by a continuous distribution $g(\omega)$, but everything else remains the same. Then $(\ref{kuramoto-fp})$ is a diffusion-free Fokker-Planck equation. It has two solutions. The first is coherent, with \[ \rho(\theta, t | \omega) = \delta\left(\theta - \psi - \sin^{-1}[\omega/K r]\right) \] for $|\omega| \leq Kr$ and hence $|\theta -\psi| \leq \pi/2$. These individual oscillators are phase locked, with $\dot{\theta}=0$. The second solution to $(\ref{kuramoto-fp})$ is obtained from assuming a stationary state, \[ 0 = \frac{\partial}{\partial\theta} \big\{\big[\omega + Kr \sin(\psi-\theta)\big]\rho\big\} \quad \Longrightarrow \quad \rho(\theta, t | \omega) = \frac{C}{|\omega + Kr \sin(\psi-\theta)|} \] for some constant $C$ determined by normalization at fixed $\omega$. This proves our first claim.
For the second, assume that $g(\omega)=g(-\omega)$ is even. We can calculate $r$ in the large-$N$ limit, performing a separate sum over wanderers and followers: \begin{align} r = \mathbb{E}_\rho\left[e^{i(\theta-\psi)}\right] & = I_1 + I_2 \\ I_1 & = \int_{-\pi/2}^{\pi/2} \mathrm{d}\theta \int_{-\infty}^\infty \mathrm{d}\omega \,\, e^{i(\theta-\psi)}\delta\left(\theta - \psi - \sin^{-1}[\omega/K r]\right) g(\omega) \\ I_2 & = \int_{-\pi}^{\pi} \mathrm{d}\theta \int_{|\omega| > Kr} \mathrm{d}\omega \,\, e^{i(\theta-\psi)}\frac{Cg(\omega)}{|\omega + Kr \sin(\psi-\theta)|}. \end{align} From evenness, the integral $I_2$ vanishes, while the first reduces to \begin{align} r & = \int_{|\omega| \leq Kr} \mathrm{d}\omega \,\, e^{i\sin^{-1}[\omega/Kr]} g(\omega) \\ & = \int_{|\omega| \leq Kr} \mathrm{d}\omega \,\, \cos\left(\sin^{-1}[\omega/Kr]\right) g(\omega) \\ & = Kr \int_{-\pi/2}^{\pi/2} \mathrm{d} \phi \,\, \cos^2\phi \, g(Kr \sin\phi), \end{align} where on the last line we made the change of variable $\omega = Kr \sin\phi$. This has the trivial, incoherent solution $r = 0$, and for $r > 0$, the nontrivial solution \[ 1 = K \int_{-\pi/2}^{\pi/2} \mathrm{d} \phi \,\, \cos^2\phi \, g(Kr \sin\phi). \] Taking $r \to 0$ in the above, the integral gives $\pi/2$, so the minimal value of coupling at which the partial synchronization becomes possible is $2/[\pi g(0)]$, as claimed. $\blacksquare$

In our simulation, frequencies are uniformly distributed according to $g \sim \mathcal{U}(-0.01, 0.01)$. Thus, $g(0) = 1/0.02$, and for large $N$, the onset of partial synchronization should occur at $K_c = 0.04/\pi \sim 13 \times 10^{-3}$. This is reflected in the simulation! For a general uniform interval $\mathcal{U}(-L, L)$, $K_c = 4L/\pi$. We can also calculate $r$. Setting $J = Kr$ and $M = \min\{L, J\}$, we have

\[r = \int_{|\omega| \leq M} \frac{1}{2L}\mathrm{d}\omega \,\, \sqrt{1 - (\omega/J)^2} = \frac{M}{2L}\sqrt{1 - \left(\frac{M}{J}\right)^2} + \frac{J}{2L} \, \sin^{-1}\left(\frac{M}{J}\right). \tag{5.3} \label{kuramoto-r}\]

When $M = J$ this gives us no information about $r$, but at large coupling, $J \gg L = M$, keeping only first-order terms in $L/J$ gives

\[r \approx \frac{1}{2} + \frac{J}{2L} \sin^{-1} \left(\frac{L}{J}\right) \approx \frac{1}{2} + \frac{1}{2} = 1.\]

This is what we expect! In general, when $M = L$ we can numerically solve $(\ref{kuramoto-r})$ for $r$. For instance, when $K = 20 \times 10^{-3}$, $r \approx 0.95$. This doesn’t quite match the simulation because of finite $N$ corrections. You can read more about these in Strogatz and Mirollo (1991). They also add noise to the model and do a stability analysis, but I won’t go into that here.

The usual image of chaos is the butterfly effect, where systems become unpredictable due to sensitivity on initial conditions. Put differently, we start in a small patch of phase space, and time evolution spreads this out to a large patch (when suitably coarse grained). The “firefly effect” is a very different aspect of chaos: predictable behaviour emerges spontaneously, and the emergent behaviour is largely insensitive to initial conditions. This takes us from a large patch to a small patch, so it’s like the butterfly effect in reverse!

References

“The Kuramoto model” (2005), Juan Acebrón, L. Bonilla, Conrad Pérez Vicente, Félix Ritort and Renato Spigler.
“Stability of Incoherence in a Population of Coupled Oscillators” (1991), Steven Strogatz and Renato Mirollo.

6. Quantum error correction from Penrose tilings

Penrose tilings are deliciously weird tessellations that cover the plane, but aperiodically: they never repeat themselves. It’s built out of two rhombi called a “kite” and a “dart” (purple and blue below), with thin angles $2\pi/5$ and $\pi/5$ respectively. To build a tiling, we join kites and darts according to a matching rule for edges, indicated by a single or double arrowhead below:

There are uncountably many ways to perform the matching, since most of the time we can join a kite or a dart to a given edge. Despite this, Penrose tilings are indistinguishable unless we look at an infinitely large area! We can also recover any finite “holes”. Formally:

(Indistinguishability) Any finite tiling patch $T$ is contained in every Penrose tiling. This result can be strengthened to the statement that the relative spatial frequency of a patch $T$ is also independent of the tiling.
(Recoverability.) The pattern in a finite region $K$ is uniquely determined by the pattern in $K^c$.

Both results are straightforward to prove using the toolkit of aperiodic tilings.

Penrose Tilings. Penrose tilings are locally indistinguishable and recoverable.

Proof.$\,$ (Dualize matching edges to bars, and iterately group/ungroup rhombi.)

We will first show recoverability, i.e. that the pattern in a finite region $K$ can be recovered from the pattern in $K^c$. The first step is to replace the edge matching with a "dual" matching of bars, as depicted below right. It's easy to see contiguity of bars is equivalent to matching edges. However, since the bars "reflect" off edges, when joined they form straight lines through the tiling called Ammann lines.

The pattern can be uniquely reconstructed from the Ammann lines. To see this, note that the bars make an angle of $\pi/10$ at the midpoint of the double-arrow edges, so the Amman lines will make an angle of $\pi/5$ here and only here, since no lines meet in the interior at this angle. Similarly, two double-arrow edges meet at $4\pi/5$ just in case they belong to the same thin rhombus. We draw these angles below (in red and orange). Thus, when two Ammann lines meet at $\pi/5$, we bisect them with double-arrow lines, and when these new lines intersect at $4\pi/5$, we produce darts. Kites fill in the rest of the space automatically.

We now prove local indistinguishability, namely that a finite patch of tiling $T$ is contained in every full Penrose tiling. The basic idea is to use inflation and deflation. Inflation is a way of splitting the kite and dart into smaller half-kites and half-darts. We then then complete these half-rhombi into full rhombi, as below, and rescale by $(1+\sqrt{5})/2$ to obtain a new, and unique, Penrose tiling. You can check from the edge matching rules that adjacent rhombi are inflated in a consistent way. Although we won't prove it, this process is reversible, so any tiling may be uniquely deflated by running the dissection backwards.

This gives an easy proof of indistinguishability. Given any finite patch of tiling $T$, we can deflate it $d$ times until it is contained in a single vertex configuration $v$, meaning a vertex and the rhombi that come together at that vertex. There are only a finite number of such vertex arrangements, and each appears in any valid Penrose tiling $T'$. In particular, if we start with a valid tiling and deflate $d$ times, it contains $v$, and hence on inflation $d$ times, we conclude that $T'$ contains the finite patch $T$. This argument can be extended to show that the relative spatial frequency of any pattern $T$ is indendent of the full Penrose tiling, a fact we will use below. $\blacksquare$

For further details I've omitted, including proof that deflation works, and the analysis of vertex arrangements, check out this site.

These properties tell us that information about the tiling is encoded globally, but in a redundant way. Li and Boyle (2023) observe that this looks a lot like a quantum error-correcting code (QECC), and take the further step of turning it into one! If it QECCs like a duck… A QECC is a way of redundantly encoding quantum information in a larger Hilbert space so that some ways of corrupting the state can be corrected. More precisely, if $\mathcal{H}_0$ is a Hilbert space of logical states, we embed it as a code subspace $\mathcal{C}$ inside a larger “physical” Hilbert space $\mathcal{H}$.

To turn logical states into physical states we need an encoding unitary $\mathcal{U}: \mathcal{H}_0 \to \mathcal{C}$, some error channels $\mathcal{E}_i: \mathcal{H} \to \mathcal{H}$ which corrupt the states, and a recovery channel $\mathcal{R}: \mathcal{H} \to \mathcal{H}$ which undoes errors. A simpler way to say this is that logical states remain distinguishable after errors. In quantum mechanics, distinguishability is measured by the inner product, so for any code states $|u\rangle, |v\rangle \in \mathcal{C}$ and any correctable error channels $\mathcal{E}_i, \mathcal{E}_j$, we must have We use the trace since errors may make pure states mixed.

\[\mbox{Tr}[(\mathcal{E}_i|u\rangle\langle u|)^\dagger \mathcal{E}_j |v\rangle\langle v|] = \langle u | v\rangle C_{ij},\]

where the constant $C_{ij}$ depends only on the errors and not the states. With all this setup, we can finally show that Penrose tilings naturally give rise to an unusual QECC.

Penrose Tilings as a QECC. Let $\mathcal{H}$ be the the Hilbert space spanned by orthonormal states $|T\rangle$, where $T$ are inequivalent Penrose tilings. Let $\mathcal{C} \subseteq \mathcal{H}$ be spanned by states of the form \[ |\overline{T}\rangle = \int \mathrm{d}g \, |gT\rangle, \] where $\mathrm{d}g$ is the Haar measure over rigid motions of the plane. Then $\mathcal{C}$ is the code space of a QECC that can correct arbitrary errors in any finite region $K$.

Proof.$\,$ (Take an overlap of states in the code space and use recoverability/indistinguishability.)

Consider a state in the subspace, decomposed over $K^c \otimes K$: \[ |\overline{T}\rangle = \int \mathrm{d}g |gT\rangle_{K^c}|gT\rangle_K, \quad \mathcal{E}|\overline{T}\rangle = \int \mathrm{d}g |gT\rangle_{K^c}\mathcal{E}|gT\rangle_K. \] The overlap of two states corrupted on $K$ is \[ \langle\overline{T}|\mathcal{E}^\dagger \mathcal{E}'|\overline{T}'\rangle = \int \mathrm{d}g' \, \mathrm{d}g\, \langle g T| g'T'\rangle_{K^c} \langle gT|_K \mathcal{E}^\dagger \mathcal{E}' |g'T'\rangle_K. \] By recoverability, $gT$ and $g'T'$ are determined by patterns on $K^c$, so the first overlap becomes an indicator function $\langle g T| g'T'\rangle_{K^c} = \delta_{gT, g'T'} = \delta_{T, g^{-1}g'T'}$. By indistinguishability, the states $|gT\rangle_K$ in the second overlap simply index patterns on $K$, with relative frequency independent of $T$ and $T'$. Thus, setting $g'' = g^{-1}g'$, we get \[ \langle\overline{T}|\mathcal{E}^\dagger \mathcal{E}'|\overline{T}'\rangle = \int \mathrm{d}g'' \, \delta_{T, g''T'} \int \mathrm{d}g \, \langle gT|_K \mathcal{E}^\dagger \mathcal{E}' |gT'\rangle_K = \langle \overline{T}|\overline{T}'\rangle C(\mathcal{E}, \mathcal{E}'), \] where $C(\mathcal{E}, \mathcal{E}')$ is a constant depending on the errors but not the states. Thus, the Penrose tiling is a QECC. $\blacksquare$

This is very cool! But it raises the question: can we go backwards, and obtain a tiling from an erasure code? To make this more concrete, consider the smallest quantum erasure code, introduced by Grassl, Beth and Pillizzari (1996). This encodes two logical qubits using four qubits. We can write

\[\begin{align*} \mathcal{U}|00\rangle &= \frac{1}{\sqrt{2}}(|0000\rangle + |1111\rangle) \\ \mathcal{U}|01\rangle &= \frac{1}{\sqrt{2}}(|0110\rangle + |1001\rangle) \\ \mathcal{U}|10\rangle &= \frac{1}{\sqrt{2}}(|1010\rangle + |0101\rangle )\\ \mathcal{U}|11\rangle & = \frac{1}{\sqrt{2}}(|1100\rangle + |0011\rangle). \end{align*}\]

This can correct an arbitrary error on a single qubit at a known location. In this case, there is a natural geometric interpretation. Notice that each code state consists of a $4$-qubit basis state, like $|0000\rangle$ and its “antipodal” opposite, $|1111\rangle$. These are opposite corners of a tesseract. These corners are always at least two edges away from each other, since the coordinates have even parity. Erasing a qubit corresponds to erasing one dimension of the tesseract, but the missing information can always be recovered by an antipodal flip, providing we know which part was erased. I suspect this is related to tilings by Minkowski’s conjecture.

This construction also connects to an approach to quantum gravity called “holography” or the AdS/CFT correspondence. It has been known for some time that, in many ways, AdS/CFT behaves like a QECC, with physical information encoded redundantly on the asymptotic boundary of spacetime. In addition, the physical objects in gravity are diffeomorphism invariant, so $|T\rangle$ should treated the same way as $|gT\rangle$. This is implemented directly by our code space.

For Penrose tilings, we don’t even need the full complement $K^c$ to determine $K$; any boundary neighbourhood $\partial K$ of $K$ (red above) will do! We can extend the Ammann lines and infer the tiling, as before. It’s cute that this theory is “holographic” in the sense that what happens inside $K$ is determined by the boundary. Moreoever, inflation and deflation look a lot like renormalization, also expected to play a role in quantum gravity. From recreational mathematics to quasicrystals to quantum gravity, Penrose tilings continue to surprise!

References

“Codes for the quantum erasure channel” (1996), M. Grassl, Th. Beth and T. Pellizzari.
“The Penrose tiling is a quantum error-correcting code” (2023), Zhi Li and Latham Boyle.

Written on January 17, 2024

𝔥𝔢𝔭𝔱𝔞𝔯۰𝔠𝔥

Six Impossible Things: Issue 1

Contents

1. The obstructed supervision theorem

Further reading

2. Zermelo’s demon and the axiom of choice

Further reading

3. Why the sky is blue

References

4. A game of primes

References

5. The firefly effect

References

6. Quantum error correction from Penrose tilings

References