Computational free will

June 7, 2026. If we live in a simulation, could the simulators control our minds? Probably not! I explain how complexity theory hands us a computational notion of free will.

Introduction

Suppose we are agents in a simulation. A disturbing possibility is that simulators have decided to influence or even control our thoughts, not by violent, large-scale interventions—this compromises the integrity of the simulation and is inconsistent with our experience—but more subtly, by nudging us: microscopic changes at the neuronal or electrocortical level that cascade into decision making or other deliberative processes. Simulators might do this because they are A/B-testing our behaviour, playing us in some sort of video game (think Being John Malkovich), or because they want to guide us towards cheaper regions of simulation space.

Whatever the reason, this would be a direct assault on the notion of free will. The goal of this post is to show that computational complexity theory actually provides a notion of “computational free will”, up to some reasonable assumptions. These include that simulators are subject to the same mathematical laws as we are (not unreasonable if we are necessitarians about mathematical truth) but also that their computational resources are bounded (polynomial in the size of the simulation).

Full disclosure: I don’t think we live in a simulation. Part of my goal is to reduce the number of interesting things that can be done with a high-fidelity simulation, and thus reduce the posterior likelihood that such a big, useless, and expensive program would be run in the first place. But your posterior will of course depend on your prior, aka your mileage will vary.

Chaos

A simple barrier to controlling decisions is chaos. In its Lyapunov form, it says that an initial uncertainty $\delta x_0$ in some parameter $x$ grows exponentially in time:

\[\delta x(t) \approx e^{\lambda t}\delta x_0\]

for Lyapunov exponent $\lambda$. Brains are complex systems with intricate dynamics that are plausibly chaotic in this sense. If so, this suggests that short-term behaviour on the order of the Lyapunov time $\lambda^{-1}$ is essentially unpredictable (since errors grow exponentially), so simulators cannot have analytic control over brain states, at least if initial conditions are specified with finite precision. If dynamics cannot be predicted, it presumably cannot be controlled.

The problem with this argument is that simulators know brain states exactly, so they may well be able to predict future brain states. Worse, the sensitivity to initial conditions suggests that microscopic adjustments can quickly grow into large-scale changes. This seems like a slam dunk! But there is a difference between evolving a brain state and guiding it towards a target outcome. Random seeds lead to random outcomes; unless they can solve the inverse problem of determining the seed from the outcome, they are restricted to brute force search (via analytics or simulation) over the space of small perturbations.

The space of small perturbations is continuous, but even if discretized, it is exponentially large in the number of neurons $N$. If “small” means at most $n = \alpha N$ neurons are perturbed (for $\alpha \ll 1$ but $\alpha N \gg 1$) and there are $\ell$ ways to perturb a neuron, the number of perturbations is

\[\sum_{k=0}^n \ell^k \binom{N}{k} \sim \frac{\ell^n N!}{n!(N - n)!} = \mathcal{O}\left[\left(\frac{1}{1-\alpha}\right)^N\left(\frac{\ell(1-\alpha)}{\alpha}\right)^n\right]\]

using Stirling’s formula. This is exponential in both $N$ and $n$, which for $N \sim 10^{10}$ neurons, is plausibly outside the computational reach of our simulators unless $n = \mathcal{O}(1)$. Even if they manage to reduce the search space to something reasonable, how do they know the target state is reachable after a perturbation?

Neural chaos. Perturbations grow exponentially in time, and there are an exponential number of them when we perturb some $\alpha = \mathcal{O}(1)$ fraction of neurons.

Reachability

Let’s treat this reachability problem formally. We model the network of neurons as a graph $G = (V, E)$, where each neuron is a node $v \in V$ and neural connections are undirected edges $\{v, w\} \in E$. Each neuron has a state $\sigma_v(t) \in \Sigma$ at time $t$, with a local update rule $f_v: \Sigma^{\delta_v} \to \Sigma$, solely in terms of states of neighbours of $v$, with $\delta_v = |N(v)|$ the degree of $v$. The state of the whole graph at time $t$ is $\sigma(t) \in \Sigma^{|V|}$, and the global update rule is $f$, so $f[\sigma(t)] = \sigma(t + 1)$, obtained by applying local update rules in some fixed order. Because we apply rules in sequence, this model is called a sequential dynamical system (SDS).

The reachability problem for an SDS is simply the question of whether the initial global state $\sigma_0$ ever evolves to some target global state $\sigma$, i.e., whether

\[\exists t : f^{(t)}[\sigma_0] = \sigma.\]

For simulators, $\sigma$ is a target brain state, e.g., where we make a different decision, and $\sigma_0$ is a perturbed initial brain state. Now, this isn’t a perfect model. Unlike the SDS, neural networks are time-dependent and stochastic. But although time-independence is unrealistic on the timescale of learning, the network is fixed over the decision timescale so this isn’t an issue. Second, stochastic behaviour coarse grains some (more complex) deterministic behaviour (for instance discrete Hopfield networks), so we lose nothing with this constraint.

For highly symmetric update rules, the problem of reachability can be efficiently solved. But increasing the complexity of update rules, it quickly becomes impossible! We define a threshold update rule over a binary state space $\Sigma = \{0, 1\}$ and update via

\[\sigma_v(t+1) = \left[ \sum_{w\in N(v)} b_{vw} \sigma_w(t) \geq \theta \right]\]

where $[\cdot]$ is the Iverson bracket, $b_{vw}$ is a set of weights, and $\theta$ is a fixed constant. This is a reasonably accurate model of excitatory processes in the brain where activation is passed along neural connections in a gated way. Here’s the punchline. Barrett et. al. (2003) show that if weights are asymmetric, meaning that for some nodes $v, w \in V$ we have

\[b_{vw} \neq b_{wv},\]

then the general problem of deciding if $\sigma_0$ ever reaches $\sigma$ is $\textsf{PSPACE}$-complete. The intuition is that it can take an exponentially long time for interesting things to happen and we can’t “fast forward” the dynamics beyond running the SDS and seeing what happens. More formally, $\textsf{PSPACE}$ is contained in $\textsf{EXPTIME}$, so we can solve the problem in exponential time, but we’ve assumed this exceeds the computate budget of our simulators. A final note: asymmetric weights are biologically plausible for neurons since neural information tends to flow in one direction along the network.

SDS reachability. It is computationally intractable to tell if one brain state eventually evolves into another, assuming brains can be modelled by an SDS with asymmetric threshold update.

Bounded horizon control

This argument is nice, but it overlooks an important fact: simulators only care about bounded horizon reachability, where the time is bounded by some total number of steps $T$ that correspond to the decision timescale (or indeed any neurally plausible timescale):

\[\exists t \leq T : f^{(t)}[\sigma_0] = \sigma.\]

It’s plausible to take $T = \mathcal{O}(1)$ or at most polynomial in $\vert V\vert$, so there is a polynomial time algorithm for checking if states evolve into other states: just run the update rules. As mentioned above, the $\textsf{PSPACE}$-completeness only arises when we allow for run times which are exponential in $\vert V\vert$, so on the order of $2^{10^{10}}$ time steps, which is much longer than the age of universe when converted to neuronal time scales (milliseconds per step).

So, does that make the problem of thought control easy? No, because we have neglected the whole element of control! Really, the simulators are looking for a perturbation $\delta$ such that

\[\exists t \leq T : f^{(t)}[\sigma_0 + \delta] = \sigma\]

for their desired $\sigma$ and the “natural” initial neural state $\sigma_0$. This is a bounded horizon control problem. I don’t know how hard this problem is. But I do know that it is easy if $f$ is invertible, since then we simply compute

\[\Delta = f^{(-t)}[\sigma] - \sigma_0\]

for each $t \leq T$, and if $\Delta$ is small enough, we keep it. Thankfully for us, the update rule is not generally invertible (in particular, for the asymmetric threshold update) so this strategy doesn’t work, and the simulators are probably forced to search $\delta$-space as described above. The problem is in the $\textsf{NP}$ complexity class, since we can check a putative solution in polynomial time, and we conjecture it is $\textsf{NP}$-complete. The proof will have to await another blog post.

Horizon-bounded control. We conjecture that finding a perturbation to cascade into a target state is computationally intractable, or equivalently, that the dynamics is hard to invert.

Conclusion

Telling if one state updates to another (eventually) is wickedly hard, and telling which (if any) perturbation flows to a target state is probably only solvable by brute force over the perturbation search space, which is also (most likely) wickedly hard. It’s possible the update rules for neurons look different (though they can’t look dramatically different) or that our brains are for some reason much easier than the worst-case scenario which conventional complexity deals with (a more legitimate objection). Unfortunately, average-case complexity is much more finicky and requires an input probability distribution, which in turn requires a precise input model for the biological substrate. These are well beyond the scope of this modest blog post.

But I hope I’ve persuaded you that, even if we live in a simulation, it is highly unlikely our thoughts are in any way controlled by the simulators. They could perturb us, certainly; but they have no reliable way of telling what large-scale changes those perturbations will induce, other than by simulating and seeing what happens. This is far more severe than simple chaotic unpredictability: even with perfect knowledge of the forward process, the inverse problem is intractable. This is what I mean by computational free will, namely, a complexity-theoretic obstruction to external interference with the exercise of the will.

Computational free will. Mind control is computationally intractable.

I suspect that both chaos and non-invertibility are relevant to free will in the traditional sense. If microscopic changes can lead chaotically to decisional shifts, the “could have done otherwise” has some phenomenological grounding, since at a coarse-grained level you could have done otherwise and felt more or less the same. On the other hand, non-invertibility provides a way for free will to transcend determinism, i.e., your decisions may be determined in advance, but there is no way for a computationally bounded observer to know what those will be. I find that comforting! Perhaps it’s not so much freedom as privacy of will, an idea closely related to cryptography, one-way functions, and ilk notions I’ll explore another time. For now, rest easy in the knowledge that your thoughts are (probably) your own.

Written on June 7, 2026