Computational free will

June 7, 2026. If we live in a simulation, it seems like simulators could nudge us towards outcomes they choose in advance. I explain how a form of “computational free will” is ensured by complexity theory.

Introduction

Suppose we are agents in a simulation. A disturbing possibility is that simulators have decided to influence or even control our thoughts. Not by violent, large-scale interventions—this compromises the integrity of the simulation—but more subtly, by nudging us: microscopic changes at the neuronal or electrocortical level that cascade into, e.g., impacts on decision making or other deliberative processes. Simulators might do this because they are A/B-testing our behaviour, playing us in some sort of video game (think Being John Malkovich), or because they want to guide us towards cheaper regions of simulation space.

Whatever the reason, this would be a direct assault on the notion of free will. The goal of this post is to show that computational complexity theory actually guarantees a notion of “computational free will”, up to some reasonable assumptions. These include the assumption that simulators are subject to the same mathematical laws as we are (not unreasonable if we are necessitarians about mathematical truth) but also that their computational resources are bounded (polynomial in the size of the simulation, more questionable).

By the way, full disclosure: I don’t think we live in a simulation. Part of the point of my argument is to establish that simulators have less freedom to interfere than we might expect, so there is less incentive for them to run big, expensive, useless simulations in the first place. But your posterior will of course depend on your prior.

Chaos

A simple reason to think controlling decisions is hard is chaos. In its Lyapunov form, it says that an initial uncertainty $\delta x_0$ in a parameter $x$ grows exponentially in time:

\[\delta x(t) \approx e^{\lambda t}\delta x_0\]

for Lyapunov exponent $\lambda$. This shows that short-term dynamics on the order of the Lyapunov time $\lambda^{-1}$ is essentially unpredictable (since errors grow exponentially), so simulators cannot have analytic control over brain states for initial conditions specified with finite precision. If dynamics cannot be predicted, it presumably cannot be controlled.

The problem with this argument is that simulators know brain states exactly, so they may well be able to predict future brain states. Worse, the sensitivity to initial conditions suggests that microscopic adjustments can quickly grow into large-scale changes. This seems like a slam dunk for the simulators! But there is a difference between evolving a brain state and guiding it towards a target outcome. Random seeds lead to random outcomes; unless they can solve the inverse problem of determining the seed from the outcome, they are restricted to brute force search (via analytics or simulation) over the space of small perturbations.

Our focus now will be to understand the inverse problem and characterize its difficulty. The space of small perturbations is continuous, but even if discretized, it is exponentially large in the number of neurons $N$. If “small” means at most $n = \alpha N$ neurons are perturbed (for $\alpha \ll 1$ but $\alpha N \gg 1$) and there are $\ell$ ways to perturb a neuron, the number of perturbations is

\[\sum_{k=0}^n \ell^k \binom{N}{k} \sim \frac{\ell^n N!}{n!(N - n)!} = \mathcal{O}\left[\left(\frac{1}{1-\alpha}\right)^N\left(\frac{\ell(1-\alpha)}{\alpha}\right)^n\right]\]

using Stirling’s formula. This is exponential in both $N$ and $n$, which for $N \sim 10^{10}$ neurons, is plausibly outside the computational reach of our simulators unless $n = \mathcal{O}(1)$. But even if they manage to reduce the search space to something reasonable, how do they know the target state is reachable after a perturbation?

Reachability

Let’s treat the reachability problem formally. We model the network of neurons as a graph $G = (V, E)$, where each neuron is a node $v \in V$ and neural connections are undirected edges $\{v, w\} \in E$. Each neuron has a state $\sigma_v(t) \in \Sigma$ at time $t$, with a local update rule $f_v: \Sigma^{\delta_v} \to \Sigma$, solely in terms of states of neighbours of $v$, with $\delta_v = |N(v)|$ the degree of $v$. The state of the whole graph at time $t$ is $\sigma(t) \in \Sigma^{|V|}$, and the global update rule is $f$, so $f[\sigma(t)] = \sigma(t + 1)$, obtained by applying local update rules in some fixed order. This is called a sequential dynamical system (SDS).

The reachability problem for an SDS is simply the question of whether the initial global state $\sigma_0$ ever evolves to some target global state $\sigma$, i.e., whether

\[\exists t : f^{(t)}[\sigma_0] = \sigma.\]

For simulators, $\sigma$ is a target brain state, e.g., where we make a different decision, and $\sigma_0$ is a perturbed initial brain state. Now, this isn’t a perfect model. Unlike the SDS, neural networks are time-dependent and stochastic. But although time-independence is unrealistic on the time scale of learning, the network is fixed over the decision time scale so this isn’t an issue. Second, stochastic behaviour coarse grains some (more complex) deterministic behaviour (for instance discrete Hopfield networks), so we lose nothing with this restriction.

For highly symmetric update rules, the problem of reachability can be efficiently solved. But it quickly becomes impossible! For concreteness, fix a threshold update rule as follows. We use a binary state space $\Sigma = \{0, 1\}$ and define

\[\sigma_v(t+1) = \left[ \sum_{w\in N(v)} b_{vw} \sigma_w(t) \geq \theta \right]\]

where $[\cdot]$ is the Iverson bracket, $b_{vw}$ is a set of weights, and $\theta$ is a fixed constant. This is a reasonably accurate model of excitatory processes in the brain where activation is passed along neural connections in a gated way. Here’s the punchline. Barrett et. al. (2003) show that if weights are asymmetric, meaning that for some nodes $v, w \in V$ we have

\[b_{vw} \neq b_{wv},\]

then the general problem of deciding if $\sigma_0$ ever reaches $\sigma$ is $\textsf{PSPACE}$-complete. (This is likely much harder than $\textsf{NP}$-complete problems.) Asymmetric weights are biologically plausible for neurons since neural information tends to flow in one direction along the network. $\textsf{PSPACE}$ is contained in $\textsf{EXPTIME}$, meaning that we can solve the problem in exponential time, but we’ve assumed this exceeds the computate budget of our simulators.

Bounded horizon control

This argument is a nice, but it overlooks an important fact: simulators only care about bounded horizon reachability, where the time is bounded by some total number of steps $T$:

\[\exists t \leq T : f^{(t)}[\sigma_0] = \sigma.\]

Here, $T$ is some fixed, decisional time scale, plausible $T = \mathcal{O}(1)$ or polynomial at most in $\vert V\vert$ so there is a polynomial time algorithm for checking if states evolve into other states: just run the update rules. The $\textsf{PSPACE}$-completeness only arises when we allow for run times which are exponential in $\vert V\vert$, so on the order of $10^{10^{10}}$ time steps, which is much longer than the age of universe when converted to neuronal time scales.

So, does that make the problem of thought control easy? No, because we have neglected the whole element of control! Really, the simulators are looking for a perturbation $\delta$ such that

\[\exists t \leq T : f^{(t)}[\sigma_0 + \delta] = \sigma\]

for their desired $\sigma$ and a neurally fixed $\sigma_0$. This is a bounded horizon control problem. I don’t know how hard this problem is. But I do know that it is easy if $f$ is invertible, since then we simply compute

\[\Delta = f^{(-t)}[\sigma] - \sigma_0\]

for each $t \leq T$, and if $\Delta$ is small enough, we keep it. Thankfully, the update rule is not generally invertible (in particular, for the asymmetric threshold update) so this strategy doesn’t work, and the simulators are forced to search $\delta$-space as above. We conjecture this is $\textsf{NP}$-complete but the proof will have to await another blog post.

Conclusion

Telling if one state updates to another (eventually) is wickedly hard, and telling which (if any) perturbation updates to a target state is probably only solvable by brute force over the perturbation search space. It’s possible the update rules for neurons look different (though they can’t look dramatically different) or that our brains are for some reason much easier than the worst-case scenario which conventional complexity deals with (a more legitimate objection). Unfortunately, average-case complexity is much harder and requires an input probability distribution, which in turn requires a more precise input model for the biological substrate. These are well beyond the scope of this modest blog post.

But I hope I’ve persuaded you that, even if we live in a simulation, it is highly unlikely our thoughts are in any way controlled by the simulators. They could perturb us, certainly; but they have no reliable way of telling what large-scale changes those perturbations will induce, other than by simulating and seeing what happens. This is far more severe than simple chaotic unpredictability; even with perfect knowledge of the forward process, the inverse problem is intractable. This is what I mean by computational free will: complexity-theoretic obstructions to external interference with the exercise of the will.

Does this have anything to do with free will in the traditional sense? I suspect that both chaos and non-invertibility are relevant. If microscopic changes can lead to decisional shifts, the “could have done otherwise” has some phenomenological grounding. On the other hand, non-invertibility provides a way for free will to transcend determinism, i.e., your decisions may be determined in advance, but there is no way for a computationally bounded observer to know what those will be. I find that comforting! Perhaps it’s not so much freedom as privacy of will, an idea closely related to cryptography, average-case complexity, and other such notions I’ll explore another time. For now, rest easy in the knowledge that your thoughts are (probably) yours.

Written on June 7, 2026