The cost of rebuilding state

Think about a day in your week where you were maximally productive. You shipped a feature, fixed three bugs, refactored a module. Now think carefully about where the hours actually went. A generous estimate: maybe forty percent was spent on the thing. The rest went into getting to the place where the thing could be done, or getting back to that place after something threw you out of it. This essay is about that sixty percent. It’s not a new observation — every engineer has felt it — but the costs compound in a way that’s rarely reckoned with, and the reason they persist is that no one built a primitive to make them go away. That’s changing. It’s worth spelling out the problem clearly before talking about the solution.

The setup tax

Every task in software has a setup cost and a doing cost. The doing cost is the change you care about: fix this bug, add this endpoint, write this migration. The setup cost is everything else you have to do before the change becomes possible:

Clone the repo. Check out the branch.
Install dependencies. (Did the lockfile change? Did a native module break?)
Boot the database. Seed it with fixtures.
Start the backend. Start the frontend. Start the background worker.
Navigate the UI to the screen you’re working on.
Sign in. Reach the correct account. Get past the paywall.
Reproduce the bug once, as a sanity check.

None of this is the work. All of it is the conditions under which the work can happen. For a small change, the setup might take fifteen minutes. For a mature codebase with a dozen services, it can take an hour. For a task that requires landing on a specific failure mode — “the bug only happens after the second login on a Tuesday” — it can take most of the day. The naive defense is “we only pay this once per task.” That’s false in practice. We pay it:

Every time we switch branches to review a colleague’s PR
Every time we rebase and a dependency changes
Every time Docker decides to nuke its cache
Every time a test run leaves things in a broken state
Every time we come back to a task the next morning and have to reconstruct where we were
Every time a flaky test sends us back to step one

The setup isn’t paid once. It’s paid over and over, and every payment is work that produces nothing except the ability to start doing the work.

The debugging tax

Debugging is the purest form of the problem. A bug report arrives with a list of steps, and the goal is to get into the state where the bug reproduces. You follow the steps. It doesn’t reproduce. You look closer. One of the steps assumed an account that was created three weeks ago. You create the account. Now another step fails because its fixture data isn’t loaded. You reseed the database. Now the bug’s environment differs from production in ways you don’t yet know. By the time you have a repro, the interesting part of debugging — inspecting the state that produces the bug — is thirty minutes of work. The setup took two hours. Then you try a fix. It fails. The state is now corrupt from the attempted fix. You have to reset everything and start over. You try another fix. Same thing. After four iterations your local environment is a graveyard of half-applied experiments. The loop isn’t “form hypothesis, test hypothesis, learn.” It’s “spend an hour recovering, form hypothesis, test hypothesis, spend an hour recovering.” And the cost isn’t linear in the number of attempts — it’s quadratic, because each recovery also reintroduces small variations that require re-validating the environment before you can trust the next result.

The experimentation tax

Software engineering claims to be an empirical discipline. We form hypotheses, run experiments, update our beliefs. But we don’t actually run many experiments, because each experiment is expensive in exactly the way described above. Imagine you want to test three different approaches to a performance problem. In principle, you’d try all three in parallel and pick the best. In practice:

Approach A: implement, run benchmark, record number. (20 min setup, 5 min work)
Approach B: revert A, implement B, run benchmark, record number. (25 min setup because reverting is never clean, 5 min work)
Approach C: revert B, implement C, run benchmark. (30 min setup because something from A or B leaked, 5 min work)

Three experiments produce fifteen minutes of actual signal and seventy-five minutes of rebuilding state. Faced with that math, engineers stop running experiments. They commit to one approach, ship it, and rationalize. The cost of comparing alternatives has priced alternatives out of the practice. This is why most production systems have exactly one implementation of each component, even when three alternatives are plausible and the correct choice isn’t obvious. Not because the team decided A is better than B and C. Because trying B and C was too expensive to bother with.

The collaboration tax

You pair with someone on a problem. They’ve spent an hour getting into a useful state: repros running, debug server attached, three relevant log streams tailing in three terminals. You sit down to take a shift. Can they hand you their state? No. Not really. They describe it to you. They walk you through the steps. You re-do the setup, approximately, on your own machine, and produce an approximately equivalent state. But the repro is different. The log contents differ. You’re not really debugging the same thing; you’re debugging a thing shaped like the same thing. Pair programming is famously valuable. One of the reasons it’s valuable is that you share state — both people are literally looking at the same screen, same terminal, same repro. The moment that shared context ends, each of you has to rebuild the other’s state to resume alone.

The reviewer tax

A colleague sends you a PR. To review it well, you need to see the code running. To see the code running, you need:

The PR branch checked out locally
Dependencies up to date with whatever that branch requires
A database in a state compatible with any migrations in the PR
Environment variables matching whatever the PR expects
The exact scenario in which the PR’s code path fires

For a thoughtful review of a medium-sized change, this is 30-60 minutes of setup per review. Most reviewers, honestly, don’t pay it. They review the diff, trust the CI, and move on. The sophistication of the review is bounded by the cost of reproducing the PR’s environment. This is why CI preview environments were a breakthrough when they arrived at platforms like Vercel and Netlify. “Every PR has a running version, click the link” is the simplest possible answer to the reviewer tax. It’s also notable that a decade into modern web development, only a handful of deployment targets have solved this, and mostly for frontend web apps. For backend services with databases and long-running migrations, the reviewer tax is still crushing.

The pattern

All of these are the same tax. You spent expensive effort to get to a state where useful work can happen. The state is ephemeral — it lives in a running process, a populated filesystem, a warm cache, a loaded dataset. Any interruption — a reboot, a branch switch, a broken experiment, a second person needing the state — costs you the full price to rebuild. The deep reason is that state isn’t a first-class value in our infrastructure. Code is a value: we version it, diff it, merge it, ship it around. Data is a value: we back it up, snapshot it, replicate it. But the state of a running system, the thing that took you forty-five minutes to create — that’s not a value anywhere. You can’t hand it to someone. You can’t store it. You can’t branch it. Maximum you can do is describe it in words and hope the next reconstruction comes out close enough. The right question isn’t “how do we work around the setup tax?” It’s “why isn’t the running state of a machine a first-class value we can fork, commit, and restore like code?”

What changes when state becomes a value

The cost disappears. Not shrinks — disappears. Because once the setup cost is paid once, the state can be committed, branched, and restored in microseconds, and every future operation starts from the committed state instead of rebuilding it. Concretely:

Debugging. Commit your environment the moment you reach a repro. Every experimental fix goes on a branch of that commit. Each failed attempt is free to throw away — you restore instantly. You run five fixes in the time the old loop ran one.
Reviews. The author commits their PR’s state. You restore it in under a second. You’re debugging their repro, not an approximation.
Pair work. The other engineer commits. You branch from their commit. You inherit their exact running state, live — server processes, terminal history, warm caches, open connections.
Experimentation. Commit the baseline once. Branch for each variant. Run them in parallel. Compare in seconds. Discard losers without remorse, because there was nothing to lose.
CI and reviews. Preview environments for free, because every PR’s state is just a commit the reviewer can restore.
Recovery. Crashes stop being crises. The state before the crash is a commit; you restore and keep going.

Each of these is a day-to-day pain that engineers have absorbed because the primitive didn’t exist. Once the primitive exists, the pain goes away because you can cheaply inherit state, not because anyone sped up the work itself.

The economics

If sixty percent of engineering time is rebuilding state, and a primitive that makes state branchable and restorable in microseconds turns that sixty percent into near-zero, the implication is that engineering productivity has a lot of slack in it. Not because anyone is slacking — because the cost has been hiding in setup, in reruns, in careful avoidance of experiments people would have run if they were cheap. This is the unsexy, boring-looking, enormously consequential thing about branchable VMs: they don’t make the doing faster. They make the setup free. And in any workflow where the doing is a tenth of the effort, making the other ninety percent free is a step-function improvement in what a team can get done in a week. The inside joke of building this kind of primitive is that the first reaction is always “okay, cool, why do I need that?” and the second reaction, a month later, is “oh, I can’t work without that anymore.” That’s how it goes with primitives that remove costs people have stopped seeing.

Why stateless compute ran out

The same problem, applied to autonomous agents: context isn’t a message array, and throwing it away every restart is catastrophic.

Time-travel debugging

What debugging looks like when commits are restoration points and branches are experiment tracks.

Parallel scenario testing

The experimentation tax removed: branch once, try every scenario in parallel.

Core concepts

Projects, VMs, HEAD, branches, commits — the mental model behind the primitive.

Tutorials

Essays

The cost of rebuilding state

The setup tax

The debugging tax

The experimentation tax

The collaboration tax

The reviewer tax

The pattern

What changes when state becomes a value

The economics

Further reading

Why stateless compute ran out

Time-travel debugging

Parallel scenario testing

Core concepts

Tutorials

Essays

Documentation Index

​The setup tax

​The debugging tax

​The experimentation tax

​The collaboration tax

​The reviewer tax

​The pattern

​What changes when state becomes a value

​The economics

​Further reading

Why stateless compute ran out

Time-travel debugging

Parallel scenario testing

Core concepts

The setup tax

The debugging tax

The experimentation tax

The collaboration tax

The reviewer tax

The pattern

What changes when state becomes a value

The economics

Further reading