A short history of fork()

In November 1969, Ken Thompson was writing the first version of Unix on a PDP-7 at Bell Labs. He needed a way for one process to create another. The obvious design — common in operating systems of the era — was a call that took the name of a program and started it running: newProc("/bin/sh"). Multics had something like this. DEC’s systems had something like this. Nearly every operating system written before or around 1969 had something like this. Thompson and Dennis Ritchie chose something different, and that choice turned out to be the one that made Unix generative in the way it was.

What came before

Before Unix, the dominant model for creating a new process was specify the program, create the process. You pointed the OS at an executable, asked for it to be loaded, and the OS returned a handle to a process that had been built from scratch around that program. The new process had no relationship to the creator beyond being allocated by the same operating system. This model has an obvious appeal: it’s clean. The parent process says what it wants (a running instance of program X) and the OS does the thing. It’s also restrictive in ways that take a while to notice. You can’t pass in-memory state from parent to child. You can’t hand over open file handles or network connections without explicit OS primitives for each one. You can’t easily set up a child’s environment to be a modified version of the parent’s — everything has to be specified up front. Pipelines, in particular, are awkward: to run A | B, you have to plumb a pipe through the OS-level process-creation API for both programs. Multics had new_proc. RSTS/E had RUN. VM/CMS had EXEC. They all worked the same way: you tell the OS what to run, it builds a fresh process around the executable. Reasonable, and limiting.

The Unix move

Thompson’s fork() inverted the order. Instead of creating a new process by specifying a program, fork() duplicated the calling process — same memory, same file descriptors, same execution state — and returned twice. The parent got the child’s process ID; the child got zero. From that moment the two processes ran independently, sharing nothing but the state they had at the moment of the fork. On its own, fork() just gave you two identical processes. That’s not useful. The second piece of the design was exec(), which replaced the running program’s image with a different executable while keeping the process’s identity, file descriptors, and environment. The typical Unix pattern became:

pid_t pid = fork();
if (pid == 0) {
  /* We're the child. Replace ourselves with a different program. */
  execvp("/bin/sh", argv);
} else {
  /* We're the parent. Maybe wait for the child; maybe not. */
}

Two primitives, separable. fork() to duplicate; exec() to replace. Most uses want both; some only want fork() (when the child’s job is so small it doesn’t need its own program); the separation is what made the design generative. What did you get from this design that the newProc(program) style didn’t give you? Inherited file descriptors. The child begins life holding every file and socket the parent had open at the fork. This is how Unix pipelines work — the shell forks, sets up the pipe, forks again for the second command, and the file descriptors flow naturally because the child inherited them. Inherited environment. Environment variables, current working directory, signal handlers, resource limits — all inherited. The child doesn’t have to be told; it already has them. Inherited address space. Every byte of the parent’s memory is available to the child at the fork. The child can read the parent’s loaded data structures, the parent’s computed state, the parent’s open database connection. Then, if it wants a completely fresh start, it calls exec() and replaces everything. If it doesn’t, it keeps the parent’s state and diverges. A natural shape for servers. A server that wants one worker per connection does accept() in a loop, fork() on each accept, hand the connection to the child. The child inherits the connection file descriptor; the parent goes back to accepting. This is the classic pre-epoll server structure and it’s cleanly expressible because fork() hands off state. You can see why Unix’s authors didn’t feel the need to add many other primitives for process communication: if you control what state is present at the fork, you’ve already communicated whatever you needed to communicate.

What copy-on-write added

The original fork() had a performance problem. Duplicating a process meant duplicating its entire address space — every page of memory, copied. For a large process this was slow, and it was often wasted work when the child immediately called exec() and discarded the duplicated memory. Berkeley’s response in the early 80s was vfork() — a variant that didn’t copy the address space, on the promise that the child would exec() before touching any of it. vfork() worked but was brittle, and any programmer who violated the “don’t touch memory before exec” contract got mysterious corruption. The better answer was copy-on-write. Introduced roughly concurrently in Mach and in later Unixes, copy-on-write made fork() cheap: instead of duplicating pages eagerly, the OS marked them read-only and shared between parent and child. A page was physically copied only when one of the two processes wrote to it. In the common case where the child quickly called exec(), almost no pages were ever copied at all. With copy-on-write, fork() became a genuinely cheap primitive — fast enough to run in tight loops, which is exactly the shape of the patterns people had already developed (shell pipelines, one-fork-per-accept servers). The mental model didn’t change; the performance finally caught up to the model.

Linux generalizes it

When Linus Torvalds was implementing Linux in the early 1990s, he needed fork() for Unix compatibility but he also wanted threads. Threads and forks are related — a thread is basically a process that shares address space with its peers — and Linux’s answer was clone(), a generalized version of fork() that took flags specifying exactly what the child should share with the parent. CLONE_VM to share memory (a thread). CLONE_FILES to share file descriptors. CLONE_SIGHAND to share signal handlers. Combine flags to get different flavors of child process, from “heavyweight process” (no sharing, like fork()) to “lightweight thread” (share everything, like pthreads). Linux’s clone() is still the basis for thread creation on Linux today, and it’s the underlying primitive behind container namespacing — CLONE_NEWPID creates a child in a new PID namespace, CLONE_NEWNS in a new mount namespace, and so on. Every Linux container you’ve ever run was created by clone(), under the hood. The design traces back to Thompson’s 1969 choice: instead of newProc(program), duplicate the running process and then optionally diverge.

The critique

fork() is also controversial in its details. The 2019 HotOS paper “A fork() in the road” by Baumann, Appavoo, Krieger, and Roscoe argues that fork() has become incompatible with modern systems: it doesn’t compose with threads (forking a multithreaded process is a minefield), it doesn’t compose with shared memory segments (children inherit the mappings but the semantics get weird), it doesn’t compose with accelerators (GPU state, locked memory), it doesn’t compose with kernel-managed resources that have server-side state (tracked by pid, which the child now duplicates). The authors suggest that the Unix community should move past fork() toward more explicit primitives. The paper is right about the details. fork() is a hack in many specific ways, and if we were designing Unix from scratch today we probably wouldn’t reach for its exact shape. But the idea — duplicate a running state, diverge — has been so generative that it keeps getting re-invented at different altitudes:

Process-level: fork() / clone().
Filesystem-level: Btrfs and ZFS snapshots. “Fork this filesystem.” Copy-on-write at the block level.
Image-level: Docker image layers. “Fork this base image.” Copy-on-write at the filesystem layer.
Container-level: docker commit. “Fork this running container into an image.”
Language-level: Erlang’s spawn (actors inherit nothing, but the pattern of “spawn, diverge, communicate” is the same).
VM-level, historically: VMware snapshots, KVM live migration. Expensive, rarely used as a hot-path primitive, but conceptually the same move.

Every one of these is the same act at a different granularity. The complaint in the HotOS paper isn’t that fork()’s idea is wrong; it’s that fork()’s specific implementation has accreted enough edge cases that it doesn’t scale cleanly to modern systems. The idea keeps getting re-invented because the idea is correct.

What’s missing at the top of the stack

Notice the pattern in the list above. Process-level fork() is cheap (microseconds on Linux). Filesystem-level snapshots are cheap (hundreds of milliseconds). Container image layering is cheap (milliseconds to seconds, with layer caching). And then at the VM level — the top of the granularity ladder, where you have a whole running machine to duplicate — the primitive has historically been expensive. VMware snapshots take seconds to tens of seconds. Cloud provider snapshots take tens of seconds to minutes, and their restore times are similar. KVM live migration is fast but its target isn’t “two running copies” — it’s “move one running VM to another host.” The pattern of “fork a running VM, diverge, keep both” has been available but rarely practical, because the latency priced it out of the hot path. This is exactly the moment in the fork() story where vfork/CoW arrived — the primitive existed but was too slow to use widely, so people designed around its absence. The unlock happens when the latency drops by enough orders of magnitude that you stop designing around it and start designing with it.

Vers in the lineage

Vers is fork() at the VM granularity, with a latency low enough that the shape of the software changes. Branch p50 is 258.3µs; branch p99 is 329.2µs. That’s the same order of magnitude as process-level fork() on Linux. The primitive is finally fast enough, at whole-machine scale, to live in the same tight loops that process-level fork() has lived in for fifty-five years. When that shift happens, you get the same generative effects:

One-fork-per-task becomes cheap. Just as Unix servers could fork per connection, agent infrastructure can fork per trajectory. Each task gets an isolated world and the cost of creating that world is negligible.
Pipelines at machine granularity. The shell’s A | B | C was possible because fork() was cheap enough to chain. Whole-machine pipelines — branch, run step, commit, branch from that commit, run next step — become expressible when VM branching sits at microsecond latency.
Inheritance of state is the delivery mechanism. In Unix, you communicate with a child by setting up state before the fork. In Vers, you communicate with a branched VM by reaching a state before the branch. Same move, bigger unit.
Copy-on-write at the memory-page level does the same job for running machines that Berkeley’s CoW did for processes: the fork is physically cheap because divergent pages are only copied when actually written.

You can read the history of fork() as an accidental design by Thompson that turned out to be the most generative primitive in the history of operating systems. Or you can read it as an inevitable choice: the shape that preserves the most state across process creation must win, because state is what makes processes useful, and rebuilding state from scratch is always the cost you’re trying to avoid. Either reading lands in the same place. The primitive wasn’t done when Thompson stopped typing in 1970. It had more altitudes to reach.

Content-addressable everything

The git equivalent of this essay: another primitive that turned out to have altitudes left to reach.

Every primitive avoids a cost

Why fork() stuck: it made the “rebuild the state from scratch” cost go away.

Architecture

How Vers implements the fork() idea at whole-machine altitude — copy-on-write memory, overlay filesystems, content-addressed commits.

Agent swarms tutorial

The “one-fork-per-task” pattern in practice — N parallel agents from a golden image.

Tutorials

Essays

A short history of fork()

What came before

The Unix move

What copy-on-write added

Linux generalizes it

The critique

What’s missing at the top of the stack

Vers in the lineage

Further reading

Content-addressable everything

Every primitive avoids a cost

Architecture

Agent swarms tutorial

Tutorials

Essays

Documentation Index

​What came before

​The Unix move

​What copy-on-write added

​Linux generalizes it

​The critique

​What’s missing at the top of the stack

​Vers in the lineage

​Further reading

Content-addressable everything

Every primitive avoids a cost

Architecture

Agent swarms tutorial

What came before

The Unix move

What copy-on-write added

Linux generalizes it

The critique

What’s missing at the top of the stack

Vers in the lineage

Further reading