Modern operating systems generally use two different approaches to create a new process. The first and conceptually simplest is to call a CreateProcess function that accepts the path to the executable and some parameters to control what is inherited and other modifiers. This approach is used in the Windows operating systems, where the CreateProcess function accepts no less than 10 parameters.
Unix-like operating systems traditionally use another method: they simply create a perfect copy of the parent process except a return value and the child then replaces its entire address space with that of a executable. Somehow this is profoundly simple, though it feels unnecessary to create a copy just to replace it soon thereafter. This is done using the fork system call that accepts not even a single parameter, but returns the process id of the child process, or zero if you are the child.
Back when Unix was created, the overhead of creating a full copy of a process wasn’t very critical, as computers had little memory. That meant when the system ran a program, it often had to swap it in from disk first. In that case, creating a copy of a process was just as expensive as running a process in the first place. Modern Unix-like kernels instead use a technique called Copy-On-Write, which means that it delays creating a copy of memory until either the parent or child modifies the memory. Yet, even though the CreateProcess model may be more efficient today, the fork model remained in use for a few reasons.
The primary reason was that it was trivial to modify the child process: fork works by letting you run code in the child. If you let that code modify the current process, then you modified the child process. It also provided a convenient programming model in which processes could easily cooperate over shared file descriptors (such as pipes). Threads were added much later to Unix systems, and yet, the model of forked processes that cooperate remained in use because it provided a solid and decoupled framework. In contrast threads run in the same process, can directly access each other and are occasionally very hard to understand and debug.
If the kernel developers decide to add a new aspect to the process model in the system, then a process may wish to modify that aspect in its child process. CreateProcess would perhaps require yet another parameter and grow in complexity. In contrast, on a Unix-like system, you would only need to add a new system call to modify the aspect of the current process and then have the forked child call it before it loads the target executable. Note that the Unix program need only invoke the system call if it actually cares (otherwise it gets nice defaults), whereas you need to pass yet another NULL to CreateProcess.
However, as time passed, the developers of the Plan 9 operating system found that fork wasn’t fine-grained enough. The problem is that fork creates a complete copy of parent process, but sometimes, you may wish to share some aspects of a process with the child. Perhaps you wish to create a child process, but share the pointer to the current directory, share the address space, share the process id, share the set of file descriptors? To do this, they added the rfork system call that accepts a single parameter which is a bitmask of what aspects you wish to share, and which you wish to get a copy of. Indeed, if you decide to share everything except the stack, then you essentially create a new thread, not a process. Later on, rfork was adopted into the BSD systems, and inspired the clone system call on Linux which is used to both create process and threads.
I decided to go with the rfork model in Sortix, which is my homemade operating system. However, I had a distaste of the exact semantics of rfork. In particular, the call copied some parts of the process by default. For simplicity, I’d much prefer that sharing aspects is the default, and you need to set a bit to copy the aspect. I therefore renamed rfork to sfork and changed the semantics ever so slightly. Adding a wrapper for BSD rfork compatibility shouldn’t be hard. The nice thing is that you can use sfork both to create processes and threads, and hybrids in-between. The fork function is now implemented as sfork(SFFORK), where SFFORK is a constant that expands to the bits that provide traditional fork semantics. Should you wish to do a normal fork, except share the current directory pointer, you can simply call sfork(SFFORK & ~SFCWD).
Actually, for internal kernel reasons, I didn’t find sfork flexible enough. To solve that, I created the system call tfork (notice a pattern here?), which accepts both a bitmask and a structure that contains the registers of the child thread. This allows you to allocate a big memory chunk, and then tfork a thread that uses it as its stack by simply setting its stack register. You can even implement sfork using tfork by putting the current registers into the structure and calling tfork (and taking care not to call tfork again in the child).
In case you were wondering, I didn’t implement the Linux clone system call, as it bears resemblance to CreateProcess in being over-engineered.