Processes
The Operating System virtualizes the CPU to promote the illusion of a nearly-endless supply of CPUs1
You can check your process limits with
❯ ulimit -u
62133 # calculated based on available ram i think❯ sysctl kernel.pid_max
kernel.pid_max = 32768 # legacy default value❯ sysctl kernel.threads-max
kernel.threads-max = 124266 # physical upper limit i imaginepid_max and the ulimit are simply pid number limits, whereas threads-max is the max number of entries in the task table/list (usually a combo of hashtable + doubly linked list in linux)
Process API
A bare minimum interface for managing processes must include the following:
- create a process
- destroy a process forcefully
- wait for a process to finish running
- suspend pause a process
- status get info about state or how long it has run
UNIX API
UNIX gives you 4 routines to work with processes
fork()2,exec(),exit(), andwait()
Process Creation

first load the program → i.e. its code and static data (initted variables) into RAM, the address space of the process (virtual memory)
loading can be eager (entire program) or lazy (pieces of program) (paging/swapping etc)
allocate memory for running the program:
- stack
- heap (optional)
Interesting to Note
- the os is responsible for populating argc/argv etc
- the os is responsible for adding a canary before the return value in the stack3
allocates file descriptors to process? (i/o setup)
jumps to main() and transfers control of the cpu to the process
Process States
The exact states used by linux are
- Running:
- The process is either running
- or ready to run
- Waiting:
- The process is waiting for an event or resource
- It may be interruptible or uninterruptible by hardware signals
- Stopped:
- process has been stopped, usually by receiving a signal
- a debugged process may be in a stopped state
- Zombie:
- A halted process which still has a
task_structin thetaskvector - a dead process
- A halted process which still has a
/* get_task_state(): */
#define TASK_REPORT (TASK_RUNNING | TASK_INTERRUPTIBLE | \
TASK_UNINTERRUPTIBLE | __TASK_STOPPED | \
__TASK_TRACED | EXIT_DEAD | EXIT_ZOMBIE | \
TASK_PARKED)Process Structures
- process list / task list
- process
struct4 5 6 / Process Control Block (PCB) / process descriptor - register context
// the registers xv6 will save and restore
// to stop and subsequently restart a process
struct context {
int eip;
int esp;
int ebx;
int ecx;
int edx;
int esi;
int edi;
int ebp;
};
// the different states a process can be in
enum proc_state { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };
// the information xv6 tracks about each process
// including its register context and state
struct proc {
char *mem; // Start of process memory
uint sz; // Size of process memory
char *kstack; // Bottom of kernel stack
// for this process
enum proc_state state; // Process state
int pid; // Process ID
struct proc *parent; // Parent process
void *chan; // If !zero, sleeping on chan
int killed; // If !zero, has been killed
struct file *ofile[NOFILE]; // Open files
struct inode *cwd; // Current directory
struct context context; // Switch here to run process
struct trapframe *tf; // Trap frame for the
// current interrupt
};fork() syscall
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> // POSIX API
int main(int argc, char *argv[]) {
printf("hello (pid:%d)\n", (int) getpid());
int rc = fork();
if (rc < 0) {
// fork failed
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
// child (new process)
printf("child (pid:%d)\n", (int) getpid());
} else {
// parent goes down this path (main)
printf("parent of %d (pid:%d)\n", rc, (int) getpid());
}
return 0;
}fork() is a way to create a new process which is an almost exact copy of the calling process.
- the child continues from the
fork()call, and not frommain() - the child has its own copy of the address space etc
- the child returns 0 from
fork()whereas the parent returns the PID of the child
wait() syscall
The wait() system call suspends execution of the calling thread until one of its children terminates.
Caution
wait()waits for any child to terminate (not all children either). to deterministically wait for a specific child you usewaitpid()
This basically causes a parent to block until its child terminates and adds determinism to the forking process
else {
// parent goes down this path (main)
int status;
int wc = wait(&status); // wc = pid of child that just terminated
// status now contains interesting information about how the child terminated
printf("parent of %d (wc:%d) (pid:%d)\n", rc, wc, (int) getpid());
if (WIFEXITED(status)) {
printf("child EXIT_CODE %d\n", WEXITSTATUS(status));
}
}check out this
man 2 waitOrphan Processes
If the parent terminates before the child, the child is orphaned and immediately adopted by
initand reapedWe sometimes want to intentionally orphan a child, usually in the case of daemons. In those cases you can
fork()twice and terminate the child, letting the grandchild be adopted byinitand run indefinitely
exec() syscall
execute the program file given to it with specified arguments as a new process
| Function | Signature | Argument Format | Environment | Path Resolution |
|---|---|---|---|---|
| execv | (path, argv[]) | Vector (Array) | Inherited | Literal Path |
| execl | (path, arg0, ..., NULL) | List (Variadic) | Inherited | Literal Path |
| execve | (path, argv[], envp[]) | Vector (Array) | Explicit | Literal Path |
| fexecve | (fd, argv[], envp[]) | Vector (Array) | Explicit | File Descriptor |
| execle | (path, arg0, ..., NULL, envp[]) | List (Variadic) | Explicit | Literal Path |
| execvp | (file, argv[]) | Vector (Array) | Inherited | $PATH Search |
| execlp | (file, arg0, ..., NULL) | List (Variadic) | Inherited | $PATH Search |
given the name of an executable (e.g., wc), and some arguments (e.g., p3.c), it loads code (and static data) from that executable and overwrites its current code segment (and current static data) with it; the heap and stack and other parts of the memory space of the program are re-initialized. Then the OS simply runs that program, passing in any arguments as the argv of that process.
Warning: Although theoretically you can create as many processes as you want, systems always have some limits. Therefore, always check to see if the returned value of fork() is negative and report the result. If this does happen, try to reduce the number of child processes, or re-organize your program.
exec()never returns!It does not create a new process; rather, it transforms the currently running program (formerly
p3) into a different running program (wc). After theexec()in the child, it is almost as ifp3.cnever ran a successful call toexec()never returns.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
int main(int argc, char *argv[]) {
printf("hello (pid:%d)\n", (int) getpid());
int rc = fork();
if (rc < 0) { // fork failed; exit
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) { // child (new process)
printf("child (pid:%d)\n", (int) getpid());
char *myargs[3];
myargs[0] = strdup("wc"); // program: "wc"
myargs[1] = strdup("p3.c"); // arg: input file
myargs[2] = NULL; // mark end of array
execvp(myargs[0], myargs); // runs word count
printf("this shouldn’t print out");
} else { // parent goes down this path
int rc_wait = wait(NULL);
printf("parent of %d (rc_wait:%d) (pid:%d)\n",
rc, rc_wait, (int) getpid());
}
return 0;
}❯ ./a.out
hello (pid:31450)
child (pid:31451)
26 108 819 p3.c
parent of 31451 (rc_wait:31451) (pid:31450)and indeed, the child does nothing except the exec() call
How your Shell runs processes
- type command to shell
forkprocessexeccommandwaitfor child- return to shell
the gap between fork and exec lets you do cool stuff like piping and redirection
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/wait.h>
int main(int argc, char *argv[]) {
int rc = fork();
if (rc < 0) {
// fork failed
fprintf(stderr, "fork failed\n");
exit(1);
} else if (rc == 0) {
// child: redirect standard output to a file
close(STDOUT_FILENO);
open("./p4.output", O_CREAT|O_WRONLY|O_TRUNC, S_IRWXU);
// now exec "wc"...
char *myargs[3];
myargs[0] = strdup("wc"); // program: wc
myargs[1] = strdup("p4.c"); // arg: file to count
myargs[2] = NULL; // mark end of array
execvp(myargs[0], myargs); // runs word count
} else {
// parent goes down this path (main)
int rc_wait = wait(NULL);
}
return 0;
}process control and users
- the
kill()syscall sends signals to processes - the
signal()syscall is used by the process to “catch” signals (sigaction()is recommended instead) - further your shell is hardwired to send
SIGINT(interrupt) on CTRL+C: usually terminating the processSIGSTP(stop) on CTRL+Z: pausing the process mid-execution
Linux and most BSDs support 31 signals
/* ISO C99 signals. */
#define SIGINT 2 /* Interactive attention signal. */
#define SIGILL 4 /* Illegal instruction. */
#define SIGABRT 6 /* Abnormal termination. */
#define SIGFPE 8 /* Erroneous arithmetic operation. */
#define SIGSEGV 11 /* Invalid access to storage. */
#define SIGTERM 15 /* Termination request. */
/* Historical signals specified by POSIX. */
#define SIGHUP 1 /* Hangup. */
#define SIGQUIT 3 /* Quit. */
#define SIGTRAP 5 /* Trace/breakpoint trap. */
#define SIGKILL 9 /* Killed. */
#define SIGPIPE 13 /* Broken pipe. */
#define SIGALRM 14 /* Alarm clock. */
/* Adjustments and additions to the signal number constants for
most Linux systems. */
#define SIGSTKFLT 16 /* Stack fault (obsolete). */
#define SIGPWR 30 /* Power failure imminent. */
/* Historical signals specified by POSIX. */
#define SIGBUS 7 /* Bus error. */
#define SIGSYS 31 /* Bad system call. */
/* New(er) POSIX signals (1003.1-2008, 1003.1-2013). */
#define SIGURG 23 /* Urgent data is available at a socket. */
#define SIGSTOP 19 /* Stop, unblockable. */
#define SIGTSTP 20 /* Keyboard stop. */
#define SIGCONT 18 /* Continue. */
#define SIGCHLD 17 /* Child terminated or stopped. */
#define SIGTTIN 21 /* Background read from control terminal. */
#define SIGTTOU 22 /* Background write to control terminal. */
#define SIGPOLL 29 /* Pollable event occurred (System V). */
#define SIGXFSZ 25 /* File size limit exceeded. */
#define SIGXCPU 24 /* CPU time limit exceeded. */
#define SIGVTALRM 26 /* Virtual timer expired. */
#define SIGPROF 27 /* Profiling timer expired. */
#define SIGUSR1 10 /* User-defined signal 1. */
#define SIGUSR2 12 /* User-defined signal 2. */
/* Nonstandard signals found in all modern POSIX systems
(including both BSD and Linux). */
#define SIGWINCH 28 /* Window size change (4.3 BSD, Sun). */the concept of process control begets the concept of users and clarifying who in fact can control the process
While our passion for the UNIX process API remains strong, we should also note that such positivity is not uniform. For example, a recent paper by systems researchers from Microsoft, Boston University, and ETH in Switzerland details some problems with fork(), and advocates for other, simpler process creation APIs such as spawn() 2. Read it, and the related work it refers to, to understand this different vantage point
Footnotes
-
https://www.man7.org/linux//man-pages/man5/proc_pid_limits.5.html and https://linuxvox.com/blog/maximum-number-of-processes-in-linux/ ↩
-
https://www.microsoft.com/en-us/research/wp-content/uploads/2019/04/fork-hotos19.pdf ↩ ↩2
-
https://www.kernel.org/doc/html/v4.19/security/self-protection.html#stack-buffer-overflow → kernel self protection mechanisms ↩
-
https://tldp.org/LDP/tlk/kernel/processes.html → Linux implementation of processes ↩
-
https://tldp.org/LDP/lki/lki-2.html → linux
task_struct↩ -
https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L819 → exact
task_structdefinition (fucking huge and complex) ↩