Limited Direct Execution
to virtualise the cpu and do stuff like timesharing and context switching whatnot, you need to be able to control the execution of a process
can’t just jump to main()
and be like idc what it does, be a strict parent. also important for security
got to be efficient about this too, can’t compromise performance
Direct Execution
OS | Program |
---|---|
Create entry for process list | |
Allocate memory for program | |
Load program into memory | |
Set up stack with argc/argv | |
Clear registers | |
Execute call main() | |
Run main() | |
Execute return from main | |
Free memory of process | |
Remove from process list |
two problems with direct execution
- how to make sure program doesn’t do some shit to mess up your laptop (linux moment)
- how to implement time sharing
Problem #1: restricted operations
introduce user mode and kernel mode
user mode can not make direct i/o requests. to do privileged stuff it must use syscalls.
“To execute a system call, a program must execute a special trap instruction. This instruction simultaneously jumps into the kernel and raises the privilege level to kernel mode; once in the kernel, the system can now perform whatever privileged operations are needed (if allowed), and thus do the required work for the calling process. When finished, the OS calls a special return-from-trap instruction, which, as you might expect, returns into the calling user program while simultaneously reducing the privilege level back to user mode.”
gotta make sure kernel mode doesn’t overwrite the context of user mode making return-from-trap impossible. x86 implements a kernel stack to store the user context when entering trap
still gotta do this safely, can’t let users arbitrarily enter kernel mode. so the os defines some trap handlers which the hardware triggers when these traps are called, the hardware knows the exact address of these handlers, and the user only calls syscalls by their numbers, so user input can not call unsafe addresses
[1] https://www.youtube.com/watch?v=fLS99zJDHOc
prompt> man syscalls
prompt> info syscall
Limited Direct Execution Protocol
OS @ boot (kernel mode) | Hardware | Program (user mode) |
---|---|---|
initialize trap table | remember address of syscall handler |
OS @ run (kernel mode) | Hardware | Program (user mode) |
---|---|---|
Create entry for process list | ||
Allocate memory for program | ||
Load program into memory | ||
Setup user stack with argc/argv | ||
Fill kernel stack with reg/PC | ||
return-from-trap | ||
restore regs (from kernel stack) | ||
move to user mode | ||
jump to main | ||
run main() | ||
call sys call trap into OS | ||
save regs (to kernel stack) | ||
move to kernel mode | ||
jump to trap handler | ||
handle trap | ||
do work of syscall | ||
return-from-trap | ||
restore regs (from kernel stack) | ||
move to user mode | ||
jump to PC after trap | ||
return from main() | ||
trap (via exit() ) | ||
Free memory of process | ||
Remove from process list |
Problem #2: switching between processes
if a process is running ⇒ the os is not running. so how does the os stop the process to start the next one?
wafadar approach → trust the process will make syscalls at regular intervals
- syscalls give the os control back and then it can choose to run a diff process
bewafa approach → don’t trust the process. maintain a timer and regain control every few miliseconds
- wafadar approach is over trusting, nothing can be done in the case of infinite loops etc, a separate syscall is needed just to give control back to the os
- it chooses to switch processes or not based on the scheduler (algo-1 trauma, i still can’t do greedy problems)
- if it chooses to switch, then it executes the context switch. reminder context is basically the state of the registers at that moment for the process
- it saves the
- general purpose registers
- pc
- kernel sp
- and restores the same of the process it is switching to
- it saves the
Limited Direct Execution Protocol (with timer interrupt
OS @ boot (kernel mode) | Hardware | Program (user mode) |
---|---|---|
initialize trap table | remember address of syscall handler | |
start interrupt timer | start timer | |
interrupt CPU in X ms |
OS @ run (kernel mode) | Hardware | Program (user mode) |
---|---|---|
Process A | ||
… | ||
timer interrupt | ||
save regs(A) → k-stack(A) | ||
move to kernel mode | ||
jump to trap handler | ||
handle the trap | ||
call switch() routine- saves regs(A) → proc_t(A) - restores regs(B) ← proc_t(B) - switch to k-stack(B) | ||
return-from-trap (into B) | ||
restore regs(B)⇐k-stack(B) | ||
move to user mode | ||
jump to B’s PC | ||
Process B | ||
… |
# void swtch(struct context *old, struct context *new);
#
# Save current register context in old
# and then load register context from new.
.globl swtch
swtch:
# Save old registers
movl 4(%esp), %eax # put old ptr into eax
popl 0(%eax) # save the old IP
movl %esp, 4(%eax) # and stack
movl %ebx, 8(%eax) # and other registers
movl %ecx, 12(%eax)
movl %edx, 16(%eax)
movl %esi, 20(%eax)
movl %edi, 24(%eax)
movl %ebp, 28(%eax)
# Load new registers
movl 4(%esp), %eax # put new ptr into eax
movl 28(%eax), %ebp # restore other registers
movl 24(%eax), %edi
movl 20(%eax), %esi
movl 16(%eax), %edx
movl 12(%eax), %ecx
movl 8(%eax), %ebx
movl 4(%eax), %esp # stack is switched here
pushl 0(%eax) # return addr put in place
ret # finally return into new ctxt
xv6 context switch code
[2] https://hovav.net/ucsd/dist/geometry.pdf → paper on ROP [3] https://www.usenix.org/legacy/publications/library/proceedings/sd96/full_papers/mcvoy.pdf → benchmarking tools