Lottery Scheduling

fair share give every process a lottery ticket, more tickets for more priority

TIP

random algs often avoid strange edge cases

random algs are usually lightweight, and maintain little to no state

random algs is useful when very fast decision are required, so long as generating a random number is fast

TIP

if you are in need of a mechanism to represent a proportion of ownership, try using the ticket

Ticket Mechanisms

ticket currency: allow a user with a set of tickets to allocate tickets among their own jobs in whatever currency they like

User A -> 500 (A’s currency) to A1 -> 50 (global currency)
	   -> 500 (A’s currency) to A2 -> 50 (global currency)
User B -> 10 (B’s currency) to B1 -> 100 (global currency)

ticket transfer: process can transfer its tickets temporarily. useful for client server cases, client asks server to do some ask and transfers its tickets (it can’t work in that time anyway)
ticket inflation: if a process needs more cpu time it temporarily boosts its ticket value

Implementation

good rng → difficult
good process list data structure
total number of tickets

simple ^_^

TIP

knowing which data structure to choose is a hallmark of good engineering. carefully consider its access patterns and frequency of usage

WARNING

we have not discussed how to assign tickets i.e. how to assign priority to a process

Stride Scheduling

a non-random approach to a fair-scheduler

basically run the process with the least progress so far. measure progress as inverse of ticket count

stride: $\frac{n}{tickets}$ calculate for each process pass value: progress tracker

curr = remove_min(queue);
schedule(curr);
curr->pass += curr->stride;
insert(queue, curr);

NOTE

lottery scheduling has a nice property in that their is no global state. therefore no extra memory, and it makes adding new processes so much easier. if a new process was added in a stride scheduler and had its pass value = 0, it would monopolize the cpu

The Linux Completely Fair Scheduler (CFS)

Basic Operation

fairly divide the cpu evenly among all competing processes. uses a counting based technique virtual runtime

as each process runs its accumulates vruntime. Each processes’s vruntime increases at the same rate, in proportion with real time. When a scheduling decision occurs, CFS picks the process with the $l o w es t$ vruntime to run next.

QUESTION

how does the scheduler know when to context switch? what exactly is a time slice?

maintains various control parameters

sched_latency: used to determine how long a process should run before considering a switch? basically create a dynamic time slice, usually done something like $\frac{sched_latency}{#processes}$ where sched_latency $= 48 m s$ . Now what if $# p rocesses \to \infty$ ? that’s too small a time slice
min_granularity: minimum value of sched_latency that can be assigned, typically min_granularity = $6 m s$

NOTE

time slices are only for the context switches, the scheduler still has periodic timer interrupts from the hardware to precisely update vruntime

Weighting (Niceness)

CFS enables control over process priority. nice level $- 20 < x < + 19$ , with a default of $0$ . Each niceness is mapped to a weight, that allows us to compute the effective time slice of a process

$time_slice_{k} = \frac{weight _{k}}{\sum n - 1 _{i = 0} weight _{i}} * sched_latency$

weight of -20 = 88761, weight of +19 = 15 ⇒ ratio of 5900

thus vruntime is also updated as

vruntime_{i} = vruntime_{i} + \frac{weight _{0}}{weight _{i}} * runtime_{i}

Using Red-Black Trees

only running or runnable processes are kept in the red-black tree. i/o operations etc move it out of the tree.

[1] https://www.cs.usfca.edu/~galles/visualization/RedBlack.html → red-black tree visualiser

[2] https://stackoverflow.com/questions/11224830/inside-the-linux-2-6-completely-fair-scheduler → “why not avl???”

Dealing with I/O and Sleeping Processes

if a process sleeps for a long time, then when it wakes up it will monopolise the CPU :((

instead of letting it keep its same vruntime CFS will instead assign it the vruntime of the minimum process in the RB tree (which only contains running processes) preventing starvation

however frequent sleeping processes often do not get their fair share of the CPU

[3] https://www.cs.utexas.edu/~EWD/ewd08xx/EWD831.PDF → why numbers should start from 0

[4] https://www.usenix.org/system/files/conference/atc18/atc18-bouron.pdf → battle of the schedulers (bsd vc linux)

[5] https://www.waldspurger.org/carl/papers/esx-mem-osdi02.pdf → memory resource management in VMware ESX server

[6] https://www.cs.unm.edu/~eschulte/data/bfs-v-cfs_groves-knockel-schulte.pdf → CFS vs BFS

subzcuber's notes

Explorer

05_lottery

Lottery Scheduling

Ticket Mechanisms

Implementation

Stride Scheduling

The Linux Completely Fair Scheduler (CFS)

Basic Operation

Weighting (Niceness)

Using Red-Black Trees

Dealing with I/O and Sleeping Processes

Graph View

Table of Contents