Priority Scheduling1
a good scheduler now needs to do 2 things
- minimise response time when necessary
- minimise turnaround time, i.e. get the most processes done in a unit time and its got to do this without any of the assumptions we made previously
Most importantly we don’t have knowledge of the job length. So what we have to do is estimate the runtime using knowledge of previous runtimes.
Learning From History
The MLFQ is an example of a system that learns from the past to predict the future. Other examples of such systems are hardware branch predictors2 and caching algorithms
Basic Rules
different queues each assigned a different priority level. if multiple jobs are on the same queue i.e. same priority, then round robin (or potentially some other alg) is used. higher priority is run first. i.e.
RULE 1: if A runs (B doesn’t)
RULE 2: If , A and B run in RR (or potentially some other alg)
managing priority is key for mlfq.
it may want to analyse the processes and pass judgement
- If a job has a lot of i/o ⇒ frees up the cpu often, it keeps its high priority
- If it is an intensive background process its priority is reduced
Consider the trivial example below (no RR)
Job List:
Job 0: startTime 0 - runTime 10 - priority 2
Job 1: startTime 0 - runTime 1 - priority 4
Job 2: startTime 0 - runTime 2 - priority 1
Job 3: startTime 0 - runTime 1 - priority 0
Job 4: startTime 0 - runTime 5 - priority 3
Q4: 1
Q3: 44444
Q2: 0000000000
Q1: 22
Q0: 3
Final statistics:
Job 0: startTime 0 - response 6 - turnaround 16
Job 1: startTime 0 - response 0 - turnaround 1
Job 2: startTime 0 - response 16 - turnaround 18
Job 3: startTime 0 - response 18 - turnaround 19
Job 4: startTime 0 - response 1 - turnaround 6
Avg 4: startTime n/a - response 8.20 - turnaround 12.00Attempt #1: Allotted Runtime
create something called allotment → time allotted to a process at a certain priority before it is shifted down3
RULE 3: When a job enters the system, it is placed at the highest priority
RULE 4: If a job uses up its allotment at a given level, its priority is reduced (regardless of how many times it gives up the CPU)
our mlfq is still incomplete though, and leaves potential for developers to game the system into giving them more resources
Starvation
Starvation occurs when there are “too many” high priority tasks running, this causese them to consume all the CPU time and lower priority processes end up never running Rumor has it when they shut down the IBM7094 at MIT in 1973, they found a process submitted in 1967 that had still not been run
Secure Scheduling
a scheduling policy is also a security concern, if a process is allowed to increase its priority it can cause a denial of service attack and not let anything else run. Linux only allows a non-root process to decrease it’s priority. Suppose we said interactive processes stay at the same level in our policy: if an attacker simply made their process more interactive they could theoretically stay at the highest priority indefinitely and starve your other processes
Attempt #2: Aging (The Priority Boost)
to avoid starvation what we can do to help CPU-bound jobs make progress?
RULE 5: The priority boost. After some time period , move all the jobs in the system to the topmost queue
guarantees that processes won’t starve, they will eventually run to completion. This also ensures that if a process becomes interactive later on, once boosted it will be treated with the priority it deserves
WARNING
What should the value of be? this is a difficult question with no simple method to determine a correct answer
Tuning MLFQ and Other Issues
some enhancements/variations can be added
-
VARIANT 1: make your time slices longer at lower priority levels
schedulers often have scheduler classes for different types of processes. The standard user scheduler class is the Time-Sharing Class. Solaris4 maintains details of this class in a dispatch table that the user can configure.5
# ts_quantum ts_tqexp ts_slpret ts_maxwait ts_lwait # PRIORITY LEVEL 200 0 50 0 50 # 0 200 0 50 0 50 # 1 200 0 50 0 50 # 2 200 0 50 0 50 # 3 200 0 50 0 50 # 4 200 0 50 0 50 # 5 200 0 50 0 50 # 6 200 0 50 0 50 # 7 200 0 50 0 50 # 8 200 0 50 0 50 # 9 160 0 51 0 51 # 10 ... 40 39 58 0 59 # 49 40 40 58 0 59 # 50 40 41 58 0 59 # 51 40 42 58 0 59 # 52 40 43 58 0 59 # 53 40 44 58 0 59 # 54 40 45 58 0 59 # 55 40 46 58 0 59 # 56 40 47 58 0 59 # 57 40 48 58 0 59 # 58 20 49 59 32000 59 # 59-
ts_quantum: the time slice at each priority level (ms) -
ts_tqexp: the priority level to jump to when the time quantum expires -
ts_slpret: the priority level to jump to the process returns from sleeping -
ts_maxwait: the maximum time spent in the ready queue before increasing priority (s) -
ts_lwait: the priority level to jump to when aging
Fairness of the Solaris Scheduler
The Solaris scheduler approximates fair allocations by decreasing priority the more a process is scheduled, and aging it the longer it waits
However due to the default configuration of the table
ts_maxwait = 0which means every second all processes will have their priority raised. This leads to compute-bound processes being given an unfair advantage -
-
VARIANT 2: FreeBSD uses a formula to calculate the current priority of a job based on how much CPU the process has used6
Based on the niceness and previous cpu time of a process, FreeBSD calculates the new priority of a process. Every 4 clock cycles its updated as
Where
p_cpuis incremented every time the system clock ticks while the process is running. However, we also introduce a decay filter7 to essentially give less weight to older cpu runtimes and focus more on the recent cpu utilisationswhere load is the average length of the run queue in the last minute. However interactive processes can not accumulate CPU time as above, so the scheduler recomputes their priority when they are awoken
where
p_sleeptimeis the time the process slept in seconds -
VARIANT 3: some reserve the topmost levels only for OS processes
-
VARIANT 4: many allow users to adjust the “niceness” of a process
This type of system takes user advice as a factor when making decisions8
❯ man nice ❯ nice --help Usage: nice [OPTION] [COMMAND [ARG]...] Run COMMAND with an adjusted niceness, which affects process scheduling. With no COMMAND, print the current niceness. Niceness values range from -20 (most favorable to the process) to 19 (least favorable to the process).
Summary
Consider this example (same jobs as in the beginning) on an MLFQ with 3 queues, the first with a quantum length of 1, the second with 5, and the third being a standard FIFO queue with aging disabled (i am also ignoring i/o)
❯ python mlfq.py -M 0 -c -Q 1,5,999 -l 0,10:0,1:0,4:0,5:0,2 -n 3
Job List:
Job 0: startTime 0 - runTime 10 - ioFreq 0
Job 1: startTime 0 - runTime 1 - ioFreq 0
Job 2: startTime 0 - runTime 4 - ioFreq 0
Job 3: startTime 0 - runTime 5 - ioFreq 0
Job 4: startTime 0 - runTime 2 - ioFreq 0
Q2: 01234
Q1: 0000022233334
Q0: 0000
Final statistics:
Job 0: startTime 0 - response 0 - turnaround 22
Job 1: startTime 0 - response 1 - turnaround 2
Job 2: startTime 0 - response 2 - turnaround 13
Job 3: startTime 0 - response 3 - turnaround 17
Job 4: startTime 0 - response 4 - turnaround 18
Avg 4: startTime n/a - response 2.00 - turnaround 14.40Here’s another example with a priority boost enabled to demonstrate aging. There is a priority boost every 10 units
❯ python mlfq.py -M 0 -c -Q 1,5,999 -l 0,15:0,1:0,6:0,5:0,12 -n 3 -B 10
Job List:
Job 0: startTime 0 - runTime 15 - ioFreq 0
Job 1: startTime 0 - runTime 1 - ioFreq 0
Job 2: startTime 0 - runTime 6 - ioFreq 0
Job 3: startTime 0 - runTime 5 - ioFreq 0
Job 4: startTime 0 - runTime 12 - ioFreq 0
Q2: 01234 0234 0234 4
Q1: 00000 000002 002233 44444
Q0: 444
Final statistics:
Job 0: startTime 0 - response 0 - turnaround 26 # notice the improvement in this process
Job 1: startTime 0 - response 1 - turnaround 2
Job 2: startTime 0 - response 2 - turnaround 28 # at the cost of this
Job 3: startTime 0 - response 3 - turnaround 30 # and this
Job 4: startTime 0 - response 4 - turnaround 39
Avg 4: startTime n/a - response 2.00 - turnaround 25.00Compared to a priority boost every 25 units
Q2: 01234 04
Q1: 0000022222333344444 0000044444
Q0: 0 00
Final statistics:
Job 0: startTime 0 - response 0 - turnaround 39
Job 1: startTime 0 - response 1 - turnaround 2
Job 2: startTime 0 - response 2 - turnaround 15
Job 3: startTime 0 - response 3 - turnaround 19
Job 4: startTime 0 - response 4 - turnaround 37
Avg 4: startTime n/a - response 2.00 - turnaround 22.40Versus the non priority boosted version
Q2: 01234
Q1: 0000022222333344444
Q0: 000000000444444
Final statistics:
Job 0: startTime 0 - response 0 - turnaround 33
Job 1: startTime 0 - response 1 - turnaround 2
Job 2: startTime 0 - response 2 - turnaround 15
Job 3: startTime 0 - response 3 - turnaround 19
Job 4: startTime 0 - response 4 - turnaround 39
Avg 4: startTime n/a - response 2.00 - turnaround 21.60Our processes are not very particularly interactive so they don’t benefit from being in the highest queue with a small quantum, leading to higher turnaround times
- Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).
- Rule 2: If Priority(A) = Priority(B), A & B run in round-robin fashion using the time slice (quantum length) of the given queue.
- Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue).
- Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue).
- Rule 5: After some time period S, move all the jobs in the system to the topmost queue.
Footnotes
-
https://www.youtube.com/watch?v=57r9OCN1EfA → video for everything discussed here ↩
-
https://medium.com/@s.ruban2000/branch-prediction-the-silent-clairvoyant-living-inside-the-processor-959557b0fbf5 ↩
-
different queues/priority levels can have their own different allotted times ↩
-
https://pages.cs.wisc.edu/~remzi/solaris-notes.pdf → Solaris scheduling implementation ↩
-
https://pages.cs.wisc.edu/~dusseau/Classes/CS537-F04/Questions/q7.html some questions ↩
-
https://www.crystallabs.io/unix-books-papers-videos/The-Design-and-Implementation-of-the-4.3BSD-Unix-OS.pdf pg 87 ↩
-
https://dl.acm.org/doi/pdf/10.1145/223587.223597 more decay filters for nerds ↩
-
https://www.cs.columbia.edu/~nieh/teaching/e6118_s00/papers/p79-patterson.pdf → an argument for making the os listen to the user more. “informed prefetching and caching” ↩