OS

Priority Scheduling1

a good scheduler now needs to do 2 things

  1. minimise response time when necessary
  2. minimise turnaround time, i.e. get the most processes done in a unit time and its got to do this without any of the assumptions we made previously

Most importantly we don’t have knowledge of the job length. So what we have to do is estimate the runtime using knowledge of previous runtimes.

Learning From History

The MLFQ is an example of a system that learns from the past to predict the future. Other examples of such systems are hardware branch predictors2 and caching algorithms

Basic Rules

different queues each assigned a different priority level. if multiple jobs are on the same queue i.e. same priority, then round robin (or potentially some other alg) is used. higher priority is run first. i.e.

RULE 1: if A runs (B doesn’t)

RULE 2: If , A and B run in RR (or potentially some other alg)

managing priority is key for mlfq.

it may want to analyse the processes and pass judgement

  • If a job has a lot of i/o frees up the cpu often, it keeps its high priority
  • If it is an intensive background process its priority is reduced

Consider the trivial example below (no RR)

Job List:
  Job  0: startTime   0 - runTime  10 - priority   2
  Job  1: startTime   0 - runTime   1 - priority   4
  Job  2: startTime   0 - runTime   2 - priority   1
  Job  3: startTime   0 - runTime   1 - priority   0
  Job  4: startTime   0 - runTime   5 - priority   3
 
Q4: 1
Q3:  44444
Q2:       0000000000
Q1:                 22
Q0:                   3
 
Final statistics:
  Job  0: startTime   0 - response   6 - turnaround  16
  Job  1: startTime   0 - response   0 - turnaround   1
  Job  2: startTime   0 - response  16 - turnaround  18
  Job  3: startTime   0 - response  18 - turnaround  19
  Job  4: startTime   0 - response   1 - turnaround   6
 
  Avg  4: startTime n/a - response 8.20 - turnaround 12.00

Attempt #1: Allotted Runtime

create something called allotment time allotted to a process at a certain priority before it is shifted down3

RULE 3: When a job enters the system, it is placed at the highest priority

RULE 4: If a job uses up its allotment at a given level, its priority is reduced (regardless of how many times it gives up the CPU)

our mlfq is still incomplete though, and leaves potential for developers to game the system into giving them more resources

Starvation

Starvation occurs when there are “too many” high priority tasks running, this causese them to consume all the CPU time and lower priority processes end up never running Rumor has it when they shut down the IBM7094 at MIT in 1973, they found a process submitted in 1967 that had still not been run

Secure Scheduling

a scheduling policy is also a security concern, if a process is allowed to increase its priority it can cause a denial of service attack and not let anything else run. Linux only allows a non-root process to decrease it’s priority. Suppose we said interactive processes stay at the same level in our policy: if an attacker simply made their process more interactive they could theoretically stay at the highest priority indefinitely and starve your other processes

Attempt #2: Aging (The Priority Boost)

to avoid starvation what we can do to help CPU-bound jobs make progress?

RULE 5: The priority boost. After some time period , move all the jobs in the system to the topmost queue

guarantees that processes won’t starve, they will eventually run to completion. This also ensures that if a process becomes interactive later on, once boosted it will be treated with the priority it deserves

WARNING

What should the value of be? this is a difficult question with no simple method to determine a correct answer

Tuning MLFQ and Other Issues

some enhancements/variations can be added

  • VARIANT 1: make your time slices longer at lower priority levels

    schedulers often have scheduler classes for different types of processes. The standard user scheduler class is the Time-Sharing Class. Solaris4 maintains details of this class in a dispatch table that the user can configure.5

    # ts_quantum ts_tqexp ts_slpret ts_maxwait ts_lwait  # PRIORITY LEVEL
            200      0        50        0          50    #    0
            200      0        50        0          50    #    1
            200      0        50        0          50    #    2
            200      0        50        0          50    #    3
            200      0        50        0          50    #    4
            200      0        50        0          50    #    5
            200      0        50        0          50    #    6
            200      0        50        0          50    #    7
            200      0        50        0          50    #    8
            200      0        50        0          50    #    9
            160      0        51        0          51    #    10
            ...
            40      39        58        0          59    #    49
            40      40        58        0          59    #    50
            40      41        58        0          59    #    51
            40      42        58        0          59    #    52
            40      43        58        0          59    #    53
            40      44        58        0          59    #    54
            40      45        58        0          59    #    55
            40      46        58        0          59    #    56
            40      47        58        0          59    #    57
            40      48        58        0          59    #    58
            20      49        59      32000        59    #    59
    • ts_quantum: the time slice at each priority level (ms)

    • ts_tqexp: the priority level to jump to when the time quantum expires

    • ts_slpret: the priority level to jump to the process returns from sleeping

    • ts_maxwait: the maximum time spent in the ready queue before increasing priority (s)

    • ts_lwait: the priority level to jump to when aging

    Fairness of the Solaris Scheduler

    The Solaris scheduler approximates fair allocations by decreasing priority the more a process is scheduled, and aging it the longer it waits

    However due to the default configuration of the table ts_maxwait = 0 which means every second all processes will have their priority raised. This leads to compute-bound processes being given an unfair advantage

  • VARIANT 2: FreeBSD uses a formula to calculate the current priority of a job based on how much CPU the process has used6

    Based on the niceness and previous cpu time of a process, FreeBSD calculates the new priority of a process. Every 4 clock cycles its updated as

    Where p_cpu is incremented every time the system clock ticks while the process is running. However, we also introduce a decay filter7 to essentially give less weight to older cpu runtimes and focus more on the recent cpu utilisations

    where load is the average length of the run queue in the last minute. However interactive processes can not accumulate CPU time as above, so the scheduler recomputes their priority when they are awoken

    where p_sleeptime is the time the process slept in seconds

  • VARIANT 3: some reserve the topmost levels only for OS processes

  • VARIANT 4: many allow users to adjust the “niceness” of a process

    This type of system takes user advice as a factor when making decisions8

     man nice
     nice --help
    Usage: nice [OPTION] [COMMAND [ARG]...]
    Run COMMAND with an adjusted niceness, which affects process scheduling.
    With no COMMAND, print the current niceness.  Niceness values range from
    -20 (most favorable to the process) to 19 (least favorable to the process).

Summary

Consider this example (same jobs as in the beginning) on an MLFQ with 3 queues, the first with a quantum length of 1, the second with 5, and the third being a standard FIFO queue with aging disabled (i am also ignoring i/o)

 python mlfq.py -M 0 -c -Q 1,5,999 -l 0,10:0,1:0,4:0,5:0,2 -n 3
Job List:
  Job  0: startTime   0 - runTime  10 - ioFreq   0
  Job  1: startTime   0 - runTime   1 - ioFreq   0
  Job  2: startTime   0 - runTime   4 - ioFreq   0
  Job  3: startTime   0 - runTime   5 - ioFreq   0
  Job  4: startTime   0 - runTime   2 - ioFreq   0
 
Q2: 01234
Q1:      0000022233334
Q0:                   0000
 
Final statistics:
  Job  0: startTime   0 - response   0 - turnaround  22
  Job  1: startTime   0 - response   1 - turnaround   2
  Job  2: startTime   0 - response   2 - turnaround  13
  Job  3: startTime   0 - response   3 - turnaround  17
  Job  4: startTime   0 - response   4 - turnaround  18
 
  Avg  4: startTime n/a - response 2.00 - turnaround 14.40

Here’s another example with a priority boost enabled to demonstrate aging. There is a priority boost every 10 units

 python mlfq.py -M 0 -c -Q 1,5,999 -l 0,15:0,1:0,6:0,5:0,12 -n 3 -B 10
Job List:
  Job  0: startTime   0 - runTime  15 - ioFreq   0
  Job  1: startTime   0 - runTime   1 - ioFreq   0
  Job  2: startTime   0 - runTime   6 - ioFreq   0
  Job  3: startTime   0 - runTime   5 - ioFreq   0
  Job  4: startTime   0 - runTime  12 - ioFreq   0
 
Q2: 01234     0234      0234      4
Q1:      00000    000002    002233 44444
Q0:                                     444
 
Final statistics:
  Job  0: startTime   0 - response   0 - turnaround  26 # notice the improvement in this process
  Job  1: startTime   0 - response   1 - turnaround   2
  Job  2: startTime   0 - response   2 - turnaround  28 # at the cost of this
  Job  3: startTime   0 - response   3 - turnaround  30 # and this
  Job  4: startTime   0 - response   4 - turnaround  39
 
  Avg  4: startTime n/a - response 2.00 - turnaround 25.00

Compared to a priority boost every 25 units

Q2: 01234                    04
Q1:      0000022222333344444   0000044444
Q0:                         0            00
 
Final statistics:
  Job  0: startTime   0 - response   0 - turnaround  39
  Job  1: startTime   0 - response   1 - turnaround   2
  Job  2: startTime   0 - response   2 - turnaround  15
  Job  3: startTime   0 - response   3 - turnaround  19
  Job  4: startTime   0 - response   4 - turnaround  37
 
  Avg  4: startTime n/a - response 2.00 - turnaround 22.40

Versus the non priority boosted version

Q2: 01234
Q1:      0000022222333344444
Q0:                         000000000444444
 
Final statistics:
  Job  0: startTime   0 - response   0 - turnaround  33
  Job  1: startTime   0 - response   1 - turnaround   2
  Job  2: startTime   0 - response   2 - turnaround  15
  Job  3: startTime   0 - response   3 - turnaround  19
  Job  4: startTime   0 - response   4 - turnaround  39
 
  Avg  4: startTime n/a - response 2.00 - turnaround 21.60

Our processes are not very particularly interactive so they don’t benefit from being in the highest queue with a small quantum, leading to higher turnaround times


  • Rule 1: If Priority(A) > Priority(B), A runs (B doesn’t).
  • Rule 2: If Priority(A) = Priority(B), A & B run in round-robin fashion using the time slice (quantum length) of the given queue.
  • Rule 3: When a job enters the system, it is placed at the highest priority (the topmost queue).
  • Rule 4: Once a job uses up its time allotment at a given level (regardless of how many times it has given up the CPU), its priority is reduced (i.e., it moves down one queue).
  • Rule 5: After some time period S, move all the jobs in the system to the topmost queue.

Footnotes

  1. https://www.youtube.com/watch?v=57r9OCN1EfA video for everything discussed here

  2. https://medium.com/@s.ruban2000/branch-prediction-the-silent-clairvoyant-living-inside-the-processor-959557b0fbf5

  3. different queues/priority levels can have their own different allotted times

  4. https://pages.cs.wisc.edu/~remzi/solaris-notes.pdf Solaris scheduling implementation

  5. https://pages.cs.wisc.edu/~dusseau/Classes/CS537-F04/Questions/q7.html some questions

  6. https://www.crystallabs.io/unix-books-papers-videos/The-Design-and-Implementation-of-the-4.3BSD-Unix-OS.pdf pg 87

  7. https://dl.acm.org/doi/pdf/10.1145/223587.223597 more decay filters for nerds

  8. https://www.cs.columbia.edu/~nieh/teaching/e6118_s00/papers/p79-patterson.pdf an argument for making the os listen to the user more. “informed prefetching and caching”