OS

Concurrency: Introduction

So far we’ve seen processes and how they are isolated with their own virtual address spaces. We introduce the concept of threads that are processes with multiple points of execution, i.e. more than one program counters. This let’s them share the same address space

Threads

The state of a single thread is very similar to a process, each thread has its own

  • program counter
  • set of registers
  • stack
  • and a Thread Control Block to store these after every context switch

The different stacks leads to needing to maintain per-thread global variables, like stack address, which are done in Thread Local Storage (TLS)

Why

The main purposes threads serve is

  1. Parallelisation: on multiprocessor systems, considerable speedup can be achieved by running your program as different threads each running concurrently on a different processor
  2. Non-Blocking I/O: blocking calls such as for i/o can be mitigated with threads. While one thread waits the others can continue working.

Creation

#include <pthread.h>
#include <stdio.h>
 
void *mythread(void *arg) {
  printf("%s\n", (char *)arg);
  return NULL;
}
 
int main(int argc, char **argv) {
  pthread_t p1, p2;
  char a[] = "p1: A";
  char b[] = "p2: B";
 
  printf("main: begin\n");
 
  pthread_create(&p1, NULL, mythread, a);
  pthread_create(&p2, NULL, mythread, b);
 
  pthread_join(p1, NULL);
  pthread_join(p2, NULL);
 
  printf("main: end\n");
 
  return 0;
}

The main program creates two threads both of which run the subroutine mythread with different arguments. Once the thread is created it may be run immediately or put in a ready queue depending on the scheduler. Multiple threads may even be put on different processors and run simultaneously

It then waits for both threads to complete before running the main thread again with pthread_join()

 build create.c thread
Compilation successful: thread
main: begin
p2: B
p1: A
main: end
 
 build create.c thread
Compilation successful: thread
main: begin
p1: A
p2: B
main: end

As you can see it is not deterministic which thread runs first, that’s entirely up to the scheduler

A possible execution trace could be:

mainthread 1thread 2
create T1
create T2
wait for T1
runs T1 and return
wait for T2
runs T2 and return
end

Another possibility could be:

mainthread 1thread 2
create T1
create T2
runs T2 and return
runs T1 and return
wait for T1
wait for T2
end

Or:

mainthread 1thread 2
create T1
runs T1 and return
create T2
runs T2 and return
wait for T1
wait for T2
end

Shared Data

Consider two processes incrementing a shared variable 10000000 times each

#include <pthread.h>
#include <stdio.h>
 
static volatile int amount = 0;
 
void *counter(void *id) {
  char *c_id = (char *)id;
  printf("%s: begin\n", c_id);
  for (int i = 0; i < 1e7; i++) {
    amount = amount + 1;
  }
  printf("%s: end\n", c_id);
  return NULL;
}
 
int main(int argc, char **argv) {
  pthread_t p1, p2;
  char a[] = "p1";
  char b[] = "p2";
 
  printf("main: begin\n");
  printf("main: amount = %d\n", amount);
 
  pthread_create(&p1, NULL, counter, a);
  pthread_create(&p2, NULL, counter, b);
 
  pthread_join(p1, NULL);
  pthread_join(p2, NULL);
 
  printf("main: end\n");
  printf("main: amount = %d\n", amount);
 
  return 0;
}

However running this gives us strange results

 build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p1: end
p2: end
main: end
main: amount = 10414799
 
 build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p2: end
p1: end
main: end
main: amount = 10261149
 
 build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p1: end
p2: end
main: end
main: amount = 11618395

We get masively incorrect, and different answers each run.

Preemptive Scheduling

The reason for this is that a simple increment statement amount = amount + 1 is actually 3 statements

mov eax, dword [obj.amount] ; counter.c:10:14    amount = amount + 1; ; [0x403c:4]=0
add eax, 1                  ; counter.c:10:21    amount = amount + 1;
mov dword [obj.amount], eax ; counter.c:10:12    amount = amount + 1; ; [0x403c:4]=0

First the value of amount is loaded into the eax register, then the register is incremented, and finally the value is stored back into amount

If the scheduler decides to do a context switch just before writing the register back into amount, the increment is basically wasted leading us to consistently get a result less than 20000000

We can see that we always want these 3 machine statements to run together safe from scheduler preemption. We call such code a Critical Section. Multiple threads running this same code simultaneously could cause a Race Condition which is what makes it so critical.

When we have two threads that could potentially run simultaneously, we want to achieve Mutual Exclusion i.e. ensure that their critical sections never run at the same time, i.e. are not preempted.

Atomicity

What we need, to achieve mutual exclusion, is Atomicity, i.e. an all or nothing approach where either the amount is incremented or nothing happens.

Machine instructions are truly atomic, at any given time a machine instruction has either been executed or not, it can not possibly be stopped in the middle of execution. This hints that to achieve true atomicity we would need some hardware support.

Based on that support we can build our Synchronisation Primitives to solve the general mutual exlusion problem

Waiting

We saw the problems associated with shared memory, however there are other ways for threads to interact too. They could wait for each other before running etc. For example when one threads runs disk i/o the other does some other productive work