Concurrency: Introduction
So far we’ve seen processes and how they are isolated with their own virtual address spaces. We introduce the concept of threads that are processes with multiple points of execution, i.e. more than one program counters. This let’s them share the same address space
Threads
The state of a single thread is very similar to a process, each thread has its own
- program counter
- set of registers
- stack
- and a Thread Control Block to store these after every context switch
The different stacks leads to needing to maintain per-thread global variables, like stack address, which are done in Thread Local Storage (TLS)
Why
The main purposes threads serve is
- Parallelisation: on multiprocessor systems, considerable speedup can be achieved by running your program as different threads each running concurrently on a different processor
- Non-Blocking I/O: blocking calls such as for i/o can be mitigated with threads. While one thread waits the others can continue working.
Creation
#include <pthread.h>
#include <stdio.h>
void *mythread(void *arg) {
printf("%s\n", (char *)arg);
return NULL;
}
int main(int argc, char **argv) {
pthread_t p1, p2;
char a[] = "p1: A";
char b[] = "p2: B";
printf("main: begin\n");
pthread_create(&p1, NULL, mythread, a);
pthread_create(&p2, NULL, mythread, b);
pthread_join(p1, NULL);
pthread_join(p2, NULL);
printf("main: end\n");
return 0;
}The main program creates two threads both of which run the subroutine mythread with different arguments. Once the thread is created it may be run immediately or put in a ready queue depending on the scheduler. Multiple threads may even be put on different processors and run simultaneously
It then waits for both threads to complete before running the main thread again with pthread_join()
❯ build create.c thread
Compilation successful: thread
main: begin
p2: B
p1: A
main: end
❯ build create.c thread
Compilation successful: thread
main: begin
p1: A
p2: B
main: endAs you can see it is not deterministic which thread runs first, that’s entirely up to the scheduler
A possible execution trace could be:
| main | thread 1 | thread 2 |
|---|---|---|
| create T1 | ||
| create T2 | ||
| wait for T1 | ||
| runs T1 and return | ||
| wait for T2 | ||
| runs T2 and return | ||
| end |
Another possibility could be:
| main | thread 1 | thread 2 |
|---|---|---|
| create T1 | ||
| create T2 | ||
| runs T2 and return | ||
| runs T1 and return | ||
| wait for T1 | ||
| wait for T2 | ||
| end |
Or:
| main | thread 1 | thread 2 |
|---|---|---|
| create T1 | ||
| runs T1 and return | ||
| create T2 | ||
| runs T2 and return | ||
| wait for T1 | ||
| wait for T2 | ||
| end |
Shared Data
Consider two processes incrementing a shared variable 10000000 times each
#include <pthread.h>
#include <stdio.h>
static volatile int amount = 0;
void *counter(void *id) {
char *c_id = (char *)id;
printf("%s: begin\n", c_id);
for (int i = 0; i < 1e7; i++) {
amount = amount + 1;
}
printf("%s: end\n", c_id);
return NULL;
}
int main(int argc, char **argv) {
pthread_t p1, p2;
char a[] = "p1";
char b[] = "p2";
printf("main: begin\n");
printf("main: amount = %d\n", amount);
pthread_create(&p1, NULL, counter, a);
pthread_create(&p2, NULL, counter, b);
pthread_join(p1, NULL);
pthread_join(p2, NULL);
printf("main: end\n");
printf("main: amount = %d\n", amount);
return 0;
}However running this gives us strange results
❯ build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p1: end
p2: end
main: end
main: amount = 10414799
❯ build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p2: end
p1: end
main: end
main: amount = 10261149
❯ build counter.c thread
Compilation successful: thread
main: begin
main: amount = 0
p1: begin
p2: begin
p1: end
p2: end
main: end
main: amount = 11618395We get masively incorrect, and different answers each run.
Preemptive Scheduling
The reason for this is that a simple increment statement amount = amount + 1 is actually 3 statements
mov eax, dword [obj.amount] ; counter.c:10:14 amount = amount + 1; ; [0x403c:4]=0
add eax, 1 ; counter.c:10:21 amount = amount + 1;
mov dword [obj.amount], eax ; counter.c:10:12 amount = amount + 1; ; [0x403c:4]=0First the value of amount is loaded into the eax register, then the register is incremented, and finally the value is stored back into amount
If the scheduler decides to do a context switch just before writing the register back into amount, the increment is basically wasted leading us to consistently get a result less than 20000000
We can see that we always want these 3 machine statements to run together safe from scheduler preemption. We call such code a Critical Section. Multiple threads running this same code simultaneously could cause a Race Condition which is what makes it so critical.
When we have two threads that could potentially run simultaneously, we want to achieve Mutual Exclusion i.e. ensure that their critical sections never run at the same time, i.e. are not preempted.
Atomicity
What we need, to achieve mutual exclusion, is Atomicity, i.e. an all or nothing approach where either the amount is incremented or nothing happens.
Machine instructions are truly atomic, at any given time a machine instruction has either been executed or not, it can not possibly be stopped in the middle of execution. This hints that to achieve true atomicity we would need some hardware support.
Based on that support we can build our Synchronisation Primitives to solve the general mutual exlusion problem
Waiting
We saw the problems associated with shared memory, however there are other ways for threads to interact too. They could wait for each other before running etc. For example when one threads runs disk i/o the other does some other productive work