Mutex kernel synchronization
All checks were successful
Build website / build-and-deploy (push) Successful in 51s
All checks were successful
Build website / build-and-deploy (push) Successful in 51s
This commit is contained in:
BIN
assets/img/running-bison.jpg
Normal file
BIN
assets/img/running-bison.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 200 KiB |
628
content/blog/MOP2/mutex-sync.adoc
Normal file
628
content/blog/MOP2/mutex-sync.adoc
Normal file
@@ -0,0 +1,628 @@
|
||||
= Mutexes and process suspension in kernels
|
||||
Kamil Kowalczyk
|
||||
2026-01-30
|
||||
:jbake-type: post
|
||||
:jbake-tags: MOP2 osdev
|
||||
:jbake-status: published
|
||||
|
||||
Hello!
|
||||
|
||||
In this blog post I would like to share my thoughts and some implementation details about the *mutex*
|
||||
mechanism in *MOP3*!
|
||||
|
||||
WAIT! MOP *3*?
|
||||
|
||||
Yes! MOP2 is undergoing a huge rewrite and redesign right now, due to me not being happy with **some**
|
||||
design choices of the system. This is mostly about writing SMP-first/SMP-friendly code. The rewrite
|
||||
turned out to be so big that it was just easier to start a new project, while reusing big portions
|
||||
of older MOP2 code - and thus MOP3 was born. Excpect more interesting updates in the future!
|
||||
|
||||
New GIT repository: https://git.kamkow1lair.pl/kamkow1/MOP3
|
||||
|
||||
== Multitasking is deliquete
|
||||
|
||||
=== Preface
|
||||
|
||||
This portion of the article will explain critical sections and why they're so fragile. If you're an
|
||||
advanced reader, you can easily skip it.
|
||||
|
||||
Writing multitasked/paralell code is notoriously difficult and can lead to bazzire, awful and undeterministic
|
||||
bugs. One should keep in mind that multitasked code is not something debuggable with normal developer tools
|
||||
like `gdb` or `lldb` or some other crash analyzers and code inspectors. Rather than that, such code should be
|
||||
proven to be safe/correct, either by just reading the code or writing formal proofs.
|
||||
|
||||
Formal proofs can be written using the Promela scripting language: https://spinroot.com/spin/whatispin.html
|
||||
|
||||
=== The problem
|
||||
|
||||
Imagine this problem in a multitasked application:
|
||||
|
||||
.A disaster happens here
|
||||
[source,c]
|
||||
----
|
||||
struct my_list_node {
|
||||
struct my_list_node* next, *prev;
|
||||
int value;
|
||||
};
|
||||
|
||||
/* These are complex operations on multiple pointers */
|
||||
#define LIST_ADD(list, x) ...
|
||||
#define LIST_REMOVE(list, x) ...
|
||||
|
||||
struct my_list_node* my_list = ...;
|
||||
|
||||
void thread_A (void) {
|
||||
struct my_list_node* new_node = malloc (sizeof (*new_node));
|
||||
new_node->value = 67;
|
||||
|
||||
LIST_ADD (my_list, new_node);
|
||||
}
|
||||
|
||||
void thread_B (void) {
|
||||
LIST_REMOVE (my_list, my_list); /* remove head */
|
||||
}
|
||||
----
|
||||
|
||||
The issue here is that, while `thread_A` is adding a new node, `thread_B` can try to remove it. This will
|
||||
lead to pointer corruption, since `LIST_REMOVE` and `LIST_ADD` will try to re-link list's nodes at the same
|
||||
time.
|
||||
|
||||
Such fragile sections of code are called "critical sections", meaning that one has to be VERY careful when
|
||||
tasks enter it. So where are they in the code?
|
||||
|
||||
[source,c]
|
||||
----
|
||||
void thread_A (void) {
|
||||
/* NOT CRITICAL - new_node is not yet visible to other tasks */
|
||||
struct my_list_node* new_node = malloc (sizeof (*new_node));
|
||||
new_node->value = 67;
|
||||
|
||||
/* CRITICAL */
|
||||
LIST_ADD (my_list, new_node);
|
||||
}
|
||||
|
||||
void thread_B (void) {
|
||||
/* CRITICAL */
|
||||
LIST_REMOVE (my_list, my_list); /* remove head */
|
||||
}
|
||||
----
|
||||
|
||||
=== So how do we solve this?
|
||||
|
||||
This specific problem can be easily solved by, what's called "Mutual Exclusion" or "MutEx" for short.
|
||||
|
||||
Why "mutual exclusion"? As you can see, to modify the list, a task needs *exclusive* access to it, ie.
|
||||
nobody else can be using the list at the same time. Kind of like a bathroom - imagine someone happily walks
|
||||
in while you're sitting on a toilet and tries to fit right beside you.
|
||||
|
||||
==== A poor man's mutex and atomicity
|
||||
|
||||
Let's park it here and move to a slightly different, but related topic - atomicity. What is it?
|
||||
|
||||
To be atomic is to be indivisible. An atomic read or a write is an operation that can't be split up
|
||||
and has to be completed from start to finish without being interrupted.
|
||||
|
||||
While atomic operations are mutually exclusive, they work on a level of individual reads/writes of integers
|
||||
of various sizes. One can atomically read/write a byte or an uint64_t. This of course won't solve our problem,
|
||||
because we're operating on many integers (here pointers) simultaneously - struct my_list_node's prev and next
|
||||
pointers - but despite that, we can leverage atomicity to implement higher level synchronization constructs.
|
||||
|
||||
BTW, documentation on atomic operations in C can be found here: https://en.cppreference.com/w/c/atomic.html.
|
||||
The functions/macros are implemented internally by the compiler, for eg. GCC atomics: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html.
|
||||
They have to be implemented internally, because they're very platform dependant and each architecture may handle
|
||||
atomicity slightly differently, although the same rules apply everwhere.
|
||||
|
||||
Now that we've discussed C atomics, let's implement the poor man's mutex!
|
||||
|
||||
What needs to be done? We've established that only one task can be using the list at a time. We'd probably want
|
||||
some sort of a flag to know if it's our turn to use the list. If the flag is set, it means that someone is
|
||||
currently using the list and thus we have to wait. If clear, we're good to go.
|
||||
|
||||
Obviously, the flag has to be atomic by itself. Otherwise one task could read the flag, while it's being updated,
|
||||
leading to a thread entering the critical section, when it's not it's turn.
|
||||
|
||||
[source,c]
|
||||
----
|
||||
#include <stdatomic.h>
|
||||
|
||||
#define SPIN_LOCK_INIT ATOMIC_FLAG_INIT
|
||||
|
||||
typedef atomic_flag spin_lock_t;
|
||||
|
||||
void spin_lock (spin_lock_t* sl) {
|
||||
while (atomic_flag_test_and_set_explicit (sl, memory_order_acquire))
|
||||
;
|
||||
}
|
||||
|
||||
void spin_unlock (spin_lock_t* sl) {
|
||||
atomic_flag_clear_explicit (sl, memory_order_release);
|
||||
}
|
||||
|
||||
// List same as before...
|
||||
|
||||
// the lock
|
||||
spin_lock_t the_lock = SPIN_LOCK_INIT;
|
||||
|
||||
void thread_A (void) {
|
||||
struct my_list_node* new_node = malloc (sizeof (*new_node));
|
||||
new_node->value = 67;
|
||||
|
||||
spin_lock (&the_lock);
|
||||
LIST_ADD (my_list, new_node);
|
||||
spin_unlock (&the_lock);
|
||||
}
|
||||
|
||||
void thread_B (void) {
|
||||
spin_lock (&the_lock);
|
||||
LIST_REMOVE (my_list, my_list); /* remove head */
|
||||
spin_unlock (&the_lock);
|
||||
}
|
||||
----
|
||||
|
||||
As you can see, if a task tries to modify the list, it must first check whether it can do so. If not, wait
|
||||
until such operation is possible. Pretty easy, eh?
|
||||
|
||||
There are some big and small improvements to be made, so let's pick up a spying glass and analyze!
|
||||
|
||||
===== Small inefficiency
|
||||
When we fail to lock the spin lock, we're stuck in a spin-loop. This is really bad
|
||||
for performance, because we're just eating up CPU cycles for nothing. A better spin lock would "relax the
|
||||
CPU", ie. tell the processor that it's stuck in a spin-loop and make it consume less power or
|
||||
switch hardware tasks.
|
||||
|
||||
On x86 we have a `pause` instruction: https://www.felixcloutier.com/x86/pause.
|
||||
ARM has a `yield` instruction: https://developer.arm.com/documentation/dui0802/b/A32-and-T32-Instructions/SEV--SEVL--WFE--WFI--and-YIELD.
|
||||
|
||||
===== Big inefficiency
|
||||
Despite the `pause` or `yield` instructions and such, we still have an underlying flaw in our spin lock.
|
||||
The issue is that we're just spinning when we fail to acquire the lock, when we could hand over control
|
||||
to another task - we can't have access ourselves, so let someone else take the precious CPU time and get
|
||||
a little bit of work done.
|
||||
|
||||
Despite this "inefficiency", spin locks are the most primitive-yet-useful way of locking. This example shows
|
||||
a simple solution to a simple problem, but in the real-world one would use spin locks to implement higher
|
||||
level/smarter ways of providing task synchronization.
|
||||
|
||||
=== The better mutex and it's building blocks
|
||||
|
||||
So if spinning in place is "bad" for performance, what do we do then? Here come the "suspension queues".
|
||||
|
||||
==== Scheduler states
|
||||
|
||||
If we fail to acquire a mutex, it means that we're stuck waiting for it. If we're stuck, we shouldn't be
|
||||
scheduled, ie. given CPU time (there's no work to be done). To denote this, we have to introduce another
|
||||
process state:
|
||||
|
||||
[source,c]
|
||||
----
|
||||
/* process states */
|
||||
#define PROC_READY 0
|
||||
#define PROC_DEAD 1
|
||||
#define PROC_SUSPENDED 2 /* NEW */
|
||||
----
|
||||
|
||||
This new state describes a new situation in our scheduler - a process IS alive, but CANNOT be scheduled.
|
||||
Before we only had alive (schedulable) processes and dead (unschedulable) processes.
|
||||
|
||||
Now our scheduler won't pick up suspended processes:
|
||||
|
||||
[source,c]
|
||||
----
|
||||
// ...
|
||||
|
||||
static struct proc* proc_find_sched (struct cpu* cpu) {
|
||||
if (!cpu->proc_run_q)
|
||||
return NULL;
|
||||
|
||||
struct list_node_link *current, *start;
|
||||
|
||||
if (cpu->proc_current)
|
||||
current = cpu->proc_current->cpu_run_q_link.next;
|
||||
else
|
||||
current = cpu->proc_run_q;
|
||||
|
||||
if (!current)
|
||||
current = cpu->proc_run_q;
|
||||
|
||||
start = current;
|
||||
|
||||
do {
|
||||
struct proc* proc = list_entry (current, struct proc, cpu_run_q_link);
|
||||
|
||||
if (atomic_load (&proc->state) == PROC_READY)
|
||||
return proc;
|
||||
|
||||
current = current->next ? current->next : cpu->proc_run_q;
|
||||
} while (current != start);
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
// ...
|
||||
|
||||
void proc_sched (void) {
|
||||
spin_lock_ctx_t ctxcpu;
|
||||
|
||||
int s_cycles = atomic_fetch_add (&sched_cycles, 1);
|
||||
|
||||
if (s_cycles % SCHED_REAP_FREQ == 0)
|
||||
proc_reap ();
|
||||
|
||||
struct proc* next = NULL;
|
||||
struct cpu* cpu = thiscpu;
|
||||
|
||||
spin_lock (&cpu->lock, &ctxcpu);
|
||||
|
||||
next = proc_find_sched (cpu);
|
||||
|
||||
if (next) {
|
||||
cpu->proc_current = next;
|
||||
|
||||
do_sched (next, &cpu->lock, &ctxcpu);
|
||||
} else {
|
||||
cpu->proc_current = NULL;
|
||||
spin_unlock (&cpu->lock, &ctxcpu);
|
||||
|
||||
spin ();
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
==== Suspension queues
|
||||
|
||||
A suspension queue is a little tricky - it's not just a list of suspended processes. We need to establish
|
||||
a many:many relationship (kinda like in databases). A process can be a member of many suspension queues
|
||||
(image a process, which waits on many mutexes) and a single suspension queue has many member processes.
|
||||
In order to achieve that, we can do some pointer trickery!
|
||||
|
||||
[source,c]
|
||||
----
|
||||
struct proc {
|
||||
int pid;
|
||||
struct rb_node_link proc_tree_link;
|
||||
struct rb_node_link procgroup_memb_tree_link;
|
||||
struct list_node_link cpu_run_q_link;
|
||||
struct list_node_link reap_link;
|
||||
struct list_node_link* sq_entries; /* list of intermediate structs */
|
||||
struct procgroup* procgroup;
|
||||
struct proc_platformdata pdata;
|
||||
uint32_t flags;
|
||||
spin_lock_t lock;
|
||||
struct cpu* cpu;
|
||||
atomic_int state;
|
||||
uintptr_t uvaddr_argument;
|
||||
};
|
||||
|
||||
struct proc_suspension_q {
|
||||
struct list_node_link* proc_list; /* list of processes via proc_sq_entry */
|
||||
spin_lock_t lock;
|
||||
};
|
||||
|
||||
/* the intermediate struct */
|
||||
struct proc_sq_entry {
|
||||
struct list_node_link sq_link; /* list link */
|
||||
struct list_node_link proc_link; /* list link */
|
||||
struct proc* proc; /* pointer back to the process */
|
||||
struct proc_suspension_q* sq; /* pointer back to the suspension queue */
|
||||
};
|
||||
----
|
||||
|
||||
Now we can talk about the suspend/resume algorithms.
|
||||
|
||||
[source,c]
|
||||
----
|
||||
bool proc_sq_suspend (struct proc* proc, struct proc_suspension_q* sq, spin_lock_t* resource_lock,
|
||||
spin_lock_ctx_t* ctxrl) {
|
||||
spin_lock_ctx_t ctxpr, ctxcpu, ctxsq;
|
||||
struct cpu* cpu = proc->cpu;
|
||||
|
||||
/* allocate intermediate struct */
|
||||
struct proc_sq_entry* sq_entry = malloc (sizeof (*sq_entry));
|
||||
if (!sq_entry) {
|
||||
spin_unlock (resource_lock, ctxrl);
|
||||
return PROC_NO_RESCHEDULE;
|
||||
}
|
||||
|
||||
/* set links on both ends */
|
||||
sq_entry->proc = proc;
|
||||
sq_entry->sq = sq;
|
||||
|
||||
/* lock with compliance to the lock hierachy */
|
||||
spin_lock (&cpu->lock, &ctxcpu);
|
||||
spin_lock (&proc->lock, &ctxpr);
|
||||
spin_lock (&sq->lock, &ctxsq);
|
||||
|
||||
spin_unlock (resource_lock, ctxrl);
|
||||
|
||||
/* transition the state to PROC_SUSPENDED */
|
||||
atomic_store (&proc->state, PROC_SUSPENDED);
|
||||
|
||||
/* append to sq's list */
|
||||
list_append (sq->proc_list, &sq_entry->sq_link);
|
||||
|
||||
/* append to proc's list */
|
||||
list_append (proc->sq_entries, &sq_entry->proc_link);
|
||||
|
||||
/* remove from CPU's run list and decrement CPU load counter */
|
||||
list_remove (cpu->proc_run_q, &proc->cpu_run_q_link);
|
||||
atomic_fetch_sub (&cpu->proc_run_q_count, 1);
|
||||
|
||||
if (cpu->proc_current == proc)
|
||||
cpu->proc_current = NULL;
|
||||
|
||||
/* a process is now unschedulable, so it doesn't need a CPU */
|
||||
proc->cpu = NULL;
|
||||
|
||||
/* release the locks */
|
||||
spin_unlock (&sq->lock, &ctxsq);
|
||||
spin_unlock (&proc->lock, &ctxpr);
|
||||
spin_unlock (&cpu->lock, &ctxcpu);
|
||||
|
||||
return PROC_NEED_RESCHEDULE;
|
||||
}
|
||||
|
||||
bool proc_sq_resume (struct proc* proc, struct proc_sq_entry* sq_entry) {
|
||||
spin_lock_ctx_t ctxsq, ctxpr, ctxcpu;
|
||||
struct cpu* cpu = cpu_find_lightest (); /* find the least-loaded CPU */
|
||||
struct proc_suspension_q* sq = sq_entry->sq;
|
||||
|
||||
/* acquire necessary locks */
|
||||
spin_lock (&cpu->lock, &ctxcpu);
|
||||
spin_lock (&proc->lock, &ctxpr);
|
||||
spin_lock (&sq->lock, &ctxsq);
|
||||
|
||||
/* remove from sq's list */
|
||||
list_remove (sq->proc_list, &sq_entry->sq_link);
|
||||
|
||||
/* remove from proc's list */
|
||||
list_remove (proc->sq_entries, &sq_entry->proc_link);
|
||||
|
||||
/* Give the CPU to the process */
|
||||
proc->cpu = cpu;
|
||||
|
||||
/*
|
||||
* update process state to PROC_READY, but only if it's not waiting inside
|
||||
* another suspension queue.
|
||||
*/
|
||||
if (proc->sq_entries == NULL)
|
||||
atomic_store (&proc->state, PROC_READY);
|
||||
|
||||
/* attach to CPU's run list and increment load counter */
|
||||
list_append (cpu->proc_run_q, &proc->cpu_run_q_link);
|
||||
atomic_fetch_add (&cpu->proc_run_q_count, 1);
|
||||
|
||||
/* unlock */
|
||||
spin_unlock (&sq->lock, &ctxsq);
|
||||
spin_unlock (&proc->lock, &ctxpr);
|
||||
spin_unlock (&cpu->lock, &ctxcpu);
|
||||
|
||||
/* intermediate struct is no longer needed, so free it */
|
||||
free (sq_entry);
|
||||
|
||||
return PROC_NEED_RESCHEDULE;
|
||||
}
|
||||
----
|
||||
|
||||
As you can see, this algorithm is quite simple actually. It's just a matter of safely
|
||||
appending/removing links to specific lists.
|
||||
|
||||
==== The mutex itself
|
||||
|
||||
The mutex is quite simple too.
|
||||
|
||||
[source,c]
|
||||
----
|
||||
struct proc_mutex {
|
||||
struct proc_resource* resource; /* pointer back to the resource */
|
||||
|
||||
bool locked; /* taken or free */
|
||||
struct proc_suspension_q suspension_q; /* suspension queue of waiting processes */
|
||||
struct proc* owner; /* current owner of the mutex / process, which can enter the critical section */
|
||||
};
|
||||
----
|
||||
|
||||
And now the lock and unlock algorithms.
|
||||
|
||||
[source,c]
|
||||
----
|
||||
bool proc_mutex_lock (struct proc* proc, struct proc_mutex* mutex) {
|
||||
spin_lock_ctx_t ctxmt;
|
||||
|
||||
spin_lock (&mutex->resource->lock, &ctxmt);
|
||||
|
||||
/* we can enter if: the mutex is NOT locked or we're already the owner of the mutex */
|
||||
if (!mutex->locked || mutex->owner == proc) {
|
||||
mutex->locked = true;
|
||||
mutex->owner = proc;
|
||||
spin_unlock (&mutex->resource->lock, &ctxmt);
|
||||
return PROC_NO_RESCHEDULE;
|
||||
}
|
||||
|
||||
/* we failed to become an owner, thus we have to free up the CPU and suspend ourselves */
|
||||
return proc_sq_suspend (proc, &mutex->suspension_q, &mutex->resource->lock, &ctxmt);
|
||||
}
|
||||
|
||||
bool proc_mutex_unlock (struct proc* proc, struct proc_mutex* mutex) {
|
||||
spin_lock_ctx_t ctxmt, ctxsq;
|
||||
|
||||
spin_lock (&mutex->resource->lock, &ctxmt);
|
||||
|
||||
/*
|
||||
* we can't unlock a mutex, which we don't already own,
|
||||
* otherwise we could just steal someone else's mutex.
|
||||
*/
|
||||
if (mutex->owner != proc) {
|
||||
spin_unlock (&mutex->resource->lock, &ctxmt);
|
||||
return PROC_NO_RESCHEDULE;
|
||||
}
|
||||
|
||||
spin_lock (&mutex->suspension_q.lock, &ctxsq);
|
||||
|
||||
struct list_node_link* node = mutex->suspension_q.proc_list;
|
||||
|
||||
/*
|
||||
* This is a little tricky. What we're doing here is "direct handoff" - we basically
|
||||
* transfer ownership, while keeping the mutex locked, instead of just unlocking and
|
||||
* resuming everyone.
|
||||
*/
|
||||
|
||||
if (node) {
|
||||
struct proc_sq_entry* sq_entry = list_entry (node, struct proc_sq_entry, sq_link);
|
||||
struct proc* resumed_proc = sq_entry->proc;
|
||||
/* transfer ownership */
|
||||
mutex->owner = resumed_proc;
|
||||
mutex->locked = true;
|
||||
|
||||
spin_unlock (&mutex->suspension_q.lock, &ctxsq);
|
||||
spin_unlock (&mutex->resource->lock, &ctxmt);
|
||||
|
||||
/* resume new owner - it can enter the critical section now */
|
||||
return proc_sq_resume (resumed_proc, sq_entry);
|
||||
}
|
||||
|
||||
/* no possible future owner found, so we just unlock */
|
||||
mutex->locked = false;
|
||||
mutex->owner = NULL;
|
||||
|
||||
spin_unlock (&mutex->suspension_q.lock, &ctxsq);
|
||||
spin_unlock (&mutex->resource->lock, &ctxmt);
|
||||
|
||||
return PROC_NEED_RESCHEDULE;
|
||||
}
|
||||
----
|
||||
|
||||
===== Direct handoff and "The Thundering Herd"
|
||||
|
||||
image::/img/running-bison.jpg["A thundering herd of bison"]
|
||||
~https://www.eurowildlife.org/wp-content/uploads/2013/11/European-Wildlife-American-Bison-Hybrididisation-Risk.jpg~
|
||||
|
||||
Why the direct handoff? Why not just unlock the mutex?
|
||||
|
||||
The thundering herd problem is very common in multitasking and synchronization.
|
||||
|
||||
We free the mutex, wake up all waiters and now anyone can take it. The issue here is that ONLY ONE
|
||||
task will eventually get to acquire the mutex and enter the critical section. If only one gets to
|
||||
take the mutex, why bother waking up everyone? The solution here would be to pass down the ownership
|
||||
to the first encountered waiter and then the next one and so on. Also waking everyone would be cause
|
||||
a performance penatly, because we're wasting CPU cycles just to be told not to execute any further.
|
||||
|
||||
====== two birds, one stone
|
||||
|
||||
Via direct handoff, we've also just eliminated a fairness problem.
|
||||
|
||||
What does it mean for a mutex to be "fair"?
|
||||
|
||||
Fairness in synchronization means that each task at some point will get a chance to work in the
|
||||
critical section.
|
||||
|
||||
Fairness example problem:
|
||||
|
||||
[source,c]
|
||||
----
|
||||
#include <stdatomic.h>
|
||||
|
||||
#define SPIN_LOCK_INIT ATOMIC_FLAG_INIT
|
||||
|
||||
typedef atomic_flag spin_lock_t;
|
||||
|
||||
void spin_lock (spin_lock_t* sl) {
|
||||
while (atomic_flag_test_and_set_explicit (sl, memory_order_acquire))
|
||||
__asm__ volatile ("pause");
|
||||
}
|
||||
|
||||
void spin_unlock (spin_lock_t* sl) {
|
||||
atomic_flag_clear_explicit (sl, memory_order_release);
|
||||
}
|
||||
|
||||
// the lock
|
||||
spin_lock_t the_lock = SPIN_LOCK_INIT;
|
||||
|
||||
void thread_A (void) {
|
||||
for (;;) {
|
||||
spin_lock (&the_lock);
|
||||
{
|
||||
/* critical section goes here */
|
||||
}
|
||||
spin_unlock (&the_lock);
|
||||
}
|
||||
}
|
||||
|
||||
void thread_B (void) {
|
||||
for (;;) {
|
||||
spin_lock (&the_lock);
|
||||
{
|
||||
/* critical section goes here */
|
||||
}
|
||||
spin_unlock (&the_lock);
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
Let's take `thread_B` for example. `thread_B` locks, does it's thing and unlocks. Because it's in an infinite
|
||||
loop, there's a chance that it may re-lock too quickly, before `thread_A` gets the chance. This causes a
|
||||
fairness issue, because `thread_A` is getting starved - it can't enter the critical section, even though it
|
||||
plays by the rules, synchronizing properly with `thread_B` via a spinlock. On real CPUs (ie. real hardware
|
||||
or QEMU with -enable-kvm), there will always be some tiny delays/"clog ups" that mitigate this problem,
|
||||
but theoretically it is possible that a task can starve.
|
||||
|
||||
Our algorithm solves this by always selecting the next thread to own the mutex, so we never get the issue
|
||||
that when we unlock, the new owner is selected randomly - the first one to get it, owns it. This makes
|
||||
the approach more deterministic/predictible.
|
||||
|
||||
== Final result
|
||||
|
||||
This is the final result of our work!
|
||||
|
||||
[source,c]
|
||||
----
|
||||
#include <limits.h>
|
||||
#include <proc/local.h>
|
||||
#include <proc/proc.h>
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
#include <string/string.h>
|
||||
|
||||
/* ID of the mutex resource */
|
||||
#define MUTEX 2000
|
||||
|
||||
/* TLS (thread-local storage) support was added recently ;)! */
|
||||
LOCAL volatile char letter = 'x';
|
||||
|
||||
void app_proc (void) {
|
||||
char arg_letter = (char)(uintptr_t)argument_ptr ();
|
||||
|
||||
letter = arg_letter;
|
||||
|
||||
for (;;) {
|
||||
/* enter the critical section, print the _letter_ 3 times and leave */
|
||||
|
||||
mutex_lock (MUTEX);
|
||||
|
||||
for (int i = 0; i < 3; i++)
|
||||
test (letter);
|
||||
|
||||
mutex_unlock (MUTEX);
|
||||
}
|
||||
|
||||
process_quit ();
|
||||
}
|
||||
|
||||
void app_main (void) {
|
||||
mutex_create (MUTEX);
|
||||
|
||||
letter = 'a';
|
||||
|
||||
process_spawn (&app_proc, (void*)'b');
|
||||
process_spawn (&app_proc, (void*)'c');
|
||||
process_spawn (&app_proc, (void*)'d');
|
||||
|
||||
for (;;) {
|
||||
mutex_lock (MUTEX);
|
||||
|
||||
for (int i = 0; i < 3; i++)
|
||||
test (letter);
|
||||
|
||||
mutex_unlock (MUTEX);
|
||||
}
|
||||
}
|
||||
----
|
||||
Reference in New Issue
Block a user