Mutex is not a Binary Semaphore
Very few people know this - A mutex is not same as a binary semaphore. It is very different in implementation and has a very different purpose and usage. We discuss the details in this post.
Before we even begin, let us be on the same page with some terms and conventions to be used in the rest of the explanation.
- A > B > C in terms of execution priority!
- Added means the process was added to the queue.
- Exit would mean the process terminated.
- A and C share a resource.
- In our case, C acquires the resource first and A waits for C to release it!
- A dark circle represents a shared resource being acquired.
- The Hollow circle represents wanting to acquire the resource.
- The scheduler is called every time a Process is added.
Binary semaphore
Look at what happens when A is added (@ t=7) while B is executing. In the case of a semaphore being used between A and C - there is priority inversion!
A is a higher priority than B but, B gets to execute because B is a higher priority than C, and note that C has the resource that A depends on! So, A cannot be run anyway!
Notice how B gets to run while A is waiting. This is problematic, especially in a real-time operating system. How can we prevent a higher priority process from being blocked by a lower priority process holding a required resource? One solution is to increase the priority of the process holding the resource.
Mutex
If a mutex is used, the scheduler will be aware of the dependency between A and C, and prioritize C's execution. This minimizes priority inversion.
A mutex will inherit the process priority, and the scheduler can determine if a higher priority process is blocked by a resource acquired and held by a lower priority process. Note that in the figure above, as soon as process A tries to acquire the resource held by C (@ t=9), the scheduler lets C run and then immediately schedules A after C has released the resource (@ t=13).
Mechanism: the key difference
There are many differences between semaphores and mutexes, but the biggest one is absolute -
Note
On the usage
A mutex is usually considered for cases where access to a critical section/memory needs to be synchronized.
A and B both need to work on the same memory region. If A has the mutex for that shared memory, then B has to wait. Similarly, if B has the mutex, then A has to wait.
Process A cannot proceed until B completes an action. A waits for a signal from B, and both A and B use a semaphore.
An example of this in an embedded systems context is when A needs to turn on an LED only after B signals that a button was pressed. A can then wait on a semaphore that B gives when B detects the button press.
References
EDIT: 22 April, 2023
Parag Sangtani (one of the members of the community) provided the following links which suggests complete opposite of what I highlighted in this post above.
The fact that a binary semaphore can be used as a mutex in discussed in the links below -
Discussion