Lock/mutex primitives in multi-threading libraries

Locking/Mutex primitives in Pthreads

Spin locks

pthread_spin_{init|lock|trylock|unlock|destroy}

On x86_64, pthread_spin_lock is implemented by lock decl followed by pause. pthread_spin_trylock is implemented by lock cmpxchg followed by cmove (move if equal)

Mutexes

pthread_mutex_{init|lock|timedlock|trylock|unlock|destroy}
pthread_mutex_{get|set}prioceiling

The implementation of Mutex in Glibc changes from version to version. The latest version uses low-level locks (in Glibc source tree, nptl/sysdeps/unix/sysv/linux/x86_64/lowlevelock.h) in conjunction with the Fast Userspace Locking system call futex.

POSIX requires that a mutex to have an attribute called robustness (see here). If thread owning a robust mutex terminates while holding the mutex, the next thread that acquires the mutex shall be notified about the termination by the erroneous return value EOWNERDEAD. What the notified thread can do is either

  • Call pthread_mutex_consistent to make the mutex usable again, and then call pthread_mutex_unlock
  • Call pthread_mutex_unlock directly, then the mutex would end up in a permanently unusable state and all attempts to lock it again shall fail with the error ENOTRECOVERABLE. The unusbale mutex can only be removed by pthread_mutex_destroy call

POSIX also requires that a mutex to have an attribute called protocol (see here). By default, the protocol is None. If it is not None, it can be:

  • Priority Inheritance (PI): The owner of a PI mutex inherits the priority of the highest priority thread waiting for it until it releases it. Use the feature test macro _POSIX_THREAD_PRIO_INHERIT in unistd.h to see if this is supported.
  • Priority Protection (PP): A PP mutex is the one with a priority ceiling (pthread_mutex_{get|set}prioceiling). The owner of a PP mutex will have the highest priority of the priority ceilings of all the PP mutexes it currently owns, regardless of any other thread waiting for the mutex. Use the feature test macro _POSIX_THREAD_PRIO_PROTECT in unistd.h to see if this is supported.

Read/write locks

pthread_rwlock_{init|destroy}
pthread_rwlock_{rdlock|timedrdlock|tryrdlock}
pthread_rwlock_{wrlock|timedwrlock|trywrlock}

Conditional variables

pthread_cond_{init|wait|signal|broadcast|destroy}

Non-standard Pthread call

pthread_yield

Locking/Mutex primitives in GNU OpenMP

GCC supports OpenMP since version 4.x.0, and it uses GNU OpenMP. Details of OpenMP and automatic parallelization in GCC can be found in Diego Novillo's paper and slides

Mutexes

gomp_mutex_{lock|unlock} and gomp_ptrlock_{get|set}, which are inline functions.

gomp_mutex_* are for integers, while gomp_ptrlock_* work for pointers.

Depending on how libgomp is built (when running ./configure script, whether --enable-linux-futex, which is default, or --disable-linux-futex), gomp_mutex_{lock|unlock} will either use its own implementation (as in libgomp/config/linux/mutex.h), or use pthread_mutex_{lock|unlock} (as in libgomp/config/linux/posix.h)

In the former case, gomp_mutex_{lock|unlock} will do a quick test using the GCC built-in __sync_lock_test_and_set function, and if it fails, it falls through to gomp_mutex_{lock|unlock}_slow.

gomp_mutex_lock_slow will spin a couple million or billion times (can be controlled by the environmental variable OMP_WAIT_POLICY and GOMP_SPINCOUNT, and this spin count also depends on whether there is any oversubscription, i.e. more threads than the number of CPUs) before using the Fast Userspace Locking system call futex.

gomp_mutex_unlock_slow will simply call futex with opcode being FUTEX_WAKE.

gomp_ptrlock_{get|set}_slow are similar.

Semaphores

gomp_sem_{wait|post}, which are inline functions.

As in previous case, there are two implementations. The POSIX version uses pthread_mutex_{lock|unlock} and pthread_cond_{wait|signal}

The Linux version is very much like mutexes: gomp_sem_{wait|post} will do a quick test using the built-in __sync_bool_compare_and_swap function, and if it fails, it falls through to gomp_sem_{wait|post}_slow.

Semaphores here are used to implement Barriers.

Locking/Mutex primitives in Intel OpenMP

There are certain functions which seem related to locks (the intiail "k" refers to KAI, Kuck & Associates, Inc, which was a high-performance compiler vendor and was acquired by Intel in 2000.)
__kmp_acquire_lock
__kmp_init_lock
__kmp_wait_yield_4
__kmp_x86_pause
__kmp_acquire_ticket_lock
__kmp_query_cpuid
and some related to profiling:
kmp_gvs_event
kmp_send_omp_collector_event
Also note that the global variable __kmp_lock_method will be set to 1 if HyperThreading is available, and 2 otherwise.