Modeling a CLH lock in TLA+

Reproducing a Java 21 virtual threads... CrowdStrike: how did we get here?

Modeling a CLH lock in TLA+

The last post I wrote mentioned in passing how Java’s ReentrantLock class is implemented as a modified version of something called a CLH lock. I’d never heard of a CLH lock before, and so I thought it would be fun to learn more about it by trying to model it. My model here is based on the implementation described in section 3.1 of the paper Building FIFO and Priority-Queueing Spin Locks from Atomic Swap by Travis Craig.

This type of lock is a first-in-first-out (FIFO) spin lock. FIFO means that processes acquire the lock in order in which they requested it, also known as first-come-first-served. A spin lock is a type of lock where the thread runs in a loop continually checking the state of a variable to determine whether it can take the lock. (Note that Java’s ReentrantLock is not a spin lock, despite being based on CLH).

Also note that I’m going to use the terms process and thread interchangeably in this post. The literature generally uses the term process, but I prefer thread because of the shared-memory nature of the lock.

Visualizing the state of a CLH lock

I thought I’d start off by showing you what a CLH data structure looks like under lock contention. Below is a visualization of the state of a CLH lock when:

There are three processes (P1, P2, P3)P1 has the lockP2 and P3 are waiting on the lockThe order that the lock was requested in is: P1, P2, P3

The paper uses the term requests to refer to the shared variables that hold the granted/pending state of the lock, those are the boxes with the sharp corners. The head of the line is at the left, and the end of the line is at the right, with a tail pointer pointing to the final request in line.

Each process owns two pointers: watch and myreq. These pointers each point to a request variable.

If I am a process, then my watch pointer points to a request that’s owned by the process that’s ahead of me in line. The myreq pointer points to the request that I own, which is the request that the process behind in me line will wait on. Over time, you can imagine the granted value propagating to the right as the processes release the lock.

This structure behaves like a queue, where P1 is at the head and P3 is at the tail. But it isn’t a conventional linked list: the pointers don’t form a chain.

Requesting the lock

When a new process requests the lock, it:

points its watch pointer to the current tailcreates a new pending request object and points to it with its myreq pointerupdates the tail to point to this new objectReleasing the lock

To release the lock, a process sets the value of its myreq request to granted. (It also takes ownership of the watch request, setting myreq to that request, for future use).

Modeling desired high-level behavior

Before we build our CLH model, let’s think about how we’ll verify if our model is correct. One way to do this is to start off with a description of the desired behavior of our lock without regard to the particular implementation details of CLH.

What we want out of a CLH lock is a lock that preserves the FIFO ordering of the requests. I defined a model I called FifoLock that has just the high-level behavior, namely:

it implements mutual exclusionthreads are granted the lock in first-come-first-served orderExecution path

Here’s the state transition diagram for each thread in my model. The arrows are annotated with the name of the TLA+ sub-actions in the model.

FIFO behavior

To model the FIFO behavior, I used a TLA+ sequence to implement a queue. The Request action appends the thread id to a sequence, and the Acquire action only grants the lock to the thread at the head of the sequence:

Request(thread) == /\ queue' = Append(queue, thread) ...Acquire(thread) == /\ Head(queue) = thread ...

The FifoLock behavior model can be found at https://github.com/lorin/clh-tla/blob/main/FifoLock.tla.

The CLH model

Now let’s build an actual model of the CLH algorithm. Note that I called these processes instead of threads because that’s what the original paper does, and I wanted to stay close to the paper in terminology to help readers.

The CLH model relies on pointers. I used a TLA+ function to model pointers. I defined a variable named requests that maps a request id to the value of the request variable. This way, I can model pointers by having processes interact with request values using request ids as a layer of indirection.

---- MODULE CLH ----...CONSTANTS NProc, GRANTED, PENDING, Xfirst == 0 \* Initial request on the queue, not owned by any processProcesses == 1..NProcRequestIDs == Processes \union {first}VARIABLES requests, \* map request ids to request state ...TypeOk == /\ requests \in [RequestIDs -> {PENDING, GRANTED, X}] ...

(I used “X” to mean “don’t care”, which is what Craig uses in his paper).

The way the algorithm works, there needs to be an initial request which is not owned by any of the process. I used 0 as the Request id for this initial request and called it first.

The myreq and watch pointers are modeled as functions that map from a process id to a request id. I used -1 to model a null pointer.

NIL == -1TypeOk == /\ myreq \in [Processes -> RequestIDs] /\ watch \in [Processes -> RequestIDs \union {NIL}] ...Execution path

The state-transition diagram for each process in our CLH model is similar to the one in our abstract (FifoLock) model. The only real difference is that requesting the lock is done in two steps:

Update the request variable to pending (MarkPending)Append the request to the tail of the queue (EnqueueRequest)

The CLH model can be found at https://github.com/lorin/clh-tla/blob/main/CLH.tla.

Checking our CLH model’s behavior

We want to verify that our CLH model always behaves in a way consistent with our FifoLock model. We can check this in using the model checker by specifying what’s called a refinement mapping. We need to show that:

We can create a mapping from the CLH model’s variables to the FifoLock model’s variables.Every valid behavior of the CLH model satisfies the FifoLock specification under this mapping.The mapping

The FifoLock model has two variables, state and queue.

---- MODULE FifoLock ----...VARIABLES state, queueTypeOK == /\ state \in [Threads -> {"ready", "requested", "acquired", "in-cs"}] /\ queue \in Seq(Threads)

The state variable is a function that maps each thread id to the state of that thread in its state machine. The dashed arrows show how I defined the state refinement mapping:

Note that two states in the CLH model (ready, to-enqueue) map to one state in FifoLock (ready).

For the queue mapping, I need to take the CLH representation (recall it looks like this)…

…and define a transformation so that I end up with a sequence that looks like this, which represents the queue in the FifoLock model:

To do that, I defined a helper operator called QueueFromTail that, given a tail pointer in CLH, constructs a sequence of ids in the right order.

Unowned(request) == ~ \E p \in Processes : myreq[p] = request\* Need to reconstruct the queue to do our mappingRECURSIVE QueueFromTail(_)QueueFromTail(rid) == IF Unowned(rid) THEN <<>> ELSE LET tl == CHOOSE p \in Processes : myreq[p] = rid r == watch[tl] IN Append(QueueFromTail(r), tl)

Then I can define the mapping like this:

Mapping == INSTANCE FifoLock WITH queue CASE state[p]="ready" -> "ready" [] state[p]="to-enqueue" -> "ready" [] state[p]="waiting" -> "requested" [] state[p]="acquired" -> "acquired" [] state[p]="in-cs" -> "in-cs"]Refinement == Mapping!Spec

In my CLH.cfg file, I added this line to check the refinement mapping:

PROPERTY Refinement

Note again what we’re doing here for validation: we’re checking if our more detailed model faithfully implements the behavior of a simpler model (under a given mapping). Note that we’re using TLA+ to serve two different purposes.

To specify the high-level behavior that we consider correct (our behavioral contract)To model a specific algorithm (our implementation)

Because TLA+ allows us to choose the granularity of our model, we can use it for both high-level specs of desired behavior and low-level specifications of an algorithm.

You can find my model at https://github.com/lorin/clh-tla

View more on Lorin Hochstein's website »

Like • 0 comments • flag

Published on August 04, 2024 16:47

No comments have been added yet.

Lorin Hochstein's Blog

Lorin Hochstein's profile
35 followers