Locking when No One’s Looking

You’re getting ready to start a thread. But wait! There are some variables you need to get ready before then. Those variables are read by the thread. How do you know the new values of those variables will be visible to the thread?

var = 1;
thread = std::thread([this] {
    use(var);  // Will I see "1"?
});

One conservative approach might be to obtain a lock while modifying those variables. This would make a lot of sense, particularly if those variables are typically guarded by that lock during the lifetime of the thread. At the same time, it does feel a little silly obtaining a lock that you know will never be contended, as the only other thing that would lock that mutex has not started yet.

{
    std::lock_guard lock(mutex);
    var = 1;
}
thread = std::thread([this] {
    std::lock_guard lock(mutex);  // Even without this
    use(var);
});

Modifying those variables under a lock is for-sure safe: releasing the lock after modifying will do a “release” memory barrier, ensuring all writes done will be visible. The thread reading them has not started yet, so there is no possible way anything could be reordered too early to see the release. If the thread also locks the mutex before reading those values, it is even more obviously safe.

But is any of that really needed? Intuitively it shouldn’t be needed, but memory models are notoriously tricky things, so let’s check our assumptions.

The C++ standard has this to say under the specification for std::string’s constructor:

Synchronization: The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f.

That sounds good, but is it what we want? What does “synchronizes with” mean? That’s defined elsewhere:

Certain library calls synchronize with other library calls performed by another thread. For example, an atomic store-release synchronizes with a load-acquire that takes its value from the store ([atomics.order]). [ Note: Except in the specified cases, reading a later value does not necessarily ensure visibility as described below. Such a requirement would sometimes interfere with efficient implementation.  — end note ] [ Note: The specifications of the synchronization operations define when one reads the value written by another. For atomic objects, the definition is clear. All operations on a given mutex occur in a single total order. Each mutex acquisition “reads the value written” by the last mutex release.  — end note ]

It’s not necessarily a good sign – that first note (“Except in the specified cases, reading a later value does not necessarily ensure visibility”) is a little concerning. But the later cases do save us. Per the later definitions:

  • “synchronizes with”: Yes, per std::thread constructor specification.
  • “carries a dependency”: Irrelevant.
  • “dependency-ordered before”: Irrelevant (but I believe no such relation exists).
  • “inter-thread happens before”: Yes per (9.1), since the “synchronizes with” relation exists.
  • “happens before”: Yes per (10.2), since the “inter-thread happens before” relation exists.
  • “strongly happens before”: Yes per (11.2), since the “synchronizes with” relation exists.

Then, we should have enough information to evaluate whether there is a “visible side effect”:

A visible side effect A on a scalar object or bit-field M with respect to a value computation B of M satisfies the conditions:

  • (12.1) A happens before B and
  • (12.2) there is no other side effect X to M such that A happens before X and X happens before B.

The value of a non-atomic scalar object or bit-field M, as determined by evaluation B, shall be the value stored by the visible side effect A. [ Note: If there is ambiguity about which side effect to a non-atomic object or bit-field is visible, then the behavior is either unspecified or undefined.  — end note ] [ Note: This states that operations on ordinary objects are not visibly reordered. This is not actually detectable without data races, but it is necessary to ensure that data races, as defined below, and with suitable restrictions on the use of atomics, correspond to data races in a simple interleaved (sequentially consistent) execution.  — end note ]

We know that A happens-before B, so (12.1) is satisfied. Assuming there are no other assignments to var, there are no other side-effects X to M, so (12.2) is satisfied as well. Hence, accessing var inside the thread is guaranteed to see the value set before creating the thread. This applies whether or not any locking is used.

This shows that the expected behavior is in fact guaranteed by the standard, and no lock is needed in this case.

Tags: , ,

Leave a Reply