On Lock-Free Programming and Concurrency

Summary

This article is organized by first outlining relevant terminology,and then exploring the definition and examples of Lock-Free DS/Programming.Subsequently,I touch upon current code/libraries, recent research literature on the same, and conclude. I’ve also classified resources that I read, so that it might be helpful/ time-saving to someone seeking to explore further. I am just learning this out of my own initiative, since it seemed a very interesting topic, and more importantly, I kept reading about experts talking and writing “You shouldn’t be doing this” . “It’s the last thing you should be trying to do” . Now,they forget basic toddler-psychology. Once anyone utters those words, it’s impossible to just ignore and move on. You are obligated to dig, get hurt, and get your hands dirty. : D : D 😀

Terminology:

There is quite a bit of confusing terminology with respect to Lock-Free Data Structures floating in the internet. It probably prompted this blog article by RethinkDB, an open source distributed database. They differentiate between wait-free and lock-free,and cite examples.

This blog article in 2013 on a blog dedicated only on concurrency tried to put an end to the nomenclature confusion. They state “wait free > lock free > blocking”, with respect to guarantees,but since wait-free and lock-free are not that popular, the definition and it’s interpretation is still ambiguous.

In this hacker news thread in 2014, we can see some more discussion on the confusion of terminology.

The book ,Art of multi-processor programming, is generally regarded as the most definitive resource to gain deeper understanding.

(a) Concurrency,Parallelism,Multithreading: Rob Pike gave a really nice talk on this topic.Here’s blog post/video of his talk I strongly recommend seeing the slides to first understand the difference between the two .The gopher-cart slides in particular are very illustrative of the concepts. . Concurrency does imply parallelism It can be present in a single-thread too .Take the example of kqueue or select /poll event-based TCP Servers. On the other hand, Multi-threaded systems achieve concurrency by parallelism.

Let’s see more on concurrency with respect to multi-threaded systems.

Here are the definitions referenced from the book, and in the lock-free documentation on boost.org . Tony Van Eerd gave a really nice talk on Lock Free programming at BoostCon 2010 Here are the slides of the same .

(b)Lock Free: Concurrent operations are guaranteed to be finished in a finite number of steps. In theory it is possible that some operations never make any progress, it is very unlikely to happen in practical applications.

(c)Wait Free: Every concurrent operation is guaranteed to be finished in a finite number of steps. It is therefore possible to give worst-case guarantees for the number of operations.

(d)Obstruction Free: A concurrent operation is guaranteed to be finished in a finite number of steps, unless another concurrent operation interferes

(e) Livelock: Livelock happens when process state changes with reference to each other, with no progress.

(f) ABA problem: In a single line if a memory location Thread 1 is noticing has value ‘A’, and Thread2 changes it to B,does some other stuff, and sets value back ‘A’, Thread1 won’t know anything about it. In the Lock-free world, it becomes important.

(g) Priority inversion occurs whenever a high-priority thread must wait for a low-priority thread.

What is Lock-Free programming? When do they become relevant?

Lock-Based Concurrency:

It’s important to have a refresh how exactly Lock-Based-Concurrent Data structures are implemented, before delving into Lock-Free Data Structures. For instance, that article has implementation for concurrent lock-based linked-lists (among others) , and shows different types of locking – global and lock-coupling or hand-over-hand locking, with their associated performance trade-offs.

Intuitive look at Lock Free programming : Look at the youtube tutorial over here, and notes over here the presenter at BoostCon, he says this- If you can visualize lock-size reducing from one over a big critical section,moving over to several locks smaller critical sections,ultimately leading to lock size of a CPU instruction, then it’s converging at “Lock-Freeness”. Hence we have importance of atomic CAS instruction, and processor architecture specifics of the same.

He takes the example of a global counter incremented by two threads using atomic CAS operations, referenced from his slides/talk.

word atomic_increment(word & target) {

word oldval,newval;

do{

oldval = target; // READ

newval = oldval + 1; // MODIFY

}while ( !CAS(&target, oldval, newval) ); // WRITE or retry

return newval;

}

static int counter;

void thread1() {

int i =100;

while(i–) atomic_increment(&counter);

}

void thread2() {

int i = 100;

while(i–) atomic_increment(&counter);

}

Difference between Spin-Lock / CAS instruction :

In the atomic_increment() function, the do .. while loop suspiciously is similar to a spin-lock, and looks to be bad code. However, the CAS loop is not blocking. CAS loop can be intuitively thought of as an atomic ” if the target is still what it just was, set it to a new value”

When the 2 threads in parallel, try to modify the variable counter, at least one of them guaranteed to succeed.

So, now we can present the official definition

“ An algorithm is lock-free if at all times at least one thread is guaranteed to be making progress”

Few pertinent points on the Lock-Free:

(1) Lock-Free approach usually implies the usage of hardware-based atomic CAS (Compare And Swap) operations. Preshing’s introduction to the topic cover’s this part sequential consistency,atomicity, memory models, instruction reordering.

(2) Lock-Free approach is a trade-off for throughput and mean latency for predictable latency.If the amount of contention is less, then the lock-free approach will get lesser work done compared to lock-based approach,since hardware CAS can slow done performance. If contention is higher, more throughput and lesser latency is possible

(3) Lock-Free approach is typically taken not for the entire application,but only for a particular set of operations.

Here’s a page with a nice bunch of links about Lock-Free programming

How do different languages support Lock-Free programming?

There have been three different school of thoughts to address or solve problems involving concurrency. Discussed in this article i n greater detail.

They are

(a) Lock -free data structures

(b) Fine -grained algorithms

Languages like Go and Erlang have built-in concurrency features like non-blocking I/O,user-space concurrency, advanced scheduling,and easy messaging-unlike more mainstream languages like C++,C,Java .

In C++ /Java we can find implementations of Lock-free libraries (Boost C++ or Facebook Folly Library or Dr Cliff’s Lock free hashtable in java), among

others.

In Concurrent-Haskell, they approach to taken solve the problem is through Software Transactional Memory

In Erlang concurrency is achieved through Message passing and the Actor Model. Check out this article dealing with the same.

In Go, the approach to concurrency is through channels. On their official page they state, ” Don’t communicate by sharing memory; share memory by communicating.Channels allow you to pass references to data structures between goroutines.”

Purely functional data structures are immutable and lock-Free, and are called Persistent Data Structures- which automatically store all previous modifications to the data structure. Here’s an interesting stackoverflow thread on the same.

What are the difficulties of Lock-free programming?

Difficulty in Lock-Free programming:

Multi-threaded applications using Lock-based-Concurrent structures give enough pain(or joy) as it is to Engineers. When there are serious performance issues involving locking and contention, Lock-Free approach may prove to be beneficial.

(1) In this stackoverflow thread , we see a discussion about when lock-free data-structures or programming performs less than lock-based. In this stackoverflow thread, we can see cases questioning into performance benefits of Lock-Free .

(2) Writing Lock-free is hard, or harder as Herb Sutter, a widely known C++ expert, has elucidated here in 2008. Among his chief observations-

(a) It’s difficult even for experts to get it right.

(b) In 2008, there wasn’t an implementation of Doubly-Linked-Lists which were lock-free. Most Likely, people get famous or publish or earn their degrees by implementing lock-free data structures.

And hey, I came across a 2014 Technical Report on Lock-Free Binary Search Trees.

Here’s a well cited Lock-free B+ Tree published in 2012- the first implementation about Lock-Free B+ Trees, from their abstract.

What are some code-bases to read about Lock-Free Data Structures/Algorithms?

(1)Dr Cliff Click from Azul Systems has published papers, and released code which are well-appreciated. For instance: It’s about Lock-Free Hash-Table

(2) Dmitry Vyukov is another highly appreciated person in this space. In his site he has this : concurrent hashmap .

Additionally, his site has a good treasure-trove of articles,links and code.

(3)Andrei Alexandrescu & others at Facebook released Folly C++ Library, which implements Concurrent Queues among other things.

CONCLUSION:

I’m just scratching at the surface of Lock-Free programming and the black magic of advanced concurrency. It is/was fun learning about the same. 🙂 As an additional side-effect, I am definitely more curious to learn about different languages and their approaches to concurrency- in particular functional programming languages.

REFERENCES:

Books,Slides,and in-depth articles :

[1] http://www.cs.cmu.edu/~410-s05/lectures/L31_LockFree.pdf

[2] http://www.amazon.com/The-Multiprocessor-Programming-Revised-Reprint/dp/0123973376

[3] http://www.quora.com/Data-Structures/What-are-good-resources-for-learning-about-lock-free-data-structures

[4] http://kukuruku.co/hub/cpp/lock-free-data-structures-introduction

[5] http://www.drdobbs.com/lock-free-data-structures/184401865

[6] http://pages.cs.wisc.edu/~remzi/OSTEP/threads-locks-usage.pdf

[7] http://www.drdobbs.com/cpp/lock-free-code-a-false-sense-of-security/210600279

[8] http://web.stanford.edu/class/ee380/Abstracts/070221_LockFreeHash.pdf

[9] http://www.1024cores.net/home/lock-free-algorithms

[10] http://preshing.com/20120612/an-introduction-to-lock-free-programming/

[11] Double Checked Locking paper : http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Discussion about Terminology/Nomenclature:

[12] http://rethinkdb.com/blog/lock-free-vs-wait-free-concurrency/

[13] https://news.ycombinator.com/item?id=7815443

[14] http://concurrencyfreaks.blogspot.in/2013/05/lock-free-and-wait-free-definition-and.html

[15] https://groups.google.com/forum/#!topic/mechanical-sympathy/Tm3xRcvpnzE
[16]http://www.boost.org/doc/libs/1_53_0/doc/html/lockfree.html
Discussion of related topics, different approaches to parallelism

[17]http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

[18] On immutability and the need for locks 

[19] http://www.scribd.com/doc/28336206/Multicore-programming-in-Haskell

[20] http://talks.golang.org/2012/waza.slide#1 
[21] http://blog.golang.org/concurrency-is-not-parallelism

[22] http://learnyousomeerlang.com/the-hitchhikers-guide-to-concurrency
[23]  http://kachayev.github.io/talks/kharkivpy%230/#/
[24]  http://arschles.github.io/2014/06/25/concurrency-mem-mgmt.html

 Stackoverflow Threads:

[25] when-are-lock-free-data-structures-less-performant-than-mutual-exclusion-mutexes? 
[26]Are-lock-free-algorithms-really-more-performant-than-their-lock-full-counterpart?
[27] does-immutability-entirely-eliminate-the-need-for-locks-in-multi-processor-programming?
[28]  http://stackoverflow.com/questions/6945288/lock-free-programming-in-haskell
http://stackoverflow.com/questions/1050222/concurrency-vs-parallelism-what-is-the-difference

Good bunch of Links listed:

[29] http://yinsochen.com/thread-safe-and-or-lockless-data-structures

[30] http://www.1024cores.net/home/lock-free-algorithms/links

 
Wikipedia articles:  
[31]  http://en.wikipedia.org/wiki/Persistent_data_structure
[32] http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms
[33]  http://en.wikipedia.org/wiki/Herb_Sutter

Research Work: 
 [34] http://arxiv.org/pdf/1404.3272v3.pdf
 [35] http://dl.acm.org/citation.cfm?id=2312016
 [36] Publications of Nir Shavit , one of authors of "Art of Multiprocessor programming"

Code Resources:
 [37] Dr Cliff's Lock-Free Hash Table 
 [38] Dmitry Vyukov's concurrent hashmap 
 [39]  Boost Library has Lockfree constructs 


Video Tutorial: 
 [40]  http://www.youtube.com/watch?v=LbOB_moUa94
 [41] Notes of the tutorial: 
      http://zao.se/~zao/boostcon/10/2010_presentations/tue/lockfree_boostcon2010_notes.pdf

2 responses to “On Lock-Free Programming and Concurrency”

Pedro Ramalhete says:

July 13, 2014 at 1:21 pm

Hi,
In the “Terminology” section a), the link to Rob Pike’s presentation is actually the link to Tony Van Eerd presentation at BoostCont 2010 (which is even more instructive!).

It seems to me, that out of all the Lock-Free definitions, you suggest the usage of the one that says:
“An algorithm is lock-free if at all times at least one thread is guaranteed to be making progress”
Personally, I prefer the one by Maurice Herlihy and Nir Shavit, taken from “The Art of Multiprocessor Programming”:
“A method is lock-free if it guarantees that infinitely often some method call finishes in a finite number of steps”.

The difference between the two is subtle but important: Imagine the case where multiple threads attempt an operation and they all fail, but this failure is non-deterministic, and there will be a non-zero probability that at least one of the threads will succeed, even if for some consecutive tries all will fail.
Under the first definition (the one you propose), this scenario would not be Lock-Free, while under the second definition (by Maurice Herlihy) it would be Lock-Free. Which one is best can be disputed, but if you use the first definition, then what will the scenario I’ve described be? Blocking? Something else?

The different definitions of Blocking, Obstruction-Free, Lock-Free, Wait-Free are all inter-related, such hat they only make sense when defined together, so that no scenario “escapes through the cracks”… this stuff is hard enough as it is 😉

Cool post though!

Cheers,
Pedro

- Arvind Subramanian says:
  
  July 21, 2014 at 9:51 am
  
  Hi Pedro,
  
  Thanks a lot for your comment!
  I updated the links in the article to correctly reflect what you pointed out. 🙂
  
  I had actually referenced Ton Van ‘s presentation for the definition I had listed here, but I did know I didn’t do enough justice to it. Your blog article about the definition definitely covered that in far greater detail than what I have done ( which I had referenced !)
  
  Thanks a lot for taking the time to clarify about the definition.
  
  Considering that I have been spending quite a bit of time on your blog to learn more, it’s pretty awesome and encouraging that you found this article nice!! 🙂
  
  Cheers,
  Arvind

Arvind Subramanian's blog

Cogito,Codeito,Ergo,Sum

On Lock-Free Programming and Concurrency

Summary

Terminology:

What is Lock-Free programming? When do they become relevant?

How do different languages support Lock-Free programming?

What are the difficulties of Lock-free programming?

What are some code-bases to read about Lock-Free Data Structures/Algorithms?

CONCLUSION:

REFERENCES:

2 responses to “On Lock-Free Programming and Concurrency”

Leave a comment Cancel reply

Summary

Terminology:

What is Lock-Free programming? When do they become relevant?

How do different languages support Lock-Free programming?

What are the difficulties of Lock-free programming?

What are some code-bases to read about Lock-Free Data Structures/Algorithms?

CONCLUSION:

REFERENCES:

Related

2 responses to “On Lock-Free Programming and Concurrency”

Leave a comment Cancel reply