Avoiding Data Races in Multithreaded Programming

4 minute read

In Spring 2022, one of the courses I took was ECE 566 - Parallel Processing. As part of the course, we were tasked with presenting a topic of our choice. Thomas and I chose to present on data races and data race detection methods. Our slides covered the basics of data races, data race prevention methods and different methods that can be used to detect data races. Recently, I decided to brush up on these ideas and I wanted to write a blog post based on our slides.

Let’s start with the concept of multithreading. Multithreading is a technique that allows multiple tasks to be executed concurrently in a single program. The program is divided into multiple threads, and each thread runs independently. Threads share the same memory space and resources as the main process. They can access and modify the same data. If the data that is accessed by multiple threads is not protected sufficiently, hard-to-debug issues may occur. These issues can be difficult to reproduce, find the cause of and fix. Data races are one common type of such bugs.

Data race occurs when

  • Two or more instructions from different threads access the same memory location concurrently,
  • At least one of these accesses is a ‘write’,
  • There is no synchronization that mandates any particular order among these accesses. When these three conditions hold, the order of access is nondeterministic and the program behaves in an unpredictable way. Consider the following example:
void thread_1() {
    foo = 1;
    // thread_2 might assign foo = 2 here
    printf("The value of counter in thread 1 is: %d\n", counter);
}

void thread_2() {
    foo = 2;
    // thread_1 might assign foo = 1 here
    if (foo == 2){
        printf(Nothing strange happened.\n);
    } else {
        printf(Data race!.\n)
    }
}

In this example, thread_1() and thread_2() are assigning different values to the same global variable. If these two threads are running concurrently, the order in which they execute their assignments is non-deterministic and the result of the program may change in each run. If the program does a critical operation depending on the value of the shared resource, the data race becomes a bug.

So, how do we prevent data races? We need a protection mechanism. Mutual exclusion locks, or mutexes, serve this purpose. A mutex ensures that only one thread can access a critical section of code at a time. Other threads are prevented from accessing the critical section until the first thread is done with it. It is important to unlock the mutex once the first thread is finished, otherwise other threads will be blocked indefinitely.

Now that we know about data races and how to prevent them, let’s look at the ways of detecting data races. Data race detection tools can be grouped into two categories: static analysis tools and dynamic analysis tools.

Static Analysis Tools

Static analysis tools analyze the source code of a program to find potential data races. They work at compile time, which means they do not need to run the program. Static analysis tools can be very effective at finding data races, but the drawback is that they can be very noisy due to high number of false reports.

Static analysis tools first discover the shared variables in the program. Then, they run a lockset analysis to find the potential places for data races. A lockset is a set of locks that must be held by a thread in order to access a shared variable safely. If two threads are accessing a shared variable with overlapping locksets, then there is a potential for a data race. Finally, static analysis tools run a warning reduction step to reduce the number of false reports. This step is typically based on heuristics, such as the fact that data races can only occur in concurrent portions of the code.

Dynamic Analysis Tools

Dynamic analysis tools execute the program and monitor its execution for potential data races. They track the access of shared variables. Since dynamic analysis tools actually run the code, they are typically more accurate than static analysis tools. However, dynamic analysis tools will miss some data races because they only analyze a particular run of the code. Bugs that are not in the path of execution will not be caught.

Dynamic analysis tools use the happens-before relationship. The happens-before relationship is a partial ordering on events in a multithreaded program. DJIT+ and DataCollider are two dynamic analysis tools that use the happens-before relationship to track the access history of shared variables and detect data races.

References

  1. V. Kahlon, Y. Yang, S. Sankaranarayanan, and A. Gupta, “Fast and accurate static data-race detection for concurrent programs,” in Computer Aided Verification: 19th International Conference, CAV 2007, Berlin, Germany, July 3-7, 2007. Proceedings 19, pp. 226-239, Springer, 2007.
  2. E. Pozniansky and A. Schuster, “MultiRace: efficient on-the-fly data race detection in multithreaded C++ programs,” Concurrency and Computation: Practice and Experience, vol. 19, no. 3, pp. 327-340, 2007.
  3. J. Erickson, M. Musuvathi, S. Burckhardt, and K. Olynyk, “Effective Data-Race Detection for the Kernel,” in 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10), 2010.
  4. J. Erickson, et al., “Dynamic Analyses for Data Race Detection,” Slideshow, Microsoft.
  5. Y. Wang, “Race Detection in Parallel Programming,” Computer Science Department, San Jose State University, [Online]
  6. Nathan’s Blog, “Finding Races in Firefox with ThreadSanitizer,” Mozilla Blog, Feb. 20, 2015. [Online]. Available: https://blog.mozilla.org/nfroyd/2015/02/20/finding-races-in-firefox-with-threadsanitizer/
  7. Y. Yong-chul and A. Gangopadhyay, “What Are Data Races? How to Avoid Them During Software Development,” MathWorks, [Online]. Available: https://www.mathworks.com/products/polyspace/static-analysis-notes/what-data-races-how-avoid-during-software-development.html