Concurrency in Go: Race Conditions and their Solutions

Concurrency in Go: Race Conditions and their Solutions

·

6 min read

You can find the Chinese version at 用一個小例子談談 Golang 中的 Race Condition.

Goroutine is one of the most important features of Go. It allows developers to easily achieve concurrency. Also, It is lightweight, making it feasible to create a substantial number of goroutines without difficulty.

However, if race conditions are not considered when using goroutines, it can lead to incorrect results. This article presents a straightforward example to illustrate situations where race conditions may arise and provides guidance on detecting and resolving it.

Race Condition

According to the definition of Race Condition on Wikipedia:

A race condition or race hazard is the behavior of electronics, software, or other systems where the output is dependent on the sequence or timing of other uncontrollable events.

For example, if there are two ongoing goroutines that both perform operations on the variable a such as a = a * 2 and a = a + 1, the different order of these goroutines may result in different values for the variable a. This is a race condition. To prevent race conditions, it is crucial to implement measures that establish a deterministic order, thereby avoiding unpredictable bugs.

Let's consider the following example, which consists of three steps.

  1. Set the initial value of the variable a to 0.

  2. Start 3 goroutines that each do a++ once.

  3. Use a channel to wait for all three goroutines to complete.

func main() {
    a := 0
    times := 3
    c := make(chan bool)

    for i := 0; i < times; i++ {
        go func() {
            a++
            c <- true
        }()
    }

    for i := 0; i < times; i++ {
        <-c
    }
    fmt.Printf("a = %d\n", a)
}

If everything goes well, the expected result should be a = 3, and it does match the actual output. Now, let's modify the number of iterations to ten thousand:

func main() {
    a := 0
    times := 10000  // <-- HERE
    c := make(chan bool)

    for i := 0; i < times; i++ {
        go func() {
            a++
            c <- true
        }()
    }

    for i := 0; i < times; i++ {
        <-c
    }
    fmt.Printf("a = %d\n", a)
}

Theoretically, we should expect a = 10000, but running the code may yield results like a = 9903. So weird, we do create ten thousand goroutines and run a++ ten thousand times, so why is the result incorrect? It's because a race condition occurred during the a++ operation.

Why does a++ result in a race condition?

When you write a++, the computer actually performs three steps:

  1. The CPU retrieves the value of a.

  2. It adds 1 to the retrieved value.

  3. It stores the result back into the variable a.

In case you have a multi-core CPU, the following scenario is possible:

Two CPUs simultaneously retrieve the value of a, increment it independently and store it back, leading to the variable a being incremented only once. As a result, the output 9903 is smaller than the expected value of 10000.

Solution: Mutex Lock

The fundamental reason for this race condition is that "two goroutines can access the variable 'a' simultaneously." To restrict only one goroutine from executing a++ at a time, we can use the sync.Mutex package.

func main() {
    a := 0
    times := 10000
    c := make(chan bool)

    var m sync.Mutex   // init the lock

    for i := 0; i < times; i++ {
        go func() {
            m.Lock()   // acquire lock
            a++
            m.Unlock() // release lock
            c <- true
        }()
    }

    for i := 0; i < times; i++ {
        <-c
    }
    fmt.Printf("a = %d\n", a)
}

The term "Mutex" stands for mutual exclusion, where only one person (goroutine) can hold the lock at a time. If others attempt to acquire the lock, they will be blocked until the previous holder releases it. By using a mutex, we ensure that only one goroutine performs 'a++' at any given time, resulting in the correct output of 10000.

How to detect race conditions:

In this example, the race condition occurs during a++. However, it may be challenging to identify such issues if you are not familiar with low-level computer operations. Fortunately, Go provides a powerful tool called the Data Race Detector.

func main() {
    a := 0
    times := 10000
    c := make(chan bool)

    for i := 0; i < times; i++ {
        go func() {
            a++
            c <- true
        }()
    }

    for i := 0; i < times; i++ {
        <-c
    }
    fmt.Printf("a = %d\n", a)
}

By running the code with the '-race' flag, the Data Race Detector can help detect potential race conditions. You can clone the source code from the repository and run it yourself.

$ go run -race main.go
==================
WARNING: DATA RACE
Read at 0x00c4200a4008 by goroutine 7:
  main.main.func1()
      .../add_few_times/main.go:12 +0x38Previous write at 0x00c4200a4008 by goroutine 6:
  main.main.func1()
      .../add_few_times/main.go:12 +0x4eGoroutine 7 (running) created at:
  main.main()
      .../add_few_times/main.go:11 +0xc1Goroutine 6 (running) created at:
  main.main()
      .../add_few_times/main.go:11 +0xc1
==================
Found 1 data race(s)
exit status 66

The Race Detector warns about a data race. It indicates that goroutine G7 reads variable a before G6 has written to it, which could result in G7 reading outdated data. This is a possible race condition. Through the Race Detector, you can detect almost all race conditions.

Drawbacks of using locks

Performance

In the previous example, we used a mutex to prevent multiple goroutines from accessing the same variable simultaneously. However, since there are ten thousand goroutines, when one of them accesses the variable, the other 9999 goroutines are waiting. This results in no parallelism between them. In comparison, using a loop to increment the variable from 0 to 10000 might be faster.

Therefore, when using locks, it is essential to be cautious and use them only when necessary. Otherwise, the performance of the program may be significantly decreased.

Forgetting to unlock

Sometimes, locking and unlocking is not as simple as in this case. There may be multiple locks, various conditional checks, network requests, and other complexities. In complex scenarios, there is a possibility of forgetting or delaying the unlock operation, leading to a slow or completely stuck program, causing a Deadlock issue.

Summary

In this article, we discussed when race conditions may occur in Go and how to address them. Due to the ease of creating goroutines, it is easy to unintentionally overlook race conditions. Fortunately, Go provides tools like the Race Detector to detect such issues without the need for manual searching.

When I primarily wrote Node.js, I didn't have to worry about race conditions because JavaScript is single-threaded. Tasks are not interrupted halfway through. Also, there are no simultaneous accesses to variables by multiple threads. It was easy to JavaScript write code, but the downside was that time-consuming computations could potentially block the entire program. Each approach has its pros and cons.

Lastly, this article refers to the source code in this GitHub repository. Feel free to clone it and try it yourself.

References