Software engineering notes

Go Scheduler

What is the Go Scheduler?

Go scheduler’s purpose is to efficiently distribute goroutines over multiple OS threads.

Kernel thread is expensive; therefore, reducing and reusing kernel threads are keys.

Scheduling Basics

P (Processor)

M (Machine, OS thread)

G (Goroutine)

Runqueue

P2 has no work to do and P1 has no work to steal (we will talk about work stealing later).

Global Runqueue
┏━━━━┓
┃ G7 ┃
┗━━━━┛

┏━━━━┓  ┏━━━━┓  ┏━━━━┓         ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M1 ┣━━┃ G1 ┃         ┃ P2 ┣━━┃ M2 ┣━━┃    ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛         ┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃                              ┃
  ┃    Local Runqueue            ┃    Local Runqueue
  ┃    ┏━━━━━━━┓                 ┃    ┏━━━━━━━┓
  ┗━━━━┫ empty ┃                 ┗━━━━┫ empty ┃
       ┗━━━━━━━┛                      ┗━━━━━━━┛

Since there was runnable G G7 in global runqueue, P2 got G7 from there.

Global Runqueue
┏━━━━━━━┓
┃ empty ┃
┗━━━━━━━┛

┏━━━━┓  ┏━━━━┓  ┏━━━━┓         ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M1 ┣━━┃ G1 ┃         ┃ P2 ┣━━┃ M2 ┣━━┃ G7 ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛         ┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃                              ┃
  ┃    Local Runqueue            ┃    Local Runqueue
  ┃    ┏━━━━━━━┓                 ┃    ┏━━━━━━━┓
  ┗━━━━┫ empty ┃                 ┗━━━━┫ empty ┃
       ┗━━━━━━━┛                      ┗━━━━━━━┛

Context Switching

┏━━━━━━━━━━━━━━━━━━━━━━┓
┃  CPU Core            ┃
┃┏━━━━━━━━━━━━━━━━━━━━┓┃
┃┃ Thread 1           ┣╋━━┓
┃┃ ┏━━━━┓┏━━━━┓┏━━━━┓ ┃┃  ┃
┃┃ ┃ G1 ┃┃ G2 ┃┃ G3 ┃ ┃┃  ┃
┃┃ ┗━━━━┛┗━━━━┛┗━━━━┛ ┃┃  ┣━━>OS scheduler
┃┗━━━━━━━━━━━━━━━━━━━━┛┃  ┃
┃┏━━━━━━━━━━━━━━━━━━━━┓┃  ┃
┃┃ Thread 2           ┣╋━━┛
┃┃ ┏━━━━┓┏━━━━┓┏━━━━┓ ┃┃
┃┃ ┃ G4 ┃┃ G5 ┃┃ G6 ┃ ┣╋━━━━━>Go scheduler
┃┃ ┗━━━━┛┗━━━━┛┗━━━━┛ ┃┃
┃┗━━━━━━━━━━━━━━━━━━━━┛┃
┗━━━━━━━━━━━━━━━━━━━━━━┛

Asynchronous System Calls

netpoller

G1 ran on M1 and made a network call, so it was moved to the netpoller. And P1 got G2 from its local runqueue and G2 was context-switched on M1. Until G1 finish its network call, it will be moved back to the P1’s local runqueue.

netpoller
┏━━━━┓
┃ G1 ┃
┗━━━━┛

┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M1 ┣━━┃ G2 ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃
  ┃    Local Runqueue
  ┃    ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
  ┗━━━━┫ G3 ┣━━┃ G4 ┣━━┃ G5 ┃
       ┗━━━━┛  ┗━━━━┛  ┗━━━━┛

Synchronous System Calls

G1 ran on M1 and made a system call that blocked M1. M1 with G1 was moved off from P1, then M2 was created and replace M1. Then P1 got G2 from its local runqueue and G2 was context-switched on M2. Until G1 finish its system call, it will be moved back to P1’s local runqueue and M1 will park as a idle thread for future use.

┏━━━━┓  ┏━━━━┓
┃ M1 ┣━━┃ G1 ┃
┗━━━━┛  ┗━━━━┛

┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M2 ┣━━┃ G2 ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃
  ┃    Local Runqueue
  ┃    ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
  ┗━━━━┫ G3 ┣━━┃ G4 ┣━━┃ G5 ┃
       ┗━━━━┛  ┗━━━━┛  ┗━━━━┛

Spinning Thread

For the reason that hand-off and thread parking/unparking increase latency, minimise these actions for optimal thread management.

The problem is that it is impossible to predict the future whether goroutines are comming. We do not want to just park a worker thread and then unpark it.

The solution is to maintain one “idle” thread on M for incoming goroutines, and this state of worker thread called “spinning”.

Work Stealing

P2 has finished all work and it will try to steal work from P1.

┏━━━━┓  ┏━━━━┓  ┏━━━━┓         ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M1 ┣━━┃ G1 ┃         ┃ P2 ┣━━┃ M2 ┣━━┃    ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛         ┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃                              ┃
  ┃    Local Runqueue            ┃    Local Runqueue
  ┃    ┏━━━━┓  ┏━━━━┓  ┏━━━━┓    ┃    ┏━━━━━━━┓
  ┗━━━━┫ G4 ┣━━┃ G5 ┣━━┃ G6 ┃    ┗━━━━┫ empty ┃
       ┗━━━━┛  ┗━━━━┛  ┗━━━━┛         ┗━━━━━━━┛

P2 stealed half of Gs (G4 and G5) from P1’s local runqueue and G4 was context-switched on M2.

┏━━━━┓  ┏━━━━┓  ┏━━━━┓         ┏━━━━┓  ┏━━━━┓  ┏━━━━┓
┃ P1 ┣━━┃ M1 ┣━━┃ G1 ┃         ┃ P2 ┣━━┃ M2 ┣━━┃ G4 ┃
┗━┳━━┛  ┗━━━━┛  ┗━━━━┛         ┗━┳━━┛  ┗━━━━┛  ┗━━━━┛
  ┃                              ┃
  ┃    Local Runqueue            ┃    Local Runqueue
  ┃    ┏━━━━┓                    ┃    ┏━━━━┓
  ┗━━━━┫ G6 ┃                    ┗━━━━┫ G5 ┃
       ┗━━━━┛                         ┗━━━━┛

Trace go scheduler

schedtrace

$ GOMAXPROCS=2 GODEBUG=schedtrace=1000 go run main.go
...
SCHED 2009ms: gomaxprocs=2 idleprocs=0 threads=4 spinningthreads=0 idlethreads=1 runqueue=0 [8 0]
...

schedetail

$ GOMAXPROCS=2 GODEBUG=schedtrace=1000,scheddetail=1 go run main.go
...
SCHED 0ms: gomaxprocs=2 idleprocs=0 threads=4 spinningthreads=1 idlethreads=1 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=0 syscalltick=0 m=3 runqsize=0 gfreecnt=0
  P1: status=0 schedtick=2 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M3: p=0 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=true blocked=false lockedg=-1
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=-1
  M1: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M0: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=true lockedg=1
  G1: status=1(chan receive) m=-1 lockedm=0
  G2: status=4(force gc (idle)) m=-1 lockedm=-1
  G3: status=4(GC sweep wait) m=-1 lockedm=-1
...

Ref: