Provides the processor with a hint displaying that the current code is in a spin loop.
The YieldProcessor routine improves the performance of spin loops by providing the processor with a hint displaying that the current code is in a spin loop.
A spin loop is a way of delaying the execution of a thread without yielding control of the thread to the OS, causing the OS to enter its scheduler and choose another thread to run. If the OS enters its scheduler, this will mean a context switch, which involves:
- Saving the thread's context (state of CPU registers, and so on)
- A kernel transition from user mode to kernel mode
- Selecting a new thread to run and loading its context
- A kernel transition from kernel mode to user mode
- Any state in the CPU cache, branch prediction buffer (also known as branch history table), translation lookaside buffers (TLB) for caching virtual memory page mappings, and so on, that helped make the code in the current thread "hot" and run faster are lost when the new thread picks up where it left off yet somewhere else entirely, possibly in a different process altogether.
If the expected delay for the thread is small, smaller than a thread slice (which is on the order of milliseconds), then it can make sense to avoid the costs of a context switch and wait using a spin loop rather than by yielding the thread (such as by calling the Win32 function Sleep() with 0 as the argument).
Such cases arise when using so-called lock-free techniques and synchronization primitives. For example, a user-mode synchronization primitive may be implemented by an atomic conditional test and set operation like InterlockedCompareExchange. To reliably and correctly change the state of a variable shared amongst threads where the new state of the variable depends in some way on its previous state, you must either use an OS-provided synchronization primitive, or an atomic test-and-set in a loop. Using the OS-provided synchronization primitive will mean inducing a context switch in the case of contention. An atomic test-and-set in a loop simply fails every time contention occurs; hence you loop and try again. But if you try again too soon, you may be doing busy-work, as the other thread(s) interfering are still busy. So you want to wait just long enough to increase the likelihood of the other thread(s) have moved on; hence a spin loop.
Warning: When looking to improve performance using spin loops and lock-free techniques, you might be introducing a fairly large risk of errors in return for medium chance of performance gains.