Why is there a GIL?
The GIL has been around in CPython since the inception of Python threads, in 1992. It's designed to ensure thread safety of running python code. Python interpreters written with a GIL prevent multiple native threads from executing Python bytecodes at once. This makes it easy for plugins to ensure that their code is thread-safe: simply lock the GIL, and only your active thread is able to run, so your code is automatically thread-safe.
Short version: the GIL ensures that no matter how many processors and threads you have, only one thread of a python interpreter will run at one time.
This has a lot of ease-of-use benefits, but also has a lot of negative benefits as well.
Note that a GIL is not a requirment of the Python language. Consequently, you can't access the GIL directly from standard python code. Not all implementations of Python use a GIL.
Interpreters that have a GIL: CPython, PyPy, Cython (but you can disable the GIL with nogil
)
Interpreters that do not have a GIL: Jython, IronPython
Details on how the GIL operates:
When a thread is running, it locks the GIL. When a thread wants to run, it requests the GIL, and waits until it is available. In CPython, before version 3.2, the running thread would check after a certain number of python instructions to see if other code wanted the lock (that is, it released the lock and then requested it again). This method tended to cause thread starvation, largely because the thread that released the lock would acquire it again before the waiting threads had a chance to wake up. Since 3.2, threads that want the GIL wait for the lock for some time, and after that time, they set a shared variable that forces the running thread to yield. This can still result in drastically longer execution times, though. See the links below from dabeaz.com (in the references section) for more details.
CPython automatically releases the GIL when a thread performs an I/O operation. Image processing libraries and numpy number crunching operations release the GIL before doing their processing.
Benefits of the GIL
For interpreters that use the GIL, the GIL is systemic. It is used to preserve the state of the application. Benefits include:
- Garbage collection - thread-safe reference counts must be modified while the GIL is locked. In CPython, all of garbarge collection is tied to the GIL. This is a big one; see the python.org wiki article about the GIL (listed in References, below) for details about what must still be functional if one wanted to remove the GIL.
- Ease for programmers dealing with the GIL - locking everything is simplistic, but easy to code to
- Eases the import of modules from other languages
Consequences of the GIL
The GIL only allows one thread to run python code at a time inside the python interpreter. This means that multithreading of processes that run strict python code simply doesn't work. When using threads against the GIL, you will likely have worse performance with the threads than if you ran in a single thread.
References:
https://wiki.python.org/moin/GlobalInterpreterLock - quick summary of what it does, fine details on all the benefits
http://programmers.stackexchange.com/questions/186889/why-was-python-written-with-the-gil - clearly written summary
http://www.dabeaz.com/python/UnderstandingGIL.pdf - how the GIL works and why it slows down on multiple cores
http://www.dabeaz.com/GIL/gilvis/index.html - visualization of the data showing how the GIL locks up threads
http://jeffknupp.com/blog/2012/03/31/pythons-hardest-problem/ - simple to understand history of the GIL problem
https://jeffknupp.com/blog/2013/06/30/pythons-hardest-problem-revisited/ - details on ways to work around the GIL's limitations