The idea here is to move the computationally intensive jobs to C (using special macros), independent of Python, and have the C code release the GIL while it's working.
#include "Python.h"
...
PyObject *pyfunc(PyObject *self, PyObject *args) {
...
Py_BEGIN_ALLOW_THREADS
// Threaded C code
...
Py_END_ALLOW_THREADS
...
}