A look into functools implementation from CPython

As someone who's still relatively new to Python, I'm on a quest to get a better understanding of the language and it's tools. As part of that, I decided to take a look under the hood at how some of the modules in the standard library have been implemented. Being that "functions as first class citizens" is one of my favorite Python features, I decided to start with the functools module.

First step was to get the source code. There's a CPython semi-oficial repository in Github that can be a good start. But eventually, I opted for the more official way and downloaded the source code from the Python's foundation downloads page so that I could comfortably explore it with my text editor.

Glancing at the top level directories, and since I'm looking for a module in the standard library, Lib caught my attention. Expanding, there was a lot of familiar names, including a file called functools.py. Well, that was easy.

At the top of the file I found:

__all__ = ['update_wrapper', 'wraps', 'WRAPPER_ASSIGNMENTS', 'WRAPPER_UPDATES',  
           'total_ordering', 'cmp_to_key', 'lru_cache', 'reduce', 'partial',
           'partialmethod', 'singledispatch']

Those are all the functions in the module's documentation plus the constants WRAPPER_ASSIGNMENTS and WRAPPER_UPDATES that are used as default values for update_wrapper and wraps. It must mean I'm in the right place.

The lines that came immediately after that were a little less encouraging though.

    from _functools import reduce
except ImportError:  

So, reduce is one of the functions that is exposed by the functools module. But it's not defined here. It's imported from a mysterious internal module called _functools. At this point I had a strong feeling this would be C extension. I was right.

I didn't find any other files that hinted to be related with functools in the Lib folder (which also only seem to contain Python code), so, I took another step back and looked at what else was there in the root. Another promising folder appeared, this one called Modules. Inside there's a README file that says:

Source files for standard library extension modules, and former extension modules that are now builtin modules.

That sounds about right. Amongst many other C files, there was the one I was looking for _functoolsmodule.c.

Now, at this point I can't say that I fully understand what's going on here, but I took this as proof that this was in fact the definition of the _functools module.

static struct PyModuleDef _functoolsmodule = {  

After browsing this file for a while (and relying on some of the comments), there seems to be definitions for:

  • partial
  • cmp_to_key
  • reduce
  • lru_cache

So, functools is an hybrid module, partially written in Python, partially written in C. But it actually goes a bit further than that. Going back to the functools.py file and looking at, for example, the reference to cmp_to_key we get this:

def cmp_to_key(mycmp):  
    """Convert a cmp= function into a key= function"""
    class K(object):
        __slots__ = ['obj']
        def __init__(self, obj):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        __hash__ = None
    return K

    from _functools import cmp_to_key
except ImportError:  

There's a Python implementation of cmp_to_key and then it tries to import the same name from _functools. The pure Python version acts as a fallback in case the C extension is not available for some reason.

This pattern is used for every other function that gets imported from _functools except for reduce. My guess is that this has something to do with the fact that reduce used to be a built-in function.

The code I found here varies a lot in complexity. The implementations for singledispatch or lru_cache are quite complex (and perhaps deserving of their own posts) while wraps and total_ordering are so accessible that even a beginner can understand them very quickly. The C extension parts of it, are arguably much harder to understand and it would take a lot more effort to be familiar with that part of the code.

But what about tests?

Inside the Lib folder, there's another folder called tests. There's a large amount of files in the folder. One in particular holds the test cases for the functools module, conveniently called test_functools.py.

The first thing to notice about this is that it uses unittest. Great example of dogfooding. The second thing I noticed was these two lines:

py_functools = support.import_fresh_module('functools', blocked=['_functools'])  
c_functools = support.import_fresh_module('functools', fresh=['_functools'])  

The module is written in both Python and C. And partially written in both. So, both versions need to be tested individually. A curiosity I found here is that for the functions that have implementations in both C and Python, the C versions seem to be much more thoroughly tested. Which makes sense, since they are probably the ones being called on most use cases, so there's a bigger ROI there.

Diving into Python's source code made for an entertaining and educational couple of hours. For a codebase that's been there for decades, it was pretty easy to find what I was looking for.