How CPython implements free lists

Last updated on April 13, 2022, in python

CPython, the reference implementation of Python, uses free lists to reuse some of the objects again. Such a technique reduces the number of memory allocations and speeds up object creation. This is especially important for simple objects such as floats.

Python types that use free lists:

  • float
  • dictionary
  • list
  • tuple
  • unicode string

When a Python object is destroyed, a free list has an extra slot, and the object meets all the requirements (e.g. size of a tuple), it's moved to the free list and is not deallocated.

When creating a new object, Python checks the free list that contains all free objects of the same type. If there is an object, it removes it from the list and reuses it instead of allocating a new one.

CPython implements free lists using the linked list data structure.

For example, here is the C structure that keeps track of float objects:

struct _Py_float_state {
    /* Special free list
       free_list is a singly-linked list of available PyFloatObjects,
       linked via abuse of their ob_type members. */
    int numfree;
    PyFloatObject *free_list;
};

It keeps the first item and the number of items in the free list. The maximum amount of items in the free list for floats is 100.

The classic implementation of the linked list usually uses another data structure for each item in it. It keeps the pointer to the next item:

struct Item {
    double data;
    struct Item* next;
};

To improve memory consumption and speed, Python does not use such a structure. Instead, it keeps the location of the next object in the object itself, which is implemented as a C structure.

Each Python object has the ob_type type field. Here is how it looks for the float:

typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    double ob_fval;
} PyFloatObject;

The ob_type field points to a structure that describes the type and all its features and methods. When CPython moves the object to the free list, it replaces the ob_type field with the pointer that points to the next item. Since each object type has its own free list, there is no need to keep the type information. It can be easily restored.

Here is how the object allocation looks like for the float object:

PyObject *
PyFloat_FromDouble(double fval)
{
    struct _Py_float_state *state = get_float_state();
    PyFloatObject *op = state->free_list;
    if (op != NULL) {
        state->free_list = (PyFloatObject *) Py_TYPE(op);
        state->numfree--;
    }
    else {
        op = PyObject_Malloc(sizeof(PyFloatObject));
        if (!op) {
            return PyErr_NoMemory();
        }
    }
    _PyObject_Init((PyObject*)op, &PyFloat_Type);
    op->ob_fval = fval;
    return (PyObject *) op;
}

The Py_TYPE macros returns the pointer of the ob_type and CPython casts it to the float object. The _PyObject_Init function sets the correct pointer to the type structure back.

And here is the deallocation

static void
float_dealloc(PyFloatObject *op)
{
    if (PyFloat_CheckExact(op)) {
        struct _Py_float_state *state = get_float_state();
        if (state->numfree >= PyFloat_MAXFREELIST)  {
            PyObject_Free(op);
            return;
        }
        state->numfree++;
        Py_SET_TYPE(op, (PyTypeObject *)state->free_list);
        state->free_list = op;
    }
    else {
        Py_TYPE(op)->tp_free((PyObject *)op);
    }
}

Here the Py_SET_TYPE macros is used to set the pointer to the next item in the free list.

Comments

There are no comments for this post. Be the first.