How many objects does Python allocate during its interpreter lifetime?

Last updated on June 19, 2018, in python

It can be very surprising to see how many objects Python interpreter temporarily allocates while executing simple scripts. In fact, Python provides a way to check it.

To do so, we need to compile a standard CPython interpreter with additional debug flags:

./configure CFLAGS='-DCOUNT_ALLOCS' --with-pydebug 
make -s -j2

Let's open an empty interactive REPL and check allocation statistics:

>>> import sys
>>> sys.getcounts()
[('iterator', 7, 7, 4), ('functools._lru_cache_wrapper', 1, 0, 1), ('re.Match', 2, 2, 1),
('re.Pattern', 3, 2, 1), ('SubPattern', 10, 10, 8), ('Pattern', 3, 3, 1),
('IndexError', 4, 4, 1), ('Tokenizer', 3, 3, 1), ('odict_keys', 1, 1, 1),
('odict_iterator', 18, 18, 1), ('odict_items', 17, 17, 1), ('RegexFlag', 18, 8, 10),
('operator.itemgetter', 4, 0, 4), ('PyCapsule', 1, 1, 1), ('Repr', 1, 0, 1),
('_NamedIntConstant', 74, 0, 74), ('collections.OrderedDict', 5, 0, 5),
('EnumMeta', 5, 0, 5), ('DynamicClassAttribute', 2, 0, 2), ('_EnumDict', 5, 5, 1),
('TypeError', 1, 1, 1), ('method-wrapper', 365, 365, 2), ('_C', 1, 1, 1),
('symtable entry', 5, 5, 2), ('OSError', 1, 1, 1), ('Completer', 1, 0, 1),
('ExtensionFileLoader', 2, 0, 2), ('ModuleNotFoundError', 2, 2, 1),
('_Helper', 1, 0, 1), ('_Printer', 3, 0, 3), ('Quitter', 2, 0, 2),
('enumerate', 5, 5, 1), ('_io.IncrementalNewlineDecoder', 1, 1, 1),
('map', 25, 25, 1), ('_Environ', 2, 0, 2), ('async_generator', 2, 1, 1),
('coroutine', 2, 2, 1), ('zip', 1, 1, 1), ('longrange_iterator', 1, 1, 1),
('range_iterator', 7, 7, 1), ('range', 14, 14, 2), ('list_reverseiterator', 2, 2, 1),
('dict_valueiterator', 1, 1, 1), ('dict_values', 2, 2, 1), ('dict_keyiterator', 25, 25, 1),
('dict_keys', 5, 5, 1), ('bytearray_iterator', 1, 1, 1), ('bytearray', 4, 4, 1),
('bytes_iterator', 2, 2, 1), ('IncrementalEncoder', 2, 0, 2), ('_io.BufferedWriter', 2, 0, 2),
('IncrementalDecoder', 2, 1, 2), ('_io.TextIOWrapper', 4, 1, 4), ('_io.BufferedReader', 2, 1, 2),
('_abc_data', 39, 0, 39), ('mappingproxy', 199, 199, 1), ('ABCMeta', 39, 0, 39),
('CodecInfo', 1, 0, 1), ('str_iterator', 7, 7, 1), ('memoryview', 60, 60, 2),
('managedbuffer', 31, 31, 1), ('slice', 589, 589, 1), ('_io.FileIO', 33, 30, 5),
('SourceFileLoader', 29, 0, 29), ('set', 166, 101, 80), ('StopIteration', 33, 33, 1),
('FileFinder', 11, 0, 11), ('os.stat_result', 145, 145, 1), ('ImportError', 2, 2, 1),
('FileNotFoundError', 10, 10, 1), ('ZipImportError', 12, 12, 1), ('zipimport.zipimporter', 12, 12, 1),
('NameError', 4, 4, 1), ('set_iterator', 46, 46, 1), ('frozenset', 50, 0, 50), ('_ImportLockContext', 113, 113, 1),
('list_iterator', 305, 305, 5), ('_thread.lock', 92, 92, 10), ('_ModuleLock', 46, 46, 5), ('KeyError', 67, 67, 2),
('_ModuleLockManager', 46, 46, 5), ('generator', 125, 125, 1), ('_installed_safely', 52, 52, 5),
('method', 1095, 1093, 14), ('ModuleSpec', 58, 4, 54), ('AttributeError', 22, 22, 1),
('traceback', 154, 154, 3), ('dict_itemiterator', 45, 45, 1), ('dict_items', 46, 46, 1),
('object', 8, 1, 7), ('tuple_iterator', 631, 631, 3), ('cell', 71, 31, 42),
('classmethod', 58, 0, 58), ('property', 18, 2, 16), ('super', 360, 360, 1),
('type', 78, 3, 75), ('function', 1705, 785, 922), ('frame', 5442, 5440, 36),
('code', 1280, 276, 1063), ('bytes', 2999, 965, 2154), ('Token.MISSING', 1, 0, 1),
('stderrprinter', 1, 1, 1), ('MemoryError', 16, 16, 16), ('sys.thread_info', 1, 0, 1),
('sys.flags', 2, 0, 2), ('types.SimpleNamespace', 1, 0, 1), ('sys.version_info', 1, 0, 1),
('sys.hash_info', 1, 0, 1), ('sys.int_info', 1, 0, 1), ('float', 584, 569, 20),
('sys.float_info', 1, 0, 1), ('module', 56, 0, 56), ('staticmethod', 16, 0, 16),
('weakref', 505, 82, 426), ('int', 3540, 2775, 766), ('member_descriptor', 246, 10, 239),
('list', 992, 919, 85), ('getset_descriptor', 240, 4, 240), ('classmethod_descriptor', 12, 0, 12),
('method_descriptor', 678, 0, 678), ('builtin_function_or_method', 1796, 1151, 651), ('wrapper_descriptor', 1031, 5, 1026),
('str', 16156, 9272, 6950), ('dict', 1696, 900, 810), ('tuple', 10367, 6110, 4337)]

Here is how we can make it more readable:

def print_allocations(top_k=None):
    allocs = sys.getcounts()
    if top_k:
        allocs = sorted(allocs, key=lambda tup: tup[1], reverse=True)[0:top_k]

    for obj in allocs:
        alive = obj[1]-obj[2]
        print("Type {},  allocs: {}, deallocs: {}, max: {}, alive: {}".format(*obj,alive))
>>> print_allocations(10)
Type str,  allocs: 17328, deallocs: 10312, max: 7016, alive: 7016
Type tuple,  allocs: 10550, deallocs: 6161, max: 4389, alive: 4389
Type frame,  allocs: 5445, deallocs: 5442, max: 36, alive: 3
Type int,  allocs: 3988, deallocs: 3175, max: 813, alive: 813
Type bytes,  allocs: 3031, deallocs: 1044, max: 2154, alive: 1987
Type builtin_function_or_method,  allocs: 1809, deallocs: 1164, max: 651, alive: 645
Type dict,  allocs: 1726, deallocs: 930, max: 815, alive: 796
Type function,  allocs: 1706, deallocs: 811, max: 922, alive: 895
Type code,  allocs: 1284, deallocs: 304, max: 1063, alive: 980
Type method,  allocs: 1095, deallocs: 1093, max: 14, alive: 2

Where:

  • allocs - the number of allocations since interpreter startup
  • deallocs - the number of manually deallocated and garbage collected objects
  • alive - the number of alive (active) objects (allocs - deallocs)
  • max - the maximum seen number of alive objects since interpreter startup


As you can see, an empty Python REPL has allocated 17 328 ** strings and 10 550** tuples. That's an insane amount of allocations! Keep in mind, that unlike regular Python script, REPL imports additional modules which implement its features.

Now, let's test a hello world flask application:

import sys

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    print_allocations(15)
    return 'Hello, World!'
./python -m flask run
ab -n 100 http://127.0.0.1:5000/

After processing 100 HTTP requests statistics looks as follows:

Type str,  allocs: 192649, deallocs: 138892, max: 54320, alive: 53757
Type frame,  allocs: 191752, deallocs: 191714, max: 158, alive: 38
Type tuple,  allocs: 183474, deallocs: 150069, max: 33581, alive: 33405
Type int,  allocs: 85154, deallocs: 81100, max: 4115, alive: 4054
Type bytes,  allocs: 31671, deallocs: 14331, max: 17381, alive: 17340
Type list,  allocs: 29846, deallocs: 27541, max: 2415, alive: 2305
Type builtin_function_or_method,  allocs: 28525, deallocs: 27572, max: 957, alive: 953
Type dict,  allocs: 19900, deallocs: 14800, max: 5280, alive: 5100
Type method,  allocs: 15170, deallocs: 15105, max: 74, alive: 65
Type function,  allocs: 14761, deallocs: 7086, max: 7711, alive: 7675
Type slice,  allocs: 12521, deallocs: 12521, max: 1, alive: 0
Type list_iterator,  allocs: 10795, deallocs: 10795, max: 35, alive: 0
Type code,  allocs: 9849, deallocs: 1749, max: 8107, alive: 8100
Type tuple_iterator,  allocs: 8938, deallocs: 8938, max: 4, alive: 0
Type float,  allocs: 6033, deallocs: 5889, max: 152, alive: 144

With all the requests, a Python interpreter running a simple flask application has allocated 847 261 objects in total. Most of them were temporal (714 336) and deallocated after they are no longer needed. The rest ( 132 925 ) are still alive.

Frame and code objects

There are a lot of code and frame objects in the example above. Why do we need them?

In short, each code object stores a block of compiled code, furthermore frame object represents a call stack. In Python, the most popular block is a function. Every function definition creates a code object, and every function execution requires a unique frame object, where Python stores local variables. Apart from local variables, each frame object may allocate tens of auxiliary objects.

From where do all these allocations come?

Python has a very dynamic nature which comes at a cost. In order to support many features at runtime, it allocates a lot of auxiliary objects.

For example, I found out that a simple function definition allocates at least five dictionaries, five tuples, and four lists that live until the end of a Python process. In turn, all these objects allocate its members, e.g., integers, floats, and strings. An ordinary class definition may allocate hundreds of container objects. Unfortunately, it's hard to tell precise numbers since I didn't find an easy way to measure it automatically.

To speed up and provide fast performance when allocating objects, Python has a variety of optimizations that improve object allocation.

Sometimes it's good to know how many unnecessary (for a Python user) but interesting details Python hides from us.


If you have any questions, feel free to ask them via e-mail displayed in the footer.

Comments

There are no comments for this post. Be the first.