How many objects does Python allocate during its interpreter lifetime?
It can be very surprising to see how many objects Python interpreter temporarily allocates while executing simple scripts. In fact, Python provides a way to check it.
To do so, we need to compile a standard CPython interpreter with additional debug flags:
./configure CFLAGS='-DCOUNT_ALLOCS' --with-pydebug make -s -j2
Let's open an empty interactive REPL and check allocation statistics:
>>> import sys >>> sys.getcounts() [('iterator', 7, 7, 4), ('functools._lru_cache_wrapper', 1, 0, 1), ('re.Match', 2, 2, 1), ('re.Pattern', 3, 2, 1), ('SubPattern', 10, 10, 8), ('Pattern', 3, 3, 1), ('IndexError', 4, 4, 1), ('Tokenizer', 3, 3, 1), ('odict_keys', 1, 1, 1), ('odict_iterator', 18, 18, 1), ('odict_items', 17, 17, 1), ('RegexFlag', 18, 8, 10), ('operator.itemgetter', 4, 0, 4), ('PyCapsule', 1, 1, 1), ('Repr', 1, 0, 1), ('_NamedIntConstant', 74, 0, 74), ('collections.OrderedDict', 5, 0, 5), ('EnumMeta', 5, 0, 5), ('DynamicClassAttribute', 2, 0, 2), ('_EnumDict', 5, 5, 1), ('TypeError', 1, 1, 1), ('method-wrapper', 365, 365, 2), ('_C', 1, 1, 1), ('symtable entry', 5, 5, 2), ('OSError', 1, 1, 1), ('Completer', 1, 0, 1), ('ExtensionFileLoader', 2, 0, 2), ('ModuleNotFoundError', 2, 2, 1), ('_Helper', 1, 0, 1), ('_Printer', 3, 0, 3), ('Quitter', 2, 0, 2), ('enumerate', 5, 5, 1), ('_io.IncrementalNewlineDecoder', 1, 1, 1), ('map', 25, 25, 1), ('_Environ', 2, 0, 2), ('async_generator', 2, 1, 1), ('coroutine', 2, 2, 1), ('zip', 1, 1, 1), ('longrange_iterator', 1, 1, 1), ('range_iterator', 7, 7, 1), ('range', 14, 14, 2), ('list_reverseiterator', 2, 2, 1), ('dict_valueiterator', 1, 1, 1), ('dict_values', 2, 2, 1), ('dict_keyiterator', 25, 25, 1), ('dict_keys', 5, 5, 1), ('bytearray_iterator', 1, 1, 1), ('bytearray', 4, 4, 1), ('bytes_iterator', 2, 2, 1), ('IncrementalEncoder', 2, 0, 2), ('_io.BufferedWriter', 2, 0, 2), ('IncrementalDecoder', 2, 1, 2), ('_io.TextIOWrapper', 4, 1, 4), ('_io.BufferedReader', 2, 1, 2), ('_abc_data', 39, 0, 39), ('mappingproxy', 199, 199, 1), ('ABCMeta', 39, 0, 39), ('CodecInfo', 1, 0, 1), ('str_iterator', 7, 7, 1), ('memoryview', 60, 60, 2), ('managedbuffer', 31, 31, 1), ('slice', 589, 589, 1), ('_io.FileIO', 33, 30, 5), ('SourceFileLoader', 29, 0, 29), ('set', 166, 101, 80), ('StopIteration', 33, 33, 1), ('FileFinder', 11, 0, 11), ('os.stat_result', 145, 145, 1), ('ImportError', 2, 2, 1), ('FileNotFoundError', 10, 10, 1), ('ZipImportError', 12, 12, 1), ('zipimport.zipimporter', 12, 12, 1), ('NameError', 4, 4, 1), ('set_iterator', 46, 46, 1), ('frozenset', 50, 0, 50), ('_ImportLockContext', 113, 113, 1), ('list_iterator', 305, 305, 5), ('_thread.lock', 92, 92, 10), ('_ModuleLock', 46, 46, 5), ('KeyError', 67, 67, 2), ('_ModuleLockManager', 46, 46, 5), ('generator', 125, 125, 1), ('_installed_safely', 52, 52, 5), ('method', 1095, 1093, 14), ('ModuleSpec', 58, 4, 54), ('AttributeError', 22, 22, 1), ('traceback', 154, 154, 3), ('dict_itemiterator', 45, 45, 1), ('dict_items', 46, 46, 1), ('object', 8, 1, 7), ('tuple_iterator', 631, 631, 3), ('cell', 71, 31, 42), ('classmethod', 58, 0, 58), ('property', 18, 2, 16), ('super', 360, 360, 1), ('type', 78, 3, 75), ('function', 1705, 785, 922), ('frame', 5442, 5440, 36), ('code', 1280, 276, 1063), ('bytes', 2999, 965, 2154), ('Token.MISSING', 1, 0, 1), ('stderrprinter', 1, 1, 1), ('MemoryError', 16, 16, 16), ('sys.thread_info', 1, 0, 1), ('sys.flags', 2, 0, 2), ('types.SimpleNamespace', 1, 0, 1), ('sys.version_info', 1, 0, 1), ('sys.hash_info', 1, 0, 1), ('sys.int_info', 1, 0, 1), ('float', 584, 569, 20), ('sys.float_info', 1, 0, 1), ('module', 56, 0, 56), ('staticmethod', 16, 0, 16), ('weakref', 505, 82, 426), ('int', 3540, 2775, 766), ('member_descriptor', 246, 10, 239), ('list', 992, 919, 85), ('getset_descriptor', 240, 4, 240), ('classmethod_descriptor', 12, 0, 12), ('method_descriptor', 678, 0, 678), ('builtin_function_or_method', 1796, 1151, 651), ('wrapper_descriptor', 1031, 5, 1026), ('str', 16156, 9272, 6950), ('dict', 1696, 900, 810), ('tuple', 10367, 6110, 4337)]
Here is how we can make it more readable:
def print_allocations(top_k=None): allocs = sys.getcounts() if top_k: allocs = sorted(allocs, key=lambda tup: tup[1], reverse=True)[0:top_k] for obj in allocs: alive = obj[1]-obj[2] print("Type {}, allocs: {}, deallocs: {}, max: {}, alive: {}".format(*obj,alive))
>>> print_allocations(10) Type str, allocs: 17328, deallocs: 10312, max: 7016, alive: 7016 Type tuple, allocs: 10550, deallocs: 6161, max: 4389, alive: 4389 Type frame, allocs: 5445, deallocs: 5442, max: 36, alive: 3 Type int, allocs: 3988, deallocs: 3175, max: 813, alive: 813 Type bytes, allocs: 3031, deallocs: 1044, max: 2154, alive: 1987 Type builtin_function_or_method, allocs: 1809, deallocs: 1164, max: 651, alive: 645 Type dict, allocs: 1726, deallocs: 930, max: 815, alive: 796 Type function, allocs: 1706, deallocs: 811, max: 922, alive: 895 Type code, allocs: 1284, deallocs: 304, max: 1063, alive: 980 Type method, allocs: 1095, deallocs: 1093, max: 14, alive: 2
Where:
- allocs - the number of allocations since interpreter startup
- deallocs - the number of manually deallocated and garbage collected objects
- alive - the number of alive (active) objects (allocs - deallocs)
- max - the maximum seen number of alive objects since interpreter startup
As you can see, an empty Python REPL has allocated 17 328 strings and 10 550 tuples. That's an insane amount of allocations! Keep in mind, that unlike regular Python script, REPL imports additional modules which implement its features.
Now, let's test a hello world flask application:
import sys from flask import Flask app = Flask(__name__) @app.route('/') def hello_world(): print_allocations(15) return 'Hello, World!'
./python -m flask run
ab -n 100 http://127.0.0.1:5000/
After processing 100 HTTP requests statistics looks as follows:
Type str, allocs: 192649, deallocs: 138892, max: 54320, alive: 53757 Type frame, allocs: 191752, deallocs: 191714, max: 158, alive: 38 Type tuple, allocs: 183474, deallocs: 150069, max: 33581, alive: 33405 Type int, allocs: 85154, deallocs: 81100, max: 4115, alive: 4054 Type bytes, allocs: 31671, deallocs: 14331, max: 17381, alive: 17340 Type list, allocs: 29846, deallocs: 27541, max: 2415, alive: 2305 Type builtin_function_or_method, allocs: 28525, deallocs: 27572, max: 957, alive: 953 Type dict, allocs: 19900, deallocs: 14800, max: 5280, alive: 5100 Type method, allocs: 15170, deallocs: 15105, max: 74, alive: 65 Type function, allocs: 14761, deallocs: 7086, max: 7711, alive: 7675 Type slice, allocs: 12521, deallocs: 12521, max: 1, alive: 0 Type list_iterator, allocs: 10795, deallocs: 10795, max: 35, alive: 0 Type code, allocs: 9849, deallocs: 1749, max: 8107, alive: 8100 Type tuple_iterator, allocs: 8938, deallocs: 8938, max: 4, alive: 0 Type float, allocs: 6033, deallocs: 5889, max: 152, alive: 144
With all the requests, a Python interpreter running a simple flask application has allocated 847 261 objects in total. Most of them were temporal (714 336) and deallocated after they are no longer needed. The rest (132 925) are still alive.
Frame and code objects
There are a lot of code and frame objects in the example above. Why do we need them?
In short, each code object stores a block of compiled code, furthermore frame object represents a call stack. In Python, the most popular block is a function. Every function definition creates a code object, and every function execution requires a unique frame object, where Python stores local variables. Apart from local variables, each frame object may allocate tens of auxiliary objects.
From where do all these allocations come?
Python has a very dynamic nature which comes at a cost. In order to support many features at runtime, it allocates a lot of auxiliary objects.
For example, I found out that a simple function definition allocates at least five dictionaries, five tuples, and four lists that live until the end of a Python process. In turn, all these objects allocate its members, e.g., integers, floats, and strings. An ordinary class definition may allocate hundreds of container objects. Unfortunately, it's hard to tell precise numbers since I didn't find an easy way to measure it automatically.
To speed up and provide fast performance when allocating objects, Python has a variety of optimizations that improve object allocation.
Sometimes it's good to know how many unnecessary (for a Python user) but interesting details Python hides from us.