On code isolation in Python

Last updated on October 27, 2020, in python

I started learning Python in 2009, and I had a pretty challenging task and somewhat unusual use of Python. I was working on a desktop application that used PyQT for GUI and Python as the main language.

To hide the code, I embedded Python interpreter into a standalone Windows executable. There are a lot of solutions to do so (e.g. pyinstaller, pyexe), and they all work similarly. They compile your Python scripts to bytecode files and bundle them with an interpreter into an executable. Compiling scripts down to bytecode makes it harder for people with bad intentions to get the source code and crack or hack your software. Bytecode has to be extracted from the executable and decompiled. It can also produce obfuscated code that is much harder to understand.

At one point, I wanted to add a plugin system so that users could benefit from extra features. Executing arbitrary third-party code on your server is dangerous. But can it harm your commercial product when you execute it on user's machines, and users have trust in that they are executing? At that time, the answer was not obvious, and I decided to implement the system.

A few years later, it became apparent that you should never execute third-party code using the same Python process (interpreter) if you don't want to leak the source code. There are a lot of commercial products that use Python for desktop software or as a scripting language. Some of them can be at risk.

There are many ways to extract Python bytecode even if you don't run any third-party code. It's a never-ending arms race between developers and reverse engineers, but it's much easier to extract the bytecode and crack your program when you can run your own code. My software was later cracked without using the plugins subsystem.

So what can you do to the "host" code when it executes your scripts?

Python is a very dynamic language, and you can do a lot of things. This article demonstrates a few approaches on how to modify or extract the source code.

When you work with a regular Python process, you don't even need a plugins system. You can always attach to a running process using GDB and inject your own code.

Monkey patching

If a plugin can be initialized before a function that you want to modify, we can simply mock it.

Let's suppose we have a function that validates license:

def validate_license():
    hw_hash, hw_id = get_hardware_id()
    data = {"timestamp": time.time(), "hid": hw_id, }
    r = requests.post('https://rushter.com/validate', data)
    server_hash = r.text
    return hw_hash == server_hash

We can bypass the checks by replacing a few functions:

def mock_licensing():
    requests = __import__('requests')
    licensing = __import__('licensing')


    def post(*args, **kwargs):
        mocked_object = types.SimpleNamespace()
        mocked_object.text = "a8f5f167"
        return mocked_object

    licensing.get_hardware_id = lambda: ("a8f5f167", 123)
    requests.post = post

Frame objects

In Python, frame objects keep the state of functions that are currently running. Each frame corresponds to single function execution. Python modules and class definition use frames too. That is a building block of the call stack.

Given a frame object, you can:

  • Change locals, globals, and builtins at runtime.
  • Get bytecode of a function (code block) that is being executed.

Here is how you can list all frames in the current call stack:

def list_frames():
    current_frame = sys._getframe(0)
    while current_frame.f_back:
        print(f"""
locals: {current_frame.f_locals}
globals: {current_frame.f_globals}
bytecode: {current_frame.f_code.co_code}
function name: {current_frame.f_code.co_name}
line number: {current_frame.f_lineno}
        """)
        current_frame = current_frame.f_back

The inspect module describes all available attributes of the frame and code objects.

Changing locals

Let's suppose we have a function that calls a callback, and we have control over the callback. For example, the path to callback can be defined in the settings file.

def get_amount():
    return 10


def update_database(user, amount):
    pass


def charge_user_for_subscription(user, logger=logging.info):
    amount = get_amount()
    print(amount)
    logger(amount)
    update_database(user, amount)
    print(amount)

The last function charges a user for a monthly subscription and allows to specify a custom logging callback. I've added a few prints so that you can copy-paste the code and see the results.

Since logging happens in the middle, we can modify the amount variable.

def fix_amount(_):
    import ctypes
    # Get parent frame
    frame = sys._getframe(1)
    # Update locals dictionary
    frame.f_locals['amount'] = -100
    # Synchronize dictionary
    ctypes.pythonapi.PyFrame_LocalsToFast(ctypes.py_object(frame), 0)
In [8]: charge_user_for_subscription('Ivan', fix_amount)
10
-100

Fast locals

In Python, local and global variables are stored in dictionaries. Every time you use a variable, Python needs to lookup it in a dictionary. Since dictionary lookups are not free and take time, Python uses various optimization techniques.

By analyzing the code of a function, it's possible to detect variable names that a function will be using when running. Our function has three local variables: amount, user, logger. Function arguments are local variables too.

When compiling source code to bytecode, Python maps known variable names to indexes in a special array and stores them there. Accessing a variable by an index is fast, and most of the functions use predefined names. Optimized variables are called fast locals. To keep variable names that are generated on the go, Python uses a dictionary as a fallback.

When dereferencing variables, Python prioritizes fast locals and ignores changes in the dictionary. That's why we use ctypes and call internal PyFrame_LocalsToFast function.

Patching bytecode

I have an article on bytecode patching that describes how to patch function definition. We can go even further and patch a running function.

Instead of source code, Python interpreter executes bytecode that was generated using a special compiler. When executing the code, a special virtual machine executes each instruction one by one. That allows us to replace unexecuted instructions on the go.

Let's use this function as an example:

def is_valid():
    return False


def check_license(callback):
    callback()
    if not is_valid():
        print('exiting')
        exit(0)

The builtin dis module allows us to see the bytecode in a human-readable format:

In [12]: check_license.__code__.co_code
Out[12]: b'|\x00\x83\x00\x01\x00t\x00\x83\x00s\x1ct\x01d\x01\x83\x01\x01\x00t\x02d\x02\x83\x01\x01\x00d\x00S\x00'

In [13]: dis.dis(check_license)
  6           0 LOAD_FAST                0 (callback)
              2 CALL_FUNCTION            0
              4 POP_TOP

  7           6 LOAD_GLOBAL              0 (is_valid)
              8 CALL_FUNCTION            0
             10 POP_JUMP_IF_TRUE        28

  8          12 LOAD_GLOBAL              1 (print)
             14 LOAD_CONST               1 ('exiting')
             16 CALL_FUNCTION            1
             18 POP_TOP

  9          20 LOAD_GLOBAL              2 (exit)
             22 LOAD_CONST               2 (0)
             24 CALL_FUNCTION            1
             26 POP_TOP
        >>   28 LOAD_CONST               0 (None)
             30 RETURN_VALUE

Our license is not valid, and we want to remove the not statement from the code. To do so, we need to replace the POP_JUMP_IF_TRUE instruction with POP_JUMP_IF_FALSE.

Since we can control the callback function, we can apply a hot patch in the middle of a function.

import sys, ctypes

def fix():
    # get parent frame
    frame = sys._getframe(1)
    # find bytecode location
    memory_offset = id(frame.f_code.co_code) + sys.getsizeof(b'') - 1
    # update 10th bytecode element
    ctypes.memset(memory_offset + 10, dis.opmap['POP_JUMP_IF_FALSE'], 1)

if __name__ == '__main__':
    check_license(fix)

As you can see above, internally, the bytecode is stored as bytes. Unfortunately, we can't modify the frame.f_code.co_code attribute since it's a read-only attribute.

To bypass this restriction, we use ctypes module that allows us to modify the RAM of a Python process. Every bytes object contains meta-information, such as number of references and information about the type. To locate the exact address of the raw C string, we use the id function that returns object address in memory, and we skip all meta information (size of the empty byte string). The output from dis.dis shows that the POP_JUMP_IF_TRUE instruction is the 10th element in the byte string that we need to replace.

Extracting the source code

Every script that Python runs or imports creates a module object that stores constants, functions, class definitions, and so on. If you don't have source code, you can get it back by decompiling the bytecode.

Here is how you can iterate over all modules and find all available functions:

for name, module in list(sys.modules.items()):
    if name not in ['license', 'runner']:
        continue
    for obj_name, obj in inspect.getmembers(module):
        if inspect.isfunction(obj):
            print(obj_name, obj.__code__)

Fortunately (or unfortunately for some people), there is no easy way to extract the bytecode of a whole module if you don't have a pyc (bytecode cache) file. If you want to get the source code from a running Python process, you will need to extract the bytecode of each function, class, and module definitions as well as from some frame objects. After that, you will need to run it through one of the decompilers.

Conclusion

This article is a part of CPython internals series. It's hard to find a practical application to these techniques, but such language details are useful for people who perform security research or participate in CTF competitions. My previous articles on bytecode have helped people to get extra points in security competitions.

Comments

  • Jim Williams 2020-10-29 #

    Very useful, well-written post. Thanks.

    reply