How virtual environment libraries work in Python

Last updated on July 01, 2018, in Python

Have you ever wondered what happens when you activate a virtual environment and how it works internally? Here is a quick overview of internals behind popular virtual environments, e.g., virtualenv, virtualenvwrapper, conda, pipenv.

Initially, Python didn't have built-in support for virtual environments, and such feature was implemented as a hack. As it turns out, this hack is based on a simple concept.

When Python starts its interpreter, it searches for the site-specific directory where all packages are stored. The search starts at the parent directory of a Python executable location and continues by backtracking the path (i.e., looking at the parent directories) until it reaches the root directory. To determine if it's a site-specific directory, Python looks for the os.py module, which is a mandatory requirement by Python in order to work.

Let's suppose our Python binary is located at /usr/dev/bin/python. The search pattern will look as follows:

/usr/dev/lib/python3.7/os.py
/usr/lib/python3.7/os.py
/lib/python3.7/os.py

As you can see, Python adds a special prefix (lib/python$VERSION/os.py). When interpreter finds the first occurrence of the os module it sets the sys.prefix and sys.exec_prefix to the found location with prefix removed from the path. If there is none found, Python uses a hardcoded prefix.

Now, let's see how an old and well-known virtualenv library creates its virtual environments:

user@arb:/usr/home/test# virtualenv ENV
Running virtualenv with interpreter /usr/bin/python3
New python executable in /usr/home/test/ENV/bin/python3
Also creating executable in /usr/home/test/ENV/bin/python
Installing setuptools, pkg_resources, pip, wheel...done.

After execution, it creates additional directory:

user@arb:/usr/home/test/ENV# tree -L 3
.
├── bin
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── activate_this.py
│   ├── easy_install
│   ├── easy_install-3.7
│   ├── pip
│   ├── pip3
│   ├── pip3.7
│   ├── python
│   ├── python-config
│   ├── python3 -> python
│   ├── python3.7 -> python
│   └── wheel
├── include
│   └── python3.7m -> /usr/include/python3.7m
├── lib
│   └── python3.7
│   ├── __future__.py -> /usr/lib/python3.7/__future__.py
│   ├── __pycache__
│   ├── _bootlocale.py -> /usr/lib/python3.7/_bootlocale.py
│   ├── _collections_abc.py -> /usr/lib/python3.7/_collections_abc.py
│   ├── _dummy_thread.py -> /usr/lib/python3.7/_dummy_thread.py
│   ├── _weakrefset.py -> /usr/lib/python3.7/_weakrefset.py
│   ├── abc.py -> /usr/lib/python3.7/abc.py
│   ├── base64.py -> /usr/lib/python3.7/base64.py
│   ├── bisect.py -> /usr/lib/python3.7/bisect.py
│   ├── codecs.py -> /usr/lib/python3.7/codecs.py
│   ├── collections -> /usr/lib/python3.7/collections
│   ├── config-3.7m-darwin -> /usr/lib/python3.7/config-3.7m-darwin
│   ├── copy.py -> /usr/lib/python3.7/copy.py
│   ├── copyreg.py -> /usr/lib/python3.7/copyreg.py
│   ├── distutils
│   ├── encodings -> /usr/lib/python3.7/encodings
│   ├── enum.py -> /usr/lib/python3.7/enum.py
│   ├── fnmatch.py -> /usr/lib/python3.7/fnmatch.py
│   ├── functools.py -> /usr/lib/python3.7/functools.py
│   ├── genericpath.py -> /usr/lib/python3.7/genericpath.py
│   ├── hashlib.py -> /usr/lib/python3.7/hashlib.py
│   ├── heapq.py -> /usr/lib/python3.7/heapq.py
│   ├── hmac.py -> /usr/lib/python3.7/hmac.py
│   ├── imp.py -> /usr/lib/python3.7/imp.py
│   ├── importlib -> /usr/lib/python3.7/importlib
│   ├── io.py -> /usr/lib/python3.7/io.py
│   ├── keyword.py -> /usr/lib/python3.7/keyword.py
│   ├── lib-dynload -> /usr/lib/python3.7/lib-dynload
│   ├── linecache.py -> /usr/lib/python3.7/linecache.py
│   ├── locale.py -> /usr/lib/python3.7/locale.py
│   ├── no-global-site-packages.txt
│   ├── ntpath.py -> /usr/lib/python3.7/ntpath.py
│   ├── operator.py -> /usr/lib/python3.7/operator.py
│   ├── orig-prefix.txt
│   ├── os.py -> /usr/lib/python3.7/os.py
│   ├── posixpath.py -> /usr/lib/python3.7/posixpath.py
│   ├── random.py -> /usr/lib/python3.7/random.py
│   ├── re.py -> /usr/lib/python3.7/re.py
│   ├── readline.so -> /usr/lib/python3.7/lib-dynload/readline.cpython-37m-darwin.so
│   ├── reprlib.py -> /usr/lib/python3.7/reprlib.py
│   ├── rlcompleter.py -> /usr/lib/python3.7/rlcompleter.py
│   ├── shutil.py -> /usr/lib/python3.7/shutil.py
│   ├── site-packages
│   ├── site.py
│   ├── sre_compile.py -> /usr/lib/python3.7/sre_compile.py
│   ├── sre_constants.py -> /usr/lib/python3.7/sre_constants.py
│   ├── sre_parse.py -> /usr/lib/python3.7/sre_parse.py
│   ├── stat.py -> /usr/lib/python3.7/stat.py
│   ├── struct.py -> /usr/lib/python3.7/struct.py
│   ├── tarfile.py -> /usr/lib/python3.7/tarfile.py
│   ├── tempfile.py -> /usr/lib/python3.7/tempfile.py
│   ├── token.py -> /usr/lib/python3.7/token.py
│   ├── tokenize.py -> /usr/lib/python3.7/tokenize.py
│   ├── types.py -> /usr/lib/python3.7/types.py
│   ├── warnings.py -> /usr/lib/python3.7/warnings.py
│   └── weakref.py -> /usr/lib/python3.7/weakref.py
└── pip-selfcheck.json

As you can see, the environment was created by copying Python binary to a local directory (ENV/bin/python). Also, the parent directory contains a lib folder, which stores a collection of symlinks to standard library files. We can't create a symlink to the executable, because it will be dereferenced by the interpreter.

Now, let's activate our environment:

user@arb:/usr/home/test# source ENV/bin/activate

This command changes the $PATH (bash environment variable) in such way that the "python" command will point to our local version.

Basically, it prepends our local path of the bin directory at first place, so it has a priority over all other locations:

export "/usr/home/test/ENV/bin:$PATH"
echo $PATH

If you run a Python script in such environment, Python process will be executed using the /usr/home/test/ENV/bin/python executable. Thus, the interpreter will use this location as a starting point for its package finder. In our case, the site-specific directory will be found at the /usr/home/test/ENV/lib/python3.7/.

That is the main idea of the hack, which most of the virtual environments libraries use under the hood.

Improvements in Python 3

Since Python 3.3, there is a new PEP 405 which introduces a mechanism for lightweight virtual environments.

This PEP adds a new step to the search process. By creating a pyvenv.cfg file instead of copying Python binary and its modules you can specify their location in the config file.

That is how standard venv module works:

user@arb:/usr/home/test2# python3 -m venv ENV
user@arb:/usr/home/test2# tree -L 3
.
└── ENV
  ├── bin
  │   ├── activate
  │   ├── activate.csh
  │   ├── activate.fish
  │   ├── easy_install
  │   ├── easy_install-3.7
  │   ├── pip
  │   ├── pip3
  │   ├── pip3.5
  │   ├── python -> python3
  │   └── python3 -> /usr/bin/python3
  ├── include
  ├── lib
  │   └── python3.7
  ├── lib64 -> lib
  ├── pyvenv.cfg
  └── share
  └── python-wheels
user@arb:/usr/home/test2# cat ENV/pyvenv.cfg
home = /usr/bin
include-system-site-packages = false
version = 3.7.0
user@arb:/usr/home/test2# readlink ENV/bin/python3
/usr/bin/python3

Thanks to the config file, Instead of a copy of the executable, venv uses a symbolic link to it. If include-system-site-packages is set to true then all system-installed packages will be importable from the environment by prepending a system-specific directory to the sys.path.

Despite this improvements, most of the third-party virtual environment libraries are still using the old approach.

Want a monthly digest of these blog posts?