The state of Python 3 adoption
The first version of Python 3 was released 9 years ago. Unfortunately, Python 2.7 is still leading in some fields. With 2017 coming to an end soon, let's have a look at the current state of the Python adoption.
Firstly, If you haven't read the recent developer survey about Python's ecosystem by JetBrains then I suggest reading it.
In this article, I want to share another way of estimating usage by looking at PyPI download statistics. Fortunately, the data is publicly available and stored in the BigQuery database. To start using it, you just need a google account and basic SQL knowledge.
Here is how to aggregate package statistics for the past 30 days:
SELECT
SUBSTR(details.python, 0, 3) as python_version,
COUNT(*) as download_count,
FROM
TABLE_DATE_RANGE(
[the-psf:pypi.downloads],
DATE_ADD(CURRENT_TIMESTAMP(), -30, "days"),
CURRENT_TIMESTAMP()
)
WHERE
details.installer.name='pip' and
file.project = 'numpy'
GROUP BY
python_version,
ORDER BY
download_count DESC
LIMIT 100
Visualization of statistics
To get an idea about Python's version distribution, let's visualize relative frequency of download statistics.
Below, I provided usage statistics for the following packages: bokeh, celery, click, cython, django, flask, gensim, jupyter, keras, lxml, matplotlib, nltk, numpy, pandas, pillow, requests, scipy, scrapy, sklearn, spacy, tensorflow and xgboost.
We can the see shifts in the distributions towards Python 3 when looking at Django, Jypiter, spaCy, Cython and Celery. In case of the Django and Celery, there is a good reason for that — they dropped support for Python 2.
However, don't be confused by the numbers, PyPi isn't the best way to get an accurate statistics. Many of the downloads are generated by automated bots, such as continuous integration tools, mirroring clients and tox testing. In reality, the real usage of Python 2.7 should be smaller. Also, it's worth mentioning that many Linux distributions still using Python 2.
p.s. There is no reason to use Python 2 in the upcoming year and I'm not advocating it.