Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

View PyPI download statistics with ease.

Project Description

pypinfo is a simple CLI to access PyPI download statistics via Google’s BigQuery.

Installation

pypinfo is distributed on PyPI as a universal wheel and is available on Linux/macOS and Windows and supports Python 3.5+ and PyPy.

This is relatively painless, I swear.

  1. Go to https://bigquery.cloud.google.com.
  2. Sign up if you haven’t already. The first TB of queried data each month is free. Each additional TB is $5.
  3. Go to https://console.developers.google.com/cloud-resource-manager and create a new project if you don’t already have one. Any name is fine, but I recommend you choose something to do with PyPI like pypinfo. This way you know what the project is designated for.
  4. Go to https://console.cloud.google.com/apis/api/bigquery-json.googleapis.com/overview and make sure the correct project is chosen using the drop-down on top. Click the button on top to enable.
  5. Follow https://cloud.google.com/storage/docs/authentication#generating-a-private-key to create credentials in JSON format. During creation, choose BigQuery User as role. (If BigQuery is not an option in the list, wait 15-20 minutes and try creating the credentials again.) After creation, note the download location. Move the file wherever you want.
  6. pip install pypinfo
  7. pypinfo --auth path/to/your_credentials.json, or set an environment variable GOOGLE_APPLICATION_CREDENTIALS that points to the file.

Usage

$ pypinfo
Usage: pypinfo [OPTIONS] [PROJECT] [FIELDS]... COMMAND [ARGS]...

  Valid fields are:

  project | version | pyversion | percent3 | percent2 | impl | impl-version |

  openssl | date | month | year | country | installer | installer-version |

  setuptools-version | system | system-release | distro | distro-version | cpu

Options:
  -a, --auth TEXT         Path to Google credentials JSON file.
  --run / --test          --test simply prints the query.
  -j, --json              Print data as JSON.
  -t, --timeout INTEGER   Milliseconds. Default: 120000 (2 minutes)
  -l, --limit TEXT        Maximum number of query results. Default: 20
  -d, --days TEXT         Number of days in the past to include. Default: 30
  -sd, --start-date TEXT  Must be negative. Default: -31
  -ed, --end-date TEXT    Must be negative. Default: -1
  -w, --where TEXT        WHERE conditional. Default: file.project = "project"
  -o, --order TEXT        Field to order by. Default: download_count
  -p, --pip               Only show installs by pip.
  --version               Show the version and exit.
  --help                  Show this message and exit.

pypinfo accepts 0 or more options, followed by exactly 1 project, followed by 0 or more fields. By default only the last 30 days are queried. Let’s take a look at some examples!

Tip: If queries are resulting in NoneType errors, increase timeout.

Downloads for a project

$ pypinfo requests
download_count
--------------
    13,149,515

All downloads

$ pypinfo ""
download_count
--------------
   765,826,772

Downloads for a project by Python version

$ pypinfo django pyversion
python_version download_count
-------------- --------------
2.7                   611,777
3.6                   259,357
3.5                   200,749
3.4                   104,585
None                   97,813
2.6                     6,318
3.7                     2,342
3.3                     2,106
3.2                       365
2.4                        11
1.17                       10
2.5                         8
3.1                         1
2.1                         1

All downloads by country code

$ pypinfo "" country
country download_count
------- --------------
US         501,337,782
IE          29,547,697
CN          22,198,589
DE          21,641,064
GB          18,946,922
None        18,077,976
FR          15,593,846
BR          13,500,471
CA          13,098,341
AU          12,482,455
JP          12,390,691
RU          11,381,041
SG          11,326,902
IN          10,186,952
KR           8,141,791
NL           6,695,112
IL           3,381,433
ES           2,622,822
PL           2,408,438
NO           2,292,994

Downloads for a project by system and distribution

$ pypinfo cryptography system distro
system_name distro_name                     download_count
----------- ------------------------------- --------------
Linux       Ubuntu                               1,949,204
Linux       Debian GNU/Linux                       407,626
Linux       None                                   375,363
Linux       CentOS Linux                           251,467
None        None                                   204,007
Windows     None                                   174,763
Linux       debian                                 116,972
Linux       Amazon Linux AMI                       106,790
Linux       CentOS                                  99,851
Darwin      macOS                                   81,554
Linux       Raspbian GNU/Linux                      68,696
Linux       Red Hat Enterprise Linux Server         54,737
Linux       Alpine Linux                            46,135
Linux       Fedora                                  27,746
Darwin      OS X                                    16,918
Linux       Linux                                    9,711
Linux       openSUSE Leap                            8,636
Linux       Virtuozzo                                7,978
Linux       RedHatEnterpriseServer                   5,789
FreeBSD     None                                     4,899

Percentage of Python 3 downloads of the top 100 projects in the past year

Let’s use --test to only see the query instead of sending it.

$ pypinfo --test --days 365 --limit 100 "" project percent3
SELECT
  file.project as project,
  ROUND(100 * SUM(CASE WHEN REGEXP_EXTRACT(details.python, r"^([^\.]+)") = "3" THEN 1 ELSE 0 END) / COUNT(*), 1) as percent_3,
  COUNT(*) as download_count,
FROM
  TABLE_DATE_RANGE(
    [the-psf:pypi.downloads],
    DATE_ADD(CURRENT_TIMESTAMP(), -366, "day"),
    DATE_ADD(CURRENT_TIMESTAMP(), -1, "day")
  )
GROUP BY
  project,
ORDER BY
  download_count DESC
LIMIT 100

Credits

Changelog

Important changes are emphasized.

7.0.0

  • Output table is now in Markdown format for easy copying to GitHub issues and PRs.

6.0.0

  • Updated google-cloud-bigquery dependency.

5.0.0

  • Numeric output (non-json) is now prettier (thanks hugovk)
  • You can now filter results for only pip installs with the --pip flag (thanks hugovk)

4.0.0

3.0.1

  • Fix: project names are now normalized to adhere to PEP 503.

3.0.0

  • Breaking: --json option is now just a flag and prints output as prettified JSON.

2.0.0

  • Added --json path option.

1.0.0

  • Initial release
Release History

Release History

This version
History Node

7.0.0

History Node

6.0.0

History Node

5.0.0

History Node

4.0.0

History Node

3.0.1

History Node

3.0.0

History Node

2.0.0

History Node

1.0.0

History Node

0.0.1

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
pypinfo-7.0.0-py3-none-any.whl (14.8 kB) Copy SHA256 Checksum SHA256 py3 Wheel Nov 9, 2017
pypinfo-7.0.0.tar.gz (9.7 kB) Copy SHA256 Checksum SHA256 Source Nov 9, 2017

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting