Skip to main content

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

Project description

https://gitcdn.xyz/repo/LuisAlejandro/pypicontents/master/docs/_static/banner.svg

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

PyPI Package Travis CI Coveralls Landscape Read The Docs Contributor License Agreement Gitter Chat


PyPIContents generates a configurable index written in JSON format that serves as a database for applications like pipsalabim. It can be configured to process only a range of packages (by initial letter) and to have memory, time or log size limits. It basically aims to mimic what the Contents file means for a Debian based package repository, but for the Python Package Index.

This repository stores the application in the master branch. It also stores a Module Index in the contents branch that is updated daily through a Travis cron. Read below for more information on how to use one or the other.

For more information, please read the full documentation.

Getting started

Installation

The pypicontents program is written in python and hosted on PyPI. Therefore, you can use pip to install the stable version:

$ pip install --upgrade pypicontents

If you want to install the development version (not recomended), you can install directlty from GitHub like this:

$ pip install --upgrade https://github.com/LuisAlejandro/pypicontents/archive/master.tar.gz

Using the application

PyPIContents is divided in several commands.

pypicontents pypi

This command generates a JSON module index with information from PyPI. Read below for more information on how to use it:

$ pypicontents pypi --help

usage: pypicontents pypi [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Pypi Options:
  -l <level>, --loglevel <level>
                        Logger verbosity level (default: INFO). Must be one
                        of: DEBUG, INFO, WARNING, ERROR or CRITICAL.
  -f <path>, --logfile <path>
                        A path pointing to a file to be used to store logs.
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the JSON Module Index (required).
  -R <letter/number>, --letter-range <letter/number>
                        An expression representing an alphanumeric range to be
                        used to filter packages from PyPI (default: 0-z). You
                        can use a single alphanumeric character like "0" to
                        process only packages beginning with "0". You can use
                        commas use as a list o dashes to use as an interval.
  -L <size>, --limit-log-size <size>
                        Stop processing if log size exceeds <size> (default:
                        3M).
  -M <size>, --limit-mem <size>
                        Stop processing if process memory exceeds <size>
                        (default: 2G).
  -T <sec>, --limit-time <sec>
                        Stop processing if process time exceeds <sec>
                        (default: 2100).

pypicontents stdlib

This command generates a JSON Module Index from the Python Standard Library. Read below for more information on how to use it:

$ pypicontents stdlib --help

usage: pypicontents stdlib [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Stdlib Options:
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the JSON Module Index (required).
  -p <version>, --pyver <version>
                        Python version to be used for the Standard Library
                        (default: 2.7).

pypicontents stats

This command gathers statistics from the logs generated by the pypi command. Read below for more information on how to use it:

$ pypicontents stats --help

usage: pypicontents stats [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Stats Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by the pypi command (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the statistics (required).

pypicontents errors

This command summarizes errors found in the logs generated by the pypi command. Read below for more information on how to use it:

$ pypicontents errors --help

usage: pypicontents errors [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Errors Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by the pypi command (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the errors (required).

pypicontents merge

This command searches for JSON files generated by the pypi or stdlib commands and combines them into one. Read below for more information on how to use it:

$ pypicontents merge --help

usage: pypicontents merge [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Merge Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by pypi or stdlib commands (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the merged JSON files (required).

About the Module Index

In the pypi.json file (located in the contents branch) you will find a dictionary with all the packages registered at the main PyPI instance, each one with the following information:

{
    "pkg_a": {
        "version": [
            "X.Y.Z"
        ],
        "modules": [
            "module_1",
            "module_2",
            "..."
        ],
        "cmdline": [
            "path_1",
            "path_2",
            "..."
        ]
    },
    "pkg_b": {
         "...": "..."
    },
    "...": {},
    "...": {}
}

This index is generated using Travis. This is done by executing the setup.py file of each package through a monkeypatch that allows us to read the parameters that were passed to setup(). Check out pypicontents/api/process.py for more info.

Use cases

  • Search which package (or packages) contain a python module. Useful to determine a project’s requirements.txt or install_requires.

import json
import urllib2
from pprint import pprint

pypic = 'https://raw.githubusercontent.com/LuisAlejandro/pypicontents/contents/pypi.json'

f = urllib2.urlopen(pypic)
pypicontents = json.loads(f.read())

def find_package(contents, module):
    for pkg, data in contents.items():
        for mod in data['modules']:
            if mod == module:
                yield {pkg: data['modules']}

# Which package(s) content the 'django' module?
# Output:
pprint(list(find_package(pypicontents, 'django')))

Hint: Check out Pip Sala Bim.

Known Issues

  1. Some packages have partial or totally absent data because of some of these reasons:

    1. Some packages depend on other packages outside of stdlib. We try to override these imports but if the setup heavily depends on it, it will fail anyway.

    2. Some packages are broken and error out when executing setup.py.

    3. Some packages are empty or have no releases.

  2. If a package gets updated on PyPI and the change introduces or deletes modules, then it won’t be reflected until the next index rebuild. You should check for the version field for consistency. Also, if you need a more up-to-date index, feel free to download this software and build your own index.

Getting help

If you have any doubts or problems, suscribe to our Gitter Chat and ask for help. You can also ask your question on StackOverflow (tag it pypicontents) or drop me an email at luis@huntingbears.com.ve.

Contributing

See CONTRIBUTING.rst for details.

Release history

See HISTORY.rst for details.

License

Copyright 2016-2017, PyPIContents Developers (read AUTHORS.rst for a full list of copyright holders).

Released under a GPL-3 License (read COPYING.rst for license details).

Made with :heart: and :hamburger:

http://huntingbears.com.ve/static/img/site/banner.svg

My name is Luis (@LuisAlejandro) and I’m a Free and Open-Source Software developer living in Maracay, Venezuela.

If you like what I do, please support me on Patreon, Flattr, or donate via PayPal, so that I can continue doing what I love.

Blog huntingbears.com.ve · GitHub @LuisAlejandro · Twitter @LuisAlejandro



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypicontents-0.1.2.tar.gz (56.3 kB view details)

Uploaded Source

File details

Details for the file pypicontents-0.1.2.tar.gz.

File metadata

  • Download URL: pypicontents-0.1.2.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pypicontents-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3eab4b01dc351186783d06c74b8a8b470b766a61de77c3c02724e1ca9948b196
MD5 0f5b5080eca5607f61272dc3dddf4da5
BLAKE2b-256 7bb22fc510c0c327a933f9231306df86414e8061374d17be7466bfb302928d1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page