Skip to main content

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

Project description

https://gitcdn.xyz/repo/LuisAlejandro/pypicontents/master/docs/_static/banner.svg

PyPIContents is an application that generates a Module Index from the Python Package Index (PyPI) and also from various versions of the Python Standard Library.

PyPI Package Travis CI Coveralls Code Climate Updates Read The Docs Contributor License Agreement Gitter Chat


PyPIContents generates a configurable index written in JSON format that serves as a database for applications like pipsalabim. It can be configured to process only a range of packages (by initial letter) and to have memory, time or log size limits. It basically aims to mimic what the Contents file means for a Debian based package repository, but for the Python Package Index.

This repository stores the application in the master branch. It also stores a Module Index in the contents branch that is updated daily through a Travis cron. Read below for more information on how to use one or the other.

For more information, please read the full documentation.

Getting started

Installation

The pypicontents program is written in python and hosted on PyPI. Therefore, you can use pip to install the stable version:

$ pip install --upgrade pypicontents

If you want to install the development version (not recomended), you can install directlty from GitHub like this:

$ pip install --upgrade https://github.com/LuisAlejandro/pypicontents/archive/master.tar.gz

Using the application

PyPIContents is divided in several commands.

pypicontents pypi

This command generates a JSON module index with information from PyPI. Read below for more information on how to use it:

$ pypicontents pypi --help

usage: pypicontents pypi [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Pypi Options:
  -l <level>, --loglevel <level>
                        Logger verbosity level (default: INFO). Must be one
                        of: DEBUG, INFO, WARNING, ERROR or CRITICAL.
  -f <path>, --logfile <path>
                        A path pointing to a file to be used to store logs.
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the JSON Module Index (required).
  -R <letter/number>, --letter-range <letter/number>
                        An expression representing an alphanumeric range to be
                        used to filter packages from PyPI (default: 0-z). You
                        can use a single alphanumeric character like "0" to
                        process only packages beginning with "0". You can use
                        commas use as a list o dashes to use as an interval.
  -L <size>, --limit-log-size <size>
                        Stop processing if log size exceeds <size> (default:
                        3M).
  -M <size>, --limit-mem <size>
                        Stop processing if process memory exceeds <size>
                        (default: 2G).
  -T <sec>, --limit-time <sec>
                        Stop processing if process time exceeds <sec>
                        (default: 2100).

pypicontents stdlib

This command generates a JSON Module Index from the Python Standard Library. Read below for more information on how to use it:

$ pypicontents stdlib --help

usage: pypicontents stdlib [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Stdlib Options:
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the JSON Module Index (required).
  -p <version>, --pyver <version>
                        Python version to be used for the Standard Library
                        (default: 2.7).

pypicontents stats

This command gathers statistics from the logs generated by the pypi command. Read below for more information on how to use it:

$ pypicontents stats --help

usage: pypicontents stats [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Stats Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by the pypi command (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the statistics (required).

pypicontents errors

This command summarizes errors found in the logs generated by the pypi command. Read below for more information on how to use it:

$ pypicontents errors --help

usage: pypicontents errors [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Errors Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by the pypi command (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the errors (required).

pypicontents merge

This command searches for JSON files generated by the pypi or stdlib commands and combines them into one. Read below for more information on how to use it:

$ pypicontents merge --help

usage: pypicontents merge [options]

General Options:
  -V, --version         Print version and exit.
  -h, --help            Show this help message and exit.

Merge Options:
  -i <path>, --inputdir <path>
                        A path pointing to a directory containing JSON files
                        generated by pypi or stdlib commands (required).
  -o <path>, --outputfile <path>
                        A path pointing to a file that will be used to store
                        the merged JSON files (required).

About the Module Index

In the pypi.json file (located in the contents branch) you will find a dictionary with all the packages registered at the main PyPI instance, each one with the following information:

{
    "pkg_a": {
        "version": [
            "X.Y.Z"
        ],
        "modules": [
            "module_1",
            "module_2",
            "..."
        ],
        "cmdline": [
            "path_1",
            "path_2",
            "..."
        ]
    },
    "pkg_b": {
         "...": "..."
    },
    "...": {},
    "...": {}
}

This index is generated using Travis. This is done by executing the setup.py file of each package through a monkeypatch that allows us to read the parameters that were passed to setup(). Check out pypicontents/api/process.py for more info.

Use cases

  • Search which package (or packages) contain a python module. Useful to determine a project’s requirements.txt or install_requires.

import json
import urllib2
from pprint import pprint

pypic = 'https://raw.githubusercontent.com/LuisAlejandro/pypicontents/contents/pypi.json'

f = urllib2.urlopen(pypic)
pypicontents = json.loads(f.read())

def find_package(contents, module):
    for pkg, data in contents.items():
        for mod in data['modules']:
            if mod == module:
                yield {pkg: data['modules']}

# Which package(s) content the 'django' module?
# Output:
pprint(list(find_package(pypicontents, 'django')))

Hint: Check out Pip Sala Bim.

Known Issues

  1. Some packages have partial or totally absent data because of some of these reasons:

    1. Some packages depend on other packages outside of stdlib. We try to override these imports but if the setup heavily depends on it, it will fail anyway.

    2. Some packages are broken and error out when executing setup.py.

    3. Some packages are empty or have no releases.

  2. If a package gets updated on PyPI and the change introduces or deletes modules, then it won’t be reflected until the next index rebuild. You should check for the version field for consistency. Also, if you need a more up-to-date index, feel free to download this software and build your own index.

Getting help

If you have any doubts or problems, suscribe to our Gitter Chat and ask for help. You can also ask your question on StackOverflow (tag it pypicontents) or drop me an email at luis@huntingbears.com.ve.

Contributing

See CONTRIBUTING.rst for details.

Release history

See HISTORY.rst for details.

License

Copyright 2016-2017, PyPIContents Developers (read AUTHORS.rst for a full list of copyright holders).

Released under a GPL-3 License (read COPYING.rst for license details).

Made with :heart: and :hamburger:

http://huntingbears.com.ve/static/img/site/banner.svg

My name is Luis (@LuisAlejandro) and I’m a Free and Open-Source Software developer living in Maracay, Venezuela.

If you like what I do, please support me on Patreon, Flattr, or donate via PayPal, so that I can continue doing what I love.

Blog huntingbears.com.ve · GitHub @LuisAlejandro · Twitter @LuisAlejandro



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypicontents-0.1.14.tar.gz (58.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page