Skip to main content

License scanning and documentation for conda and pip envs

Project description

Find license types and texts for all installed packages in conda and pip environments, producing a distribution-ready file THIRDPARTY-LICENSES with required details on all packages, including their respective license texts.

Why yet another tool?

There are several packages with similar intents, however I did not find any to match the particular usecase PyLicenses covers. Specifically to produce a complete set of licenses for all installed packages, at the package level (as opposed to the file level as many other such tools do).Also I wanted to have a focused tool that is easily extensible to any framework, in any language.

Features

PyLicense

  • produces the THIRDPARTY-LICENSES file as a report on all packages

  • collects data from conda, pip, pypi and github to retrieve information on authorship, package homepage, license style and - most importantly - the actual license text.

  • uses a pipeline of scanners/data collectors. Adding a new framework to scan (e.g. to include npm modules) is a matter of writing a new PackageProvider class with a single method.

  • produces reports and statistics on primary packages (direct dependency) and secondary packages (pulled-in through a dependency), notably this works across conda and pip. Statistics currently include counts per license type.

  • highlights packages were the license information or license text is missing

  • can map packages to a fixed license URL for packages that do not include the license text or where the LICENSE file is difficult to find by automated means.

How to use

Within your conda or pip virtualenv, run

$ python -m pylicense

To see options

$ python -m pylicense -h usage: __main__.py [-h] [–github GITHUB] [–stats STATS]

optional arguments:
-h, --help

show this help message and exit

--github GITHUB

specify github user,password

--stats STATS

print statistics

Sample output

See the THIRDPARTY-LICENSES file in this repository for the full license collection report of this package.

The direct output looks something like this

$ python -m pylicense Packages directly required:

name author license ———- ————————– ———————– pylicenses Patrick Senti Apache 2.0 wheel Daniel Holth Other urllib3 Andrey Petrov MIT tabulate Sergey Astanin MIT sh Andrew Moffat MIT setuptools Python Packaging Authority MIT License requests Kenneth Reitz Apache Software License pip The pip developers MIT certifi Kenneth Reitz MPL-2.0 libedit NetBSD python PSF

Packages pulled in through other requirements:

name author license ————— —————- ————————————– idna Kim Davies BSD Like chardet Daniel Blanchard GNU Lesser General Public License v2.1 ca-certificates ISC libffi MIT libgcc-ng GPL libstdcxx-ng GPL3 with runtime exception ncurses Free software - X11 License openssl OpenSSL readline GPL3 sqlite Public-Domain xz Public-Domain, GPL zlib zlib

SUCCESS Good news. There are no packages without license texts

SUCCESS The full license report is available in THIRDPARTY-LICENSES

How to implement a new scanner

  1. Add a class in pylicenses.provider, e.g.

    class MyPackageScanner(PackageProvider):
    def get_packages_info(self, packages, subset=None)

    … your code to update packages … return packages

    packages is a dictionary mapping name=>data, where name is either the package’s canonical name or the full distribution name (name-version-type), and data is the data collected so far. For programming convenience all mapping of the same package, independent of the key, reference the same data object.

    Currently there are only very few conventions for the contents of data:

    • name is the name of package without version or distribution type

    • dist_name is the full distribution name (name-version or name-version-type)

    • license is the canonical license name (e.g. MIT, Apache-2.0 etc.)

    • license_text is the actual license text

    • license_source is the filename/URL to the source of the license text

    • license_trace is the last scanner to update the data

    Any other data can be stored by the scanners as they see fit. Note the dependency on PackageProvider as a base class is a convenience only.

  2. Add the new scanner class to PyLicenses.PROVIDERS

  3. Add unit tests

License

MIT License - Copyright (c) 2018 Patrick Senti, productaize.io See LICENSE file

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylicenses-0.1.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

pylicenses-0.1-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file pylicenses-0.1.tar.gz.

File metadata

  • Download URL: pylicenses-0.1.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for pylicenses-0.1.tar.gz
Algorithm Hash digest
SHA256 13e8c13b3edbc88040410bc79c35032da1bf835053836e56fe450dd40f52a9ac
MD5 f7f46b184949997171f17c3bb697cb8b
BLAKE2b-256 d6c48bf60bb12c9d94fda12c34513d14f3d2f02b960b52b2e0ef045919ece11a

See more details on using hashes here.

File details

Details for the file pylicenses-0.1-py3-none-any.whl.

File metadata

  • Download URL: pylicenses-0.1-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for pylicenses-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 90342aa36afdeeb573a12f6148f283d051643d672e4d329f20669e6f26fab3dd
MD5 077ce818850a7aa7db3406121d37b68a
BLAKE2b-256 5789a77f1cccd48c446bd8de3b532de332bef285b43bc3b2df785993f4c24f29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page