Skip to main content

Mine and extract complete package lists from PyPI registry

Project description

PyPI Package Miner

A Python tool to mine and extract complete package lists from the PyPI (Python Package Index) registry.

Features

  • Fetches all ~500,000 PyPI packages from the official simple API
  • Retrieves package metadata including homepage and repository URLs
  • Parallel processing with 50 workers for efficient data collection
  • Intelligently extracts repository URLs from multiple metadata fields
  • Progress tracking with visual feedback
  • Outputs standardized CSV format for cross-ecosystem analysis

Installation

pip install pypi-miner

Quick Start

pypi-miner

Or use as a Python module:

from pypi_miner import mine_pypi
mine_pypi()

Output

Generates a CSV file with package information:

  • Package ID, Platform, Name
  • Homepage URL, Repository URL

Performance

  • Runtime: 3-8 hours for complete dataset
  • Uses 50 parallel workers
  • Processes ~500,000 packages

Data Source

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypi_miner-1.0.2.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pypi_miner-1.0.2-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file pypi_miner-1.0.2.tar.gz.

File metadata

  • Download URL: pypi_miner-1.0.2.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pypi_miner-1.0.2.tar.gz
Algorithm Hash digest
SHA256 5e8907076acabdc4c18b49accd2a93a455162b21cad6749b9d2ca9df610b00ec
MD5 0e3acd7ea3af0ef3d1bd08cc6066b884
BLAKE2b-256 b42780ab54bed30e040c7b4a5a04eb46057bad703355954b1738cc6947e60454

See more details on using hashes here.

File details

Details for the file pypi_miner-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pypi_miner-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pypi_miner-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 52e37c5096386645cd6cd6879766c94860c4e67f6acac4be2aa896c19a965943
MD5 0e1b4ea3415ed53a9870a7e00031eb5f
BLAKE2b-256 bd41790e29595ad92dfb2bc3b5703f294451602f3836d6c8746cabcde71a3fcb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page