Mine and extract complete package lists from PyPI registry
Project description
PyPI Package Miner
A Python tool to mine and extract complete package lists from the PyPI (Python Package Index) registry.
Features
- Fetches all ~500,000 PyPI packages from the official simple API
- Retrieves package metadata including homepage and repository URLs
- Parallel processing with 50 workers for efficient data collection
- Intelligently extracts repository URLs from multiple metadata fields
- Progress tracking with visual feedback
- Outputs standardized CSV format for cross-ecosystem analysis
Installation
pip install pypi-miner
Quick Start
pypi-miner
Or use as a Python module:
from pypi_miner import mine_pypi
mine_pypi()
Output
Generates a CSV file with package information:
- Package ID, Platform, Name
- Homepage URL, Repository URL
Performance
- Runtime: 3-8 hours for complete dataset
- Uses 50 parallel workers
- Processes ~500,000 packages
Data Source
- PyPI Simple Index: https://pypi.org/simple/
- Package Metadata: https://pypi.org/pypi/{package-name}/json
License
MIT License - see LICENSE file for details
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pypi_miner-1.0.2.tar.gz
(7.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pypi_miner-1.0.2.tar.gz.
File metadata
- Download URL: pypi_miner-1.0.2.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e8907076acabdc4c18b49accd2a93a455162b21cad6749b9d2ca9df610b00ec
|
|
| MD5 |
0e3acd7ea3af0ef3d1bd08cc6066b884
|
|
| BLAKE2b-256 |
b42780ab54bed30e040c7b4a5a04eb46057bad703355954b1738cc6947e60454
|
File details
Details for the file pypi_miner-1.0.2-py3-none-any.whl.
File metadata
- Download URL: pypi_miner-1.0.2-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52e37c5096386645cd6cd6879766c94860c4e67f6acac4be2aa896c19a965943
|
|
| MD5 |
0e1b4ea3415ed53a9870a7e00031eb5f
|
|
| BLAKE2b-256 |
bd41790e29595ad92dfb2bc3b5703f294451602f3836d6c8746cabcde71a3fcb
|