Translate Package URLs (PURLs) into validated download URLs for source code artifacts
Project description
PURL2SRC - Package URL to Source Download URLs
Translate Package URLs (PURLs) into validated download URLs for source code artifacts across multiple package ecosystems. Provides a reliable three-tier resolution strategy with URL validation and batch processing capabilities for automated source code retrieval workflows.
Features
- Multi-Ecosystem Support: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more
- Smart Resolution Strategy: Three-level approach from direct URL construction to API queries and local fallback
- URL Validation: Verify download URLs are accessible before returning results
- SEMCL.ONE Integration: Seamlessly integrates with other ecosystem tools for comprehensive source analysis
Installation
pip install purl2src
For development:
git clone https://github.com/SemClone/purl2src.git
cd purl2src
pip install -e .
Quick Start
# Convert a single PURL to download URL
purl2src "pkg:npm/express@4.17.1"
# Batch process multiple PURLs with validation
purl2src -f purls.txt --validate --output results.json
Usage
CLI Usage
# Single PURL with default text output
purl2src "pkg:npm/express@4.17.1"
# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz
# JSON output format
purl2src "pkg:npm/express@4.17.1" --format json
# With URL validation
purl2src "pkg:pypi/requests@2.28.0" --validate
# Batch processing from file
purl2src -f purls.txt --output results.json
# CSV output format
purl2src -f purls.txt --format csv --output results.csv
Python API
from purl2src import get_download_url
# Get download URL for a PURL
result = get_download_url("pkg:npm/express@4.17.1")
print(result.download_url)
# https://registry.npmjs.org/express/-/express-4.17.1.tgz
# With validation (recommended for production)
result = get_download_url("pkg:pypi/requests@2.28.0", validate=True)
# Batch processing
from purl2src import process_purls
results = process_purls(["pkg:npm/express@4.17.1", "pkg:pypi/requests@2.28.0"])
Supported Ecosystems
| Ecosystem | PURL Type | Example |
|---|---|---|
| NPM | npm |
pkg:npm/@angular/core@12.0.0 |
| PyPI | pypi |
pkg:pypi/django@4.0.0 |
| Cargo | cargo |
pkg:cargo/serde@1.0.0 |
| NuGet | nuget |
pkg:nuget/Newtonsoft.Json@13.0.1 |
| Maven | maven |
pkg:maven/org.apache.commons/commons-lang3@3.12.0 |
| RubyGems | gem |
pkg:gem/rails@7.0.0 |
| Go | golang |
pkg:golang/github.com/gin-gonic/gin@v1.8.0 |
| GitHub | github |
pkg:github/facebook/react@v18.0.0 |
| Conda | conda |
pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0 |
| Generic | generic |
pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz |
Examples
NPM with Scoped Package
purl2src "pkg:npm/@angular/core@12.0.0"
# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz
Maven with Classifier
purl2src "pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources"
# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar
Generic with Checksum Validation
purl2src "pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234..."
Integration with SEMCL.ONE
PURL2SRC is a core component of the SEMCL.ONE ecosystem, enabling automated source code retrieval workflows:
- Works with src2purl for package identification and coordinate extraction
- Integrates with purl2notices for legal notice generation from source packages
- Supports upmex package metadata extraction workflows
- Complements osslili for comprehensive license analysis of downloaded sources
Documentation
- User Guide - Comprehensive usage examples and configuration
- API Reference - Python API documentation and examples
- Examples - Common workflows and integration patterns
Contributing
We welcome contributions! Please see CONTRIBUTING.md for details on:
- Code of conduct
- Development setup
- Submitting pull requests
- Reporting issues
Support
For support and questions:
- GitHub Issues - Bug reports and feature requests
- Documentation - Complete project documentation
- SEMCL.ONE Community - Ecosystem support and discussions
License
Apache License 2.0 - see LICENSE file for details.
Authors
See AUTHORS.md for a list of contributors.
Part of the SEMCL.ONE ecosystem for comprehensive OSS compliance and code analysis.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file purl2src-1.2.4.tar.gz.
File metadata
- Download URL: purl2src-1.2.4.tar.gz
- Upload date:
- Size: 28.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b3c739dd3d550826640ab3c36834233dff3695c8f01c986f959dbe1645d7b2b
|
|
| MD5 |
d1708983f5b14c2f2748b6e0da7ee341
|
|
| BLAKE2b-256 |
fdc26207986adce081973d07aabb64e9c903c31996f2618095282a75e5c97d79
|
Provenance
The following attestation bundles were made for purl2src-1.2.4.tar.gz:
Publisher:
python-publish.yml on SemClone/purl2src
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purl2src-1.2.4.tar.gz -
Subject digest:
6b3c739dd3d550826640ab3c36834233dff3695c8f01c986f959dbe1645d7b2b - Sigstore transparency entry: 1109244953
- Sigstore integration time:
-
Permalink:
SemClone/purl2src@08fc223ec50651bd3e719f86ace2e0c4cede4bff -
Branch / Tag:
refs/tags/v1.2.4 - Owner: https://github.com/SemClone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@08fc223ec50651bd3e719f86ace2e0c4cede4bff -
Trigger Event:
release
-
Statement type:
File details
Details for the file purl2src-1.2.4-py3-none-any.whl.
File metadata
- Download URL: purl2src-1.2.4-py3-none-any.whl
- Upload date:
- Size: 28.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d48bf72a071cf04ee9b832cde50b1d0f3da90a419186a99eabd0931730b1846
|
|
| MD5 |
dc226267fa50d9dac5a6f4b33c1755ee
|
|
| BLAKE2b-256 |
05ccf0a252497c6913ed31e4f2ee4f043c0046df5b0639dcb9d8f12b8c12256f
|
Provenance
The following attestation bundles were made for purl2src-1.2.4-py3-none-any.whl:
Publisher:
python-publish.yml on SemClone/purl2src
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purl2src-1.2.4-py3-none-any.whl -
Subject digest:
6d48bf72a071cf04ee9b832cde50b1d0f3da90a419186a99eabd0931730b1846 - Sigstore transparency entry: 1109244954
- Sigstore integration time:
-
Permalink:
SemClone/purl2src@08fc223ec50651bd3e719f86ace2e0c4cede4bff -
Branch / Tag:
refs/tags/v1.2.4 - Owner: https://github.com/SemClone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@08fc223ec50651bd3e719f86ace2e0c4cede4bff -
Trigger Event:
release
-
Statement type: