Skip to main content

Translate Package URLs (PURLs) into validated download URLs for source code artifacts

Project description

PURL2SRC - Package URL to Source Download URLs

License Python 3.8+ PyPI version

Translate Package URLs (PURLs) into validated download URLs for source code artifacts across multiple package ecosystems. Provides a reliable three-tier resolution strategy with URL validation and batch processing capabilities for automated source code retrieval workflows.

Features

  • Multi-Ecosystem Support: NPM, PyPI, Cargo, NuGet, GitHub, Maven, RubyGems, Go, Conda, and more
  • Smart Resolution Strategy: Three-level approach from direct URL construction to API queries and local fallback
  • URL Validation: Verify download URLs are accessible before returning results
  • SEMCL.ONE Integration: Seamlessly integrates with other ecosystem tools for comprehensive source analysis

Installation

pip install purl2src

For development:

git clone https://github.com/SemClone/purl2src.git
cd purl2src
pip install -e .

Quick Start

# Convert a single PURL to download URL
purl2src "pkg:npm/express@4.17.1"

# Batch process multiple PURLs with validation
purl2src -f purls.txt --validate --output results.json

Usage

CLI Usage

# Single PURL with default text output
purl2src "pkg:npm/express@4.17.1"
# Output: pkg:npm/express@4.17.1 -> https://registry.npmjs.org/express/-/express-4.17.1.tgz

# JSON output format
purl2src "pkg:npm/express@4.17.1" --format json

# With URL validation
purl2src "pkg:pypi/requests@2.28.0" --validate

# Batch processing from file
purl2src -f purls.txt --output results.json

# CSV output format
purl2src -f purls.txt --format csv --output results.csv

Python API

from purl2src import get_download_url

# Get download URL for a PURL
result = get_download_url("pkg:npm/express@4.17.1")
print(result.download_url)
# https://registry.npmjs.org/express/-/express-4.17.1.tgz

# With validation (recommended for production)
result = get_download_url("pkg:pypi/requests@2.28.0", validate=True)

# Batch processing
from purl2src import process_purls
results = process_purls(["pkg:npm/express@4.17.1", "pkg:pypi/requests@2.28.0"])

Supported Ecosystems

Ecosystem PURL Type Example
NPM npm pkg:npm/@angular/core@12.0.0
PyPI pypi pkg:pypi/django@4.0.0
Cargo cargo pkg:cargo/serde@1.0.0
NuGet nuget pkg:nuget/Newtonsoft.Json@13.0.1
Maven maven pkg:maven/org.apache.commons/commons-lang3@3.12.0
RubyGems gem pkg:gem/rails@7.0.0
Go golang pkg:golang/github.com/gin-gonic/gin@v1.8.0
GitHub github pkg:github/facebook/react@v18.0.0
Conda conda pkg:conda/numpy@1.23.0?channel=conda-forge&subdir=linux-64&build=py39h1234567_0
Generic generic pkg:generic/package@1.0.0?download_url=https://example.com/file.tar.gz

Examples

NPM with Scoped Package

purl2src "pkg:npm/@angular/core@12.0.0"
# Output: https://registry.npmjs.org/@angular/core/-/core-12.0.0.tgz

Maven with Classifier

purl2src "pkg:maven/org.apache.xmlgraphics/batik-anim@1.9.1?classifier=sources"
# Output: https://repo.maven.apache.org/maven2/org/apache/xmlgraphics/batik-anim/1.9.1/batik-anim-1.9.1-sources.jar

Generic with Checksum Validation

purl2src "pkg:generic/mypackage@1.0.0?download_url=https://example.com/pkg.tar.gz&checksum=sha256:abcd1234..."

Integration with SEMCL.ONE

PURL2SRC is a core component of the SEMCL.ONE ecosystem, enabling automated source code retrieval workflows:

  • Works with src2purl for package identification and coordinate extraction
  • Integrates with purl2notices for legal notice generation from source packages
  • Supports upmex package metadata extraction workflows
  • Complements osslili for comprehensive license analysis of downloaded sources

Documentation

  • User Guide - Comprehensive usage examples and configuration
  • API Reference - Python API documentation and examples
  • Examples - Common workflows and integration patterns

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details on:

  • Code of conduct
  • Development setup
  • Submitting pull requests
  • Reporting issues

Support

For support and questions:

License

Apache License 2.0 - see LICENSE file for details.

Authors

See AUTHORS.md for a list of contributors.


Part of the SEMCL.ONE ecosystem for comprehensive OSS compliance and code analysis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purl2src-1.2.4.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

purl2src-1.2.4-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file purl2src-1.2.4.tar.gz.

File metadata

  • Download URL: purl2src-1.2.4.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for purl2src-1.2.4.tar.gz
Algorithm Hash digest
SHA256 6b3c739dd3d550826640ab3c36834233dff3695c8f01c986f959dbe1645d7b2b
MD5 d1708983f5b14c2f2748b6e0da7ee341
BLAKE2b-256 fdc26207986adce081973d07aabb64e9c903c31996f2618095282a75e5c97d79

See more details on using hashes here.

Provenance

The following attestation bundles were made for purl2src-1.2.4.tar.gz:

Publisher: python-publish.yml on SemClone/purl2src

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file purl2src-1.2.4-py3-none-any.whl.

File metadata

  • Download URL: purl2src-1.2.4-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for purl2src-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6d48bf72a071cf04ee9b832cde50b1d0f3da90a419186a99eabd0931730b1846
MD5 dc226267fa50d9dac5a6f4b33c1755ee
BLAKE2b-256 05ccf0a252497c6913ed31e4f2ee4f043c0046df5b0639dcb9d8f12b8c12256f

See more details on using hashes here.

Provenance

The following attestation bundles were made for purl2src-1.2.4-py3-none-any.whl:

Publisher: python-publish.yml on SemClone/purl2src

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page