Julia-style Artifacts system for Python - TOML-based artifact management with automatic downloading and caching
Project description
fetch-artifacts
A Julia-style artifact system for Python. Manage large binary files with TOML-based configuration, automatic downloading, content-addressable caching, and checksum verification.
Features
- Julia-compatible: Uses the same
Artifacts.tomlformat as Julia's Pkg.Artifacts - Content-addressable storage: Artifacts cached by git-tree-sha1 hash for deduplication
- Lazy loading: Download artifacts only when accessed
- Checksum verification: SHA256 verification for all downloads
- Multiple mirrors: Support for fallback download sources
- Simple API: Minimal code to load and use artifacts
Installation
pip install fetch-artifacts
Usage
1. Create an Artifacts.toml file
[MyDataset]
git-tree-sha1 = "d309b571f5693718c8612d387820a409479fe506"
[[MyDataset.download]]
url = "https://example.com/dataset.tar.xz"
sha256 = "d309b571f5693718c8612d387820a409479fe50688d4c46c87ba8662c6acc09b"
2. Load artifacts in Python
from fetch_artifacts import artifact
# Get path to the artifact (downloads if needed)
dataset_path = artifact("MyDataset")
# Use the artifact
import pandas as pd
data = pd.read_csv(dataset_path / "data.csv")
3. Create and publish artifacts
from fetch_artifacts import create_artifact, bind_artifact
# Create archive from directory
result = create_artifact(
directory="path/to/data",
archive_path="output.tar.xz",
compression="xz"
)
# Add to Artifacts.toml
bind_artifact(
toml_path="Artifacts.toml",
name="MyArtifact",
git_tree_sha1=result['git_tree_sha1'],
download_url="https://example.com/artifact.tar.xz",
sha256=result['sha256']
)
4. Add existing remote files
from fetch_artifacts import add_artifact
# Download, compute hashes, and add to Artifacts.toml in one step
add_artifact(
toml_path="Artifacts.toml",
name="RemoteDataset",
tarball_url="https://zenodo.org/records/12345/files/data.tar.xz"
)
Advanced Usage
Custom cache directory:
from fetch_artifacts import set_cache_dir
set_cache_dir("/path/to/cache")
Check if artifact exists:
from fetch_artifacts import artifact_exists
if artifact_exists("MyArtifact"):
print("Artifact is cached")
Clear cache:
from fetch_artifacts import clear_artifact_cache
clear_artifact_cache("MyArtifact") # Clear specific artifact
clear_artifact_cache() # Clear all artifacts
Custom metadata:
[MyEmulator]
git-tree-sha1 = "abc123..."
description = "Neural network emulator for cosmology"
version = "2.0"
[[MyEmulator.download]]
url = "https://zenodo.org/records/12345/files/emulator.tar.xz"
sha256 = "def456..."
Access metadata:
from fetch_artifacts import load_artifacts
manager = load_artifacts("Artifacts.toml")
metadata = manager.artifacts["MyEmulator"].metadata
print(metadata["description"]) # "Neural network emulator for cosmology"
Why fetch-artifacts?
Managing large datasets or model files in scientific computing has several challenges:
- git-lfs: Expensive, coupled to git history, doesn't deduplicate across projects
- Direct downloads: No versioning, no automatic checksums, manual management
- fetch-artifacts: Content-addressable, automatic verification, global caching, platform-independent
Inspired by Julia's Pkg.Artifacts, fetch-artifacts brings the same robust workflow to Python.
Artifacts.toml Format
[ArtifactName]
git-tree-sha1 = "abc123..." # Content hash (required)
[[ArtifactName.download]]
url = "https://primary.com/data.tar.xz"
sha256 = "def456..."
[[ArtifactName.download]] # Optional fallback mirror
url = "https://mirror.com/data.tar.xz"
sha256 = "def456..."
Development
git clone https://github.com/CosmologicalEmulators/fetch-artifacts.git
cd fetch-artifacts
poetry install
poetry run pytest tests/ -v --cov=fetch_artifacts
License
MIT License. See LICENSE for details.
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
Links
- Documentation
- Issue Tracker
- PyPI Package (coming soon)
- Julia's Pkg.Artifacts
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fetch_artifacts-0.1.0.tar.gz.
File metadata
- Download URL: fetch_artifacts-0.1.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8228063b82f5ddedfd4be6b65a37e182c583a03a94ca39729b37e3354fc36c2a
|
|
| MD5 |
41170c5475338b96873e376955cdea80
|
|
| BLAKE2b-256 |
e4deecc1d01e759df8e114b5ddae887ff924733e14d2a60e0300bebbc56b4728
|
Provenance
The following attestation bundles were made for fetch_artifacts-0.1.0.tar.gz:
Publisher:
publish.yml on CosmologicalEmulators/fetch-artifacts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetch_artifacts-0.1.0.tar.gz -
Subject digest:
8228063b82f5ddedfd4be6b65a37e182c583a03a94ca39729b37e3354fc36c2a - Sigstore transparency entry: 737550302
- Sigstore integration time:
-
Permalink:
CosmologicalEmulators/fetch-artifacts@9b2ec5a37c8033bebeb123811f79fdb92633268e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/CosmologicalEmulators
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9b2ec5a37c8033bebeb123811f79fdb92633268e -
Trigger Event:
release
-
Statement type:
File details
Details for the file fetch_artifacts-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fetch_artifacts-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
faf5842a60029379ea9286699128d8c6550613fdbac4089604688ca4d717740e
|
|
| MD5 |
324bec57a5d0d4ea5760ff9127c4ae65
|
|
| BLAKE2b-256 |
ecef86ce17de83e1b8caa85e0c2802b940b6e1ef86fc333e69888ed8062addd9
|
Provenance
The following attestation bundles were made for fetch_artifacts-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on CosmologicalEmulators/fetch-artifacts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fetch_artifacts-0.1.0-py3-none-any.whl -
Subject digest:
faf5842a60029379ea9286699128d8c6550613fdbac4089604688ca4d717740e - Sigstore transparency entry: 737550315
- Sigstore integration time:
-
Permalink:
CosmologicalEmulators/fetch-artifacts@9b2ec5a37c8033bebeb123811f79fdb92633268e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/CosmologicalEmulators
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9b2ec5a37c8033bebeb123811f79fdb92633268e -
Trigger Event:
release
-
Statement type: