Python Client for Crucible - National Archive for NSRC Observations
Project description
nano-crucible : National Archive for NSRC Observations
A Python client library and CLI tool for Crucible - the Molecular Foundry's data lakehouse for scientific research. Crucible stores experimental and synthetic data from DOE Nanoscale Science Research Centers (NSRCs), along with comprehensive metadata about samples, projects, instruments, and users.
🔬 What is Crucible?
Crucible is the centralized data infrastructure for the Molecular Foundry and other DOE Nanoscale Science Research Centers, providing:
- Unified data storage for experimental and synthetic data
- Rich metadata capture for to associate to datasets
- Sample provenance tracking with parent-child relationships
✨ Features
🐍 Python API
- Dataset Management: Create, query, update, and download datasets
- Sample Tracking: Manage samples with hierarchical relationships and provenance
- Metadata: Store and retrieve scientific metadata and experimental parameters
- Linking: Connect datasets, samples, and create relationships programmatically
🖥️ Command-Line Interface
crucible config: One-time setup and configuration managementcrucible upload: Upload datasets with automatic parsing and metadata extractioncrucible open: Open resources in the Crucible Web Explorer with one commandcrucible link: Create relationships between datasets and samples
📦 Installation
From PyPI (Recommended)
pip install nano-crucible
From GitHub (Latest Development)
pip install git+https://github.com/MolecularFoundryCrucible/nano-crucible
For Development
git clone https://github.com/MolecularFoundryCrucible/nano-crucible.git
cd nano-crucible
pip install -e .
🚀 Quick Start
Python API
Creating and Uploading Datasets
from crucible import CrucibleClient, BaseDataset
from crucible.config import config
# Get client
client = config.client
# Method 1: Create dataset (no files)
dataset = client.create_new_dataset(
unique_id = "my-unique-dataset-id", # Optional, auto-generated if None
dataset_name="High-Temperature Synthesis",
measurement="XRD",
project_id="nanomaterials-2024",
public=False,
scientific_metadata={
"temperature_C": 800,
"pressure_bar": 1.0,
"duration_hours": 12,
"atmosphere": "nitrogen"
},
keywords=["synthesis", "high-temperature", "oxides"]
)
# Method 2: Upload dataset with files using BaseDataset
dataset = BaseDataset(
unique_id="my-unique-dataset-id", # Optional, auto-generated if None
dataset_name="Electron Microscopy Images",
measurement="TEM",
project_id="nanomaterials-2024",
public=False,
instrument_name="TEM-2100",
data_format="TIFF",
file_to_upload="/path/to/image.tiff"
)
# Upload with metadata and files
result = client.create_new_dataset_from_files(
dataset=dataset,
scientific_metadata={
"magnification": 50000,
"voltage_kV": 200,
"spot_size": 3
},
keywords=["TEM", "imaging", "nanoparticles"],
files_to_upload=["/path/to/image.tiff", "/path/to/calibration.txt"],
thumbnail="/path/to/thumbnail.png", # Optional
ingestor='ApiUploadIngestor',
wait_for_ingestion_response=True
)
print(f"Dataset created: {result['created_record']['unique_id']}")
Linking Resources
# Link two datasets
client.link_datasets("parent-dataset", "child-dataset")
# Link two samples
client.link_samples("parent-sample", "child-sample")
# Link sample to dataset
client.add_sample_to_dataset("dataset-id", "sample-id")
Command-Line Interface
1. Initial Configuration
# One-time setup
crucible config init
# View your configuration
crucible config show
# Update settings
crucible config set api_key YOUR_NEW_KEY
Get your API key at: https://crucible.lbl.gov/api/v1/user_apikey
2. Upload Data with Parsers
# Upload with generic dataset
crucible upload -i data.txt -pid my-project \
--metadata '{"temperature=300,pressure=1.0"}' \
--keywords "experiment,test"
# Upload specific dataset (e.g. LAMMPS simulation)
# Works only if the parser exists
crucible upload -i simulation.lmp -t lammps -pid my-project
3. Link Resources
# Link two datasets
crucible link -p parent_dataset_id -c child_dataset_id
# Link two samples
crucible link -p parent_sample_id -c child_sample_id
# Link sample to dataset
crucible link -d dataset_id -s sample_id
4. Open in Browser
# Open the Crucible Web Explorer
crucible open
# Open to a specific resource
crucible open RESOURCE_MFID
📖 Documentation
- CLI Documentation: See cli/README.md
- Parser Documentation: See parsers/README.md
- API Reference: Coming soon
🤝 Contributing
We welcome contributions! Areas where you can help:
- New parsers for additional data formats
- Bug reports and feature requests
- Documentation improvements
- Example notebooks and tutorials
📄 License
This project is licensed under the BSD-3-Clause License - see the LICENSE file for details.
🔗 Links
- Crucible API: https://crucible.lbl.gov/api/v1
- Crucible Web Interface: https://crucible-graph-explorer.run.app
💬 Support
For issues, questions, or feature requests:
- GitHub Issues: https://github.com/MolecularFoundryCrucible/nano-crucible/issues
- Email: mkwall@lbl.gov, roncoroni@lbl.gov, esbarnard@lbl.gov
nano-crucible is developed and maintained by the Data Group at the Molecular Foundry at Lawrence Berkeley National Laboratory.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nano_crucible-2.0.0.tar.gz.
File metadata
- Download URL: nano_crucible-2.0.0.tar.gz
- Upload date:
- Size: 39.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cdd8a2872ecca13c9125445aa776180a3e82d23c79ba7ec4affe1239d95e3e2
|
|
| MD5 |
f96d6535e046060e91a36d532588dbb7
|
|
| BLAKE2b-256 |
70766c07f1493123128e77d2f125fc966781beacdab07de5e0d79d223f01b873
|
Provenance
The following attestation bundles were made for nano_crucible-2.0.0.tar.gz:
Publisher:
publish-to-pypi.yml on MolecularFoundryCrucible/nano-crucible
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nano_crucible-2.0.0.tar.gz -
Subject digest:
2cdd8a2872ecca13c9125445aa776180a3e82d23c79ba7ec4affe1239d95e3e2 - Sigstore transparency entry: 975230897
- Sigstore integration time:
-
Permalink:
MolecularFoundryCrucible/nano-crucible@c596cc9b8841edc5370bc9448ba90cd3f0be5855 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MolecularFoundryCrucible
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@c596cc9b8841edc5370bc9448ba90cd3f0be5855 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file nano_crucible-2.0.0-py3-none-any.whl.
File metadata
- Download URL: nano_crucible-2.0.0-py3-none-any.whl
- Upload date:
- Size: 44.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec1dce033c91b2c9c2cdcb2b09d9f0ae026b3041e9dbfdbd234596307005b8a0
|
|
| MD5 |
40c4312c527d2f08182b976a05da9cca
|
|
| BLAKE2b-256 |
b73aeff9be762b820a38808731b2426e06409f1f4cc37408d0f56c8ed8ddf1b9
|
Provenance
The following attestation bundles were made for nano_crucible-2.0.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on MolecularFoundryCrucible/nano-crucible
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nano_crucible-2.0.0-py3-none-any.whl -
Subject digest:
ec1dce033c91b2c9c2cdcb2b09d9f0ae026b3041e9dbfdbd234596307005b8a0 - Sigstore transparency entry: 975230901
- Sigstore integration time:
-
Permalink:
MolecularFoundryCrucible/nano-crucible@c596cc9b8841edc5370bc9448ba90cd3f0be5855 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/MolecularFoundryCrucible
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@c596cc9b8841edc5370bc9448ba90cd3f0be5855 -
Trigger Event:
workflow_dispatch
-
Statement type: