Skip to main content

Python Client for Crucible - National Archive for NSRC Observations

Project description

nano-crucible : National Archive for NSRC Observations

PyPI version PyPI downloads GitHub release Python Version License GitHub stars

A Python client library and CLI tool for Crucible - the Molecular Foundry's data lakehouse for scientific research. Crucible stores experimental and synthetic data from DOE Nanoscale Science Research Centers (NSRCs), along with comprehensive metadata about samples, projects, instruments, and users.

🔬 What is Crucible?

Crucible is the centralized data infrastructure for the Molecular Foundry and other DOE Nanoscale Science Research Centers, providing:

  • Unified data storage for experimental and synthetic data
  • Rich metadata capture for to associate to datasets
  • Sample provenance tracking with parent-child relationships

✨ Features

🐍 Python API

  • Dataset Management: Create, query, update, and download datasets
  • Sample Tracking: Manage samples with hierarchical relationships and provenance
  • Metadata: Store and retrieve scientific metadata and experimental parameters
  • Linking: Connect datasets, samples, and create relationships programmatically

🖥️ Command-Line Interface

  • crucible config: One-time setup and configuration management
  • crucible upload: Upload datasets with automatic parsing and metadata extraction
  • crucible open: Open resources in the Crucible Web Explorer with one command
  • crucible link: Create relationships between datasets and samples

📦 Installation

From PyPI (Recommended)

pip install nano-crucible

From GitHub (Latest Development)

pip install git+https://github.com/MolecularFoundryCrucible/nano-crucible

For Development

git clone https://github.com/MolecularFoundryCrucible/nano-crucible.git
cd nano-crucible
pip install -e .

🚀 Quick Start

Python API

Creating and Uploading Datasets

from crucible import CrucibleClient, BaseDataset
from crucible.config import config

# Get client
client = config.client

# Method 1: Create dataset (no files)
dataset = client.create_new_dataset(
    unique_id = "my-unique-dataset-id",  # Optional, auto-generated if None
    dataset_name="High-Temperature Synthesis",
    measurement="XRD",
    project_id="nanomaterials-2024",
    public=False,
    scientific_metadata={
        "temperature_C": 800,
        "pressure_bar": 1.0,
        "duration_hours": 12,
        "atmosphere": "nitrogen"
    },
    keywords=["synthesis", "high-temperature", "oxides"]
)

# Method 2: Upload dataset with files using BaseDataset
dataset = BaseDataset(
    unique_id="my-unique-dataset-id",  # Optional, auto-generated if None
    dataset_name="Electron Microscopy Images",
    measurement="TEM",
    project_id="nanomaterials-2024",
    public=False,
    instrument_name="TEM-2100",
    data_format="TIFF",
    file_to_upload="/path/to/image.tiff"
)

# Upload with metadata and files
result = client.create_new_dataset_from_files(
    dataset=dataset,
    scientific_metadata={
        "magnification": 50000,
        "voltage_kV": 200,
        "spot_size": 3
    },
    keywords=["TEM", "imaging", "nanoparticles"],
    files_to_upload=["/path/to/image.tiff", "/path/to/calibration.txt"],
    thumbnail="/path/to/thumbnail.png",  # Optional
    ingestor='ApiUploadIngestor',
    wait_for_ingestion_response=True
)

print(f"Dataset created: {result['created_record']['unique_id']}")

Linking Resources

# Link two datasets
client.link_datasets("parent-dataset", "child-dataset")
# Link two samples
client.link_samples("parent-sample", "child-sample")
# Link sample to dataset
client.add_sample_to_dataset("dataset-id", "sample-id")

Command-Line Interface

1. Initial Configuration

# One-time setup
crucible config init

# View your configuration
crucible config show

# Update settings
crucible config set api_key YOUR_NEW_KEY

Get your API key at: https://crucible.lbl.gov/api/v1/user_apikey

2. Upload Data with Parsers

# Upload with generic dataset
crucible upload -i data.txt -pid my-project \
    --metadata '{"temperature=300,pressure=1.0"}' \
    --keywords "experiment,test"

# Upload specific dataset (e.g. LAMMPS simulation)
# Works only if the parser exists
crucible upload -i simulation.lmp -t lammps -pid my-project

3. Link Resources

# Link two datasets
crucible link -p parent_dataset_id -c child_dataset_id
# Link two samples
crucible link -p parent_sample_id -c child_sample_id
# Link sample to dataset
crucible link -d dataset_id -s sample_id

4. Open in Browser

# Open the Crucible Web Explorer
crucible open
# Open to a specific resource
crucible open RESOURCE_MFID

📖 Documentation

🤝 Contributing

We welcome contributions! Areas where you can help:

  • New parsers for additional data formats
  • Bug reports and feature requests
  • Documentation improvements
  • Example notebooks and tutorials

📄 License

This project is licensed under the BSD-3-Clause License - see the LICENSE file for details.

🔗 Links

💬 Support

For issues, questions, or feature requests:


nano-crucible is developed and maintained by the Data Group at the Molecular Foundry at Lawrence Berkeley National Laboratory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nano_crucible-2.0.0.tar.gz (39.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nano_crucible-2.0.0-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file nano_crucible-2.0.0.tar.gz.

File metadata

  • Download URL: nano_crucible-2.0.0.tar.gz
  • Upload date:
  • Size: 39.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nano_crucible-2.0.0.tar.gz
Algorithm Hash digest
SHA256 2cdd8a2872ecca13c9125445aa776180a3e82d23c79ba7ec4affe1239d95e3e2
MD5 f96d6535e046060e91a36d532588dbb7
BLAKE2b-256 70766c07f1493123128e77d2f125fc966781beacdab07de5e0d79d223f01b873

See more details on using hashes here.

Provenance

The following attestation bundles were made for nano_crucible-2.0.0.tar.gz:

Publisher: publish-to-pypi.yml on MolecularFoundryCrucible/nano-crucible

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nano_crucible-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: nano_crucible-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nano_crucible-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ec1dce033c91b2c9c2cdcb2b09d9f0ae026b3041e9dbfdbd234596307005b8a0
MD5 40c4312c527d2f08182b976a05da9cca
BLAKE2b-256 b73aeff9be762b820a38808731b2426e06409f1f4cc37408d0f56c8ed8ddf1b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for nano_crucible-2.0.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on MolecularFoundryCrucible/nano-crucible

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page