Skip to main content

Package to support simplified application of machine learning models to datasets in materials science

Project description

PyPI Tests NSF-1931306

Foundry-ML simplifies access to machine learning-ready datasets in materials science and chemistry.

  • Search & Load - Find and use curated datasets with a few lines of code
  • Understand - Rich schemas describe what each field means
  • Cite - Automatic citation generation for publications
  • Publish - Share your datasets with the community
  • AI-Ready - MCP server for Claude and other AI assistants

Quick Start

pip install foundry-ml

Need optional integrations? Install extras only when you need them:

pip install "foundry-ml[torch]"        # Enable dataset.get_as_torch()
pip install "foundry-ml[tensorflow]"   # Enable dataset.get_as_tensorflow()
pip install "foundry-ml[huggingface]"  # Enable push-to-hub
pip install "foundry-ml[excel]"        # Excel import support via openpyxl

PyTorch/TensorFlow extras expect wheels compiled against NumPy 2.0. Install PyTorch 2.3+ and TensorFlow 2.18+ (or newer builds with NumPy 2 support) to avoid ABI errors.

from foundry import Foundry

# Connect
f = Foundry()

# Search
results = f.search("band gap", limit=5)

# Load
dataset = results.iloc[0].FoundryDataset
X, y = dataset.get_as_dict()['train']

# Understand
schema = dataset.get_schema()
print(schema['fields'])

# Cite
print(dataset.get_citation())

Cloud Environments

For Google Colab or remote Jupyter:

f = Foundry(no_browser=True, no_local_server=True)

CLI

foundry search "band gap"
foundry schema 10.18126/abc123
foundry --help

AI Agent Integration

foundry mcp install  # Add to Claude Code

Documentation

Features

Feature Description
Search Find datasets by keyword, DOI, or browse catalog
Load Automatic download, caching, and format conversion
PyTorch/TensorFlow (extras) dataset.get_as_torch(), dataset.get_as_tensorflow()
CLI Terminal-based workflows
MCP Server AI assistant integration
HuggingFace Export (extra) Publish to HuggingFace Hub

Available Datasets

Browse datasets at Foundry-ML.org or:

f = Foundry()
f.list(limit=20)  # See available datasets

How to Cite

If you use Foundry-ML, please cite:

@article{Schmidt2024,
  doi = {10.21105/joss.05467},
  year = {2024},
  publisher = {The Open Journal},
  volume = {9},
  number = {93},
  pages = {5467},
  author = {Kj Schmidt and Aristana Scourtas and Logan Ward and others},
  title = {Foundry-ML - Software and Services to Simplify Access to Machine Learning Datasets in Materials Science},
  journal = {Journal of Open Source Software}
}

Contributing

Foundry is open source. To contribute:

  1. Fork from main
  2. Make your changes
  3. Open a Pull Request

See CONTRIBUTING.md for details.

Support

This work was supported by the National Science Foundation under NSF Award Number: 1931306 "Collaborative Research: Framework: Machine Learning Materials Innovation Infrastructure".

Foundry integrates with Materials Data Facility, FuncX, and MAST-ML.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foundry_ml-1.2.2.tar.gz (55.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

foundry_ml-1.2.2-py3-none-any.whl (63.8 kB view details)

Uploaded Python 3

File details

Details for the file foundry_ml-1.2.2.tar.gz.

File metadata

  • Download URL: foundry_ml-1.2.2.tar.gz
  • Upload date:
  • Size: 55.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for foundry_ml-1.2.2.tar.gz
Algorithm Hash digest
SHA256 cbb8f26c70cd17b17fbb78f4eecf55ee3ea543ab658934b6f120b57ca0ef4c36
MD5 fa58464a09fe824a75b43caaf88a35d4
BLAKE2b-256 3b99b5eb09caf8fad27b2a7821fc42d149dbb9b737730bb5fb56f32fa7be6169

See more details on using hashes here.

File details

Details for the file foundry_ml-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: foundry_ml-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 63.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for foundry_ml-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 22fcd641e11f97e27c483eeb6f8489adc0675c0d487e419495baadfe6c228a7c
MD5 644df7837e17ea72423445813639deae
BLAKE2b-256 d353bf02798c8c675d08386a77b41046ec957419ebe46f432b0547a6004f4c39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page