Crystalyse v1.0 - Intelligent Scientific AI Agent for Inorganic Materials Design
Project description
Crystalyse v1.0
Intelligent Scientific AI Agent for Inorganic Materials Design
Version 1.0.0 Research tool combining composition validation (SMACT), structure prediction (Chemeleon), energy calculations (MACE), and materials analysis (PyMatGen)
What's New in v1.0
Version 1.0 includes significant architectural changes focused on computational traceability and simplified setup:
Key Features
Provenance System
- Every numerical value traces to actual calculations
- Render gate prevents unprovenanced values from being displayed
- Full audit trail of all operations
- No fabricated or hallucinated numbers
Automated Setup
- Chemeleon checkpoints auto-download on first use (~600 MB)
- Materials Project phase diagrams auto-download (~170 MB, 271,617 entries)
- Files cached in
~/.cache/crystalyse/ - Environment variables are optional
PyMatGen Integration
- Energy above hull calculations using 271,617 Materials Project entries
- Phase diagram construction
- Decomposition product analysis
- Stability assessment (stable/metastable/unstable)
Adaptive Clarification System
- Learns user expertise level over time
- Adjusts question complexity accordingly
- Fewer interruptions for experienced users
- More guidance for new users
Analysis Modes
- Creative Mode: Fast exploration (~50s) with Chemeleon + MACE
- Rigorous Mode: Full validation (2-5min) with SMACT + Chemeleon + MACE + analysis
- Adaptive Mode: Automatic mode selection based on query complexity
Simplified Architecture
- Modular tool implementations
- Python packaging for MCP servers
- Cleaner separation of concerns
Quick Start
Installation
# Install from PyPI (includes all dependencies and MCP servers)
pip install crystalyse
# Optional: visualization tools
pip install crystalyse[visualization]
# Set your OpenAI API key
export OPENAI_MDG_API_KEY="sk-your-key-here"
# Verify installation
crystalyse --help
First Run
On first run, CrystaLyse will automatically download:
- Chemeleon model checkpoints (~600 MB, one-time)
- Materials Project phase diagrams (~170 MB, one-time)
These are cached in ~/.cache/crystalyse/ and never downloaded again.
Basic Usage
# Creative mode (fast exploration)
crystalyse analyse "Find stable perovskite materials for solar cells" --mode creative
# Rigorous mode (complete validation)
crystalyse analyse "Analyze LiCoO2 for battery applications" --mode rigorous
# Interactive session with memory
crystalyse chat -u researcher -s battery_project
# Resume previous session
crystalyse resume battery_project -u researcher
Scientific Capabilities
Materials Discovery Pipeline
-
Composition Validation (SMACT)
- Chemical plausibility screening
- Charge balancing
- Electronegative balance
- Quick filtering of impossible compositions
-
Structure Prediction (Chemeleon)
- AI-powered crystal structure generation
- Multiple polymorph candidates
- Confidence scores for each structure
- Text-guided generation
-
Energy Calculations (MACE)
- Formation energy evaluation
- Structure relaxation
- Uncertainty quantification
- GPU-accelerated calculations
-
Stability Analysis (PyMatGen)
- Energy above hull calculations
- Phase diagram construction
- Decomposition products
- Competing phases analysis
-
Visualization (Optional)
- 3D crystal structures
- XRD patterns
- Radial distribution functions
- Coordination environment analysis
Research Applications
Energy Materials
- Battery cathodes and anodes (Li-ion, Na-ion, solid-state)
- Solid electrolytes and fast ion conductors
- Photovoltaic semiconductors and perovskites
- Thermoelectric materials
Electronic Materials
- Ferroelectric and multiferroic materials
- Magnetic materials and spintronics
- Semiconductor devices
- Quantum materials
Structural Materials
- High-temperature ceramics
- Hard coatings and superhard materials
- Transparent conductors
Performance
| Operation | Creative Mode | Rigorous Mode |
|---|---|---|
| Simple query (single material) | ~50 seconds | 2-3 minutes |
| Complex analysis (multiple candidates) | 1-2 minutes | 3-5 minutes |
| Batch screening (10+ materials) | 5-10 minutes | 15-30 minutes |
Advanced Usage
Interactive Sessions
# Start research session
crystalyse chat -u researcher -s solar_cells -m creative
# In-session commands
/mode rigorous # Switch to rigorous mode
/mode creative # Switch to creative mode
/help # Show commands
/exit # Exit session
Custom Data Paths
# Use custom checkpoint directory
export CHEMELEON_CHECKPOINT_DIR=/path/to/checkpoints
crystalyse analyse "BaTiO3"
# Use custom phase diagram data
export CRYSTALYSE_PPD_PATH=/path/to/ppd.pkl.gz
crystalyse analyse "LiCoO2"
Programmatic API
from crystalyse.agents import EnhancedCrystaLyseAgent
from crystalyse.config import CrystaLyseConfig
# Initialize agent
config = CrystaLyseConfig()
agent = EnhancedCrystaLyseAgent(config=config, mode="rigorous")
# Run analysis
result = agent.query(
"Analyze CsSnI3 perovskite for photovoltaic applications",
user_id="researcher"
)
print(result.response)
Computational Honesty
Crystalyse implements a provenance system to track the origin of all computed values:
- Traceability: Every numerical value traces to actual tool calculations
- Render Gate: Blocks unprovenanced values from being displayed
- Audit Trail: Full JSONL logs of all operations
- Uncertainty: Predictions include confidence scores when available
If Crystalyse reports a formation energy, it was calculated by MACE. If it reports an energy above hull, it came from PyMatGen with Materials Project data. The LLM interprets results but doesn't fabricate numbers.
Migration from crystalyse-ai
If you're upgrading from the old crystalyse-ai package:
Breaking Changes
- Package name:
pip install crystalyse(notcrystalyse-ai) - Import name:
from crystalyse import ...(notfrom crystalyse_ai import ...) - MCP servers: Now bundled automatically (no separate installation)
- Auto-download: Checkpoints and data files download automatically
- Environment variables: Custom checkpoint/data paths are now optional
Migration Steps
# Uninstall old version
pip uninstall crystalyse-ai
# Install new version
pip install crystalyse
# Update imports in your code
# OLD: from crystalyse_ai.agents import ...
# NEW: from crystalyse.agents import ...
# Remove manual checkpoint/data management
# Everything auto-downloads now
What Stays the Same
- CLI interface (mostly compatible)
- Core workflow concepts
- Analysis modes (creative/rigorous/adaptive)
- Session management
System Requirements
- Python: 3.11 or higher
- Memory: 8GB minimum, 16GB recommended
- Disk: 2GB for checkpoints and cache
- GPU: Optional, accelerates MACE calculations
- Network: Required for first-time setup (auto-downloads)
Documentation
- Installation Guide
- User Guide
- API Documentation
- Provenance System
- CLAUDE.md - Developer guide
Acknowledgments
Crystalyse builds on open-source tools from the materials science community:
- SMACT - Semiconducting Materials by Analogy and Chemical Theory
- Chemeleon - AI-powered crystal structure prediction
- MACE - Machine learning ACE force fields
- PyMatGen - Python Materials Genomics
- Pymatviz - Materials visualization toolkit
- OpenAI Agents SDK - Agent orchestration framework
Citation
If you use CrystaLyse in your research, please cite the underlying tools:
@article{davies2019smact,
title={SMACT: Semiconducting Materials by Analogy and Chemical Theory},
author={Davies, Daniel W and Butler, Keith T and Jackson, Adam J and others},
journal={Journal of Open Source Software},
volume={4},
number={38},
pages={1361},
year={2019}
}
@article{park2025chemeleon,
title={Exploration of crystal chemical space using text-guided generative artificial intelligence},
author={Park, Hyun Seo and others},
journal={Nature Communications},
year={2025}
}
@article{batatia2022mace,
title={MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields},
author={Batatia, Ilyes and others},
journal={NeurIPS},
year={2022}
}
License
MIT License - see LICENSE for details.
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: ryannduma@gmail.com
Changelog
See CHANGELOG.md for version history.
Made with computational honesty by Ryan Nduma
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crystalyse-1.0.1.tar.gz.
File metadata
- Download URL: crystalyse-1.0.1.tar.gz
- Upload date:
- Size: 211.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29cbc70928136d987ce734a81e1c0f2a5c8e8b1f9b2c72cfb4805b51142d8023
|
|
| MD5 |
fd4b8a77945cd5bbcdfdaed58913de4a
|
|
| BLAKE2b-256 |
b4a0960cd0987959c748be112ed4cd0572b839a3a37b8cdaa59254a262463b2d
|
File details
Details for the file crystalyse-1.0.1-py3-none-any.whl.
File metadata
- Download URL: crystalyse-1.0.1-py3-none-any.whl
- Upload date:
- Size: 243.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17a740158c7e602df402c485c4926ed31e01f3bb487270b7b03f989227a8b4d5
|
|
| MD5 |
711751258a76e3080b82a22d760daccf
|
|
| BLAKE2b-256 |
6db4089b05310af1d38e3a1c3221dd2ac2acaf2cd491b0269a3832bbc0cabb3d
|