Skip to main content

Bridge from classical materials databases to first-quantized Hamiltonians for quantum simulation

Project description

QMatBridge

A Python library for bridging classical materials databases to first-quantized Hamiltonians for fault-tolerant quantum simulation.

License: MIT Python 3.10+ CI PyPI Docs

Status: early-stage, active development — schema is stabilizing, adapters are stubs, exporters are planned. The core NIR and I/O layer are functional. Breaking changes to QMatEntry before v1.0 will be announced in CHANGELOG.md and accompanied by migration notes.


Why QMatBridge exists

Research groups developing fault-tolerant Hamiltonian-simulation algorithms — block-encodings, qubitization variants, truncated Dyson series methods — need realistic electronic-structure Hamiltonians to test and benchmark their primitives. The standard path is to write one-off scripts that pull DFT data from the Materials Project or similar databases, reformat it, and feed it into a quantum compiler.

These scripts are rarely shared, almost never interoperable with more than one quantum framework, and often omit the provenance information needed to reproduce or compare results. When two groups report T-gate counts for "silicon", there is no standard way to verify they used the same Hamiltonian.

QMatBridge addresses this by providing a neutral intermediate representation (NIR) that any upstream database adapter can write to and any downstream quantum compiler or resource estimator can read from, with full provenance baked in.


Core idea

  Materials Project  ──┐
  OQMD               ──┤  adapters  ──▶  QMatEntry (NIR)  ──▶  exporters  ──▶  OpenFermion
  OPTIMADE sources   ──┤                                                   ──▶  qualtran
  Alexandria         ──┘                                                   ──▶  pyLIQTR
                                                                           ──▶  Qiskit

A QMatEntry records:

  • where the material came from (SourceProvenance — database, functional, pseudopotential, code version, retrieval timestamp)
  • what it looks like (StructureMetadata — formula, lattice, spacegroup)
  • how the Hamiltonian is constructed (BasisMetadata — plane-wave cutoff, basis size)
  • what the oracle costs (OracleMetadata — LCU 1-norm λ, term breakdown, SELECT/PREPARE circuit parameters, complexity annotations)
  • where it has been exported (ExportMetadata — framework, format, status, artifact path)

Every entry carries a deterministic canonical_hash() over its physically meaningful fields so that benchmark results can be traced to a specific Hamiltonian without re-deriving it.


Who it is for

QMatBridge is written for:

  • Algorithm researchers implementing block-encodings, qubitization circuits, LCU decompositions, and related fault-tolerant primitives who need real materials data without building a database-integration layer themselves.
  • Research software engineers building quantum-chemistry / quantum-computing pipelines who need a stable, citable intermediate format between the DFT world and the circuit world.
  • Resource estimation groups running systematic T-count and qubit studies across a library of materials, who need reproducible, provenance-rich Hamiltonian records.

If you are looking for a general materials informatics library, use pymatgen or ASE. QMatBridge is intentionally narrow: it is a bridge, not a database.


Current scope

In scope (v0.1):

  • Periodic bulk materials in a plane-wave basis
  • First-quantized Hamiltonian representation (T + U + V in reciprocal space)
  • LCU / qubitization oracle metadata
  • JSON serialization with canonical entry hashes
  • Adapter stubs for Materials Project and OQMD
  • Zero required dependencies in the core schema

Out of scope (for now):

  • Gaussian / LCAO / real-space basis sets
  • Molecular (non-periodic) systems
  • Excited-state or TDDFT Hamiltonians
  • Live API calls (adapters are stubs until v0.2)
  • Exporter implementations (planned v0.4)

Design principles

Provenance-first. Every QMatEntry records the exact upstream source, DFT functional, pseudopotential family, and retrieval timestamp. The canonical_hash() method produces a stable SHA-256 digest over the physically meaningful fields so that published results can be precisely reproduced.

Open and interoperable. The core schema has zero required dependencies. Adapters for upstream databases and exporters for downstream frameworks are optional extras — install only what you need. The NIR is designed to be targeted by any adapter or consumed by any exporter without coupling the two.

Benchmark-oriented. Field names, norm conventions, and oracle metadata types are chosen to align with the quantities reported in fault-tolerant resource-estimation literature (LCU 1-norm λ, plane-wave count N, rotation-precision bits b_r). The library should make it easy to reproduce or extend a published benchmark, not just run a new one.

Representation-aware. OracleMetadata distinguishes between LCU, qubitization, sparse-access, and tensor-hypercontraction decompositions. TermMetadata stores per-physical-term norms so that methods that weight T, U, and V differently can extract exactly the quantities they need.

Community-driven. The schema is a shared contract. Changes to QMatEntry, MaterialReference, or HamiltonianMetadata require a public discussion period and a minor-version bump. The governance model is documented in GOVERNANCE.md; design discussions happen in GitHub Discussions before any code is written.


Getting started

Installation

# from source (recommended until PyPI publication)
git clone https://github.com/rmsreis/qmatbridge.git
cd qmatbridge
pip install -e ".[dev]"

Optional extras:

pip install -e ".[mp]"        # Materials Project adapter (mp-api, pymatgen)
pip install -e ".[oqmd]"      # OQMD adapter (qmpy-rester)
pip install -e ".[openfermion]"  # OpenFermion exporter

Minimal example

from qmatbridge.schema import (
    ExternalIdentifier, SourceProvenance,
    LatticeMetadata, StructureMetadata,
    BasisMetadata, HamiltonianMetadata,
    MaterialReference, QMatEntry,
)
from qmatbridge.io import write_entry_json

provenance = SourceProvenance(
    primary=ExternalIdentifier(
        source="materials_project", identifier="mp-149"
    ),
    functional="PBE",
    pseudopotential="PAW_PBE",
    code="VASP",
)

structure = StructureMetadata(
    formula_reduced="Si",
    formula_unit_cell="Si2",
    num_sites=2,
    species=["Si", "Si"],
    lattice=LatticeMetadata(
        a=3.867, b=3.867, c=3.867,
        alpha=60.0, beta=60.0, gamma=60.0,
        spacegroup_number=227, spacegroup_symbol="Fd-3m",
    ),
)

hamiltonian = HamiltonianMetadata(
    num_electrons=8,
    spin_polarized=False,
    basis=BasisMetadata(type="plane_wave", cutoff_energy_ev=520.0),
    num_bands=16,
)

entry = QMatEntry(
    reference=MaterialReference(provenance=provenance, structure=structure),
    hamiltonian=hamiltonian,
    tags=["silicon", "benchmark"],
)

print(entry)                      # QMatEntry(formula='Si', source='materials_project', ...)
print(entry.canonical_hash())     # deterministic SHA-256

write_entry_json(entry, "silicon.json")   # full nested JSON, 2-space indent

See examples/minimal_entry.py for a fully populated silicon entry including oracle metadata and term breakdown.


Repository layout

qmatbridge/
├── qmatbridge/
│   ├── schema.py                  # Neutral intermediate representation (NIR)
│   ├── io.py                      # JSON serialization utilities
│   └── adapters/
│       ├── materials_project.py   # Materials Project adapter stub (v0.2 target)
│       └── oqmd.py                # OQMD adapter stub (v0.3 target)
├── docs/
│   ├── vision.md                  # Design rationale and architectural constraints
│   ├── adapters.md                # Upstream adapter documentation
│   └── oracle-model.md            # Oracle abstraction and export model
├── examples/
│   └── minimal_entry.py           # Silicon QMatEntry with full oracle metadata
├── benchmarks/
│   └── README.md                  # Benchmark methodology and Tier-1 material plan
├── planning/
│   └── initial_issues.md          # Scoped GitHub issue set for v0.1–v0.4
└── .github/workflows/
    └── python-package.yml         # CI: lint (ruff + mypy) + tests on 3.10–3.12

Roadmap

v0.1 — Schema and scaffold (current)

  • Core dataclasses: MaterialReference, HamiltonianMetadata, QMatEntry
  • Oracle and term metadata: OracleMetadata, TermMetadata, ExportMetadata
  • JSON serialization via qmatbridge.io
  • Canonical entry hash (QMatEntry.canonical_hash())
  • Adapter stubs: Materials Project, OQMD
  • CI: ruff, mypy, pytest on Python 3.10–3.12
  • JSON round-trip regression fixtures
  • Oracle convention vocabulary (oracle_type, index_encoding)

v0.2 — Materials Project live integration

  • fetch_structure_metadata_from_mp and fetch_hamiltonian_metadata_from_mp
  • Plane-wave count utility (num_plane_waves_from_ecut)
  • Tier-1 benchmark material fixtures (Si, LiH, Fe, MgO, TiO₂)
  • MkDocs documentation site on GitHub Pages

v0.3 — OPTIMADE, OQMD, and Alexandria adapters

  • Generic OPTIMADE adapter (AFLOW, JARVIS, NOMAD, MC3D)
  • OQMD live fetch
  • Alexandria adapter (PBEsol / HSE06 / r²SCAN, ~4.5 M structures)
  • Cross-database deduplication via shared ICSD numbers

v0.4 — Hamiltonian exporters

  • OpenFermion InteractionOperator exporter
  • First-quantized plane-wave Hamiltonian (raw NumPy arrays)
  • LCU coefficient export for qualtran / pyLIQTR

v0.5 — Resource estimation hooks

  • T-count and Toffoli estimation interface
  • Qubit footprint estimator
  • Direct integration with qualtran / pyLIQTR resource analysis

Beyond v0.5

  • Gaussian and real-space basis support
  • Defect and surface slab geometries
  • Community adapter registry (entry-point group)

Community

Document Purpose
CONTRIBUTING.md How to open issues, propose schema changes, and submit pull requests
GOVERNANCE.md Roles, decision process, and release policy
CODE_OF_CONDUCT.md Community standards and enforcement
SECURITY.md Private vulnerability reporting
SUPPORT.md Where to ask questions vs. file bugs
GitHub Discussions Q&A, design proposals, and architecture discussions

Contributions are welcome at any level — bug reports, adapter implementations, documentation improvements, and benchmark additions. Please read CONTRIBUTING.md and the project vision before opening a large pull request.


Citation

QMatBridge does not yet have a formal publication. If you use it in academic work, please cite the repository directly:

@software{qmatbridge,
  author  = {Reis, Roberto},
  title   = {{QMatBridge}: A bridge from classical materials databases to
             first-quantized Hamiltonians for quantum simulation},
  url     = {https://github.com/rmsreis/qmatbridge},
    version = {0.1.0},
  year    = {2026},
}

A citable release and, if the project grows, a JOSS submission are planned once the v0.2 Materials Project adapter is complete and the API is stable.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qmatbridge-0.1.0.tar.gz (23.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qmatbridge-0.1.0-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file qmatbridge-0.1.0.tar.gz.

File metadata

  • Download URL: qmatbridge-0.1.0.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for qmatbridge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9aae674c1ead319856d6109d96bceb0dd6cc70a8cb5d54ee5fa7c1321d4413c1
MD5 6293a7b1ecc8bdaa990103acbac78468
BLAKE2b-256 e2e3dd2fe5701925078af538ec3a0cbfd0168cf161efc6673625535ef376ff5f

See more details on using hashes here.

File details

Details for the file qmatbridge-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: qmatbridge-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for qmatbridge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 edf1e63ddddcdb469799ae757140367ed7dbdb4634d6f2895de5f1481b8f87c9
MD5 e18490cbb16e49a74df9acea839c7eaa
BLAKE2b-256 ef74efbb4e0f0261938a92f6876fa569b222d2d0caac7496018fe45008ad21bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page