Skip to main content

No project description provided

Project description

PyPI version Python versions codecov

Graph ID

Graph ID is a universal identifier system for atomistic structures including crystals and molecules. It generates unique, deterministic identifiers based on the topological and compositional properties of atomic structures, enabling efficient structure comparison, database indexing, and materials discovery.

Overview

Graph ID works by:

  1. Converting atomic structures into graph representations where atoms are nodes and bonds are edges
  2. Analyzing the local chemical environment around each atom using compositional sequences
  3. Computing a hash-based identifier that captures both topology and composition
  4. Supporting various modes including topology-only comparisons and Wyckoff position analysis

Features

  • Universal Structure Identification: Generate unique IDs for any crystal or molecular structure
  • Topological Analysis: Option to generate topology-only IDs for structure type comparison
  • Wyckoff Position Support: Include crystallographic symmetry information in ID generation
  • Distance Clustering: Advanced clustering-based analysis for complex structures
  • C++ Performance: High-performance C++ backend with Python bindings
  • Multiple Neighbor Detection: Support for various neighbor-finding algorithms (MinimumDistanceNN, CrystalNN, etc.)

Installation

From PyPI

pip install graph-id-core
pip install graph-id-db  # optional database component

From Source

git clone https://github.com/kmu/graph-id-core.git
cd graph-id-core
git submodule update --init --recursive
pip install -e .

Quick Start

Basic Usage

from pymatgen.core import Structure, Lattice
from graph_id import GraphIDMaker

# Create a structure (NaCl)
structure = Structure.from_spacegroup(
    "Fm-3m",
    Lattice.cubic(5.692),
    ["Na", "Cl"],
    [[0, 0, 0], [0.5, 0.5, 0.5]]
)

# Generate Graph ID
maker = GraphIDMaker()
graph_id = maker.get_id(structure)
print(graph_id)  # Output: NaCl-88c8e156db1b0fd9

Loading from Files

from pymatgen.core import Structure
from graph_id_cpp import GraphIDGenerator

# Load structure from file
structure = Structure.from_file("path/to/structure.cif")
generator = GraphIDGenerator()
graph_id = generator.get_id(structure)

Advanced Configuration

from graph_id_cpp import GraphIDGenerator
from pymatgen.analysis.local_env import CrystalNN

# Topology-only comparison (ignores composition)
topo_gen = GraphIDGenerator(topology_only=True)
topo_id = topo_gen.get_id(structure)

# Include Wyckoff positions
wyckoff_gen = GraphIDGenerator(wyckoff=True)
wyckoff_id = wyckoff_gen.get_id(structure)

# Use different neighbor detection
crystal_gen = GraphIDGenerator(nn=CrystalNN())  # Faster CrystalNN using C++ is also available
crystal_id = crystal_gen.get_id(structure)

Search Structures from Database

Use graph-id-db to search structures in the Materials Project using precomputed Graph ID stored in graph-id-db

# pip install graph-id-db
from graph_id_cpp import GraphIDGenerator

from pymatgen.core import Structure, Lattice

structure = Structure.from_spacegroup(
    "Fm-3m",
    Lattice.cubic(5.692),
    ["Na", "Cl"],
    [[0, 0, 0], [0.5, 0.5, 0.5]]
).get_primitive_structure()
gen = GraphIDGenerator()
graph_id = gen.get_id(structure)
print(f"Graph ID of NaCl is {graph_id}")

from graph_id_db import Finder

# Search for structures in graph-id-db using GraphID
finder = Finder()
finder.find(graph_id)

Examples

More comprehensive examples can be found in the tests/ and examples/ directories.

Applications

Graph ID is particularly useful for:

  • Materials Databases: Efficient indexing and deduplication of structure databases
  • High-throughput Screening: Rapid identification of unique structures in computational workflows
  • Polymorph Identification: Distinguishing between different polymorphs of the same composition

Web Service (experimental)

You can search materials using Graph ID at matfinder.net.

Developer's notes

This repo is managed by poetry.

Installation

  1. Clone the repository:
git clone https://github.com/kmu/graph-id-core.git
cd graph-id-core
  1. Initialize git submodules (required for the C++ build):
git submodule update --init --recursive
  1. Install the package and dependencies using Poetry:
poetry install
  1. Install pre-commit
pre-commit install

Note: The git submodules (library/pybind11, library/eigen, library/gtl) are required for building the C++ extension. Without them, the installation will fail during the CMake build step.

Testing

poetry run pytest

If you have made changes to the C++ code, run poetry run pip install -e --force-reinstall to apply the changes before running the tests.

Releasing

  • Bump version in pyproject.toml.
  • Create a new PR from main branch to release branch.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_id_core-0.1.17.tar.gz (5.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

graph_id_core-0.1.17-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (304.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

graph_id_core-0.1.17-cp312-cp312-macosx_15_0_arm64.whl (252.4 kB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

graph_id_core-0.1.17-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (306.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

graph_id_core-0.1.17-cp311-cp311-macosx_15_0_arm64.whl (252.8 kB view details)

Uploaded CPython 3.11macOS 15.0+ ARM64

File details

Details for the file graph_id_core-0.1.17.tar.gz.

File metadata

  • Download URL: graph_id_core-0.1.17.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graph_id_core-0.1.17.tar.gz
Algorithm Hash digest
SHA256 26f5c3995e6b98b95e712e34dc8473a304af6111e88bce953410a9941a0ea2c1
MD5 1ab4c1b354b7297793d073c885af6553
BLAKE2b-256 fa02c77fe5955ccacd125f82925f7d8f6b2c00f08d25cef39c1dffeb024105ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_id_core-0.1.17.tar.gz:

Publisher: release.yml on kmu/graph-id-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graph_id_core-0.1.17-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graph_id_core-0.1.17-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2f4f0801d13edfcecf7f1e2240eb138d1d6e0315b9cc3409da805949cdd31300
MD5 d53d0363a8a1a5134e569057ed8c81ba
BLAKE2b-256 481db161d250dde417a102e2e6bced618e80e19b35924bed96410a68b627f410

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_id_core-0.1.17-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on kmu/graph-id-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graph_id_core-0.1.17-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for graph_id_core-0.1.17-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 f63fc27a1c1fb9c58f072431333186c46f5d1ad4a30e12abcda0433576bb33e0
MD5 0bcd66c501d497b4ff0f086ce550f8c2
BLAKE2b-256 85a0fc846264a12a3f41441a5cfb794257af5c7ee90952e7ed11c3b20d69c229

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_id_core-0.1.17-cp312-cp312-macosx_15_0_arm64.whl:

Publisher: release.yml on kmu/graph-id-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graph_id_core-0.1.17-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graph_id_core-0.1.17-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ffb177b87a20addd6809ab60a8ca8474d04af31b0f782c6d9b501bc4489d4e7f
MD5 6e52876abaa417c83d8457e525611a54
BLAKE2b-256 52ed8de0c24509d65e2537a8ff967c5e61d3974fcc1eee7c161f2f3e83132937

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_id_core-0.1.17-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on kmu/graph-id-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graph_id_core-0.1.17-cp311-cp311-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for graph_id_core-0.1.17-cp311-cp311-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 7fb1d0249a863fd473b2eddb81d78a3ee150f0298fd296fe53bef4d52753c1b7
MD5 558ddac4051d1fcc96801d39765bf2e5
BLAKE2b-256 f15bfe52b88b0e343e352a547529b75c5a8097bb559c25f086a3546171c8449c

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_id_core-0.1.17-cp311-cp311-macosx_15_0_arm64.whl:

Publisher: release.yml on kmu/graph-id-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page