Skip to main content

Standard data for digital Materials R&D.

Project description

Standata

Standard data for digital materials R&D entities in the ESSE data format.

1. Installation

1.1. Python

The package is compatible with Python 3.10+. It can be installed as a Python package either via PyPI:

pip install mat3ra-standata

Or as an editable local installation in a virtual environment after cloning the repository:

virtualenv .venv
source .venv/bin/activate
pip install -e PATH_TO_STANDATA_REPOSITORY

1.2. JavaScript

Standata can be installed as a Node.js package via NPM (node package manager).

npm install @mat3ra/standata

2. Usage

2.1. Python

from mat3ra.standata.materials import materials_data
# This returns a list of JSON configs for all materials.
materialConfigs = materials_data["filesMapByName"].values();

2.2. JavaScript

// Direct import can be used to avoid importing all data at once.
import data from "@mat3ra/standata/lib/runtime_data/materials";
// This creates a list of JSON configs for all materials.
const materialConfigs = Object.values(data.filesMapByName);

3. Conventions

3.1. Runtime Modules

To avoid file system calls on the client, the entity categories and data structures are made available at runtime via the files in src/js/runtime_data. These files are generated automatically using the following command:

npm run build:runtime-data

3.2. CLI Scripts for Creating Symlinks

3.2.1. Python

The Python package adds a command line script create-symlinks that creates a category-based file tree where entity data files are symbolically linked in directories named after the categories associated with the entity. The resulting file tree will be contained in a directory names by_category. The script expects the (relative or absolute) path to an entity config file (categories.yml). The destination of the file tree can be modified by passing the --destination/-d option.

# consult help page to view all options
create-symlinks --help
# creates symbolic links in materials/by_category
create-symlinks materials/categories.yml
# creates symbolic links for materials in tmp/by_category
create-symlinks materials/categories.yml -d tmp

3.2.1 JavaScript/Node

Analogous to the command line script in Python, the repository also features a script in TypeScript (src/js/cli.ts) and (after transpiling) in JavaScript (lib/cli.js). The script takes the entity config file as a mandatory positional argument and the alternative location for the directory containing the symbolic links (--destination/-d).

# creates symbolic links in materials/by_category (node)
node lib/cli.js materials/categories.yml
# creates symbolic links in materials/by_category (ts-node)
ts-node src/js/cli.ts materials/categories.yml
# creates symbolic links for materials in tmp/by_category
ts-node src/js/cli.ts -d tmp materials/categories.yml
# run via npm
npm run build:categories -- materials/categories.yml

4. Development

See ESSE for the notes about development and testing.

To develop, first, create a virtual environment and install the dev dependencies:

python -m venv .venv
source .venv/bin/activate
pip install ".[dev]"

4.1. Materials Source

The materials data is sourced from the Materials Project for 3D materials and 2dmatpedia for 2D materials. The structural data in POSCAR format is stored in the materials/sources directory alongside the manifest.yml file that contains the additional description and metadata for each material.

To add new materials to Standata, place the POSCAR file in the materials/sources directory and update the manifest.yml file with the new material's metadata. Then run to create the materials data:

python create_materials.py

4.2. Materials Naming Conventions

Our dataset's naming convention for materials is designed to provide a comprehensive description of each material, incorporating essential attributes such as chemical composition, common name, crystal structure, and unique identifiers.

4.2.1. Name Property Format

The format for the material name property is a structured representation that includes the chemical formula, common name, crystal system, space group, dimensionality, specific structure details, and a unique identifier. Each element in the name is separated by a comma and space.

Format:

{Chemical Formula}, {Common Name}, {Crystal System} ({Space Group}) {Dimensionality} ({Structure Detail}), {Unique Identifier}

Examples:

  • Ni, Nickel, FCC (Fm-3m) 3D (Bulk), mp-23
  • ZrO2, Zirconium Dioxide, MCL (P2_1/c) 3D (Bulk), mp-2858
  • C, Graphite, HEX (P6_3/mmc) 3D (Bulk), mp-48
  • C, Graphene, HEX (P6/mmm) 2D (Monolayer), mp-1040425

4.2.2. Filename Format

Filenames are derived from the name property through a slugification process, ensuring they are filesystem-friendly and easily accessible via URLs or command-line interfaces. This process involves converting the structured name into a standardized, URL-safe format that reflects the material's attributes.

Format:

{Chemical_Formula}-[{Common_Name}]-{Crystal_System}_[{Space_Group}]_
{Dimensionality}_[{Structure_Detail}]-[{Unique_Identifier}]

Transformation Rules:

Commas and Spaces: Replace , (comma and space) with - (hyphen) and (space) with _ (underscore). Parentheses: Convert ( and ) into [ and ] respectively. Special Characters: Encode characters such as / into URL-safe representations (e.g., %2F). Brackets: Wrap common name and identifier parts in square brackets [].

Filename Examples:

  • Ni-[Nickel]-FCC_[Fm-3m]3D[Bulk]-[mp-23]
  • ZrO2-[Zirconium_Dioxide]-MCL_[P2_1%2Fc]3D[Bulk]-[mp-2858]
  • C-[Graphite]-HEX_[P6_3%2Fmmc]3D[Bulk]-[mp-48]
  • C-[Graphene]-HEX_[P6%2Fmmm]2D[Monolayer]-[mp-1040425]

5. Links

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mat3ra_standata-2025.10.2.post0.tar.gz (121.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mat3ra_standata-2025.10.2.post0-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file mat3ra_standata-2025.10.2.post0.tar.gz.

File metadata

  • Download URL: mat3ra_standata-2025.10.2.post0.tar.gz
  • Upload date:
  • Size: 121.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/8.7.0 pkginfo/1.12.1.2 requests/2.32.5 requests-toolbelt/1.0.0 tqdm/4.67.1 CPython/3.10.13

File hashes

Hashes for mat3ra_standata-2025.10.2.post0.tar.gz
Algorithm Hash digest
SHA256 c3e2775bddf10975ec8a21a26553ab1805b2bea82ca1924b907a5216a5a244d5
MD5 16f56a0cdca68783cca762442358a4d6
BLAKE2b-256 b4903bd5fa4fe77fad96b24086b1360e9ebf0cb0cd744eacd038712c8ebd48ed

See more details on using hashes here.

File details

Details for the file mat3ra_standata-2025.10.2.post0-py3-none-any.whl.

File metadata

  • Download URL: mat3ra_standata-2025.10.2.post0-py3-none-any.whl
  • Upload date:
  • Size: 26.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/8.7.0 pkginfo/1.12.1.2 requests/2.32.5 requests-toolbelt/1.0.0 tqdm/4.67.1 CPython/3.10.13

File hashes

Hashes for mat3ra_standata-2025.10.2.post0-py3-none-any.whl
Algorithm Hash digest
SHA256 779fc16d7c1f00549afae670329aa8b3f7f1b811b23feb02ace21124a24393c4
MD5 d89d3fdb0c4b6d3675afa4a72718202b
BLAKE2b-256 8b3830f3cc75c9d59699cfc0aa791f7293a283da1c4622b1c2b1db8012ecc046

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page