Skip to main content

Python library for high-throughput .cif analysis

Project description

cifkit

CI codecov Python - Version PyPi version Conda version

Logo light mode Logo dark mode

cifkit is designed to provide a set of fully-tested utility functions and variables for handling large datasets, on the order of tens of thousands, of .cif files.

The current codebase and documentation are actively being improved as of Sep 18, 2024.

Features:

cifkit provides higher-level functions in just a few lines of code.

  • Coordination geometry - cifkit provides functions for visualing coordination geometry from each site and extracts physics-based features like volume and packing efficiency in each polyhedron.
  • Atomic mixing - cifkit extracts atomic mixing information at the bond pair level—tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker.
  • Filter - cifkit offers features for preprocessing. It systematically addresses common issues in CIF files from databases, such as incorrect loop values and missing fractional coordinates, by standardizing and filtering out ill-formatted files. It also preprocesses atomic site labels, transforming labels such as 'M1' to 'Fe1' in files with atomic mixing.
  • Sort - cifkit allows you to copy, move, and sort .cif files based on attributes such as coordination numbers, space groups, unit cells, shortest distances, elements, and more.

Example usage 1 - coordination geometry

The example below uses cifkit to visualize the polyhedron generated from each atomic site based on the coordination number geometry.

from cifkit import Cif

cif = Cif("your_cif_file_path")
site_labels = cif.site_labels

# Loop through each site label
for label in site_labels:
    # Dipslay each polyhedron, .png saved for each label
    cif.plot_polyhedron(label, is_displayed=True)

Polyhedron generation

Example Usage 2 - sort

The following example generates a distribution of structure.

from cifkit import CifEnsemble

ensemble = CifEnsemble("your_folder_path_containing_cif_files")
ensemble.generate_structure_histogram()

structure distribution

Basde on your visual histogram above, you can copy and move .cif files based on specific attributes:

# Return file paths matching structures either Co1.75Ge or CoIn2
ensemble.filter_by_structures(["Co1.75Ge", "CoIn2"])

# Return file path matching CeAl2Ga2
ensemble.filter_by_structures("CeAl2Ga2")

To learn more, please read the official documentation here: https://bobleesj.github.io/cifkit.

Quotes

Here is a quote illustrating how cifkit addresses one of the challenges mentioned above.

"I am building an X-Ray diffraction analysis (XRD) pattern visualization script for my lab using pymatgen. I feel like cifkit integrated really well into my existing stable of libraries, while surpassing some alternatives in preprocessing and parsing. For example, it was often unclear at what stage an error occurred—whether during pre-processing with CifParser, or XRD plot generation with diffraction.core in pymatgen. The pre-processing logic in cifkit was communicated clearly, both in documentation and in actual outputs, allowing me to catch errors in my data before it was used in my visualizations. I now use cifkit by default for processing CIFs before they pass through the rest of my pipeline." - Alex Vtorov `

Documentation

How to contribute

Here is how you can contribute to the cifkit project if you found it helpful:

  • Star the repository on GitHub and recommend it to your colleagues who might find cifkit helpful as well. Star GitHub repository
  • Create a new issue for any bugs or feature requests here
  • Fork the repository and consider contributing changes via a pull request. Fork GitHub repository. Check out CONTRIBUTING.md for instructions.
  • If you have any suggestions or need further clarification on how to use cifkit, please reach out to Bob Lee (@bobleesj).

To render documentation

pip install -r requirements/docs.txt
mkdocs serve

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cifkit-1.0.5.tar.gz (4.0 MB view details)

Uploaded Source

Built Distribution

cifkit-1.0.5-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file cifkit-1.0.5.tar.gz.

File metadata

  • Download URL: cifkit-1.0.5.tar.gz
  • Upload date:
  • Size: 4.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cifkit-1.0.5.tar.gz
Algorithm Hash digest
SHA256 1b71d398d73800db32b09785af0e755e2bebb1d508fd8fa319ca16379386e98e
MD5 40c3a0c78496b4b76f031df9c6f67c02
BLAKE2b-256 b59210af0f86907359ededfa9d4d70953118cfc0c932136a649575dc251a5aa7

See more details on using hashes here.

File details

Details for the file cifkit-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: cifkit-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cifkit-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3cbdf9806eb5a59775f6787d7ddd712f07b1440316bb125b57b27daba0de9660
MD5 f871295c786ed7407d9d165b07b1635c
BLAKE2b-256 36c8610c373d0ae385375e40fb9df329518105ef1d0e2a7c2732b6a27423f383

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page