Skip to main content

Python library for high-throughput .cif analysis

Project description

cifkit

Integration Tests codecov Python - Version PyPi version Conda version

Logo light mode Logo dark mode

cifkit is designed to provide a set of fully-tested utility functions and variables for handling large datasets, on the order of tens of thousands, of .cif files.

The current codebase and documentation are actively being improved as of Sep 18, 2024.

Features:

cifkit provides higher-level functions in just a few lines of code.

  • Coordination geometry - cifkit provides fuctions for visualing coordination geometry from each site and extracts physics-based features like volume and packing efficiency in each polyhedron.
  • Atomic mixing - cifkit extracts atomic mixing information at the bond pair level—tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker.
  • Filter - cifkit offers features for preprocessing. It systematically addresses common issues in CIF files from databases, such as incorrect loop values and missing fractional coordinates, by standardizing and filtering out ill-formatted files. It also preprocesses atomic site labels, transforming labels such as 'M1' to 'Fe1' in files with atomic mixing.
  • Sort - cifkit allows you to copy, move, and sort .cif files based on attributes such as coordination numbers, space groups, unit cells, shortest distances, elements, and more.

Example usage 1 - coordination geometry

The example below uses cifkit to visualize the polyhedron generated from each atomic site based on the coordination number geometry.

from cifkit import Cif

cif = Cif("your_cif_file_path")
site_labels = cif.site_labels

# Loop through each site label
for label in site_labels:
    # Dipslay each polyhedron, .png saved for each label
    cif.plot_polyhedron(label, is_displayed=True)

Polyhedron generation

Example Usage 2 - sort

The following example generates a distribution of structure.

from cifkit import CifEnsemble

ensemble = CifEnsemble("cif_containing_folder_path")
ensemble.generate_structure_histogram()

structure distribution

Basde on your visual histogram above, you can copy and move .cif files based on specific attributes:

# Return file paths matching structures either Co1.75Ge or CoIn2
ensemble.filter_by_structures(["Co1.75Ge", "CoIn2"])

# Return file path matching CeAl2Ga2
ensemble.filter_by_structures("CeAl2Ga2")

To learn more, please read the official documention here: https://bobleesj.github.io/cifkit.

Quotes

Here is a quote illustrating how cifkit addresses one of the challenges mentioned above.

"I am building an X-Ray diffraction analysis (XRD) pattern visualization script for my lab using pymatgen. I feel like cifkit integrated really well into my existing stable of libraries, while surpassing some alternatives in preprocessing and parsing. For example, it was often unclear at what stage an error occurred—whether during pre-processing with CifParser, or XRD plot generation with diffraction.core in pymatgen. The pre-processing logic in cifkit was communicated clearly, both in documentation and in actual outputs, allowing me to catch errors in my data before it was used in my visualizations. I now use cifkit by default for processing CIFs before they pass through the rest of my pipeline." - Alex Vtorov `

Documentation

How to contribute

Here is how you can contribute to the cifkit project if you found it helpful:

  • Star the repository on GitHub and recommend it to your colleagues who might find cifkit helpful as well. Star GitHub repository
  • Create a new issue for any bugs or feature requests here
  • Fork the repository and consider contributing changes via a pull request. Fork GitHub repository. Check out CONTRIBUTING.md for instructions.
  • If you have any suggestions or need further clarification on how to use cifkit, please reach out to Bob Lee (@bobleesj).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cifkit-1.0.4rc4.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

cifkit-1.0.4rc4-py3-none-any.whl (943.5 kB view details)

Uploaded Python 3

File details

Details for the file cifkit-1.0.4rc4.tar.gz.

File metadata

  • Download URL: cifkit-1.0.4rc4.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for cifkit-1.0.4rc4.tar.gz
Algorithm Hash digest
SHA256 f060f945a6791d5b3886c74ebe766cdd61a27df04084571c557004717c2a91fd
MD5 156eae1b9c2fdcc4a6bb560b7ac62999
BLAKE2b-256 ff7aef06549ba6c0505698e5c2f4733663c09d9255946eec05f85574f927917a

See more details on using hashes here.

File details

Details for the file cifkit-1.0.4rc4-py3-none-any.whl.

File metadata

  • Download URL: cifkit-1.0.4rc4-py3-none-any.whl
  • Upload date:
  • Size: 943.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for cifkit-1.0.4rc4-py3-none-any.whl
Algorithm Hash digest
SHA256 6199642b274d59f1d5bbe6f38fcf9535c38ec243d88dee963e676468caf32632
MD5 9898c71b0e5bcb5adc9348f423e7e368
BLAKE2b-256 69ac04eaa8dada3cf9b58d2bcb4de9a940ad8d72f7592e2f7f5eb33957b28a0d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page