Skip to main content

Python library for high-throughput .cif analysis

Project description

cifkit

Integration Tests codecov Python - Version PyPi version Conda version

Logo light mode Logo dark mode

cifkit is designed to provide a set of fully-tested utility functions and variables for handling large datasets, on the order of tens of thousands, of .cif files.

The current codebase and documentation are actively being improved as of Sep 18, 2024.

Features:

cifkit provides higher-level functions in just a few lines of code.

  • Coordination geometry - cifkit provides fuctions for visualing coordination geometry from each site and extracts physics-based features like volume and packing efficiency in each polyhedron.
  • Atomic mixing - cifkit extracts atomic mixing information at the bond pair level—tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker.
  • Filter - cifkit offers features for preprocessing. It systematically addresses common issues in CIF files from databases, such as incorrect loop values and missing fractional coordinates, by standardizing and filtering out ill-formatted files. It also preprocesses atomic site labels, transforming labels such as 'M1' to 'Fe1' in files with atomic mixing.
  • Sort - cifkit allows you to copy, move, and sort .cif files based on attributes such as coordination numbers, space groups, unit cells, shortest distances, elements, and more.

Example usage 1 - coordination geometry

The example below uses cifkit to visualize the polyhedron generated from each atomic site based on the coordination number geometry.

from cifkit import Cif

cif = Cif("your_cif_file_path")
site_labels = cif.site_labels

# Loop through each site label
for label in site_labels:
    # Dipslay each polyhedron, .png saved for each label
    cif.plot_polyhedron(label, is_displayed=True)

Polyhedron generation

Example Usage 2 - sort

The following example generates a distribution of structure.

from cifkit import CifEnsemble

ensemble = CifEnsemble("cif_containing_folder_path")
ensemble.generate_structure_histogram()

structure distribution

Basde on your visual histogram above, you can copy and move .cif files based on specific attributes:

# Return file paths matching structures either Co1.75Ge or CoIn2
ensemble.filter_by_structures(["Co1.75Ge", "CoIn2"])

# Return file path matching CeAl2Ga2
ensemble.filter_by_structures("CeAl2Ga2")

To learn more, please read the official documention here: https://bobleesj.github.io/cifkit.

Quotes

Here is a quote illustrating how cifkit addresses one of the challenges mentioned above.

"I am building an X-Ray diffraction analysis (XRD) pattern visualization script for my lab using pymatgen. I feel like cifkit integrated really well into my existing stable of libraries, while surpassing some alternatives in preprocessing and parsing. For example, it was often unclear at what stage an error occurred—whether during pre-processing with CifParser, or XRD plot generation with diffraction.core in pymatgen. The pre-processing logic in cifkit was communicated clearly, both in documentation and in actual outputs, allowing me to catch errors in my data before it was used in my visualizations. I now use cifkit by default for processing CIFs before they pass through the rest of my pipeline." - Alex Vtorov `

Documentation

How to contribute

Here is how you can contribute to the cifkit project if you found it helpful:

  • Star the repository on GitHub and recommend it to your colleagues who might find cifkit helpful as well. Star GitHub repository
  • Create a new issue for any bugs or feature requests here
  • Fork the repository and consider contributing changes via a pull request. Fork GitHub repository. Check out CONTRIBUTING.md for instructions.
  • If you have any suggestions or need further clarification on how to use cifkit, please reach out to Bob Lee (@bobleesj).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cifkit-1.0.4rc5.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

cifkit-1.0.4rc5-py3-none-any.whl (943.3 kB view details)

Uploaded Python 3

File details

Details for the file cifkit-1.0.4rc5.tar.gz.

File metadata

  • Download URL: cifkit-1.0.4rc5.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for cifkit-1.0.4rc5.tar.gz
Algorithm Hash digest
SHA256 6986b21fcada3aaf617c45fcd60695522dbd0ab88f004989e3cdf7bd59a24e9f
MD5 a5753d1d7ab9fdd295d205fcaed2efed
BLAKE2b-256 278c14c35a888dce07894501966ec5412be1803e0bd7b2942d47351c4f612152

See more details on using hashes here.

File details

Details for the file cifkit-1.0.4rc5-py3-none-any.whl.

File metadata

  • Download URL: cifkit-1.0.4rc5-py3-none-any.whl
  • Upload date:
  • Size: 943.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for cifkit-1.0.4rc5-py3-none-any.whl
Algorithm Hash digest
SHA256 9e362771bf98948627ecc5ec3e4d9986bb40fce148ac3b0dbece2a49d5791207
MD5 c2eea0a73edc76fab6ced012ce8cfa44
BLAKE2b-256 563c5fe0ace96a50d617f2928c5b24b9962972289b7df7065ad985bcb0d25359

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page