Skip to main content

Python library for high-throughput .cif analysis

Project description

cifkit

Integration Tests codecov Python - Version PyPi version Conda version

Logo light mode Logo dark mode

cifkit is designed to provide a set of fully-tested utility functions and variables for handling large datasets, on the order of tens of thousands, of .cif files.

The current codebase and documentation are actively being improved as of Sep 18, 2024.

Features:

cifkit provides higher-level functions in just a few lines of code.

  • Coordination geometry - cifkit provides fuctions for visualing coordination geometry from each site and extracts physics-based features like volume and packing efficiency in each polyhedron.
  • Atomic mixing - cifkit extracts atomic mixing information at the bond pair level—tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker.
  • Filter - cifkit offers features for preprocessing. It systematically addresses common issues in CIF files from databases, such as incorrect loop values and missing fractional coordinates, by standardizing and filtering out ill-formatted files. It also preprocesses atomic site labels, transforming labels such as 'M1' to 'Fe1' in files with atomic mixing.
  • Sort - cifkit allows you to copy, move, and sort .cif files based on attributes such as coordination numbers, space groups, unit cells, shortest distances, elements, and more.

Example usage 1 - coordination geometry

The example below uses cifkit to visualize the polyhedron generated from each atomic site based on the coordination number geometry.

from cifkit import Cif

cif = Cif("your_cif_file_path")
site_labels = cif.site_labels

# Loop through each site label
for label in site_labels:
    # Dipslay each polyhedron, .png saved for each label
    cif.plot_polyhedron(label, is_displayed=True)

Polyhedron generation

Example Usage 2 - sort

The following example generates a distribution of structure.

from cifkit import CifEnsemble

ensemble = CifEnsemble("cif_containing_folder_path")
ensemble.generate_structure_histogram()

structure distribution

Basde on your visual histogram above, you can copy and move .cif files based on specific attributes:

# Return file paths matching structures either Co1.75Ge or CoIn2
ensemble.filter_by_structures(["Co1.75Ge", "CoIn2"])

# Return file path matching CeAl2Ga2
ensemble.filter_by_structures("CeAl2Ga2")

To learn more, please read the official documention here: https://bobleesj.github.io/cifkit.

Quotes

Here is a quote illustrating how cifkit addresses one of the challenges mentioned above.

"I am building an X-Ray diffraction analysis (XRD) pattern visualization script for my lab using pymatgen. I feel like cifkit integrated really well into my existing stable of libraries, while surpassing some alternatives in preprocessing and parsing. For example, it was often unclear at what stage an error occurred—whether during pre-processing with CifParser, or XRD plot generation with diffraction.core in pymatgen. The pre-processing logic in cifkit was communicated clearly, both in documentation and in actual outputs, allowing me to catch errors in my data before it was used in my visualizations. I now use cifkit by default for processing CIFs before they pass through the rest of my pipeline." - Alex Vtorov `

Documentation

How to contribute

Here is how you can contribute to the cifkit project if you found it helpful:

  • Star the repository on GitHub and recommend it to your colleagues who might find cifkit helpful as well. Star GitHub repository
  • Create a new issue for any bugs or feature requests here
  • Fork the repository and consider contributing changes via a pull request. Fork GitHub repository. Check out CONTRIBUTING.md for instructions.
  • If you have any suggestions or need further clarification on how to use cifkit, please reach out to Bob Lee (@bobleesj).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cifkit-0.0.0.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

cifkit-0.0.0-py3-none-any.whl (943.5 kB view details)

Uploaded Python 3

File details

Details for the file cifkit-0.0.0.tar.gz.

File metadata

  • Download URL: cifkit-0.0.0.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for cifkit-0.0.0.tar.gz
Algorithm Hash digest
SHA256 f9216e9c5fe2e7fdf917603d4a5879b8be3591f4f6ac834eedbc70ffe2b11066
MD5 5dd8ab75456ae48d52559d62511a2703
BLAKE2b-256 d60e3897bc14b2ea9457d553b180143c0b68cb676dff5e4376c8e0cfc07f09cb

See more details on using hashes here.

File details

Details for the file cifkit-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: cifkit-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 943.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for cifkit-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f2d95f23f65593cf7f6a47ba9397a556e7009206ab15172579fe3d60b2fe0c3
MD5 4a1a9b74d8a22abb81ff2209e14ba9f9
BLAKE2b-256 b435c80c2d73d97a109dba01e82bc7b276212cdfa5cb020d2ec4b1b153e1583b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page