Skip to main content

A tool for analyzing CIF files and conduct large scale analysis on them.

Project description

TFSI Crystallography Analysis using PyCIFTer

This repository contains Python code for PyCIFTer, a program I made to analyze X-ray crystallographic data of bis(trifluoromethanesulfonyl)imide (TFSI) compounds. View at:

Published Article

Graphs Dashboard Here

Overview

Chemical data science is an emerging field within the broader field of chemistry that has numerous applications of high relevance to a variety of academic and industrial pursuits. With quantum computing and artificial intelligence becoming more mainstream, it is of great interest for not only the data scientist, but the chemist as well, to take advantage of these technologies to streamline data processing, analyze large data sets, and reveal new chemical insights that would be otherwise hidden by the sheer amount and depth/complexity of available data. In this project, modern computing and fundamental chemical analysis have been combined in order to pursue new insights into the structural behavior of the weakly coordinating anion bis(trifluoromethylsulfonyl)imide, otherwise known as TFSI. Taking a data science approach, published solid-state crystal structures available in the Cambridge Structural Database (CSD) including one or more TFSI species of interest were categorized and statistically analyzed using software built for this purpose. The goal of this project from a chemical perspective was to determine the structural characteristics displayed by TFSI as inferred from the structural data. The goal of this project from a data-science perspective was to develop a new software program using Python to parse Crystallographic Information Files (CIF) obtained from the CSD into statistically relevant information that could be compared across the individual structural data sets. This research endeavor aims to highlight the applications of data science to an otherwise foreign area of research (structural chemistry) and outline the capabilities of data processing for future structural and chemical investigations.

Features

  • Import and parse CIF (Crystallographic Information File) data
  • Analyze TFSI bond lengths and angles
  • Visualize TFSI molecular structure
  • Perform statistical analysis on crystallographic parameters

Requirements

The tool uses python. The latest version of the program was run using python 3.14 but anything above 3.10 will suffice. The requirements.txt file has all the packages required for running the tool.

Install the required packages using pip:

pip install pycifter

Components of PyCIFTer

  • index.py : Main script used in research to demonstrate effectiveness of pycifter. Executes the main analysis pipeline
  • cifFileParser.py: TFSI structural analysis functions
  • render.py: Plotting and visualization functions
  • heatmap.py: Interactive rendering

cifFileParser.py - Utility functions

  • getElementAtoms(self, symbol) : Returns a list of atoms with a particular symbol after linearly searching through self.Atoms[].
  • containsAtom(self, symbol) : Checks whether the current molecule contains an atom of given symbol.
  • getAtomsInARadius(self, targetAtom, radius) : returns a list of atoms after performing a wave emanating search of size ‘radius’. The function looks for all atoms with a Euclidean distance of less than radius and returns a list.
  • getParticularAtom(self, identifier) : Returns a particular atom object based on its identifier (ex. C57).

Atom.py - Utility functions

  • getDistance(self,other) : Returns the Euclidean distance between the position vectors of the current atom and the “other” atom.
  • __eq__(self,other) : Operator overload for the “==” operator. Makes it so that if two atoms and compared using the “==” operator, the output depends on whether the two atoms have the exact same position vector, whereas otherwise the “==” would compare the two objects’ memory addresses.
  • __hash__(self) : Defines what should the hashing input be when generating the hash of an atom. The position vector is unique to each atom and thus, is used as a hashing input.

The current version of the software is a pre-alpha version built for research purposes. I will continue working on this project adding more robust features to the CIFParser class. See my project on Crown Ethers for another example of molecular analysis with PyCIFTer.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycifter-0.1.1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycifter-0.1.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file pycifter-0.1.1.tar.gz.

File metadata

  • Download URL: pycifter-0.1.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pycifter-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ee9fe45651d571d555e475ae03ae50c916935a424f01fe1f7d5e005ab7f4b138
MD5 11334beda939a61ab844bbbdcfb9a077
BLAKE2b-256 43a8c963e68c7bc74bb29a7623236cbb92a56f5d593fddf42bd9e19b3f0af325

See more details on using hashes here.

File details

Details for the file pycifter-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pycifter-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for pycifter-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7fbebd17e79fb270665399496a76e0f55500ace27208b3aaaf31919ae53e209e
MD5 dfe9caed930de1d2f900e4babd4f726d
BLAKE2b-256 a44fbbd4ad95cc67fbb3cb4d1d431d8ab0b985baf1f5ddc3118552940bd2d1bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page