Skip to main content

Integrated tool to measure the nucleation rate of protein crystals.

Project description

CrystalML

PyPI version Travis CI status License: GPL v3 DOI

Disclaimer: This program is undergoing active development and should not be used for production. All APIs and commands are susceptible to change without notice.

Integrated tool to measure the nucleation rate of protein crystals from the crystallization kinetics of an array of independent identical droplets.

From a directory containing a time-series of images of multiple droplets, the tool segments individual droplet and uses a pre-trained CNN model to determine the presence or absence of crystals in each drop. The nucleation rate is evaluated from the rate of decay of the proportion of drops that do not exhibit visible crystals.

Schematic

Installation

Install with pip

CrystalML is on PyPI so it can be installed with pip

pip install crystalml

Install from source

Clone the repository to your computer

git clone https://github.com/hlgirard/CrystalML.git

and install with pip

cd CrystalML
pip install .

Usage

Quickstart

A time series of images of an emulsion of protein-laden droplets must be stored in a directory prior to usage of CrystalML The application can then be used to process the images as follows:

crystalml process --save-plot path/to/directory

crystalml process command

The process command takes a directory of images, segments the droplets in each image and determines how many droplets are present and how many of these contain crystals. The program saves a .csv file at the root of that directory with the name of each image, the time it was taken (from EXIF data) and the number of droplets (total, clear and containing crystals).

Arguments

  • -c, --check-segmentation displays the result of segmenting an image (selected at approximately 80% of the time series) to verify that the segmentation algorithm works well before processing.
  • -o, --save-overlay resaves all images in the directory with an overlay showing detected droplets in red (no crystal) or green (crystal detected) for process control.
  • -p, --save-plot generates and saves plots of crystal contents over time
  • -v, --verbose increases the verbosity level

crystalml segment command

The segment command runs the segmentation algorithm on an image or a directory of images and saves the segmented droplet images to disk.

Arguments

  • -o, --save-overlay resaves all images in the directory with an overlay showing detected droplets
  • -v, --verbose increases the verbosity level

crystalml train command

The train command is used to train the machine learning models used to label the segmented droplets as containing crystals or not. A directory of training data is expected containing subdirectories named Crystal and Clear containing grayscale images of segmented droplets (use the segment command to generate the images).

Arguments

  • -m, --model selects the type of model to train (svm|cnn|cnn-transfer)
  • -tb, --tensorboard saves logs for tensorboard visualization in <cwd>/logs
  • -v, --verbose increases the verbosity level

Repository structure

  • models: pre-trained machine learning models for crystal presence discrimination
  • notebooks: jupyer notebooks evaluating different image segmentation strategies
  • src: source code for the project
    • crystal_processing: processing pipeline from directory to nucleation rate
    • data: data processing methods, including cropping, segmentation, extraction
    • models: model definition and training scripts for the droplet binary labelling task
    • visualization: visualization and plotting methods
    • cli.py: entry point to the command line interface
  • tests: unittesting

License

This project is licensed under the GPLv3 License - see the LICENSE.md file for details.

Credit

Initial models were built starting from the example at: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

Live data visualization class TrainingPlot originally from: https://github.com/kapil-varshney/utilities/blob/master/training_plot/trainingplot.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crystalml-0.1.1.tar.gz (9.1 MB view details)

Uploaded Source

Built Distribution

crystalml-0.1.1-py3-none-any.whl (9.1 MB view details)

Uploaded Python 3

File details

Details for the file crystalml-0.1.1.tar.gz.

File metadata

  • Download URL: crystalml-0.1.1.tar.gz
  • Upload date:
  • Size: 9.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for crystalml-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ee55d8df5f8e836658e73f1929ea0607413ad77f2e1582d78c66077897530e8b
MD5 0713b4ef211067b6e866917e5b7708f7
BLAKE2b-256 d6602d7066705c55af8188f5f90d60fcf5c0435c3b590b582539c16beea1ba59

See more details on using hashes here.

File details

Details for the file crystalml-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: crystalml-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for crystalml-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2ce4b57b687591b21952b5fb427c0f13057ea8e84f122f1e8c0bcd0adcda1496
MD5 928660ff5b745767f0ce4687f901b20f
BLAKE2b-256 db091a19ce48fb0b0c2e3daa65d15dfee42560ec3f549bc13591b745838e98c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page