Chemical images classification project. Program for training the deep neural network model and web service for classification chemical images
Project description
Chemical Image Classifier (ChemIC) v1.3.1
This is the official fork and continuation of the ChemIC project, which was originally developed by Dr. Aleksei Krasnov. The original repository can be found at https://github.com/ontochem/ChemIC
Table of Contents
- Project Description
- Requirements
- Prepare Workspace Environment with Conda
- Model Construction
- Models Download
- Usage: Web Service for Chemical Image Classification
- Jupyter Notebook
- Author
- Citation
- References
- License
User Web Interface
You can try out the user frontend web interface at https://chemic-ai.streamlit.app/
Project Description
The Chemical Image Classifier (ChemIC) project provides a solution for classifying chemical images using a Convolutional Neural Network (CNN). The model categorizes images into one of four predefined classes:
- Images containing a single chemical structure.
- Images depicting chemical reactions.
- Images featuring multiple chemical structures.
- Images with no identifiable chemical structures.
The package consists of three main components:
A) CNN Model for Image Classification (chemic_train_eval.py)
- Trains a deep learning model to classify images into the four predefined classes.
- Utilizes a pre-trained ResNet-50 model and includes steps for data preparation, model training, evaluation, and testing.
B) Web Service for Chemical Image Classification (app.py)
- Provides a FastAPI web application for classifying chemical images using the trained ResNet-50 model.
- Exposes an endpoint
/classify_imagesfor accepting chemical images and returning the predicted class.
C) Image Classification Client (client.py)
-
Interacts with the ChemIC web server. The client can send to the server:
- The path to an individual image file
- The path to a directory with multiple images
- Base64 encoded image data
The server classifies the images and returns the recognition results to the client.
Prepare Workspace Environment with Conda
# 1. Create and activate the conda environment
conda create --name chemic "python<3.13"
conda activate chemic
# 2. Install ChemIC-ml
# 2.1 From PyPI
pip install ChemIC-ml
# 2.2 Or, install from the GitHub repository
pip install git+https://github.com/alexey-krasnov/ChemIC.git
# 2.3 Or, install in editable mode from the GitHub repository
git clone https://github.com/alexey-krasnov/ChemIC.git
cd ChemIC
pip install -r requirements.txt
pip install -e .
- Where -e means "editable" mode.
Model construction
First, download the archive with manually labeled images, available as part of the supplementary materials from Zenodo: dataset_for_image_classifier.zip. Unzip the archive:
unzip dataset_for_image_classifier.zip
To perform model training, validation, and testing, as well as saving your trained model, run the following command in the CLI:
python chemic_train_eval.py --dataset_dir /path/to/data --checkpoint_path /path/to/checkpoint.pth --models_dir /path/to/models
--dataset_dir: Directory containing the dataset (with train, test, and validation subdirectories).--checkpoint_path: Path to the existing model checkpoint file.--models_dir: Directory to save newly trained models.
This command executes the training and evaluation using the specified paths.
Models download
Download the pre-trained models from Zenodo as an archive: models.zip.
Unzip it into the chemic/models directory. The models directory should contain the pre-trained model chemical_image_classifier_resnet50.pth for chemical image classification.
Usage Web Service for Chemical Image Classification
1. Start the FastAPI Web Server in Production Mode
Run the following command in terminal:
uvicorn chemic.app:app --host 127.0.0.1 --port 5010 --workers 1 --timeout-keep-alive 3600
--workers 1: Specifies the number of worker processes. Adjust based on your server's capabilities.--host 127.0.0.1 --port 5010: Binds the application to the specified address and port. Modify as needed.--timeout-keep-alive 3600: Sets the maximum allowed request processing time in seconds. Adjust as necessary.
2. Use frontend Web interface
In another terminal window, run the following command:
streamlit run chemic_frontendapp.py --server.address=0.0.0.0 --server.port=5009
This command will launch the ChemIC user web interface.
3. Classify Images Using the client.py Module via CLI
python chemic/client.py --image_path /path/to/images --export_dir /path/to/export
OR
python chemic/client.py --image_data <base64_encoded_string> --export_dir /path/to/export
--image_pathis the path to the image file or directory with images for classification.--image_datais the base64 encoded image data.--export_diris the export directory for the results.
4. Alternatively, Use the Client for Classification in Your Python Code
from chemic.client import ChemClassifierClient
client = ChemClassifierClient(server_url='http://127.0.0.1:5010')
# Check the health of the server
health_status = client.healthcheck().get('status')
print(f"Health Status: {health_status}")
# Use image path or directory. Replace with the actual path to your image file
image_path = '<path to the image file or directory with images for classification>'
recognition_results = client.classify_images(image_path)
# OR use base64-encoded image data. Replace with your base64-encoded image data:
base64_data = b'iVBORw0KGgoAAAANSUhEUgA....'
recognition_results = client.classify_images(image_data=base64_data)
# Recognition results will be returned in the form of a list of dictionaries
print(recognition_results)
[
{
'image_id': 'image_name_1.png',
'predicted_label': 'single chemical structure',
'classifier_package': 'ChemIC-ml_1.3.1',
'classifier_model': 'ResNet_50',
},
{
'image_id': 'image_name_2.png',
'predicted_label': 'multiple chemical structures',
'classifier_package': 'ChemIC-ml_1.3.1',
'classifier_model': 'ResNet_50',
},
...
]
Jupyter Notebook
The client_image_classifier.ipynb notebook in the notebooks directory provides an easy-to-use interface for classifying images. Follow the steps outlined in the notebook to perform image classification.
Author
Dr. Aleksei Krasnov dr.aleksei.krasnov@gmail.com
Citation
- A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, L. Weber, Comparing software tools for optical chemical structure recognition, Digital Discovery (2024). https://doi.org/10.1039/D3DD00228D
- L. Weber, A. Krasnov, S. Barnabas, T. Böhme, S. Boyer, Comparing Optical Chemical Structure Recognition Tools, ChemRxiv. (2023). https://doi.org/10.26434/chemrxiv-2023-d6kmg-v2
References
- A. Krasnov, Images dataset for Chemical Images Classifier model. https://zenodo.org/records/13378718
- A. Krasnov, Chemical Image Classifier Model. https://zenodo.org/records/10709886
License
This project is licensed under the MIT - see the LICENSE.md file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chemic_ml-1.3.2.tar.gz.
File metadata
- Download URL: chemic_ml-1.3.2.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cdf4c96f5b35ed52bd9e9f625898ff0599fd4294ebdbfdb2fe59b66a643987b
|
|
| MD5 |
a02f9a005b89bf4c374202158a9a7c3a
|
|
| BLAKE2b-256 |
1e9b93bdfa01f763ab8dfb6545e61a605ce12297ebe58daa24416fec04e18de0
|
File details
Details for the file chemic_ml-1.3.2-py3-none-any.whl.
File metadata
- Download URL: chemic_ml-1.3.2-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4dd641965d7e7af5ef12c6cb8a256a14f0ea74fc8b41814ffda31877e8abe999
|
|
| MD5 |
6f85e45a746d2b244cb213cce35ffe69
|
|
| BLAKE2b-256 |
b39b0df460e3bde2c93e41a07c2c7d404a67f527875e4c070ed53da99cab679d
|