A Python library for mosquito detection, segmentation, and classification in images
Project description
CulicidaeLab 🦟
A configuration-driven Python library for advanced mosquito analysis, featuring pre-trained models for detection, segmentation, and species classification.
culicidaeLab python library provides a robust, extensible framework designed to streamline the pipeline of mosquito image analysis. Built on a powerful configuration system, it allows researchers and developers to easily manage datasets, experiment with models, and process images for classification, detection, and segmentation tasks. culicidaelab library is part of the is a part of the CulicidaeLab Ecosystem.
CulicidaeLab Ecosystem Architecture
flowchart TD
subgraph L0 [" "]
%% Define layers with subgraphs
subgraph L1 ["Data Layer"]
DS1["🦟 mosquito_dataset_46_3139<br/>Base Diversity Dataset<br/>(46 species, 3139 unique images)<br/>📄 CC-BY-SA-4.0"]
DS2["📊 mosquito-species-<br/>classification-dataset<br/>📄 CC-BY-SA-4.0"]
DS3["🔍 mosquito-species-<br/>detection-dataset<br/>📄 CC-BY-SA-4.0"]
DS4["✂️ mosquito-species-<br/>segmentation-dataset<br/>📄 CC-BY-SA-4.0"]
end
subgraph L2 ["AI Model Layer"]
subgraph M_COLLECTION ["Top-5 Model Collection"]
M4["📊 exp_7_new_bg_simple-subs_1_v_5<br/>pvt_v2_b0.in1k_ep_60<br/>(Classification)<br/>📄 Apache 2.0"]
end
subgraph M_DEFAULT ["Top-1 Models used as default in 'culicidaelab'"]
M1["📊 culico-net-cls-v1<br/>(Classification)<br/>📄 Apache 2.0"]
M2["🔍 culico-net-det-v1<br/>(Detection)<br/>📄 AGPL-3.0"]
M3["✂️ culico-net-segm-v1-nano<br/>(Segmentation)<br/>📄 Apache 2.0"]
end
end
subgraph L3 ["Application Layer"]
APP1["🐍 culicidaelab<br/>Python Library<br/>(Core ML functionality) <br/>📄 AGPL-3.0"]
APP2["🌐 culicidaelab-server<br/>Web Application<br/>(API services)<br/>📄 AGPL-3.0"]
APP3["📸 culicidaelab-mobile<br/>Mobile Application<br/><br/>📄 AGPL-3.0"]
end
subgraph L4 ["API Service Layer"]
S1["🗲 Prediction Service<br/>(ML inference)"]
S2["💾 Observation Service<br/>(Data storage & retrieval)"]
S3["🗺️ Map Service<br/>(Geospatial visualization)"]
S4["🦟 Mosquito Gallery Service<br/>"]
S5["💊 Diseases Gallery Service<br/>"]
end
end
%% Dataset derivation and training flows
DS1 -.->|"derives"| DS2
DS1 -.->|"derives"| DS3
DS1 -.->|"derives"| DS4
DS2 -->|"used for train"| M1
DS3 -->|"used for train"| M2
DS4 -->|"used for train"| M3
DS2 -->|"used for train"| M4
%% Model integration
M1 -->|"integrated into"| APP1
M2 -->|"integrated into"| APP1
M3 -->|"integrated into"| APP1
M4 -->|"integrated into"| APP3
%% Data source for gallery
DS1 -->|"provides photos"| APP2
DS1 -->|"provides photos"| APP3
%% Library to server integration
APP1 -->|"powers"| APP2
%% Service provisioning
APP2 -->|"hosts"| S1
APP2 -->|"hosts"| S2
APP2 -->|"hosts"| S3
APP2 -->|"hosts"| S4
APP2 -->|"hosts"| S5
%% Mobile app service consumption
APP3 <-->|"API calls"| S1
APP3 <-->|"API calls"| S2
APP3 -->|"WebView"| S3
%% Styling
classDef dataLayer fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef modelLayer fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef appLayer fill:#e8f5e8,stroke:#388e3c,stroke-width:2px
classDef serviceLayer fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef collections fill:#f4dbf8,stroke:#9b3ac5,stroke-width:1px,stroke-dasharray:5,5
classDef dataset fill:#bbdefb,stroke:#1565c0,stroke-width:2px
classDef model fill:#e1bee7,stroke:#8e24aa,stroke-width:2px
classDef application fill:#c8e6c9,stroke:#43a047,stroke-width:2px
classDef service fill:#ffe0b2,stroke:#fb8c00,stroke-width:2px
class L1 dataLayer
class L2 modelLayer
class L3 appLayer
class L4 serviceLayer
class L5 collections
class DS1,DS2,DS3,DS4 dataset
class M1,M2,M3,M4 model
class APP1,APP2,APP3 application
class S1,S2,S3,S4,S5 service
class M_DEFAULT,M_COLLECTION collections
An open-source system for mosquito research and analysis includes components:
-
Data:
- Base diversity dataset (46 species, 3139 images) under CC-BY-SA-4.0 license.
- Specialized derivatives: classification, detection, and segmentation datasets under CC-BY-SA-4.0 licenses.
-
Models:
- Top-1 models (see reports), used as default by
culicidaelablibrary: classification (Apache 2.0), detection (AGPL-3.0), segmentation (Apache 2.0) - Top-5 classification models collection with accuracy >90% for 17 mosquito species.
- Top-1 models (see reports), used as default by
-
Protocols: All training parameters and metrics available at:
-
Applications:
- Python library (AGPL-3.0) providing core ML functionality
- Web server (AGPL-3.0) hosting API services
- Mobile app (AGPL-3.0) for field use with optimized models
These components form a cohesive ecosystem where datasets used for training models that power applications, the Python library provides core functionality to the web server, and the server exposes services consumed by the mobile application. All components are openly licensed, promoting transparency and collaboration.
This integrated approach enables comprehensive mosquito research, from data collection to analysis and visualization, supporting both scientific research and public health initiatives.
Key Features of the culicidaelab library
- Configuration-Driven Workflow: Manage all settings—from file paths to model parameters—through simple YAML files. Override defaults easily for custom experiments.
- Ready-to-Use Models: Leverage pre-trained models for:
- Species Classification: Identify mosquito species using a high-accuracy classifier.
- Mosquito Detection: Localize mosquitoes in images with a YOLO-based detector.
- Instance Segmentation: Generate precise pixel-level masks with a SAM-based segmenter.
- Unified API: All predictors share a consistent interface (
.predict(),.visualize(),.evaluate()) for a predictable user experience. - Automatic Resource Management: The library intelligently manages local storage, automatically downloading and caching model weights and datasets on first use.
- Extensible Provider System: Seamlessly connect to data sources. A
HuggingFaceProvideris built-in, with an easy-to-implement interface for adding more providers. - Powerful Visualization: Instantly visualize model outputs with built-in, configurable methods for drawing bounding boxes, classification labels, and segmentation masks.
Requirements
Hardware Requirements
Processor (CPU): Any modern x86-64 CPU.
Memory (RAM): Minimum 2 GB. 8 GB or more is recommended for processing large datasets or using more complex models.
Graphics Card (GPU): An NVIDIA GPU with CUDA support is highly recommended for a significant performance increase in deep learning model operations, especially for detection and segmentation but not essential for classification (see performance logs ang notebook). For the SAM model, a GPU is virtually essential for acceptable performance. Minimum video memory is 2 GB; 4 GB or more is recommended.
Hard Drive: At least 10 GB of free space to install the library, dependencies, download pre-trained models, and store processed data.
Software Requirements:
Operating Systems (tested):
- Windows 10/11
- Linux 22.04+ Software:
- for Linux needed libgl1 package to be installed
- Git
- Python 3.11
- uv 0.8.13 Python packages:
- PyTorch 2.3.1+
- FastAI 2.7.0 - 2.8.0
- Ultralytics 8.3.0+
- HuggingFace Hub 0.16.0+
- Datasets 4.0.0
- Pillow 9.4.0
- Pydantic 2.0.0+ Full list of requirements: requirements.txt Development requirements: requirements-dev.txt
Installation
For general usage with Python scripts or in Google Colab, you can install culicidaelab using pip:
pip install culicidaelab
If needed run examples in the Jupyter notebooks in local environment:
pip install culicidaelab[examples]
If needed build documentation in local environment:
pip install culicidaelab[docs]
If needed run tests in local environment:
pip install culicidaelab[test]
To get a development environment running:
-
Clone the repository:
git clone https://github.com/iloncka-ds/culicidaelab.git cd culicidaelab
-
Install dependencies with
uv(recommended):uv venv -p 3.11 uv sync -p 3.11 uv cache clean # This installs the library in editable mode and includes all dev tools uv pip install -e .[dev]
Or with
pip:python -m venv .venv pip install --upgrade pip pip install -e .[dev] pip cache purge
-
Set up pre-commit hooks:
pre-commit installThis will run linters and formatters automatically on each commit to ensure code quality and consistency.
Quick Start
Here's how to classify the species of a mosquito in just a few lines of code. The library will automatically download the necessary model on the first run.
from culicidaelab import MosquitoClassifier, get_settings
# 1. Get the central settings object
# This loads all default configurations for the library.
settings = get_settings()
# 2. Instantiate the classifier
# The settings object knows how to configure the classifier.
classifier = MosquitoClassifier(settings, load_model=True)
# 3. Make a prediction
# The model is lazy-loaded (downloaded and loaded into memory) here.
predictions = classifier.predict("path/to/your/image.jpg")
# 5. Print the results
# The output is a list of (species_name, confidence_score) tuples.
print("Top 3 Predictions:")
for species, confidence in predictions[:3]:
print(f"- {species}: {confidence:.4f}")
# Example Output:
# Top 3 Predictions:
# - Aedes aegypti: 0.9876
# - Aedes albopictus: 0.0112
# - Culex quinquefasciatus: 0.0009
Practical Applications
CulicidaeLab is more than just a set of models; it's a powerful engine for building real-world solutions. Here are some of the ways it can be applied:
-
Automation in Scientific Laboratories:
- Bulk Data Processing: Automatically analyze thousands of images from camera traps or microscopes to assess mosquito populations without manual intervention.
- Reproducible Research: Standardize the data analysis process, allowing other scientists to easily reproduce and verify research results published using the library.
-
Integration into Governmental and Commercial Systems:
- Epidemiological Surveillance: Use the library as the core "engine" for national or regional monitoring systems to track vector-borne disease risks.
- Custom Solution Development: Rapidly prototype and create specialized software products for pest control services, agro-industrial companies, or environmental organizations.
-
Advanced Analytics and Data Science:
- Geospatial Analysis: Write scripts to build disease vector distribution maps by processing geotagged images.
- Predictive Modeling: Use the library's outputs as features for larger models that forecast disease outbreaks based on vector presence and density.
Documentation
For complete guides, tutorials, and the full API reference, visit the documentation site.
The documentation includes:
- In-depth installation and configuration guides.
- Detailed tutorials for each predictor.
- Architectural deep-dives for contributors.
- A full, auto-generated API reference.
Contributing
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
Please see our Contributing Guide for details on our code of conduct, development setup, and the pull request process.
Acknowledgments
CulicidaeLab development is supported by a grant from the Foundation for Assistance to Small Innovative Enterprises (FASIE)
License
This project is distributed under the AGPL-3.0 License. See the LICENSE file for more information.
Citation
If you use CulicidaeLab in your research, please cite it as follows:
@software{culicidaelab2024,
author = {Ilona Kovaleva},
title = {{CulicidaeLab: A Configuration-Driven Python Library for Mosquito Analysis}},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/iloncka-ds/culicidaelab}
}
Contact
- Issues: Please use the GitHub issue tracker.
- Email: iloncka.ds@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file culicidaelab-0.2.2.tar.gz.
File metadata
- Download URL: culicidaelab-0.2.2.tar.gz
- Upload date:
- Size: 69.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ccdb3c57257e8686f665eb3ffd74eab0dac0303db529e2a9c0873b713354425
|
|
| MD5 |
fc0eb523de79d99ccab8f5a294f632ad
|
|
| BLAKE2b-256 |
b9894390a59ee69ec2facf4d02c0c7d080eced528fc7fd4a59dabfe32f110104
|
File details
Details for the file culicidaelab-0.2.2-py3-none-any.whl.
File metadata
- Download URL: culicidaelab-0.2.2-py3-none-any.whl
- Upload date:
- Size: 77.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c5e59d2aab3686ebb9e40ab07419cc1d4f32b2f7beb55bf162bc2b383ed3ecd
|
|
| MD5 |
ff0d4fcd39fea097b71208190f0bda8d
|
|
| BLAKE2b-256 |
01e6703d25d6fc452a9029eadcace7e911d7a4b58144598cd13db5d9dcbde683
|