Flocculation kinetic modelling and retention time simulation toolkit
Project description
Floclib
Floclib is a Python toolkit for analyzing flocculation image feature data.
It computes flocculation kinetics from Aggregate Size Distribution (ASD) to derive the Power Law Slope (Beta), fits aggregation/breakage coefficients (Ka, Kb) using Swarm Intelligence (SI) + NLS, and simulates the Total Hydraulic Retention Time (THRT) for an array of treatment efficiency and Completely Stirred Tank Reactors (CSTR) in series - Chambers-in-Series. Floclib is designed for reproducible, offline use with feature tables exported from segmentation tools.
Key features
- Two ASD methods: legacy
delta(dN = previous − current) and standarddensity(counts / bin_width). - Robust fitting: PSO global search (configurable grid) with optional Huber loss, followed by Levenberg–Marquardt refinement (
scipy.curve_fit). - Retention time solvers: Secant and Newton–Raphson methods for simulating THRT for multi-compartment CSTR system.
- Feature-first workflow: accepts CSV / Parquet / NumPy feature tables from an upstream floc image segmentation (version including direct image segmentation will be released soon).
- CLI + Python API: scriptable and interactive usage.
Installation
#Note: Do not pip install into base/system Python. It is advisable to create a virtual environment using either "conda env create -f environment.yml" or "python -m venv .venv" before installing floclib.
Install runtime dependencies (Linux / macOS):
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements_ranges.txt
pip install floclib
Windows (PowerShell)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements_ranges.txt
pip install floclib
Using Conda (recommended; Binary-safe) Linux / macOS / Windows (Anaconda/Miniconda)
# from repo root (where environment.yml is)
conda env create -f environment.yml
conda activate floclib
# install your package in editable mode (dev)
pip install floclib
Quick verification (after installation) Run these to confirm core imports and CLI show help:
# basic import checks
python -c "import sys; from floclib.asd import compute_beta; print('ASD OK'); from floclib.fit import fit_ka_kb; print('FIT OK')"
# CLI help
python -m floclib.cli --help
If these succeed, the install is good.
Input data format
Minimum required columns in feature table (rows = detected particles):
Folder— grouping key (one folder per G or Tf).longest_length— particle size measure (units must be consistent across the dataset).
Optional useful columns: Particle_num, Area_px, Equivalent_diameter_px, Perimeter_px, Major_axis_length_px, Minor_axis_length_px, Threshold_val, Timestamp.
Supported filetypes: .csv, .parquet, .feather, .npy, .npz.
Python API reference
Import:
from floclib.io import load_features, build_beta, save_results
from floclib.asd import compute_beta
from floclib.fit import fit_ka_kb
from floclib.cstr import simulate_retention_times
compute_beta_from_features(...)
Calculate Beta per folder/group.
Signature (key args):
compute_beta(
features: pd.DataFrame,
*,
size_col: str = "longest_length",
folder_col: str = "Folder",
method: str = "delta", # "delta" or "density"
bins: Optional[Sequence[float]] = None,
min_size: Optional[float] = None,
max_size: Optional[float] = None,
interval: Optional[float] = None,
midpoint_type: str = "geom", # "geom" or "mid"
min_points_for_fit: int = 3,
include_lowest: bool = True,
verbose: bool = False
) -> pd.DataFrame
Notes:
method="delta"reproduces legacydN = prev − currentand fitslog(dN/dp)vslog(size).method="density"fitslog(counts/dp)vslog(size).- Provide either
binsor (min_size,max_size,interval). - Returns DataFrame with
Tf,Beta,Intercept,n_points,r2.
fit_ka_kb(...)
Fit Ka and Kb using PSO + NLS.
Signature (key args):
fit_ka_kb(
Tf: np.ndarray,
Bo_B_obs: np.ndarray,
Gf: float,
*,
lb: Tuple[float,float] = (1e-13, 1e-13),
ub: Tuple[float,float] = (1e-3, 1e-3),
param_grid: Optional[dict] = None,
pso_iters: int = 100,
loss_for_pso: str = "huber", # "huber" or "mse"
huber_delta: float = 0.01,
run_grid_search: bool = True,
verbose: bool = False,
plot: bool = True,
plot_title: Optional[str] = None
) -> Dict[str, Any]
Behavior:
- Default replicates the legacy workflow: PSO hyperparameter grid (w, c1, c2, swarm sizes) + Huber loss → select best PSO result →
curve_fitrefine. - Returns
Ka_pso_init,Kb_pso_init,pso_best_score,pso_best_opts,Ka_fit,Kb_fit,Bo_B_fit,pcov.
Tuning tips:
run_grid_search=Truegives more robust PSO starting guesses (slower).loss_for_pso="huber"is robust to outliers;huber_deltacontrols sensitivity.
simulate_retention_times(...)
Simulate retention times T for specified R values.
Signature:
simulate_retention_times(
Gf_val: float,
Ka_fitted: float,
Kb_fitted: float,
R_values: Sequence[float] = (2,3,10),
m: int = 5,
T0: float = 50.0,
T1: float = 100.0
) -> pd.DataFrame
Behavior:
- Repeats the provided scalars to build arrays for
midentical compartments against the reciprocal of efficiency (R). - Uses Secant and Newton–Raphson methods to find THRT, solving the reactor product equation.
- Returns DataFrame with
Date,R,m,Gf,Ka,Kb,Newton_T,Newton_T_min,Secant_T,Secant_T_min.
IO helpers
load_features(path)— loads CSV / Parquet / NumPy arrays into a DataFrame.build_beta(beta_df, tf_col="Tf", beta_col="Beta", time_multiplier=60)— constructsTf_arrandBo_B_obsused for fitting.save_results(obj, out_path)— saves DataFrame/dict to JSON / CSV / Parquet as appropriate.
CLI usage (example)
Run end-to-end feature → Beta → fit → simulate: (Activate the environment first before the following). (Windows, macOS, Linux)
python -m floclib.cli -i examples/testing.csv --Gf 18 --method delta --min-size 0.02 --max-size 2.375 --interval 0.10 --loss huber --pso-grid --pso-iters 100 --out run_results.json
Optional (Multi-line — Linux / macOS; bash, zsh)
python -m floclib.cli \
-i examples/testing.csv \
--Gf 18 \
--method delta \
--min-size 0.02 \
--max-size 2.375 \
--interval 0.10 \
--loss huber \
--pso-grid \
--pso-iters 100 \
--out run_results.json
Key CLI options:
-i, --input: feature file path (csv/parquet/npy)--Gf: shear velocity (scalar)--method: ASD method (deltaordensity)--binsor (--min-size,--max-size,--interval) : bin specification--loss:huberormsefor PSO objective--pso-grid: toggle PSO hyperparameter grid search--pso-iters: iterations per PSO run--plot: show observed vs fitted curve
Outputs: JSON summary and companion Parquet files: <out>_beta.parquet, <out>_cstr.parquet.
Output artifacts
<out>.json— summary (fit metadata, Beta table, simulation results).<out>_beta.parquet— Beta table withTimeandBo_B.<out>_cstr.parquet— retention time results.
Notes & recommendations
- Units consistency: ensure particle-size units and bin edges use the same unit (mm or µm). Unit changes and bin size/intervals alter fitted slopes.
- ASD method selection: use
deltato reproduce legacy behaviour;densityis the standard alternative. - PSO performance: grid search improves robustness but increases runtime. Adjust
pso_itersand swarm sizes for faster iteration during heavy simulation. - Reproducibility: PSO is stochastic. Add a seed option (if deterministic results are required) before large-scale production runs.
- Error handling: input validation checks for required columns; ensure
Folderis declared for the corresponding column for Tf accordingly.
Contributing & license
Contributions are welcome. Include tests for algorithmic changes. License: MIT Copyright (c) 2025 Bankoleabayomi.
Citation
@article{bankole_novel_2025,
title = {A novel open-source framework for automatic flocculation kinetics and retention time modelling using image analysis and swarm intelligence},
volume = {74},
rights = {All rights reserved},
issn = {2214-7144},
url = {https://www.sciencedirect.com/science/article/pii/S2214714425009432},
doi = {10.1016/j.jwpe.2025.107871},
pages = {107871},
journaltitle = {Journal of Water Process Engineering},
author = {Bankole, Abayomi O. and Moruzzi, Rodrigo and Negri, Rogério G. and Campos, Luiza C.},
urldate = {2025-05-05},
date = {2025-05-01},
}
Contact
For questions, issues, or feature requests, open an issue in the project repository with a reproducible example and expected vs. actual behavior.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file floclib-0.1.3.tar.gz.
File metadata
- Download URL: floclib-0.1.3.tar.gz
- Upload date:
- Size: 19.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab088964ebf7239b940f7012c92b4a0420706c69948769f322b23fed8dd70176
|
|
| MD5 |
78f72a827e858a60ab0ec4ca22dda17e
|
|
| BLAKE2b-256 |
63f1231160e25dc3cb0a66ec7e71956a22676685f77e28bb389472ac78513a00
|
File details
Details for the file floclib-0.1.3-py3-none-any.whl.
File metadata
- Download URL: floclib-0.1.3-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
128204887a1672b1a248b8f273c99734362a97918762e9be947622b7b1c488e0
|
|
| MD5 |
503c6df2fb080cc140ede15117de4614
|
|
| BLAKE2b-256 |
128e61175fa13bb05866a3b31e0ddd505950dfc8c074653902e7893e860ee1ff
|