Utility functions for proteomics data analysis

Reason this release was yanked:

Broken annotation downloads

Project description

Dousatsu

Dousatsu is a Python library for the analysis of quantitative mass spectrometry-based proteomics data. It provides a set of tools for feature preprocessing, analysis, selection, and visualization, enabling a comprehensive workflow from raw data to biological insights.

The library is designed to be modular and easy to use, with a focus on integrating with the scientific Python ecosystem, including pandas, numpy, scikit-learn, and statsmodels.

Core Modules

Dousatsu is organized into four main modules, each addressing a specific step in the proteomics data analysis pipeline:

`feature_preprocessing`

This module provides a suite of tools for cleaning, normalizing, and transforming raw proteomics data into an analysis-ready format. Key functionalities include:

Data Loading: Functions to load data from common proteomics software outputs like TRIC, Diann, and Spectronaut.
Data Cleaning: Transformers to remove contaminants, non-proteotypic peptides, and low-quality data based on intensity and q-value cutoffs.
Data Formatting: Tools to reshape data from wide to long format and to standardize column names.
Normalization: Methods for median and quantile normalization to correct for systematic variations between samples.
Missing Value Imputation: Strategies to handle missing values, a common issue in proteomics data.

The preprocessing steps are implemented as scikit-learn compatible transformers, allowing them to be chained together in a Pipeline.

`feature_analysis`

Once the data is preprocessed, this module offers functions for statistical analysis to identify differentially abundant proteins or peptides. Features include:

Statistical Tests: Implementation of two-sample t-tests with corrections for multiple testing (e.g., Benjamini-Hochberg).
Fold Change Calculation: Functions to calculate log2 fold changes between different conditions.
Correlation Analysis: Tools to assess the correlation between technical replicates.

`feature_selection`

This module helps in identifying the most informative features (peptides or proteins) for building predictive models or for biomarker discovery. It includes:

Recursive Feature Elimination (RFE): A cross-validated RFE implementation to select the most stable and predictive features.
Visualization: Functions to visualize the results of the feature selection process.

`feature_visualization`

A picture is worth a thousand words. This module provides a wide range of visualization functions to explore the data and present the results of the analysis:

Dimensionality Reduction: PCA plots to visualize sample clustering and identify batch effects.
Differential Abundance: Volcano plots to visualize the results of statistical tests.
Heatmaps: Clustermaps to visualize the expression patterns of proteins or peptides across samples.
Data Quality: Plots for visualizing intensity distributions and missingness.

Development Environment

This repository is set up for development inside a Docker container to ensure a consistent and reproducible environment.

Requirements

Docker

How to use

Initial setup

Clone the repository.
Build and start the development container:
```
./start_dev.sh
```
The first time you start the container, install the pre-commit hooks:
```
pre-commit install
```

Developing

The project directory is mounted inside the container at /App, so you can edit the files on your host machine with your favorite editor.
Run all git commands from within the container.
Install the package in editable mode to test your changes:
```
pip install -e .
```
To stop the container, run:
```
./stop_dev.sh
```

This will also remove the container, so you can start fresh the next time.

Project details

Release history Release notifications | RSS feed

0.2.1

Apr 10, 2026

This version

0.2.0 yanked

Apr 10, 2026

Reason this release was yanked:

Broken annotation downloads

0.1.3

Dec 4, 2025

0.1.2 yanked

Nov 19, 2025

Reason this release was yanked:

missing deps

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dousatsu-0.2.0.tar.gz (69.1 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dousatsu-0.2.0-py3-none-any.whl (81.0 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file dousatsu-0.2.0.tar.gz.

File metadata

Download URL: dousatsu-0.2.0.tar.gz
Upload date: Apr 10, 2026
Size: 69.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for dousatsu-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c389b035f28b7894317169ee372533bbc785a80bd869b8c7447278b3ec2ba46d`
MD5	`d7df3a8626a12f88698ba08923c12f76`
BLAKE2b-256	`6577201e25374a6694576ba74816cac907d4e0c767159184bbabd9295cb95e21`

See more details on using hashes here.

File details

Details for the file dousatsu-0.2.0-py3-none-any.whl.

File metadata

Download URL: dousatsu-0.2.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 81.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for dousatsu-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86d0f8c82009d7d704093a2ac47c3cae5b44fc199acb3e0a4512d2e12046dbc3`
MD5	`e4f4cedb20935a9e8140e8f0f3dcaf5d`
BLAKE2b-256	`85882bbb9935a0850f9a9a69692abb1040e943fb654f983f361be6bd5fd7cf0f`

See more details on using hashes here.

dousatsu 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Dousatsu

Core Modules

`feature_preprocessing`

`feature_analysis`

`feature_selection`

`feature_visualization`

Development Environment

Requirements

How to use

Initial setup

Developing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes