Sequence analysis and bioinformatics utilities in Python and Perl.
Project description
A hybrid Python + Perl bioinformatics toolkit for sequence alignment, Markov modeling, sequence analysis, and FM-index–based search — unified under a single Python CLI and API.
Documentation
-
📘 Live Docs (GitHub Pages) https://bibymaths.github.io/bio-sea-pearl/
-
📂 Local Docs Located in
docs/
Quick Start
1. Clone
git clone https://github.com/bibymaths/bio-sea-pearl.git
cd bio-sea-pearl
2. Install Dependencies
This project uses Python ≥ 3.10 and is built with Hatch.
With uv (recommended)
uv pip install -e ".[dev]"
With pip
pip install -e ".[dev]"
Note: Some features (alignment, Markov generation) delegate to Perl scripts at runtime. Install Perl ≥ 5.26 if you need those features.
3. Run the CLI
The unified CLI entrypoint is:
biosea --help
CLI Usage
Alignment
biosea align seq1.fa seq2.fa \
--matrix alignment/scoring/blosum62.mat \
--mode global
⚠️ Notes:
- Matrix filenames are case-sensitive
- Use
blosum62.mat, notBLOSUM62.mat
Optionally, you can generate dotplot in svg format:
perl alignment/bin/dotplot.pl align.matrix.tsv dotplot.svg
Markov Chain
biosea markov \
--fasta seq1.fa \
--length 100 \
--start A \
--order 1 \
--method alias
For higher-order models:
--order 2 --start AA
⚠️ Constraint:
startlength must equalorder
Sequence Utilities
# Hamming distance
biosea seqtools hamming ACGT AGGT
# Levenshtein distance
biosea seqtools levenshtein kitten sitting
# k-mer counts
biosea seqtools kmer ACGTACGT --k 3
BWT / FM-Index Search
biosea bwt search \
--sequence ACGTACGT \
--pattern CGT
REST API
Start the FastAPI server:
uvicorn api.server:app --reload
Endpoints:
POST /alignPOST /markovPOST /distancePOST /kmerPOST /bwt/search
Example:
curl -X POST http://localhost:8000/distance \
-H "Content-Type: application/json" \
-d '{"seq1": "kitten", "seq2": "sitting", "metric": "levenshtein"}'
Interactive API documentation is available at http://localhost:8000/docs.
Docker
Build and start
./docker_up.sh
Stop and remove
./docker_down.sh
Interactive shell
./docker_interactive.sh
Manual Docker commands
docker compose up --build -d
docker compose exec biosea biosea --help
docker compose down
Project Structure
src/bio_sea_pearl/
├── cli.py # Unified CLI
├── api/ # Clean Python API layer
├── perl_wrappers/ # Bridge to legacy Perl scripts
├── seqtools_py/ # Python ports of core algorithms
└── bwt/ # Native Python FM-index
alignment/ # Legacy alignment tools
markov/ # Perl Markov models
seqtools/ # Perl sequence utilities
api/server.py # FastAPI layer
docs/ # MkDocs documentation
tests/ # Unit + integration tests
Architecture Overview
The system is layered:
CLI / API
↓
Python API Layer
↓
Wrappers (subprocess)
↓
Perl + Python legacy tools
This design:
- preserves legacy code
- enables gradual Python migration
- provides production-ready interfaces
Running Tests
pytest
Building
pip install build
python -m build
This produces a source distribution and wheel in dist/.
Releasing
Releases are automated via GitHub Actions. Push a tag to trigger the workflow:
git tag v0.1.0
git push origin v0.1.0
This will:
- Run the test suite
- Create a GitHub release
- Build and publish the package to PyPI
- Build and push multi-arch Docker images to
ghcr.io/bibymaths/bio-sea-pearl
Troubleshooting
Alignment fails
-
Check matrix path:
alignment/scoring/blosum62.mat -
Avoid uppercase filenames
Markov fails
Error:
Start state must be length N
Fix:
--order N → start string length must be N
CLI not found
pip install -e .
biosea --help
License
This project is licensed under the MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bio_sea_pearl-0.1.2.tar.gz.
File metadata
- Download URL: bio_sea_pearl-0.1.2.tar.gz
- Upload date:
- Size: 117.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9694310bc617785dc25368a06f8929f9222b194adc9c94ede38c5a6cbdfbc5d
|
|
| MD5 |
2f6c318ad3b4c9b9a2f6a576fd63cd59
|
|
| BLAKE2b-256 |
f9b849f7943c0e049d00e025fe67c16f4cf44fd3cd73c0d406ef9a40e5be507f
|
File details
Details for the file bio_sea_pearl-0.1.2-py3-none-any.whl.
File metadata
- Download URL: bio_sea_pearl-0.1.2-py3-none-any.whl
- Upload date:
- Size: 133.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb07f2193eea797ee826aabf8c5352fa4797f0ad4ff373ca7f684fec6a85967d
|
|
| MD5 |
2bd0c20d26f368e28b71cde1f3890e80
|
|
| BLAKE2b-256 |
fa200439267620c57368d20b5824a258ec714d8169ea2fa07c172cf3a6a1ad09
|