Skip to main content

A comprehensive package of biological constants, serving as a foundational resource for biology and bioinformatics, complemented by functions to streamline related tasks.

Project description

Biobase

Static Badge Python Version from PEP 621 TOML PyPI version License: MIT GitHub branch status

A Python package providing standardized biological constants and scoring matrices for bioinformatics pipelines. Biobase aims to eliminate the need to repeatedly recreate common biological data structures and scoring systems in your code.

Table of Contents

Quick Start

Access amino acid properties

from biobase.constants import ONE_LETTER_CODES, MONO_MASS
print(ONE_LETTER_CODES)  # 'ACDEFGHIKLMNPQRSTVWY'
print(MONO_MASS['A'])    # 71.037113805
`FastaRecord``

Use scoring matrices

from biobase.matrix import Blosum
blosum62 = Blosum(62)
print(blosum62['A']['A'])  # 4
print(blosum62['W']['C'])  # -2

Analyze DNA sequences

from biobase.analysis import Dna
sequence = "ATCGTAGC"
print(Dna.transcribe(sequence))               # 'AUCGUAGC'
print(Dna.complement(sequence))               # 'TAGCATCG'
print(Dna.complement(sequence, reverse=True)) # 'GCTACGAT'
print(Dna.calculate_gc_content(sequence))     # 50.0
print(Dna.entropy(sequence))                  # 2.0

seq = "ccatgccctaaatggggtag"
for start, end, orf in Dna.find_orfs(seq, include_seq=True)
    print(start, end, orf)
# 2, 11, "ATGCCCTAA"
# 11, 20, "ATGGGGTAG"

Find protein motifs

from biobase.analysis import find_motifs
sequence = "ACDEFGHIKLMNPQRSTVWY"
print(find_motifs(sequence, "DEF"))  # [3]

Parse FASTA

from biobase.parser import fasta_parser
records = fasta_parser(fasta)
for r in records:
    print(r.id) # CAA39742.1
    print(r.seq) # MTNIRKSHPLMKII...

Requirements

  • Python 3.10+
  • pip (for installation)

Installation

Regular Installation

pip install biobase

Development Installation

Clone the repository and install in editable mode:

git clone https://github.com/lignum-vitae/biobase.git
cd biobase
pip install -e .

Running Files

To ensure relative imports work correctly, always run files using the module path from the project root:

Run a specific file

python -m src.biobase.matrix

Data Files

  • src/biobase/matrices/: Scoring matrix data stored in JSON file format

Project Goals

Biobase aims to provide Python-friendly versions of common biological constants and tools for bioinformatics pipelines. Key objectives:

  1. Standardize biological data structures
  2. Provide efficient implementations of common scoring systems
  3. Ensure type safety and validation
  4. Maintain comprehensive documentation
  5. Support modern Python practices

Contributing

We welcome contributions! Please read our:

Stability

This project is in the beta stage. APIs may change without warning until version 1.0.0.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biobase-0.6.0.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biobase-0.6.0-py3-none-any.whl (69.2 kB view details)

Uploaded Python 3

File details

Details for the file biobase-0.6.0.tar.gz.

File metadata

  • Download URL: biobase-0.6.0.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for biobase-0.6.0.tar.gz
Algorithm Hash digest
SHA256 bbb18719195cf6c50600c0160726a807308d14993c599305b50bbd7ca689abc0
MD5 8af402753db7d5566203246aabf52281
BLAKE2b-256 1e17856bc706fd68653b17eb604ae757f8476744592ba9d6f3dda1a462c5d8ef

See more details on using hashes here.

File details

Details for the file biobase-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: biobase-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 69.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for biobase-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cbe7ceecfc038f4bcd68305a75ce0ba0b0c2c5ca0fbf7ee6550c5757b08a8861
MD5 10758f19d2d4fd8683ff53117b75d2dc
BLAKE2b-256 444aeef80c7cb018568ae6a9ce093401bfccf0206f91349151c8576643fd8d61

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page