Skip to main content

A Python library for sequence inference.

Project description

SeqInfer

SeqInfer is a Python package for sequence inference, enabling outcome prediction, sequence generation, and meaningful representation discovery, etc for sequence-like data.

Initially focused on biological sequences such as DNA, RNA, and protein sequences, it aims to provide essential tools and algorithms for handling sequence data. However, the package is designed to be easily expandable to accommodate other types of sequences, such as SMILE strings or time series. Relevant helper modules may be added in the future development.

**This library was renamed to SeqInfer from SeqLearn to avoid potential conflicts and confusion given that SeqLearn has been used by other people's repo.

Table of Contents

Installation

You can install SeqInfer using pip: pip install seqinfer Or pip install git+https://github.com/jiajiexiao/seqinfer.git

Usage

To use SeqInfer, simply import the desired modules from the seqs and learners sub-packages.

For example, you can prepare the data as below:

from seqinfer.seq.datasets import SeqFromFileDataset
from seqinfer.seq.transforms import Compose, KmerTokenizer, OneHotEncoder, ToTensor
from seqinfer.seq.vocabularies import unambiguous_dna_vocabulary_dict

seq_dataset = SeqFromFileDataset(
    seq_file="examples/toys/CCA-TXXAGG-AG-TGG-TC-A-T/pos.fasta",
    seq_file_fmt="fasta",
    transform_sequences=Compose(
        [
            KmerTokenizer(
                k=1,
                stride=1,
                vocab_dict=unambiguous_dna_vocabulary_dict,
                num_output_tokens=3,
                special_tokens=None,
            ),
            OneHotEncoder(vocab_size=len(unambiguous_dna_vocabulary_dict)),
            ToTensor(),
        ]
    ),
)

Project Structure

The SeqInfer package is organized into two major parts:

  1. seq: Contains modules to define and manage the data/dataset of sequences and provides various related transformation operations.
  2. infer: Contains modules for different learners (learning algorithms) to conduct learning tasks such as classification, regression, self-supervised representation learning, sequence generation, etc.

Examples

The examples folder contains illustrative examples demonstrating the usage of SeqInfer for various tasks, including classification, regression, multitask learning, etc. Each example includes a README to guide you through the usage and expected results.

Contributing

We welcome contributions to improve and extend SeqInfer. If you would like to contribute, please follow our contribution guidelines (To be added).

License

This project is licensed under the MIT License - see the LICENSE file for details.


We hope you find SeqInfer useful for your sequence learning tasks! If you encounter any issues or have suggestions for improvement, please feel free to open an issue or submit a pull request. Happy coding!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqinfer-0.1.dev1.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seqinfer-0.1.dev1-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file seqinfer-0.1.dev1.tar.gz.

File metadata

  • Download URL: seqinfer-0.1.dev1.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.7.1 Darwin/18.7.0

File hashes

Hashes for seqinfer-0.1.dev1.tar.gz
Algorithm Hash digest
SHA256 4195b8a25324dd8b436ea8d307a39ff84cb660eaff81fea4194ab76d984c6430
MD5 7f09a4495f198c8fe58fe5e8a4a4dc7d
BLAKE2b-256 329ad54e0df4ee30915fef277c42807ca499c2e2a865095fc19fccf0c937a8de

See more details on using hashes here.

File details

Details for the file seqinfer-0.1.dev1-py3-none-any.whl.

File metadata

  • Download URL: seqinfer-0.1.dev1-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.7.1 Darwin/18.7.0

File hashes

Hashes for seqinfer-0.1.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 3f57948bc73e3dad8340967a0af1a0335fab1c978b2de5ae329b16f0e59cf22e
MD5 eb849ecdeee5373be5e031e4c1cf379d
BLAKE2b-256 c2729c2ade578dc02bb71deac8709725417cf5ad4a246408f1959d8b7d2804f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page