Skip to main content

A package for fitting simple categorical mixture models to sequence data

Project description

categorical_mix

Fast, scalable clustering for fixed length sequences with a simple generative model.

This package is a fairly special-purpose tool designed for fitting multiple sequence alignments of protein or DNA sequences to a categorical mixture model. (It's possible you could use this for other tasks, although that's a possibility we've never investigated.) This is a very simple model but for precisely this reason it can sometimes be quite useful -- it's fully human-interpretable, easy to visualize and can fit a few million sequences very quickly. It's designed to fit datasets too large to fit in memory.

This package is primarily used by AntPack, which uses it to score antibody sequences for human-likeness and for other tasks. If you are interested in using it for some other task, for installation and usage, see the docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

categorical_mix-0.1.0.1.tar.gz (68.7 kB view details)

Uploaded Source

File details

Details for the file categorical_mix-0.1.0.1.tar.gz.

File metadata

  • Download URL: categorical_mix-0.1.0.1.tar.gz
  • Upload date:
  • Size: 68.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for categorical_mix-0.1.0.1.tar.gz
Algorithm Hash digest
SHA256 ad30dac660aa372b7aa31be7f1183d12799793c631eccf86a8b266df19c6044c
MD5 44a946976a7ad0e92d9cd9bd1e4a0a81
BLAKE2b-256 d1755b16fe56434a2fdec3de72da525e4625466472393d9d3011f7bbd5b15a0c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page