Skip to main content

A package for fitting simple categorical mixture models to sequence data

Project description

categorical_mix

Fast, scalable clustering for fixed length sequences with a simple generative model.

This package is a fairly special-purpose tool designed for fitting multiple sequence alignments of protein or DNA sequences to a categorical mixture model. (It's possible you could use this for other tasks, although that's a possibility we've never investigated.) This is a very simple model but for precisely this reason it can sometimes be quite useful -- it's fully human-interpretable, easy to visualize and can fit a few million sequences very quickly. It's designed to fit datasets too large to fit in memory.

This package is primarily used by AntPack, which uses it to score antibody sequences for human-likeness and for other tasks. If you are interested in using it for some other task, for installation and usage, see the docs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

categorical_mix-0.2.0.0.tar.gz (68.8 kB view details)

Uploaded Source

File details

Details for the file categorical_mix-0.2.0.0.tar.gz.

File metadata

  • Download URL: categorical_mix-0.2.0.0.tar.gz
  • Upload date:
  • Size: 68.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for categorical_mix-0.2.0.0.tar.gz
Algorithm Hash digest
SHA256 eb8a6662e425a06e1e05b3773aca454bd9d95b0a1733769596cc9f5568145336
MD5 7d198d6f792aaf1044ea8e032e330487
BLAKE2b-256 9b47950130e163095b34f27de51f71ceadd8f6906da22729291cff50ae53bda9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page