Skip to main content

Genotype-phenotype map container and simulators, Rust-accelerated.

Project description

gpmap-v2

CI PyPI Python License

A typed, Rust-accelerated container and simulator toolkit for genotype-phenotype maps.

gpmap-v2 is a clean-break rewrite of harmslab/gpmap. It exposes the same conceptual model (a GenotypePhenotypeMap backed by a pandas DataFrame) with a locked schema, vectorized hot paths, and a PyO3 Rust core for the operations that used to be inner Python loops.

Why

  • Fast. String-encoded genotypes are replaced with packed uint8 matrices. The encoding step (genotypes_to_binary) runs rayon-parallel in Rust, delivering two orders of magnitude over the pure-Python v1 at L >= 16.
  • Typed. Full type hints, mypy-checked, strict mode.
  • Safe. Cartesian-product enumeration is size-guarded out of the box (SpaceTooLargeError). No more silent 10^26 allocations.
  • Stable surface. The container and encoding_table schema are locked in SCHEMA.md for downstream consumers.
  • Modern tooling. uv + maturin + pyproject.toml. Automated releases via python-semantic-release. OIDC-based PyPI publishing.

Install

pip install gpmap-v2

Or with uv:

uv add gpmap-v2

Python 3.10+. Prebuilt wheels ship for Linux (x86_64, aarch64), macOS (x86_64, aarch64), and Windows (x64).

Quick start

from gpmap import GenotypePhenotypeMap

gpm = GenotypePhenotypeMap(
    wildtype="AAA",
    genotypes=["AAA", "AAT", "ATA", "TAA", "ATT", "TAT", "TTA", "TTT"],
    phenotypes=[0.1, 0.2, 0.2, 0.6, 0.4, 0.6, 1.0, 1.1],
    stdeviations=[0.05] * 8,
)

gpm.genotypes        # np.ndarray of strings, shape (8,)
gpm.phenotypes       # np.ndarray[float64], shape (8,)
gpm.binary_packed    # np.ndarray[uint8], shape (8, 3) - the fast path
gpm.binary           # np.ndarray of '0'/'1' strings - back-compat accessor
gpm.n_mutations      # per-genotype Hamming weight
gpm.encoding_table   # pandas DataFrame per SCHEMA.md
gpm.data             # pandas DataFrame view for Jupyter

Simulating landscapes

from gpmap.simulate import NKSimulation, MountFujiSimulation
import numpy as np

sim = NKSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    K=2,
    rng=np.random.default_rng(0),
)
sim.phenotypes.shape  # (16,)

I/O round-trips

from gpmap import to_json, read_json, to_csv, read_csv

to_json(gpm, "map.json")
gpm2 = read_json("map.json")

to_csv(gpm, "map.csv")   # writes map.csv + map.csv.meta.json
gpm3 = read_csv("map.csv")

Public API surface

The full list of stable exports lives in gpmap.__all__. The ones downstream consumers depend on (notably epistasis-v2) are:

  • GenotypePhenotypeMap, GenotypePhenotypeMap.from_dataframe
  • get_encoding_table, genotypes_to_binary, genotypes_to_binary_packed
  • upper_transform, lower_transform
  • StandardDeviationMap, StandardErrorMap
  • SpaceTooLargeError, SchemaError, UnknownLetterError
  • read_csv, read_json, read_pickle, read_excel (and to_* counterparts)
  • All simulators under gpmap.simulate

The load-bearing schema contract (column names, dtypes, invariants) is in SCHEMA.md. Breaking changes to that document bump the major version.

Development

git clone https://github.com/lperezmo/gpmap-v2
cd gpmap-v2
uv sync
uv run maturin develop --release
uv run pytest
uv run ruff check python/gpmap tests

A typical dev inner loop after editing Rust:

uv run maturin develop --release && uv run pytest

Consuming from another local project

gpmap-v2 is designed to be consumed as an editable dependency during co-development with sister packages (e.g. epistasis-v2). In the consumer's pyproject.toml:

[tool.uv.sources]
gpmap-v2 = { path = "/absolute/path/to/gpmap-v2", editable = true }

[project]
dependencies = ["gpmap-v2"]

Then uv sync or uv add gpmap-v2 in the consumer. Imports remain import gpmap.

Migration from v1 (harmslab/gpmap)

gpmap-v2 is not wire-compatible with v1. Key differences:

  • Distribution name is gpmap-v2 on PyPI; import path is still gpmap.
  • The encoding_table column genotype_index has been renamed to site_index to match its actual meaning. The old name is still readable via a deprecated alias.
  • read_dataframe is now from_dataframe.
  • binary_packed (uint8 2D) is exposed alongside the string-form binary. Prefer the packed form for any hot-path consumer.
  • JSON files must carry "schema_version": "1". Legacy files are readable with a warning.
  • upper_transform and lower_transform are now genuinely distinct (v1 had a copy-paste bug where they were identical).
  • stats.unbiased_var honors the axis kwarg (v1 ignored it and hardcoded axis=1).
  • simulate.random_mutation_set no longer mutates the module-level amino-acid list.
  • simulate.MultiPeakMountFujiSimulation peak search has a retry cap; it raises instead of spinning forever on infeasible constraints.

See CHANGELOG.md for the full list.

Releases

Releases are driven by python-semantic-release on merge to main. Commit messages follow Conventional Commits:

  • fix: ... -> patch
  • feat: ... -> minor
  • feat!: ... or BREAKING CHANGE: footer -> major

CHANGELOG.md, version bumps, Git tags, GitHub Releases, wheel builds, and PyPI uploads all happen automatically.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpmap_v2-0.0.1.tar.gz (25.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gpmap_v2-0.0.1-cp310-abi3-win_amd64.whl (226.1 kB view details)

Uploaded CPython 3.10+Windows x86-64

gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (396.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (390.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

gpmap_v2-0.0.1-cp310-abi3-macosx_11_0_arm64.whl (343.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

gpmap_v2-0.0.1-cp310-abi3-macosx_10_12_x86_64.whl (348.0 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file gpmap_v2-0.0.1.tar.gz.

File metadata

  • Download URL: gpmap_v2-0.0.1.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpmap_v2-0.0.1.tar.gz
Algorithm Hash digest
SHA256 478acbd3828ef331629d02c7822850b8bdf7dcfa597a79e2f5f046f8f56174bd
MD5 92a6d2d29039199fa09b13dd221212c1
BLAKE2b-256 489cc369e2107d1ffb84803e83d989febd0b2055573979b37c7b6dfe7f0876b3

See more details on using hashes here.

File details

Details for the file gpmap_v2-0.0.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: gpmap_v2-0.0.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 226.1 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpmap_v2-0.0.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1d7d329fd0be8ccff464bad429d411eadee83eff314a6599c97381ffd10e009d
MD5 3669a1791d73198608558b05eb22ffe9
BLAKE2b-256 3372de2189773a901c376de98ae15949b698ecee680d8116e442541e03d27db3

See more details on using hashes here.

File details

Details for the file gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1acdeeb85056e1c0325d35bf72ef0941c89e43719d45f5623d922024b1f5f1e6
MD5 430d358fd4892f55f76238844a6f6d36
BLAKE2b-256 888bf52f8e9b8a5cf6ae986fcce6185b2ca22fb4d53748156f553c24b8c66cf8

See more details on using hashes here.

File details

Details for the file gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gpmap_v2-0.0.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6821b7ebd38df22a2f01a595af23b69213251e4562ad16096952a913d87a5ef8
MD5 779657d73a8a78d44a43cb939b4982ac
BLAKE2b-256 2589e1736f5bb5441ccd31b880c0eaa37f475edcc2c126af946d93f18304c2ee

See more details on using hashes here.

File details

Details for the file gpmap_v2-0.0.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gpmap_v2-0.0.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a9fa3defbb08cf71cfe0d6874a71d99ade4a47be3f28f8c407fea7dcc0d9a31f
MD5 0b5478db005f93d812a50cad1bd30d3c
BLAKE2b-256 5b469499b742976b26be96a8568c758e3c2bf005f352b6b82c5c94ef2da893dc

See more details on using hashes here.

File details

Details for the file gpmap_v2-0.0.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gpmap_v2-0.0.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 bb58033d7b22811473203927323de391d97258705281434713ebc3fab15cea8a
MD5 d5e26a53a4f71a1f6a3950271fb4d0f5
BLAKE2b-256 8ac957c204ed03f7b60186d665897a9463a901b43bd6df2c0d70b8d8f1ece887

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page