Skip to main content

Genotype-phenotype map container and simulators, Rust-accelerated.

Project description

gpmap-v2

CI PyPI Python License

A typed, Rust-accelerated container and simulator toolkit for genotype-phenotype maps.

gpmap-v2 is a clean-break rewrite of harmslab/gpmap. It exposes the same conceptual model (a GenotypePhenotypeMap backed by a pandas DataFrame) with a locked schema, vectorized hot paths, and a PyO3 Rust core for the operations that used to be inner Python loops.

Why

  • Fast. String-encoded genotypes are replaced with packed uint8 matrices. The encoding step (genotypes_to_binary) runs rayon-parallel in Rust, delivering two orders of magnitude over the pure-Python v1 at L >= 16.
  • Typed. Full type hints, mypy-checked, strict mode.
  • Safe. Cartesian-product enumeration is size-guarded out of the box (SpaceTooLargeError). No more silent 10^26 allocations.
  • Stable surface. The container and encoding_table schema are locked in SCHEMA.md for downstream consumers.
  • Modern tooling. uv + maturin + pyproject.toml. Automated releases via python-semantic-release. OIDC-based PyPI publishing.

Install

pip install gpmap-v2

Or with uv:

uv add gpmap-v2

Python 3.10+. Prebuilt wheels ship for Linux (x86_64, aarch64), macOS (x86_64, aarch64), and Windows (x64).

Quick start

from gpmap import GenotypePhenotypeMap

gpm = GenotypePhenotypeMap(
    wildtype="AAA",
    genotypes=["AAA", "AAT", "ATA", "TAA", "ATT", "TAT", "TTA", "TTT"],
    phenotypes=[0.1, 0.2, 0.2, 0.6, 0.4, 0.6, 1.0, 1.1],
    stdeviations=[0.05] * 8,
)

gpm.genotypes        # np.ndarray of strings, shape (8,)
gpm.phenotypes       # np.ndarray[float64], shape (8,)
gpm.binary_packed    # np.ndarray[uint8], shape (8, 3) - the fast path
gpm.binary           # np.ndarray of '0'/'1' strings - back-compat accessor
gpm.n_mutations      # per-genotype Hamming weight
gpm.encoding_table   # pandas DataFrame per SCHEMA.md
gpm.data             # pandas DataFrame view for Jupyter

Simulating landscapes

from gpmap.simulate import NKSimulation, MountFujiSimulation
import numpy as np

sim = NKSimulation(
    wildtype="AAAA",
    mutations={i: ["A", "T"] for i in range(4)},
    K=2,
    rng=np.random.default_rng(0),
)
sim.phenotypes.shape  # (16,)

I/O round-trips

from gpmap import to_json, read_json, to_csv, read_csv

to_json(gpm, "map.json")
gpm2 = read_json("map.json")

to_csv(gpm, "map.csv")   # writes map.csv + map.csv.meta.json
gpm3 = read_csv("map.csv")

Public API surface

The full list of stable exports lives in gpmap.__all__. The ones downstream consumers depend on (notably epistasis-v2) are:

  • GenotypePhenotypeMap, GenotypePhenotypeMap.from_dataframe
  • get_encoding_table, genotypes_to_binary, genotypes_to_binary_packed
  • upper_transform, lower_transform
  • StandardDeviationMap, StandardErrorMap
  • SpaceTooLargeError, SchemaError, UnknownLetterError
  • read_csv, read_json, read_pickle, read_excel (and to_* counterparts)
  • All simulators under gpmap.simulate

The load-bearing schema contract (column names, dtypes, invariants) is in SCHEMA.md. Breaking changes to that document bump the major version.

Development

git clone https://github.com/lperezmo/gpmap-v2
cd gpmap-v2
uv sync
uv run maturin develop --release
uv run pytest
uv run ruff check python/gpmap tests

A typical dev inner loop after editing Rust:

uv run maturin develop --release && uv run pytest

Consuming from another local project

gpmap-v2 is designed to be consumed as an editable dependency during co-development with sister packages (e.g. epistasis-v2). In the consumer's pyproject.toml:

[tool.uv.sources]
gpmap-v2 = { path = "/absolute/path/to/gpmap-v2", editable = true }

[project]
dependencies = ["gpmap-v2"]

Then uv sync or uv add gpmap-v2 in the consumer. Imports remain import gpmap.

Migration from v1 (harmslab/gpmap)

gpmap-v2 is not wire-compatible with v1. Key differences:

  • Distribution name is gpmap-v2 on PyPI; import path is still gpmap.
  • The encoding_table column genotype_index has been renamed to site_index to match its actual meaning. The old name is still readable via a deprecated alias.
  • read_dataframe is now from_dataframe.
  • binary_packed (uint8 2D) is exposed alongside the string-form binary. Prefer the packed form for any hot-path consumer.
  • JSON files must carry "schema_version": "1". Legacy files are readable with a warning.
  • upper_transform and lower_transform are now genuinely distinct (v1 had a copy-paste bug where they were identical).
  • stats.unbiased_var honors the axis kwarg (v1 ignored it and hardcoded axis=1).
  • simulate.random_mutation_set no longer mutates the module-level amino-acid list.
  • simulate.MultiPeakMountFujiSimulation peak search has a retry cap; it raises instead of spinning forever on infeasible constraints.

See CHANGELOG.md for the full list.

Releases

Releases are driven by python-semantic-release on merge to main. Commit messages follow Conventional Commits:

  • fix: ... -> patch
  • feat: ... -> minor
  • feat!: ... or BREAKING CHANGE: footer -> major

CHANGELOG.md, version bumps, Git tags, GitHub Releases, wheel builds, and PyPI uploads all happen automatically.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpmap_v2-1.0.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gpmap_v2-1.0.0-cp310-abi3-win_amd64.whl (226.1 kB view details)

Uploaded CPython 3.10+Windows x86-64

gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (395.9 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (390.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

gpmap_v2-1.0.0-cp310-abi3-macosx_11_0_arm64.whl (343.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

gpmap_v2-1.0.0-cp310-abi3-macosx_10_12_x86_64.whl (348.0 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file gpmap_v2-1.0.0.tar.gz.

File metadata

  • Download URL: gpmap_v2-1.0.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpmap_v2-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b71ba8e486105718e9097221e4d6fb16ffd34a953ebbc2369e0d7efa7eee97a5
MD5 f42711fc5e7c16c4fc0d43541e58ade9
BLAKE2b-256 ea99c3e48f993e51d752a40b43d15a61aec9da60925054e77e9606ef379a89d8

See more details on using hashes here.

File details

Details for the file gpmap_v2-1.0.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: gpmap_v2-1.0.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 226.1 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpmap_v2-1.0.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d667cb02f9a1c90f012212e92c9a882515809a01934af25db6714e7be505a9ca
MD5 47f1546485d83399db3209c82e7419ff
BLAKE2b-256 f1b62815f40c032bd30513cf362c64f6944159928c32208a36df66d96111c086

See more details on using hashes here.

File details

Details for the file gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2b6d8d3cd05bd1f9632ed4a85107990a7d552890c411aa971b7517d3dd0794bf
MD5 54dd216137e00e0dfc0bcc9f28dba5ae
BLAKE2b-256 85082d99d316f6cdf8c660946253ecb83c2b2d9f62da4636a4f4179132050e08

See more details on using hashes here.

File details

Details for the file gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gpmap_v2-1.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 962139d6b6e9759d33057bbd4fe467698c56496fbda274c04cf79ff7b21ef276
MD5 4770c7118c828ce3b381ba7cd93392ed
BLAKE2b-256 83f187e9e1250bb8c6aa4ebe5818c11bb4f8253fdc22418aa6978df430929e8c

See more details on using hashes here.

File details

Details for the file gpmap_v2-1.0.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gpmap_v2-1.0.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e72c0c0ce7bf0fd03a71a67f77e4dc7d854ee3b5a207a7011a124a0577cee55d
MD5 cfa54d4ff698d8d19b4700b9a7c17421
BLAKE2b-256 6bb332f58c68a4b636329da6295a3f8487ba4877b249a760a4fe059894f34f9d

See more details on using hashes here.

File details

Details for the file gpmap_v2-1.0.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gpmap_v2-1.0.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a7950760f22e72645694684cdeae63b5616afcdec4315c9eed77acb6827a1b10
MD5 c5f18f6700d90b28886dfe8217cc964e
BLAKE2b-256 59a8f876d9920f36181230261172a4080686806ab24861c1cf66fcb36385d557

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page