PEFF-aware in-silico protein digestion with PTM enumeration, sequence variants, and ProForma output.

These details have not been verified by PyPI

Project description

peff_digest

A PEFF-aware protein digest tool. Given a PEFF file and enzymatic digestion parameters, produces a CSV of peptides with their sequence (ProForma notation), variant annotation, length, and monoisotopic mass.

Each PEFF VariantSimple and VariantComplex annotation is applied independently (not combined). PEFF PTMs (ModResPsi / ModResUnimod) are applied combinatorially up to a configurable limit per peptide. Fixed and variable user-defined modifications — including terminal modifications — are also supported.

Installation

uv sync
# or
just install

Requires Python 3.12+.

Usage

Via config file

peff-digest --config config.toml

Via flags

peff-digest --peff-file human.peff --output-file peptides.csv

Flags override any values set in --config. Run peff-digest --help for the full flag reference.

Output

The output CSV has five columns:

Column	Description
`protein_id`	`db_unique_id` from the PEFF header
`sequence`	ProForma-annotated peptide sequence (includes mods)
`variant`	PEFF variant notation, e.g. `(42\|R)`, or empty for canonical
`length`	Peptide length in residues
`mass`	Monoisotopic mass in Da, or empty if not computable

Config reference

All options can be set in a TOML or JSON config file. TOML example:

peff_file = "human.peff"
output_file = "peptides.csv"

cleave_on = "KR"
missed_cleavages = 2
semi_enzymatic = false
max_ptm_per_peptide = 2
min_length = 7
max_length = 40
restrict_after = "P"
restrict_before = ""
cterminal = true
min_mass = 400.0
max_mass = 10000.0
drop_invalid_mass = false
annotate_variants = true

# Internal modifications — one [[internal_mods]] block per mod:
# [[internal_mods]]
# modification = "Carbamidomethyl"
# residue = "C"
# mod_type = "fixed"
#
# [[internal_mods]]
# modification = "Oxidation"
# residue = "M"
# mod_type = "variable"

# Terminal modifications — one [[terminal_mods]] block per mod:
# [[terminal_mods]]
# modification = "Acetyl"
# position = "nterm"
# mod_type = "variable"
# protein_terminus = true   # only the first peptide of each protein
#
# [[terminal_mods]]
# modification = "UNIMOD:737"
# position = "nterm"
# mod_type = "fixed"
# residue = "M"             # only if the terminal residue is M
#
# [[terminal_mods]]
# modification = "Amidated"
# position = "cterm"
# mod_type = "variable"

`DigestConfig` fields

Field	Type	Default	Description
`peff_file`	`str`	required	Path to the input PEFF file. Must exist.
`output_file`	`str`	`"peptides.csv"`	Path for the output CSV.
`cleave_on`	`str`	`"KR"`	Amino acids at which to cleave (e.g. `"KR"` for trypsin).
`missed_cleavages`	`int`	`2`	Maximum number of missed cleavage sites per peptide. Min 0.
`semi_enzymatic`	`bool`	`false`	Include semi-enzymatic peptides (one non-enzymatic terminus).
`max_ptm_per_peptide`	`int`	`2`	Maximum number of variable mods (PEFF + user) applied simultaneously per peptide. `0` disables all variable mods. Min 0.
`min_length`	`int`	`7`	Minimum peptide length in residues (inclusive). Min 1.
`max_length`	`int`	`40`	Maximum peptide length in residues (inclusive). Min 1.
`restrict_after`	`str`	`"P"`	Skip cleavage when the following residue is in this set (e.g. `"P"` for trypsin/Pro rule).
`restrict_before`	`str`	`""`	Skip cleavage when the preceding residue is in this set.
`cterminal`	`bool`	`true`	`true` = C-terminal cleavage (standard); `false` = N-terminal.
`internal_mods`	`list[InternalMod]`	`[]`	Per-residue modifications. See `InternalMod` fields below.
`terminal_mods`	`list[TerminalMod]`	`[]`	Terminal modifications. See `TerminalMod` fields below.
`min_mass`	`float \| None`	`None`	Minimum peptide mass in Da. Ignored if `None`.
`max_mass`	`float \| None`	`None`	Maximum peptide mass in Da. Ignored if `None`.
`drop_invalid_mass`	`bool`	`false`	If `true`, exclude peptides whose mass cannot be computed.
`annotate_variants`	`bool`	`true`	If `false`, do not set `peptide_name` on variant peptides.
`workers`	`int \| None`	`None`	Number of worker processes. Defaults to all available CPUs. Min 1.

`InternalMod` fields

Field	Type	Default	Description
`modification`	`str`	required	Modification name (e.g. `"Carbamidomethyl"`, `"UNIMOD:21"`).
`residue`	`str`	required	One or more amino acids the mod applies to (e.g. `"C"` or `"KR"`).
`mod_type`	`"fixed" \| "variable"`	required	`"fixed"` = always applied; `"variable"` = enumerated combinatorially (counts against `max_ptm_per_peptide`).

`TerminalMod` fields

Field	Type	Default	Description
`modification`	`str`	required	Modification name (e.g. `"Acetyl"`, `"UNIMOD:737"`).
`position`	`"nterm" \| "cterm"`	required	Which terminus to apply the mod to.
`mod_type`	`"fixed" \| "variable"`	required	`"fixed"` = always applied; `"variable"` = enumerated combinatorially (counts against `max_ptm_per_peptide`).
`residue`	`str \| None`	`None`	If set, the mod is only applied when the terminal residue is in this string (e.g. `"M"` or `"KR"`).
`protein_terminus`	`bool`	`false`	If `true`, only apply to the protein-level terminus (first peptide for N-term, last for C-term).

Python API

Full digest → Polars DataFrame

from peff_digest import DigestConfig, InternalMod, TerminalMod, digest

config = DigestConfig(
    peff_file="human.peff",
    missed_cleavages=2,
    min_length=7,
    max_length=40,
    min_mass=400.0,
    max_mass=10000.0,
    drop_invalid_mass=True,
    internal_mods=[
        InternalMod(modification="Carbamidomethyl", residue="C", mod_type="fixed"),
        InternalMod(modification="Oxidation", residue="M", mod_type="variable"),
    ],
    terminal_mods=[
        TerminalMod(modification="Acetyl", position="nterm", mod_type="variable", protein_terminus=True),
    ],
)

df = digest(config)
print(df)

Returns a polars.DataFrame with columns protein_id, sequence, variant, length, mass. All filtering from the config (mass bounds, drop_invalid_mass) is applied before returning.

Single-entry digest

import pefftacular as pf
from peff_digest import InternalMod, TerminalMod, digest_peff_sequence

entry = next(iter(pf.PeffReader("human.peff")))

peptides = digest_peff_sequence(
    entry,
    cleave_on="KR",
    missed_cleavages=2,
    min_length=7,
    max_length=40,
    restrict_after="P",
    internal_mods=[
        InternalMod(modification="Carbamidomethyl", residue="C", mod_type="fixed"),
        InternalMod(modification="Oxidation", residue="M", mod_type="variable"),
    ],
    max_ptm_per_peptide=2,
    terminal_mods=[
        TerminalMod(modification="Acetyl", position="nterm", mod_type="variable", protein_terminus=True),
        TerminalMod(modification="Amidated", position="cterm", mod_type="variable"),
    ],
)

for peptide in peptides:
    print(str(peptide), len(peptide), peptide.mass())

Returns a set[peptacular.ProFormaAnnotation]. Each element supports len(), .mass(), str(), and .peptide_name (PEFF variant notation, or None for canonical).

Development

just lint      # ruff check
just format    # ruff format + import sort
just test      # pytest
just check     # lint + type check (ty) + test

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peff_digest-0.1.0.tar.gz (268.1 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

peff_digest-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file peff_digest-0.1.0.tar.gz.

File metadata

Download URL: peff_digest-0.1.0.tar.gz
Upload date: Mar 31, 2026
Size: 268.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for peff_digest-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5c5918fa62b27d26edf7bf6648f10e715758178ceeb3c9d3e063d56aa65f5af3`
MD5	`7456657399664216b9eff7951263368c`
BLAKE2b-256	`6ce94ae17f361619bdab88ed95fa1286e2696fb06a926ea92bd7b1364dd31113`

See more details on using hashes here.

Provenance

The following attestation bundles were made for peff_digest-0.1.0.tar.gz:

Publisher: release.yml on tacular-omics/peff_digest

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: peff_digest-0.1.0.tar.gz
- Subject digest: 5c5918fa62b27d26edf7bf6648f10e715758178ceeb3c9d3e063d56aa65f5af3
- Sigstore transparency entry: 1202012339
- Sigstore integration time: Mar 31, 2026
Source repository:
- Permalink: tacular-omics/peff_digest@ddfa7e1f84bacf9f2208f045096f78a168535042
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tacular-omics
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ddfa7e1f84bacf9f2208f045096f78a168535042
- Trigger Event: release

File details

Details for the file peff_digest-0.1.0-py3-none-any.whl.

File metadata

Download URL: peff_digest-0.1.0-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 16.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for peff_digest-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4f1d7bf8a5374c8e0b0fa3f3da27974c8f3d1a75ec0b97ac588ca20f7400cac6`
MD5	`c57e3ed89d8cb1c7afacc8740933b01a`
BLAKE2b-256	`2f4e0a096e27c828a994758846008a4060c78511140bec49acfbb2d39f6a1a37`

See more details on using hashes here.

Provenance

The following attestation bundles were made for peff_digest-0.1.0-py3-none-any.whl:

Publisher: release.yml on tacular-omics/peff_digest

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: peff_digest-0.1.0-py3-none-any.whl
- Subject digest: 4f1d7bf8a5374c8e0b0fa3f3da27974c8f3d1a75ec0b97ac588ca20f7400cac6
- Sigstore transparency entry: 1202012348
- Sigstore integration time: Mar 31, 2026
Source repository:
- Permalink: tacular-omics/peff_digest@ddfa7e1f84bacf9f2208f045096f78a168535042
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tacular-omics
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@ddfa7e1f84bacf9f2208f045096f78a168535042
- Trigger Event: release

peff-digest 0.1.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

peff_digest

Installation

Usage

Via config file

Via flags

Output

Config reference

DigestConfig fields

InternalMod fields

TerminalMod fields

Python API

Full digest → Polars DataFrame

Single-entry digest

Development

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`DigestConfig` fields

`InternalMod` fields

`TerminalMod` fields