Read, write, and manipulate SnapGene files (.dna, .rna, .prot)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

SnapGene File Format Parser

SnapGene File Format Parser (SGFFP for short) is a reverse-engineered parser for SnapGene DNA, RNA, and protein file formats.

[!Important] Found an unknown block type? Run sff check your_file.dna -l and look for [NEW] markers. Please report them in #1 with a dump (sff check your_file.dna -d). Help us decode more blocks!

The parser reads SnapGene files into Python objects and exports to JSON, with a writer for creating new SnapGene files.

The project aims to be a minimalistic, fast, and useful tool for molecular biologists who need to parse large libraries of SnapGene files, or for developers building SnapGene-compatible applications.

Architecture

flowchart LR
    subgraph Input
        DNA[".dna file"]
        Bytes["bytes/stream"]
    end

    subgraph SGFFP
        Reader["SgffReader"]
        Object["SgffObject"]
        Ops["SgffOps"]
        Writer["SgffWriter"]
    end

    subgraph Output
        JSON["JSON"]
        File[".dna file"]
    end

    DNA --> Reader
    Bytes --> Reader
    Reader --> Object
    Object --> Ops
    Ops --> Object
    Object --> Writer
    Object --> JSON
    Writer --> File

Installation

pip install sgffp

Or with uv:

uv add sgffp

For development:

git clone https://github.com/merv1n34k/sgffp.git
cd sgffp
uv sync --all-extras

Quick Start

from sgffp import SgffReader, SgffWriter, SgffObject

# Read a SnapGene file
sgff = SgffReader.from_file("plasmid.dna")

# Access data via typed properties
print(sgff.sequence.value)
print(sgff.features[0].name)

# Modify and write back
sgff.sequence.topology = "circular"
SgffWriter.to_file(sgff, "output.dna")

# Create a new file from scratch
sgff = (
    SgffObject.new("ATGCATGCATGC", topology="circular")
    .add_feature("GFP", "CDS", 0, 8)
    .add_primer("fwd", "ATGC", bind_position=0)
)
SgffWriter.to_file(sgff, "new_plasmid.dna")

History Operations

# Record edits with automatic history tracking
sgff.ops.insert_fragment("ATCGATCG")
sgff.ops.digest("GGCC", InputSummary={"manipulation": "insert"})

# Build an entire history tree from a specification
sgff.ops.build_from_spec(
    [
        {"id": 1, "operation": "ligateFragments", "sequence": "ATCGATCG",
         "name": "Final", "children": [2, 3]},
        {"id": 2, "operation": "makeDna", "sequence": "ATCG"},
        {"id": 3, "operation": "makeDna", "sequence": "ATCG"},
    ],
    final_sequence="ATCGATCG",
)

# Edit existing history nodes in place
sgff.ops.edit_node(node_id=2, name="Renamed", sequence="GGGGCCCC")

CLI Tool

uv run sff check plasmid.dna    # Inspect file blocks
uv run sff parse plasmid.dna    # Export to JSON
uv run sff info plasmid.dna     # Show file information
uv run sff tree plasmid.dna     # Display edit history timeline

File Format

SnapGene uses a Type-Length-Value (TLV) binary format where each block contains:

Field	Size	Description
Type	1 byte	Block type identifier
Length	4 bytes	Payload size (big-endian)
Data	N bytes	Block payload

Data encoding varies by block type: UTF-8 for sequences, XML for annotations, 2-bit encoding for compressed DNA (GATC → 00/01/10/11), and LZMA compression for history blocks.

Block Types

All known SnapGene block types and their encoding formats:

ID	Block Type	Format	ID	Block Type	Format
0	DNA Sequence	UTF-8	17	Alignable Sequences	XML
1	Compressed DNA	2-bit GATC	18	Sequence Trace	ZTR
5	Primers	XML	20	Strand Colors	XML
6	Notes	XML	21	Protein Sequence	UTF-8
7	History Tree	LZMA + XML	28	Enzyme Visibilities	XML
8	Sequence Properties	XML	29	History Modifier	LZMA + XML
10	Features	XML	30	History Content	LZMA + TLV
11	History Nodes	Binary + TLV	32	RNA Sequence	UTF-8
14	Custom Enzyme Sets	XML	34	RNA Structure	LZMA + JSON
16	Trace Container	Binary + TLV

Block 18 (ZTR trace) only appears inside block 16 containers. Blocks 2, 3, 13 (enzyme maps and display settings) are auto-generated by SnapGene and not parsed. For a complete binary format reference, see SNAPGENE_FORMAT_SPEC.md.

Supported Block Types

The table below shows which block types can be read from and written to SnapGene files. Blocks marked with a Model have typed Python classes for convenient access (e.g., sgff.sequence, sgff.features, sgff.history).

ID	Block Type	Read	Write	Model
0	DNA Sequence	+	+	+
1	Compressed DNA	+	+	+
5	Primers (XML)	+	+	+
6	Notes (XML)	+	+	+
7	History Tree (XML)	+	+	+
8	Sequence Properties (XML)	+	+	+
10	Features (XML)	+	+	+
11	History Nodes	+	+	+
14	Custom Enzyme Sets (XML)	+	+
16	Trace Container	+	+	+
17	Alignable Sequences (XML)	+	+	+
18	ZTR Trace (in block 16)	+	+	+
20	Strand Colors (XML)	+	+
21	Protein Sequence	+	+	+
28	Enzyme Visibilities (XML)	+	+
29	History Modifier (XML)	+	+	+
30	History Content (Nested)	+	+	+
32	RNA Sequence	+	+	+
34	RNA Structure (LZMA JSON)	+	+

Roadmap

Improve SGFF parsing, unify TLV strategy
Understand whole file structure
Correctly parse into readable format from all common blocks
Create writer for supported block types
Add comprehensive test suite (380 tests)
Parse XML into pure JSON format
Add write support for history blocks (LZMA compression)
Add typed model classes for easy data access
De novo file creation with builder pattern
History operations API (SgffOps)
Documentation improvements

Acknowledgments

This project would not have been possible without previous work done by

Damien Goutte-Gattat, see his PDF on SGFF structure: https://incenp.org/dvlpt/docs/binary-sequence-formats/binary-sequence-formats.pdf
Isaac Luo, for his version of SnapGene reader: https://github.com/IsaacLuo/SnapGeneFileReader
Kale Kundert, for autosnapgene, a SnapGene automation tool: https://github.com/kalekundert/autosnapgene

License

Distributed under MIT licence, see LICENSE for more.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

merv1n34k

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.17.1

May 21, 2026

0.17.0

May 20, 2026

0.16.0

May 19, 2026

0.15.4

Mar 26, 2026

0.15.3

Mar 26, 2026

0.15.2

Mar 25, 2026

0.15.1

Mar 19, 2026

This version

0.15.0

Mar 9, 2026

0.14.0

Feb 16, 2026

0.13.1

Feb 2, 2026

0.13.0

Feb 1, 2026

0.11.0

Feb 1, 2026

0.10.0

Jan 26, 2026

0.9.0

Jan 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sgffp-0.15.0.tar.gz (121.6 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sgffp-0.15.0-py3-none-any.whl (44.1 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file sgffp-0.15.0.tar.gz.

File metadata

Download URL: sgffp-0.15.0.tar.gz
Upload date: Mar 9, 2026
Size: 121.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sgffp-0.15.0.tar.gz
Algorithm	Hash digest
SHA256	`a10db57c3a10eabc4f9e2acc53e2cb293f32a77eed8b56e05aa1f40cc1e5b0f6`
MD5	`839b3c066c434d1bada4d56da9e522b1`
BLAKE2b-256	`c70ea714f6cdadda457ee1d302c02ec294ec254d170a69199eb1cd449c8d9e4c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sgffp-0.15.0.tar.gz:

Publisher: publish.yml on merv1n34k/sgffp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sgffp-0.15.0.tar.gz
- Subject digest: a10db57c3a10eabc4f9e2acc53e2cb293f32a77eed8b56e05aa1f40cc1e5b0f6
- Sigstore transparency entry: 1065892578
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: merv1n34k/sgffp@7b1946722426d0ac8d4ccc9ef8d028cdb64046b6
- Branch / Tag: refs/tags/v0.15.0
- Owner: https://github.com/merv1n34k
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7b1946722426d0ac8d4ccc9ef8d028cdb64046b6
- Trigger Event: release

File details

Details for the file sgffp-0.15.0-py3-none-any.whl.

File metadata

Download URL: sgffp-0.15.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 44.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sgffp-0.15.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`233d82a964643a9b5e4d51b1d3b31b6d9454756f20b66dd814669ce6f83a1977`
MD5	`f145121de585048890a1dc3208d6e62a`
BLAKE2b-256	`427f91d1938cfd0cd75278e7daf18efddddbcbfcd811813c9584179c3801001c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sgffp-0.15.0-py3-none-any.whl:

Publisher: publish.yml on merv1n34k/sgffp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sgffp-0.15.0-py3-none-any.whl
- Subject digest: 233d82a964643a9b5e4d51b1d3b31b6d9454756f20b66dd814669ce6f83a1977
- Sigstore transparency entry: 1065892617
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: merv1n34k/sgffp@7b1946722426d0ac8d4ccc9ef8d028cdb64046b6
- Branch / Tag: refs/tags/v0.15.0
- Owner: https://github.com/merv1n34k
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7b1946722426d0ac8d4ccc9ef8d028cdb64046b6
- Trigger Event: release

sgffp 0.15.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

SnapGene File Format Parser

Architecture

Installation

Quick Start

History Operations

CLI Tool

File Format

Block Types

Supported Block Types

Roadmap

Acknowledgments

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance