GFFBase — Rust-accelerated GFF3/GTF parser with a DuckDB-backed storage engine and a drop-in gffutils-compatible Python API.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kh.chao

These details have not been verified by PyPI

Project description

GFFBase

What is GFFBase?

GFFBase is a high-performance genomic-annotation engine combining a SIMD Rust parser, a DuckDB columnar backend, and a zero-copy PyArrow interface — purpose-built for whole-genome-scale ingest and bulk machine-learning feature extraction, while remaining a drop-in successor to gffutils.

A SIMD Rust+PyO3 parser feeds DuckDB's columnar storage through record-batch Arrow handoffs. A smart query router auto-picks an R-tree or B-tree spatial index per query, and a closure-cache / recursive-CTE relational dispatcher selects the right strategy based on the corpus's actual hierarchy depth. The full FeatureDB / Feature / create_db / DataIterator / GFFWriter / merge_criteria legacy API is preserved verbatim — most users migrate by changing one import line.

Three reasons it matters

🚀 ≥ 32× faster GENCODE GTF ingest (v49, 6.07 M lines) — and mathematically more efficient: legacy needs a Python loop + ~5 million correlated SQLite subqueries to invent the missing gene/transcript rows, while gffbase does the same work in two set-based DuckDB GROUP BY aggregations + one recursive CTE. (Proven by a same-release GTF/GFF3 head-to-head)
⚡ 36.68× faster bulk ML extraction — children_batched(format='arrow') returns 50 000 transcripts → 1.6 M exons as a zero-copy PyArrow table in 1.16 s. No Python Feature objects, ever. (How?)
🛡️ Validated NCBI compliance — all four canonical human-genome annotations (GENCODE / RefSeq / MANE / CHESS 3) ingest cleanly with zero strict-mode warnings. RefSeq's split-CDS duplicate-ID convention is handled automatically.

⚡ Comprehensive Human Genome Annotations — validated across every canonical corpus

Validated head-to-head against legacy gffutils on the four canonical human-genome annotation sources, including the GENCODE v49 GTF and GFF3 versions of the same release — a same-biology, same-features, different-format pairing that exposes the GTF Synthesis Advantage in its purest form:

Corpus	Format	Lines	gffbase ingest	legacy ingest	speedup	spatial qps	batched (5 k anchors)
GENCODE v49 (basic)	GTF	6,068,892	4 min 37 s	≥ 2 hr 30 min[^1]	🚀 ≥ 32×	1,204	172 ms / 596 k desc
GENCODE v49 (basic)	GFF3	6,066,054	6 min 7 s	11 min 23 s	1.86×	1,292	422 ms / 1.93 M desc
RefSeq GRCh38.p14	GFF3	4,932,571	4 min 12 s[^2]	6 min 5 s	1.45×	1,011	263 ms / 999 k desc
MANE v1.5 (Ensembl)	GFF3	524,834	21.6 s	45.1 s	2.09×	1,766	78 ms / 156 k desc
CHESS 3.1.3	GFF3	2,761,061	53.6 s	2 min 13.1 s	2.48×	1,175	91 ms / 161 k desc

[^1]: Legacy gffutils.create_db() on GENCODE v49 GTF (6.07 M lines) hits the bench's safety-valve cap (75 min). The reported wall is a conservative 2× extrapolation — the canonical GENCODE v45 GTF (2.0 M lines, 3× smaller) ran uncapped at 3,582 s (59 min 42 s) on the same hardware, so the v49 wall is well past 2 hours. See Performance Comparison §"GTF Synthesis Advantage" for the formal cost model. [^2]: Result of the v0.1.0 ingest-pipeline optimization — the same RefSeq corpus used to take 7 min 49 s before the GFF3 path was re-architected to stamp seqid_y and bbox inline during the Arrow batch INSERT.

The same biological release, ingested in two different formats, by two different engines — that's the load-bearing comparison. Legacy GFF3 ingest finishes in 11 min because every parent edge is explicit; legacy GTF ingest takes hours because the parent rows have to be invented from the data (one Python ↔ SQLite round-trip per missing row). gffbase replaces those millions of round-trips with two set-based DuckDB GROUP BY aggregations + one recursive CTE — the same code path runs for GTF and GFF3, which is why the gffbase column barely shifts (4 min 37 s → 6 min 7 s) between the two rows while the legacy column balloons by 13×–20×.

Robustness: every corpus ingests cleanly with zero strict-mode warnings from the NCBI-spec-hardened Rust parser (9 enforced rules, line-numbered GFFFormatError, opt-in non-strict mode). RefSeq's notorious duplicate-ID=cds-NP_xxx convention (split CDS segments) is handled transparently — gffbase mirrors gffutils.merge_strategy="create_unique" automatically and records the remap in the duplicates table. No config knobs to flip.

📊 Full reproducible numbers + per-corpus root-cause analysis: PERFORMANCE_COMPARISON.md. Re-run via python benchmarks/06_mega.py --legacy-timeout 900.

🚀 The Killer Feature — zero-copy PyArrow for ML pipelines

Modern ML genomics pipelines have one shape: pull every exon for 50 000 transcripts, push the column-oriented table into a tensor, train. Legacy gffutils forces a per-feature Python loop — constructing 1.6 M throwaway Feature objects per pull, which crushes both wall time and memory. gffbase bypasses Python entirely with a single batched call that returns DuckDB's internal Arrow buffers directly:

# 50 000 transcript IDs → every exon, in one query.
# Returns a zero-copy pyarrow.Table — no Python `Feature` object
# is constructed at any layer.
exons = db.children_batched(
    transcript_ids,
    featuretype="exon",
    format="arrow",        # or "df" / "polars"
)

# Hand off directly to PyTorch / Hugging Face datasets / JAX / Lance.
import torch
starts = torch.from_numpy(exons.column("start").to_numpy())
ends   = torch.from_numpy(exons.column("end").to_numpy())
# The "anchor" column carries the input id for each row, so you can
# reconstruct per-transcript groups without re-issuing N queries.

Numbers for that one call (50 000 transcripts, GENCODE basic annotation, returning 1.6 M exon rows):

Path	Wall	vs legacy
gffbase `children_batched(format='arrow')`	1.16 s	36.68× faster
legacy `gffutils` row-by-row loop	42.55 s	1.0× (baseline)
gffbase row-by-row loop	≥ 642 s	0.07× (slower!)

This is the reason GFFBase exists. Iterating for x in ids: db.children(x) with DuckDB pays vectorization startup per call and is slower than legacy's SQLite row-by-row path — but the batched API obliterates both row-by-row paths because it issues one set-based SQL query and avoids constructing any Python Feature objects whatsoever.

region_batched(...) and parents_batched(...) have the same zero-copy contract for spatial and parent workloads.

📦 Installation

pip install gffbase

Universal abi3-py39 wheels — single binary per arch covers CPython 3.9 → 3.13. No Rust toolchain required at install time.

For source/dev installs (Rust ≥ 1.69 + maturin):

pip install -e .[dev]
maturin develop --release

🏃 Quick start — row-by-row (drop-in for `gffutils`)

from gffbase import create_db

# 1. Ingest a GTF/GFF3 in seconds (auto-detects format, gzipped OK).
db = create_db("gencode.v49.chr_patch_hapl_scaff.basic.annotation.gtf.gz",
               "gencode.duckdb", force=True)

# 2. Walk a single gene's hierarchy.
for tx in db.children("ENSG00000139618", level=1, featuretype="transcript"):
    print(tx.id, tx.start, tx.end)

# 3. Spatial overlap query — uses the per-seqid R-tree under the hood.
for f in db.region("chr17:43044295-43125483", featuretype="exon"):
    print(f)

If you're migrating from gffutils, change one line:

import gffbase as gffutils    # one-line alias migration
db = gffutils.create_db(...)  # everything else identical

(But please read the Migration Guide first — it has one important note about ML loops.)

🤖 Quick start — vectorized for ML

from gffbase import FeatureDB

db = FeatureDB("gencode.duckdb")

# Pull every exon for 50 000 transcripts — one set-based SQL query.
exons = db.children_batched(
    transcript_ids,                # iterable of 50 000 IDs
    featuretype="exon",
    format="arrow",                # "df" / "polars" also supported
)
# exons is a pyarrow.Table sharing memory with DuckDB. No copies.

# Spatial: "for each ATAC-seq peak, find every overlapping CDS."
peaks = [("chr1", 100_000, 110_000), ("chr1", 200_000, 210_000), ...]
overlaps = db.region_batched(peaks, featuretype="CDS", format="arrow")

See the Machine Learning Workflows Cookbook for end-to-end pipelines with PyTorch and Hugging Face datasets.

✨ What's inside

Rust + PyO3 parser — SIMD line/tab splitting, lazy URL-decoding, GTF semicolon-in-quotes safe, gzipped input transparent. Hardened against the NCBI GFF3 spec (line-numbered GFFFormatError, strict / non-strict modes, 9 enforced rules).
DuckDB columnar storage — 7-table schema, set-based GTF gene/transcript synthesis, recursive-CTE transitive closure, per-seqid-banded R-tree spatial index built inline during ingest.
Smart routing — region() auto-picks R-tree vs B-tree; children() auto-picks closure cache vs dynamic CTE based on measured corpus depth.
Vectorized batched API — children_batched, parents_batched, region_batched return pyarrow.Table / pandas.DataFrame / polars.DataFrame directly out of DuckDB's buffer pool.
Drop-in legacy API — FeatureDB, Feature, create_db, DataIterator, GFFWriter, merge_criteria, interfeatures, bed12, execute() SQL escape hatch, export_sqlite().
abi3 wheels — single binary per arch covers CPython 3.9–3.13.

📚 Documentation

Full site (rendered with MkDocs Material) — build it locally:

pip install -e .[docs]
mkdocs serve            # http://localhost:8000

Page	What's there
Usage Gallery	Copy-pasteable snippets for every public API method
Performance comparison	Head-to-head numbers across every canonical human-genome annotation + per-corpus root-cause analysis
Migration guide for `gffutils` users	Drop-in compat checklist + the one OLAP/OLTP gotcha you must understand
Cookbooks	GENCODE/Ensembl, RefSeq, MANE, ML workflows
API reference	Every public method, full signatures + docstrings

🧪 Testing

pip install -e .[test]
pytest                  # 523 passed, 7 skipped, 99.19% coverage

CI runs the full matrix on Linux + macOS + Windows, both R-tree and B-tree fallback paths, on Python 3.9 / 3.11 / 3.13.

🤝 Contributing

GFFBase welcomes pull requests, bug reports, and feature suggestions. Start with CONTRIBUTING.md for the full guide:

Rust + Python development setup (maturin develop --release)
Running the test suite + the 99 % coverage gate
Branch naming, Conventional Commits, the PR checklist

The repo ships standard issue templates and a PR template so new contributions land with the context maintainers need to triage them quickly.

🪪 License

Apache License 2.0. See LICENSE.

Citation: if GFFBase helps your research, please cite the project at the Releases page.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kh.chao

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gffbase-0.1.0.tar.gz (120.7 kB view details)

Uploaded May 6, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gffbase-0.1.0-cp39-abi3-win_amd64.whl (278.6 kB view details)

Uploaded May 6, 2026 CPython 3.9+Windows x86-64

gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (359.9 kB view details)

Uploaded May 6, 2026 CPython 3.9+manylinux: glibc 2.17+ x86-64

gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (355.9 kB view details)

Uploaded May 6, 2026 CPython 3.9+manylinux: glibc 2.17+ ARM64

gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl (339.2 kB view details)

Uploaded May 6, 2026 CPython 3.9+macOS 11.0+ ARM64

gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl (348.3 kB view details)

Uploaded May 6, 2026 CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file gffbase-0.1.0.tar.gz.

File metadata

Download URL: gffbase-0.1.0.tar.gz
Upload date: May 6, 2026
Size: 120.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ceb3fb77eb40a559f3543281946b86177b7f3298c022dffacec3f860140b78bf`
MD5	`230ba0c2d66033339f0c47912c18b15d`
BLAKE2b-256	`3339d0cf400df9804d25aec6019c55ef7d5fecf016d010c61b14c4d393bb0ce5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0.tar.gz:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0.tar.gz
- Subject digest: ceb3fb77eb40a559f3543281946b86177b7f3298c022dffacec3f860140b78bf
- Sigstore transparency entry: 1448674175
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

File details

Details for the file gffbase-0.1.0-cp39-abi3-win_amd64.whl.

File metadata

Download URL: gffbase-0.1.0-cp39-abi3-win_amd64.whl
Upload date: May 6, 2026
Size: 278.6 kB
Tags: CPython 3.9+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0-cp39-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`ebdd3d571878fbbbb9cd8e9ae442b727b5ecaac734a928c752e15accd7300d51`
MD5	`20ce006d6ee1d794f6d281a434c5b876`
BLAKE2b-256	`f4f73bf9148c041fa0abaec14b0ac46c3e8a56997919188d37da7ef592ad3441`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0-cp39-abi3-win_amd64.whl:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0-cp39-abi3-win_amd64.whl
- Subject digest: ebdd3d571878fbbbb9cd8e9ae442b727b5ecaac734a928c752e15accd7300d51
- Sigstore transparency entry: 1448674272
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

File details

Details for the file gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: May 6, 2026
Size: 359.9 kB
Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`2ccdcc091daff87ea66547f10c1873033c0f299262a2b3161dff58f7551748f2`
MD5	`17b16d0f2c6121179c0d4ca6b6deff2f`
BLAKE2b-256	`55a16a6dd2562054cc85cb169c44d459c65dd640253c7af93f6c011b6fe354ee`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Subject digest: 2ccdcc091daff87ea66547f10c1873033c0f299262a2b3161dff58f7551748f2
- Sigstore transparency entry: 1448674628
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

File details

Details for the file gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: May 6, 2026
Size: 355.9 kB
Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`c49275b4b110e3bfb60dd3310d691a53935a336024f25049b83abf30f096279a`
MD5	`f83cee176ad0cecf508eb611073d4212`
BLAKE2b-256	`7bc7b2187919c118ab419eb4b8d8916642216b701d04df2f8b76a0464796772c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Subject digest: c49275b4b110e3bfb60dd3310d691a53935a336024f25049b83abf30f096279a
- Sigstore transparency entry: 1448674349
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

File details

Details for the file gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Upload date: May 6, 2026
Size: 339.2 kB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`d67a92b0bd99755750c89a002f810750f649a6b440277c3130583ef32a9b754f`
MD5	`7294a13bc41a780f1b45c15e22d17756`
BLAKE2b-256	`ce61c3f4127921776efee552cc63dce3ff551c0dfd07a20863d3adaa7ca51835`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
- Subject digest: d67a92b0bd99755750c89a002f810750f649a6b440277c3130583ef32a9b754f
- Sigstore transparency entry: 1448674433
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

File details

Details for the file gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
Upload date: May 6, 2026
Size: 348.3 kB
Tags: CPython 3.9+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`c96fbb8da06a7a06fb30b4f678fa1f31ec520354dd39388972309f724b8dd8cc`
MD5	`171354af3a20114f6b9a64667124c9be`
BLAKE2b-256	`465f99868e605a83c80608890e8deb617073cc034a256cb9380559d75084b597`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on Kuanhao-Chao/gffbase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gffbase-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
- Subject digest: c96fbb8da06a7a06fb30b4f678fa1f31ec520354dd39388972309f724b8dd8cc
- Sigstore transparency entry: 1448674523
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Kuanhao-Chao/gffbase@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Kuanhao-Chao
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@011747cd95b00f42a8415d54b95ef3b4451f0b4c
- Trigger Event: push

gffbase 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

GFFBase

What is GFFBase?

Three reasons it matters

⚡ Comprehensive Human Genome Annotations — validated across every canonical corpus

🚀 The Killer Feature — zero-copy PyArrow for ML pipelines

📦 Installation

🏃 Quick start — row-by-row (drop-in for gffutils)

🤖 Quick start — vectorized for ML

✨ What's inside

📚 Documentation

🧪 Testing

🤝 Contributing

🪪 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

🏃 Quick start — row-by-row (drop-in for `gffutils`)