Skip to main content

Fast Cython-backed parsing for GTF attribute columns.

Project description

gtfreader

gtfreader is a small package for parsing and reading GTF files into pandas dataframes.

Install

python -m pip install -e .

Usage

from gtfreader import read_gtf, read_gtf_python

df = read_gtf("annotation.gtf")
df_python = read_gtf_python("annotation.gtf")

read_gtf(...) uses the compiled parser path when available. read_gtf_python(...) uses the high-level pure Python parser path used by the current pyrunges reader style.

If you want to use the compiled low-level parser directly, pass it raw attribute strings from column 9 of the GTF before they have been expanded:

import pandas as pd

from gtfreader import find_first_data_line_index, parse_chunk_columns

skiprows = find_first_data_line_index("annotation.gtf")
attribute_lines = pd.read_csv(
    "annotation.gtf",
    sep="\t",
    header=None,
    usecols=[8],
    names=["Attribute"],
    skiprows=skiprows,
)["Attribute"].tolist()

compiled_columns = parse_chunk_columns(attribute_lines)

Build

python -m build

Test

python -m pytest -q

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gtfreader-0.1.0.tar.gz (111.7 kB view details)

Uploaded Source

File details

Details for the file gtfreader-0.1.0.tar.gz.

File metadata

  • Download URL: gtfreader-0.1.0.tar.gz
  • Upload date:
  • Size: 111.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for gtfreader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 701d2308fdc2d66384f4a3c7000284d8dbcd3efb2975100482b8eeade4c645d5
MD5 dc74dd91e84d53668cdba5cb61771eb6
BLAKE2b-256 b826010b61e71fd1f9773f89c0ac4092480c9d83c5aa288978eb497047d51535

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page