Skip to main content

Python library for parsing UniProt XML data

Project description

uniprotlib

Note: This library was vibe coded with Claude. It works, it's tested, but review accordingly.

Python library for parsing UniProt XML files. Handles both single-entry downloads and multi-GB gzip-compressed database dumps with bounded memory usage.

Installation

pip install uniprotlib

Or with uv:

uv add uniprotlib

Usage

from uniprotlib import parse_xml

# single file
for entry in parse_xml("Q9Y261.xml"):
    print(entry.primary_accession, entry.protein_name)

# gzipped bulk download
for entry in parse_xml("uniprot_sprot.xml.gz"):
    print(entry.gene.primary, entry.organism.scientific_name)

# multiple files
for entry in parse_xml("human.xml.gz", "mouse.xml.gz"):
    print(entry.primary_accession)

parse_xml() returns an iterator that yields UniProtEntry objects. Gzip detection is automatic based on the .gz extension. Memory stays bounded regardless of file size.

Parsed fields

Model Fields
UniProtEntry primary_accession, accessions, entry_name, dataset, protein_name, gene, organism, sequence, keywords, db_references
Gene primary, synonyms, ordered_locus_names, orf_names
Organism scientific_name, common_name, tax_id, lineage
Sequence value, length, mass, checksum
DbReference type, id, molecule, properties

All model classes are dataclasses with full type annotations and py.typed support.

Development

Requires Python >= 3.12 and uv.

uv sync
uv run pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uniprotlib-0.1.0.tar.gz (65.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uniprotlib-0.1.0-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file uniprotlib-0.1.0.tar.gz.

File metadata

  • Download URL: uniprotlib-0.1.0.tar.gz
  • Upload date:
  • Size: 65.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for uniprotlib-0.1.0.tar.gz
Algorithm Hash digest
SHA256 51a79d8a49f28e8fbc5cf9d238c89eb169a3bb1c4f8deb977e0b3f1fc1aee7d5
MD5 cea007d95e39ce418ae6e1cf8cea16a4
BLAKE2b-256 347cf488f4380b75e8dcad1d012d8288829628779db69515e133fd22de42c4df

See more details on using hashes here.

Provenance

The following attestation bundles were made for uniprotlib-0.1.0.tar.gz:

Publisher: publish.yml on mpreusse/uniprotlib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file uniprotlib-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: uniprotlib-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for uniprotlib-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e402cc96434067c727395a57b2e83fd34295edc351da38d36ac437b45dd4769
MD5 55b7a240b0024fe1d8a552aa8642b077
BLAKE2b-256 1eac76ba06e458e3b45476e840884ca0c62e39c2a9ad1ef9ac8c7d89292d0a2d

See more details on using hashes here.

Provenance

The following attestation bundles were made for uniprotlib-0.1.0-py3-none-any.whl:

Publisher: publish.yml on mpreusse/uniprotlib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page