Skip to main content

Python HML parser

Project description

pyHML

https://img.shields.io/travis/nmdp-bioinformatics/pyHML.svg Documentation Status Updates https://img.shields.io/pypi/v/pyhml.svg https://coveralls.io/repos/github/nmdp-bioinformatics/pyHML/badge.svg?branch=master

Python HML parser

Features

import pyhml
hml_file = "hml_example.xml"
hmlparser = pyhml.HmlParser()
hml = hmlparser.parse(hml_file)
outdir = 'output/directory'

# Print out each subject in fasta format
hml.tobiotype(outdir, dtype='fasta', by='subject')

# Print out the full HML file in IMGT dat file format
hml.tobiotype(outdir, dtype='imgt', by='file')

# Get pandas DF from HML object
pandasdf = hml.toPandas()
print(pandasdf)

         ID     Locus                             glstring dbversion  \
    0   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    1   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    2   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    3   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    4   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    5   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    6   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    7   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    8   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    9   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    10  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    11  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    12  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    13  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    14  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0
    15  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0

                                                 sequence
    0   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    1   TTCCCGTCAGACCCCCCCAAGACACATATGACCCACCACCCCATCT...
    2   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    3   GTGCCTGTGTCCAGGCTGGTGTCTGGGTTCTGTGCTCTCTTCCCCA...
    4   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    5   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    6   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    7   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    8   AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    9   CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    10  AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    11  CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    12  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    13  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    14  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
    15  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...

Install

pip install pyhml

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.0.5 (2017-04-16)

  • Improved documentation

  • Fixed issues with parsing HML files with NMDP-CORRECTION

0.0.4 (2017-04-15)

  • Fixed dependency issues.

  • Moved tobiotype to HML object.

  • Moved toDF to HML object and renamed toPandas()

  • Added tests and linked to travis.ci

0.0.3 (2017-04-14)

  • Added the ability to parse .gz files

  • Added the ability to parse HML files with bad tags.

0.0.2 (2017-11-14)

  • Fixed issues with parsing HML files with missing data

0.0.1 (2017-10-19)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhml-0.0.5.tar.gz (3.5 MB view details)

Uploaded Source

Built Distribution

pyhml-0.0.5-py2.py3-none-any.whl (92.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyhml-0.0.5.tar.gz.

File metadata

  • Download URL: pyhml-0.0.5.tar.gz
  • Upload date:
  • Size: 3.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyhml-0.0.5.tar.gz
Algorithm Hash digest
SHA256 e24657cd6cd4b2834dd47ec7019f50f6538bf2d9ed8380c5a49365c9fcb002e7
MD5 141459fc09909f353c2c3d8048452714
BLAKE2b-256 9b6f339eec0c72ff3927f3dba0e8e0c4294c6e25f81178cdbd380f37b35110e1

See more details on using hashes here.

File details

Details for the file pyhml-0.0.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pyhml-0.0.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8612da89f1cc830ade94699eb96d9a94067a37238a010ca395006cc407b5fc51
MD5 45a7e2be2a4dc7d21bf2eacf09ba437a
BLAKE2b-256 5d7ec6f4ecc10b161c7cd49d8b89bdb2fc3b357ffbea7e665a462958edcc2736

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page