Skip to main content
Python Software Foundation 20th Year Anniversary Fundraiser  Donate today!

Python HML parser

Project description

pyHML

https://img.shields.io/travis/nmdp-bioinformatics/pyHML.svg Documentation Status Updates https://img.shields.io/pypi/v/pyhml.svg https://coveralls.io/repos/github/nmdp-bioinformatics/pyHML/badge.svg?branch=master

Python HML parser

Features

import pyhml
hml_file = "hml_example.xml"
hmlparser = pyhml.HmlParser()
hml = hmlparser.parse(hml_file)
outdir = 'output/directory'

# Print out each subject in fasta format
hml.tobiotype(outdir, dtype='fasta', by='subject')

# Print out the full HML file in IMGT dat file format
hml.tobiotype(outdir, dtype='imgt', by='file')

# Get pandas DF from HML object
pandasdf = hml.toPandas()
print(pandasdf)

         ID     Locus                             glstring dbversion  \
    0   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    1   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    2   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    3   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    4   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    5   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    6   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    7   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    8   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    9   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    10  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    11  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    12  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    13  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    14  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0
    15  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0

                                                 sequence
    0   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    1   TTCCCGTCAGACCCCCCCAAGACACATATGACCCACCACCCCATCT...
    2   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    3   GTGCCTGTGTCCAGGCTGGTGTCTGGGTTCTGTGCTCTCTTCCCCA...
    4   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    5   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    6   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    7   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    8   AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    9   CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    10  AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    11  CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    12  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    13  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    14  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
    15  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...

Install

pip install pyhml

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.0.5 (2017-04-16)

  • Improved documentation
  • Fixed issues with parsing HML files with NMDP-CORRECTION

0.0.4 (2017-04-15)

  • Fixed dependency issues.
  • Moved tobiotype to HML object.
  • Moved toDF to HML object and renamed toPandas()
  • Added tests and linked to travis.ci

0.0.3 (2017-04-14)

  • Added the ability to parse .gz files
  • Added the ability to parse HML files with bad tags.

0.0.2 (2017-11-14)

  • Fixed issues with parsing HML files with missing data

0.0.1 (2017-10-19)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pyhml, version 0.0.5
Filename, size File type Python version Upload date Hashes
Filename, size pyhml-0.0.5-py2.py3-none-any.whl (92.4 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size pyhml-0.0.5.tar.gz (3.5 MB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page