Skip to main content
Help us improve Python packaging – donate today!

Python HML parser

Project Description

pyHML

Documentation Status Updates https://coveralls.io/repos/github/nmdp-bioinformatics/pyHML/badge.svg?branch=master

Python HML parser

Features

import pyhml
hml_file = "hml_example.xml"
hmlparser = pyhml.HmlParser()
hml = hmlparser.parse(hml_file)
outdir = 'output/directory'

# Print out each subject in fasta format
hml.tobiotype(outdir, dtype='fasta', by='subject')

# Print out the full HML file in IMGT dat file format
hml.tobiotype(outdir, dtype='imgt', by='file')

# Get pandas DF from HML object
pandasdf = hml.toPandas()
print(pandasdf)

         ID     Locus                             glstring dbversion  \
    0   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    1   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    2   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    3   1367-7150-8     HLA-A        HLA-A*01:01:01+HLA-A*24:02:01    3.14.0
    4   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    5   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    6   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    7   1367-7150-8     HLA-B        HLA-B*08:01:01+HLA-B*57:01:01    3.14.0
    8   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    9   1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    10  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    11  1367-7150-8     HLA-C        HLA-C*06:02:01+HLA-C*07:01:01    3.14.0
    12  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    13  1367-7150-8  HLA-DPB1  HLA-DPB1*02:01:02+HLA-DPB1*04:01:01    3.14.0
    14  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0
    15  1367-7150-8  HLA-DRB1  HLA-DRB1*03:01:01+HLA-DRB1*07:01:01    3.15.0

                                                 sequence
    0   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    1   TTCCCGTCAGACCCCCCCAAGACACATATGACCCACCACCCCATCT...
    2   TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
    3   GTGCCTGTGTCCAGGCTGGTGTCTGGGTTCTGTGCTCTCTTCCCCA...
    4   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    5   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    6   CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
    7   GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
    8   AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    9   CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    10  AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
    11  CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
    12  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    13  CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
    14  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
    15  CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...

Install

pip install pyhml

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.0.5 (2017-04-16)

  • Improved documentation
  • Fixed issues with parsing HML files with NMDP-CORRECTION

0.0.4 (2017-04-15)

  • Fixed dependency issues.
  • Moved tobiotype to HML object.
  • Moved toDF to HML object and renamed toPandas()
  • Added tests and linked to travis.ci

0.0.3 (2017-04-14)

  • Added the ability to parse .gz files
  • Added the ability to parse HML files with bad tags.

0.0.2 (2017-11-14)

  • Fixed issues with parsing HML files with missing data

0.0.1 (2017-10-19)

  • First release on PyPI.

Release history Release notifications

This version
History Node

0.0.5

History Node

0.0.4

History Node

0.0.3

History Node

0.0.2

History Node

0.0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pyhml-0.0.5-py2.py3-none-any.whl (92.4 kB) Copy SHA256 hash SHA256 Wheel py2.py3 Apr 17, 2018
pyhml-0.0.5.tar.gz (3.5 MB) Copy SHA256 hash SHA256 Source None Apr 17, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page