Python HML parser
Project description
pyHML
Python HML parser
Free software: LGPL 3.0
Documentation: https://pyhml.readthedocs.io.
Features
import pyhml
hml_file = "hml_example.xml"
hmlparser = pyhml.HmlParser()
hml = hmlparser.parse(hml_file)
outdir = 'output/directory'
# Print out each subject in fasta format
hml.tobiotype(outdir, dtype='fasta', by='subject')
# Print out the full HML file in IMGT dat file format
hml.tobiotype(outdir, dtype='imgt', by='file')
# Get pandas DF from HML object
pandasdf = hml.toPandas()
print(pandasdf)
ID Locus glstring dbversion \
0 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0
1 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0
2 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0
3 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0
4 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0
5 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0
6 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0
7 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0
8 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0
9 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0
10 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0
11 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0
12 1367-7150-8 HLA-DPB1 HLA-DPB1*02:01:02+HLA-DPB1*04:01:01 3.14.0
13 1367-7150-8 HLA-DPB1 HLA-DPB1*02:01:02+HLA-DPB1*04:01:01 3.14.0
14 1367-7150-8 HLA-DRB1 HLA-DRB1*03:01:01+HLA-DRB1*07:01:01 3.15.0
15 1367-7150-8 HLA-DRB1 HLA-DRB1*03:01:01+HLA-DRB1*07:01:01 3.15.0
sequence
0 TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
1 TTCCCGTCAGACCCCCCCAAGACACATATGACCCACCACCCCATCT...
2 TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT...
3 GTGCCTGTGTCCAGGCTGGTGTCTGGGTTCTGTGCTCTCTTCCCCA...
4 CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
5 GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
6 CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG...
7 GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG...
8 AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
9 CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
10 AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC...
11 CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA...
12 CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
13 CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT...
14 CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
15 CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
Install
pip install pyhml
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.0.5 (2017-04-16)
Improved documentation
Fixed issues with parsing HML files with NMDP-CORRECTION
0.0.4 (2017-04-15)
Fixed dependency issues.
Moved tobiotype to HML object.
Moved toDF to HML object and renamed toPandas()
Added tests and linked to travis.ci
0.0.3 (2017-04-14)
Added the ability to parse .gz files
Added the ability to parse HML files with bad tags.
0.0.2 (2017-11-14)
Fixed issues with parsing HML files with missing data
0.0.1 (2017-10-19)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyhml-0.0.5.tar.gz
(3.5 MB
view hashes)
Built Distribution
pyhml-0.0.5-py2.py3-none-any.whl
(92.4 kB
view hashes)
Close
Hashes for pyhml-0.0.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8612da89f1cc830ade94699eb96d9a94067a37238a010ca395006cc407b5fc51 |
|
MD5 | 45a7e2be2a4dc7d21bf2eacf09ba437a |
|
BLAKE2b-256 | 5d7ec6f4ecc10b161c7cd49d8b89bdb2fc3b357ffbea7e665a462958edcc2736 |