Python HML parser
Project description
pyHML
Python HML parser
- Free software: LGPL 3.0
- Documentation: https://pyhml.readthedocs.io.
- Jupyter Notebook
Features
import pyhml hml_file = "hml_example.xml" hmlparser = pyhml.HmlParser() hml = hmlparser.parse(hml_file) outdir = 'output/directory' # Print out each subject in fasta format hml.tobiotype(outdir, dtype='fasta', by='subject') # Print out the full HML file in IMGT dat file format hml.tobiotype(outdir, dtype='imgt', by='file') # Get pandas DF from HML object pandasdf = hml.toPandas() print(pandasdf) ID Locus glstring dbversion \ 0 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0 1 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0 2 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0 3 1367-7150-8 HLA-A HLA-A*01:01:01+HLA-A*24:02:01 3.14.0 4 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0 5 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0 6 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0 7 1367-7150-8 HLA-B HLA-B*08:01:01+HLA-B*57:01:01 3.14.0 8 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0 9 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0 10 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0 11 1367-7150-8 HLA-C HLA-C*06:02:01+HLA-C*07:01:01 3.14.0 12 1367-7150-8 HLA-DPB1 HLA-DPB1*02:01:02+HLA-DPB1*04:01:01 3.14.0 13 1367-7150-8 HLA-DPB1 HLA-DPB1*02:01:02+HLA-DPB1*04:01:01 3.14.0 14 1367-7150-8 HLA-DRB1 HLA-DRB1*03:01:01+HLA-DRB1*07:01:01 3.15.0 15 1367-7150-8 HLA-DRB1 HLA-DRB1*03:01:01+HLA-DRB1*07:01:01 3.15.0 sequence 0 TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT... 1 TTCCCGTCAGACCCCCCCAAGACACATATGACCCACCACCCCATCT... 2 TTCCTGGATACTCACGACGCGGACCCAGTTCTCACTCCCATTGGGT... 3 GTGCCTGTGTCCAGGCTGGTGTCTGGGTTCTGTGCTCTCTTCCCCA... 4 CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG... 5 GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG... 6 CCATGGTGAGTTTCCCTGTACAAGAGTCCAAGGGGAGAGGTAAGTG... 7 GGCCTCTGCGGAGAGGAGCGAGGGGCCCGCCCGGCGAGGGCGCAGG... 8 AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC... 9 CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA... 10 AGGGATCAGGACGAAGTCCCAGGTCCCGGACGGGGCTCTCAGGGTC... 11 CGCATCCCCACTTCCCACTCCCATTGGGTGTCGGATATCTAGAGAA... 12 CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT... 13 CCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATTGGCCAATT... 14 CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA... 15 CATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCATGCA...
Install
pip install pyhml
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.0.5 (2017-04-16)
- Improved documentation
- Fixed issues with parsing HML files with NMDP-CORRECTION
0.0.4 (2017-04-15)
- Fixed dependency issues.
- Moved tobiotype to HML object.
- Moved toDF to HML object and renamed toPandas()
- Added tests and linked to travis.ci
0.0.3 (2017-04-14)
- Added the ability to parse .gz files
- Added the ability to parse HML files with bad tags.
0.0.2 (2017-11-14)
- Fixed issues with parsing HML files with missing data
0.0.1 (2017-10-19)
- First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyhml-0.0.5.tar.gz
(3.5 MB
view hashes)
Built Distribution
pyhml-0.0.5-py2.py3-none-any.whl
(92.4 kB
view hashes)
Close
Hashes for pyhml-0.0.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8612da89f1cc830ade94699eb96d9a94067a37238a010ca395006cc407b5fc51 |
|
MD5 | 45a7e2be2a4dc7d21bf2eacf09ba437a |
|
BLAKE2-256 | 5d7ec6f4ecc10b161c7cd49d8b89bdb2fc3b357ffbea7e665a462958edcc2736 |