semi structured xml to dict
Project description
# semi structured xml to dict
ssxtd is an xmlreader similar to xmltodict, but supporting semi structured xml, and providing a more flexible environnment.
ssxtd use either of :
* the lxml package
* the native package ElementTree
the choice, and installation, in up to the user.
Globally, lxml is performing better than ElementTree
## Getting started
* if you can't install lxml, and are limitted in RAM, use :
`parsers.xml_iterparse(my_file, depth=2)`
* if you can't install lxml, and are NOT limitted in RAM, use :
`parsers.xml_parse(my_file, depth=2)`
* if you CAN install lxml, and are limitted in RAM, use :
`parsers.lxml_iterparse(my_file, depth=2)`
* if you CAN install lxml, and are NOT limitted in RAM, use :
`parsers.lxml_parse(my_file, depth=2)`
depth is the depth of the tag you want to parse
## Mixed tag and text
ssxtd will convert mixed tags and text to a string, keeping the order of the xml.
## Flexible
### Compressed files
if you specify the parameter "compression" when calling a parser, the file will be decompressed
accepted values : "gz", "zip"
```
parsers.xml_parse(my_file, depth=2, compression="gz"):
```
### Object processor
if you specify the parameter "object_processor=my_function" when calling a parser, your function will be called for each object
```
WIP (see bin/run_exemple.py )
```
Allows to do special actions like merging tags directly during the parsing
### Value processor
if you specify the parameter "object_processor=my_function" when calling a parser, your function will be called for each object
e.g a simple type conversion :
```
def try_conversion(value):
try:
return int(value)
except (ValueError, TypeError):
pass
try:
return float(value)
except (ValueError, TypeError):
pass
return value
```
## Performances of the parsing functions
lxml_parse :
17.63099956512451
7764 processed values
lxml_iterparse :
19.163238525390625
7764 processed values
xml_parse :
17.3682701587677
7764 processed values
xml_iterparse :
27.15250539779663
7764 processed values
xmltodict (other lib) :
18.277526140213013
7764 processed values
ssxtd is an xmlreader similar to xmltodict, but supporting semi structured xml, and providing a more flexible environnment.
ssxtd use either of :
* the lxml package
* the native package ElementTree
the choice, and installation, in up to the user.
Globally, lxml is performing better than ElementTree
## Getting started
* if you can't install lxml, and are limitted in RAM, use :
`parsers.xml_iterparse(my_file, depth=2)`
* if you can't install lxml, and are NOT limitted in RAM, use :
`parsers.xml_parse(my_file, depth=2)`
* if you CAN install lxml, and are limitted in RAM, use :
`parsers.lxml_iterparse(my_file, depth=2)`
* if you CAN install lxml, and are NOT limitted in RAM, use :
`parsers.lxml_parse(my_file, depth=2)`
depth is the depth of the tag you want to parse
## Mixed tag and text
ssxtd will convert mixed tags and text to a string, keeping the order of the xml.
## Flexible
### Compressed files
if you specify the parameter "compression" when calling a parser, the file will be decompressed
accepted values : "gz", "zip"
```
parsers.xml_parse(my_file, depth=2, compression="gz"):
```
### Object processor
if you specify the parameter "object_processor=my_function" when calling a parser, your function will be called for each object
```
WIP (see bin/run_exemple.py )
```
Allows to do special actions like merging tags directly during the parsing
### Value processor
if you specify the parameter "object_processor=my_function" when calling a parser, your function will be called for each object
e.g a simple type conversion :
```
def try_conversion(value):
try:
return int(value)
except (ValueError, TypeError):
pass
try:
return float(value)
except (ValueError, TypeError):
pass
return value
```
## Performances of the parsing functions
lxml_parse :
17.63099956512451
7764 processed values
lxml_iterparse :
19.163238525390625
7764 processed values
xml_parse :
17.3682701587677
7764 processed values
xml_iterparse :
27.15250539779663
7764 processed values
xmltodict (other lib) :
18.277526140213013
7764 processed values
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ssxtd-0.1.6.tar.gz
(6.2 kB
view hashes)
Built Distribution
Close
Hashes for ssxtd-0.1.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13de99271280973862cdf37b9369516f43b84d6e402464f4acd586091aceef85 |
|
MD5 | e3c22daf11dcd90f31504cf1de6cb5c1 |
|
BLAKE2b-256 | 857fd24492dac836ff0ebfde6a69d1fdbe83997043588d4a70f1eff8b6c8f2f9 |