decimate xml data while keeping others intact
Project description
Xml Subsetter
decimate data while keeping others intact
before
# bulk.xml
<r>
<meta>
some meta data
</meta>
<something>
thing thing thing thing
</something>
<e>e0</e>
<e>e1</e>
<e>e2</e>
<e>e3</e>
<some-annoying-non-data-you-have-to-keep-1>ah yah yah</some-annoying-non-data-you-have-to-keep-1>
<some-annoying-non-data-you-have-to-keep-2>ah yah yah</some-annoying-non-data-you-have-to-keep-2>
<some-annoying-non-data-you-have-to-keep-3>ah yah yah</some-annoying-non-data-you-have-to-keep-3>
<e>e4</e>
<e>e5</e>
<e>e6</e>
<e>e7</e>
<e>e8</e>
<e>e9</e>
<e>e10</e>
...
<e>e99</e>
</r>
subset_head("bulk.xml", target_file='/tmp/small.xml', data_tag='e',ratio=0.05)
after
# small.xml
<r>
<meta>
some meta data
</meta>
<something>
thing thing thing thing
</something>
<e>e0</e>
<e>e1</e>
<e>e2</e>
<e>e3</e>
<some-annoying-non-data-you-have-to-keep-1>ah yah yah</some-annoying-non-data-you-have-to-keep-1>
<some-annoying-non-data-you-have-to-keep-2>ah yah yah</some-annoying-non-data-you-have-to-keep-2>
<some-annoying-non-data-you-have-to-keep-3>ah yah yah</some-annoying-non-data-you-have-to-keep-3>
<e>e4</e>
</r>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
xml-subsetter-0.0.2.tar.gz
(4.7 kB
view details)
File details
Details for the file xml-subsetter-0.0.2.tar.gz
.
File metadata
- Download URL: xml-subsetter-0.0.2.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c102e20288d37d1fff6e081b3e22339acc710e8e24458a050bd777985589837 |
|
MD5 | f641bb0e774ad2d20c89b96d49b14aec |
|
BLAKE2b-256 | 6eab37c989d1c2bda5f47e5e5157bd5ec1ee4c13ae4577ec8b50cbbf13fc904c |