No project description provided
Project description
xml_dwarf
Mining XML/HTML fast
This is the rust part of the library xdwarf
Installation
pip install rust_dwarf
Here's a preview on how xdwarf works
dwarf = Dwarf.from_glob(".../pmc_xml/PMC003xxxxxx/PMC31*.xml", "PMC",20)
Define the mining detail as xpath query pattern, chaining multistage mining is well supported.
dwarf.find_one('article-meta > article-id[pub-id-type=pmid]' , "pmid")
dwarf.find_one("abstract", "abstract").find_many("p", "paragraph")
# mining stage can be chained to longer detials
reference = dwarf.find_one("ref-list", "ref_list").find_many("ref","reference")
reference.find_one("pub-id[pub-id-type=pmid]", "ref_id")
reference.find_one("pub-id[pub-id-type=doi]", "doi")
ref_name = reference.find_many("name", "ref_name")
ref_name.find_one("surname", "ref_surname")
dwarf.set_necessary("pmid")
dwarf.create_children()
Mining start
result = dwarf()
See result
result.child_df().head(2)
See child result
result['ref_list'].child_df().head()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
rust_dwarf-0.1.1-cp310-none-win32.whl
(431.6 kB
view hashes)
rust_dwarf-0.1.1-cp39-none-win32.whl
(431.7 kB
view hashes)
rust_dwarf-0.1.1-cp38-none-win32.whl
(431.3 kB
view hashes)
rust_dwarf-0.1.1-cp37-none-win32.whl
(431.3 kB
view hashes)
rust_dwarf-0.1.1-cp36-none-win32.whl
(428.4 kB
view hashes)
Close
Hashes for rust_dwarf-0.1.1-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf3e00f3361d1c64cbb1b72c502f5a078127d45f3371197355ecb218e1480420 |
|
MD5 | ce244481d2322d829682c386366ac55b |
|
BLAKE2b-256 | 4c61fdf3b81dad4ab1f433196f06d64383e77af73b8ce46ff709704b021df906 |
Close
Hashes for rust_dwarf-0.1.1-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ab270ca29e87eb9bb30eece7feb224a8cb2853c55e305a9b408c47f6117c4fb |
|
MD5 | 715c3c8ace720833618e56918bc82073 |
|
BLAKE2b-256 | 52bcccdb505c6df0d7cd48bd03dc4c4787016fae7d8d9f700ec24c9e392e7fd1 |
Close
Hashes for rust_dwarf-0.1.1-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e574d1d4e2fde1eddaebb0f4cdec45d8378156da1ebbac0ce36f1a767ef5282 |
|
MD5 | 45d27511d05ddbeaabdf838bdd617cd4 |
|
BLAKE2b-256 | f206ae2904183bf744069bd6a617a6939e1b1c4cb7eb37c69cc4e64e216f14da |
Close
Hashes for rust_dwarf-0.1.1-cp310-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfeb2455c60edb3520330562173b2121f4088acd8a29dfa5deea30e89a3faf54 |
|
MD5 | 0d607b85966a37333eae899afe4af1c4 |
|
BLAKE2b-256 | 1911c853c84cacb898303fe22321fb54f5c22eb09bb39cc4c2062c2b579df83e |
Close
Hashes for rust_dwarf-0.1.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75f261155efddc27f7e6c064acbd3192933954eae9e11fecefbffa5e699d3bf9 |
|
MD5 | 6045569dbc692e7eeb513dd12c30be1d |
|
BLAKE2b-256 | e70341dceb4dd19e99c37a6f716ec5f66cce77f091d30a36868ed9f188027d1d |
Close
Hashes for rust_dwarf-0.1.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f75aa2c0ab18609ca0530cde1c58b1638a36791f875fe30f71f70936ab3e809c |
|
MD5 | 077efaaab94041d3b61a1c655c0a8460 |
|
BLAKE2b-256 | 0c7ad7552884dfba3182b884f3872ceaeb06a141baad8d60b4d6f97a087fdfb9 |
Close
Hashes for rust_dwarf-0.1.1-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3bfee01ba07fdfd2fad61c9172f0fc9746c62ef1f0ddcdb3973568c0ac45f08 |
|
MD5 | bd7124ec5eb930f0e87e7249215ad5af |
|
BLAKE2b-256 | e2c0c001933bf8255d1bd6a758b8c5b0ca5cb8255057365e977ac15caaafaba0 |
Close
Hashes for rust_dwarf-0.1.1-cp39-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d734f8101718998a9a392cb7aa529716d53353ffe82fda099e766cd0490d5eb |
|
MD5 | dd27fa001943b9aca2eaafbf03b7d426 |
|
BLAKE2b-256 | 5c02fa882bb4bbccd2ab8fcfc063aee0dc116b6d70dc9778b83d1d5583fc276c |
Close
Hashes for rust_dwarf-0.1.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5656158ca3caeec9ad40e8a18d48aec28f3c2f39edf102ce71009445de6c5667 |
|
MD5 | f03959dab6259c08310649404cb97d06 |
|
BLAKE2b-256 | 33fde407decc906cbf4206ea529fa62b5dc4410ac74fb675dc8c4aa9cd4b9c65 |
Close
Hashes for rust_dwarf-0.1.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5914283fc51d771d5a3549988dfe6301ae2f81e0d77cba025024c6bd2128e8d4 |
|
MD5 | 58da8df1687dbf759742e303d3b46f46 |
|
BLAKE2b-256 | 237588713b2b715ca240ec27061ae7402ffccce06b04714ff3d82fd8f31748a8 |
Close
Hashes for rust_dwarf-0.1.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 151146d2676c1518994642e81946809fc83dd7c8a355dbfd4c86f5201ab32e6c |
|
MD5 | 435172bebe16be3269f35924c1446cdd |
|
BLAKE2b-256 | 7ec57fd6f352731935bffa6c69ae792a954fbbbc90c1118aa3b3d16778b5e993 |
Close
Hashes for rust_dwarf-0.1.1-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cbcdb4c5ff87d3c37d4b961833531280b4370ca416492ac42b05fb0ec4d4a099 |
|
MD5 | 17269ea950d7e8f200bd8c8401db50cc |
|
BLAKE2b-256 | 21f7927066f8de7a91dccdc9f3dd225b4146e9ff7f80f9df417781ffa0616d1b |
Close
Hashes for rust_dwarf-0.1.1-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6ce24e66c6125fe49456b008213761d5a2033a1a3de0cd4d56ecb4db4c452cd |
|
MD5 | 3221d914d1b8f74b018ea45aadb3eee7 |
|
BLAKE2b-256 | 6a39ede12bfe664ebf834d591cd7ddb6b7278ef5cf8a1a2475e6e2e0b9c27b96 |
Close
Hashes for rust_dwarf-0.1.1-cp38-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07481354ccd2d0637d853b7daebd2c5a2965fe85015f80dc92d082d2a2ca8bc7 |
|
MD5 | eb88ce5b2d79a227ad1b7ead64621948 |
|
BLAKE2b-256 | 99ae92dde1952471a151449d00c7bfc5694df427f7f12597690c2bfd215007cc |
Close
Hashes for rust_dwarf-0.1.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 222f33ab93e4ebaf62292b2dc375d332a1ae26747b35c74f7a8eb7d30a98ebcd |
|
MD5 | 30a05736e2a0205dfcc909b829687e18 |
|
BLAKE2b-256 | fc1393a57f944c79d119e0be4fa049a194941b4cdf876a362be0e5c994007c43 |
Close
Hashes for rust_dwarf-0.1.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e460cf9eccea0edd2c8637cf5b7e0c8600783b7bbc7a3dc158a0a5b959013633 |
|
MD5 | 027a1db3104b1bdd86d7160dccbb9c6f |
|
BLAKE2b-256 | 10388b8c8fe5d6e221059225e01e0cd2a152c48ba42ef40b35c8515d4ad460fd |
Close
Hashes for rust_dwarf-0.1.1-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d59b1bb9219fc121cd38f8afc403b6c1709703bf30c6ef99eae3dbf00f9e212 |
|
MD5 | d7be046bcf904c4c8fb861856db992c2 |
|
BLAKE2b-256 | 520c683d9a3ab8b4a3221c9acad17db19af2d0272cbf13e2f5b27aaed3800d91 |
Close
Hashes for rust_dwarf-0.1.1-cp37-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8ec17010ff7cb9a822ada28649a56831ba1c8c8c927e5cc85f38512a9e1551e |
|
MD5 | d8d8bdfe8abe0aa0b4d89e42209eea95 |
|
BLAKE2b-256 | b2244a4b80d765e5c8535d21d4ac69682ac2a07942d451590a30a7467ea9148d |
Close
Hashes for rust_dwarf-0.1.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1861c3240cdf6e2f37248a7c1d07371c6889d01f1f42e9c521110f25e890cfb |
|
MD5 | 3b8399e17805436e601be4e1228a7d21 |
|
BLAKE2b-256 | 61afad0c59c11733a3eaf8221fec936b56e4ce193a369b2c175280a8e86f3c80 |
Close
Hashes for rust_dwarf-0.1.1-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9db1e05f01b7ff1daec65bc82b1aed19c2c87b17d13c40d5c123e4d330a45411 |
|
MD5 | d92dd1254b006daecd9637bbc8f430e6 |
|
BLAKE2b-256 | de9c0afd239aadc237fc4d931a6b3b75d02cc896e527cac03e63a0be7ca29540 |
Close
Hashes for rust_dwarf-0.1.1-cp37-cp37m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00f5c24b57d20cc62ba76f5b17a4d855752eedce42f0447bf3939fc453f05a28 |
|
MD5 | 9ce98aaa0d6638a84318b91e227e52b5 |
|
BLAKE2b-256 | ecaad7b092a2223906d5527f0f7b425e6a5dec13323faa35fc166455eccbcf13 |
Close
Hashes for rust_dwarf-0.1.1-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 850fef2bffd55da2f7786e9c3bbbe4119c8743913afac1d0f4931ae2a6bd33e4 |
|
MD5 | 402be57abc0bbc17202ea863c5bb3310 |
|
BLAKE2b-256 | 59ba475ce6bf0f1b73126eddca214c84c6ecc775c07fac14cb26a44ed20ce43d |
Close
Hashes for rust_dwarf-0.1.1-cp36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57579b8023b32bb80e5f315a9c67c29382e5c9bf52a3a1e09a622cef86131bad |
|
MD5 | 3ab8afef27c7cbbe1e31f32acd5f1302 |
|
BLAKE2b-256 | 51b2fbd16de0dc70dca978220f17fd7c8175ff566134160000ade72a98a29604 |
Close
Hashes for rust_dwarf-0.1.1-cp36-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ab4759ea724765e22f4d4fff481c487cd163dfe17f535615e65d0edc1df27bf |
|
MD5 | 8b9950280c7b33fff0cbae195f5bde3f |
|
BLAKE2b-256 | a671dea2329991419ff8d21c6d7c8de131ef6fecc2e533c20a170cf8982e5fe3 |
Close
Hashes for rust_dwarf-0.1.1-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 482d5ac3e538ace91086bb0030a3bda609e3a321c98043d6cb5141f6378b6b0b |
|
MD5 | 7edeaebced0eef8d1edf8f54b3c99d7f |
|
BLAKE2b-256 | 453ea92c88c2c016d24bb64a4acf6cc4debc5ca7022c30bc48355300f8a36d82 |
Close
Hashes for rust_dwarf-0.1.1-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5374a8b7ddc82e2b43e138ac4077d04df5e98658e0d292b1483a1e8533bf0e9e |
|
MD5 | c11d41f1130fb1f7482c7597796790a6 |
|
BLAKE2b-256 | ac8b11f8875b742614576c4d5399d4f01100e4a0cce9aa74d965de3687b2af1e |
Close
Hashes for rust_dwarf-0.1.1-cp36-cp36m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43ad2c5b071d8627a06fbb52db9c4300f86b5187ac1c53a824cf35497d1d79af |
|
MD5 | de3d368d4674e7b29791aec449e5d5e7 |
|
BLAKE2b-256 | da4679b521dff083490938f44277c93bd9d1623bd67ddbe41483209ed2b2ca1c |
Close
Hashes for rust_dwarf-0.1.1-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49995c3751dc8f530c19d97a6081daba67761334dd9c5436719ff4bc2da69718 |
|
MD5 | de4baafae2f0a1c782dc9335138cf79e |
|
BLAKE2b-256 | d9c34aa332ef2e1add82b364471ed4d4180fafa046cb3dec7f803ee1c13cbf45 |