No project description provided
Project description
Changelog
All notable changes to this project will be documented in this file.
[0.3.12]
- Ignore large exhibit files when identifying the main statement
[0.3.10]
- Handle cases where page-break a comment indicates the page-break
[0.3.9]
- In
from_zip_to_json
update fix error where unique_anchor is None
[0.3.8]
- In
from_zip_to_json
update handle absence of Metalinks.json file
[0.3.7]
- In
from_zip_to_json
update to keepcontextref
andname
attributes during merge
[0.3.6]
- In
from_zip_to_json
update financial table detection with children elements
[0.3.5]
- In
from_zip_to_json
add filtering of financial tables based on metalinks file
[0.3.4]
- In
from_zip_to_json
fix merge issue inis_row_merge_case
method
[0.3.3]
- In
from_zip_to_json
row size mismatch handling increate_table_html_empty_cell_grid
method
[0.3.2]
- In
xbrl_parser
fix error when tr is empty in_is_anchor
method - In
xbrl_parser
fix error when padding is missing fromunique_paddings
list
[0.3.1]
- In
xbrl_parser
Add handling for paddings/margins given as integers in HTML
[0.3.0]
- In
xbrl_parser
Save page breaks in thesource.html
file
[0.2.9]
- In
xbrl_parser
Annotate page breaks
[0.2.8]
- In
xbrl_parser
Fix border attribute error
[0.2.7]
- Handle HtmlExtractor._merge_cells index error
[0.2.6]
- In
xbrl_parser
Add tr and td ids in json data - In
xbrl_parser
Make cosmetic changes to html table extractor - In
xbrl_parser
replace uuid1 with uuid4
[0.2.5]
- In
xbrl_parser
Add random uuid to all html tags
[0.2.4]
- In
xbrl_parser
Add table flip functionality
[0.2.3]
- Add case handling for only numeric cells regex
[0.2.2]
- Fix handling of tables that only contain non-numeric
[0.2.1]
- In
xbrl_parser
Update the heuristics for merging irregular cells
[0.2.0]
- Add handling of indentations using empty td cells
- Add handling of tag attributes with lxml parser
[0.1.10]
- In
xbrl_parser
Remove hidden cells
[0.1.9]
- In
xbrl_parser
Change html parser to lxml (from xml)
[0.1.8]
- In
xbrl_parser
Handle cases where indent is given to child text block - In
xbrl_parser
Handle processing of tables that have at least one numeric
value
[0.1.7]
- In
xbrl_parser
Handle cases where border value is not identified
[0.1.6]
- In
xbrl_parser
fix border attribute checks
[0.1.5]
- In
xbrl_parser
add border-top and border-bottom information
[0.1.4]
- In
xbrl_parser
activate remove empty tables - In
xbrl_parser
Change some attributes of output json to camelCase
[0.1.3]
- In
xbrl_parser
remove empty tables
[0.1.2]
- In
xbrl_parser
add bold and italic information
[0.1.1]
- In
xbrl_parser
merge tables using heuristics, add left padding
[0.1.0]
- In
xbrl_parser
read zip from folder instead of full filepath and save outputs in the same folder - In
xbrl_parser
add table ids in output html and json files
[0.0.9]
- In
xbrl_parser
skip merge logic if the table is empty or has inconsistent number of tds
[0.0.8]
- In
xbrl_parser
merge th tags into one with the corresponding colspan value
[0.0.7]
- In
xbrl_parser
fix tables with empty merges
[0.0.6]
- Prevent taking the bold text as a title if it's inside a table
[0.0.5]
- Take the first bold text above the table as title
[0.0.4]
- Fix list index out of range error for table title extraction
[0.0.3]
- Extract table titles and store in json output
- Fix value extraction from table cells
[0.0.2]
- Store thead trs in a list for table json output
[0.0.1] - Initial version of the package
- Extract tables information into a json file from a htm/html file or a zip of htmls
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
domtag-0.3.12.tar.gz
(17.0 kB
view details)
Built Distribution
domtag-0.3.12-py3-none-any.whl
(17.7 kB
view details)
File details
Details for the file domtag-0.3.12.tar.gz
.
File metadata
- Download URL: domtag-0.3.12.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a51e9b3f5be442658ac6dc9979c3719a303cd33887820dc5b097e37ccd23b266 |
|
MD5 | 156060635e472bc5dced4fbae0459c74 |
|
BLAKE2b-256 | e75801ab8298194502358179a45fa4be3db95bd76395736fca222b05a135fac9 |
File details
Details for the file domtag-0.3.12-py3-none-any.whl
.
File metadata
- Download URL: domtag-0.3.12-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2bd14980b814ddbc5572f9081b1bd5ceadd35ab2d17fa789a0c75e4410fc44b |
|
MD5 | 7bd1852a07c5bfc592b8325e71830e80 |
|
BLAKE2b-256 | 3b05243562c6ec92875b40e4158e4545370673f4e2719a3ea366f80de76e1e3b |