Extract the information represented in any HTML table
Project description
Tablextract
This Python 3 library extracts the information represented in any HTML table. This project has been developed in the context of the paper TOMATE: On extracting information from HTML tables
.
How to install
You can install this library via pip using:
pip install tablextract
Usage
>>> from tablextract import tables
>>> tables('http://example.com/tables')
[]
Further information will be written soon.
Changes
v1
Released on Jan 24, 2019.
- Before using Selenium, geckodriver is automatically downloaded for Linux, Windows and Mac OS.
- The Firefox process is closed automatically when the process ends.
- Geckodriver
quit
is called instead ofclose
. - Side-projects has been moved from this core project to tablextract-server and datamart.
- Fixed project imports and setup
- More readable Table objects
v0
Released on Jan 22, 2019.
- Initial package upload.
- Removed side projects to tablextractserver and datamart
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tablextract-1.0.15.tar.gz
(13.5 kB
view hashes)
Built Distribution
Close
Hashes for tablextract-1.0.15-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 873d593ecee31c3816811d8d88df5c0edebf56d7ff578a7cfd5f16138a39b3ea |
|
MD5 | c5dea8c87d967cb4b46d81090d439802 |
|
BLAKE2b-256 | 3b8244fc6d411e45908fc2091e4ddb1865ee7fb1cd274e5b659d05a22ab867c1 |