pytablereader is a Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.
Project description
Summary
pytablereader is a Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.
Features
- Extract structured tabular data from various data format:
CSV / Tab separated values (TSV) / Space separated values (SSV)
Microsoft Excel TM file
HTML (table tags)
JSON
Line-delimited JSON(LDJSON) / NDJSON / JSON Lines
Markdown
MediaWiki
SQLite database file
- Supported data sources are:
Files on a local file system
Accessible URLs
str instances
- Loaded table data can be used as:
pandas.DataFrame instance
dict instance
Examples
Load a CSV table
- Sample Code:
import pytablereader as ptr import pytablewriter as ptw # prepare data --- file_path = "sample_data.csv" csv_text = "\n".join([ '"attr_a","attr_b","attr_c"', '1,4,"a"', '2,2.1,"bb"', '3,120.9,"ccc"', ]) with open(file_path, "w") as f: f.write(csv_text) # load from a csv file --- loader = ptr.CsvTableFileLoader(file_path) for table_data in loader.load(): print("\n".join([ "load from file", "==============", "{:s}".format(ptw.dumps_tabledata(table_data)), ])) # load from a csv text --- loader = ptr.CsvTableTextLoader(csv_text) for table_data in loader.load(): print("\n".join([ "load from text", "==============", "{:s}".format(ptw.dumps_tabledata(table_data)), ]))
- Output:
load from file ============== .. table:: sample_data ====== ====== ====== attr_a attr_b attr_c ====== ====== ====== 1 4.0 a 2 2.1 bb 3 120.9 ccc ====== ====== ====== load from text ============== .. table:: csv2 ====== ====== ====== attr_a attr_b attr_c ====== ====== ====== 1 4.0 a 2 2.1 bb 3 120.9 ccc ====== ====== ======
Get loaded table data as pandas.DataFrame instance
- Sample Code:
import pytablereader as ptr loader = ptr.CsvTableTextLoader( "\n".join([ "a,b", "1,2", "3.3,4.4", ])) for table_data in loader.load(): print(table_data.as_dataframe())
- Output:
a b 0 1 2 1 3.3 4.4
For more information
More examples are available at https://pytablereader.rtfd.io/en/latest/pages/examples/index.html
Installation
Install from PyPI
pip install pytablereader
Some of the formats require additional dependency packages, you can install the dependency packages as follows:
- Excel
pip install pytablereader[excel]
- Google Sheets
pip install pytablereader[gs]
- Markdown
pip install pytablereader[md]
- Mediawiki
pip install pytablereader[mediawiki]
- SQLite
pip install pytablereader[sqlite]
- Load from URLs
pip install pytablereader[url]
- All of the extra dependencies
pip install pytablereader[all]
Install from PPA (for Ubuntu)
sudo add-apt-repository ppa:thombashi/ppa sudo apt update sudo apt install python3-pytablereader
Dependencies
Optional Python packages
Optional packages (other than Python packages)
libxml2 (faster HTML conversion)
pandoc (required when loading MediaWiki file)
Documentation
Sponsors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pytablereader-0.31.4.tar.gz
.
File metadata
- Download URL: pytablereader-0.31.4.tar.gz
- Upload date:
- Size: 72.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad97308308525cafe0eaa4b6a80a02499e0b4c6c979efb17452d302ad78bd5b1 |
|
MD5 | d92cbcb2716ecea0eee58649a591edc0 |
|
BLAKE2b-256 | 0a44e42c24df7b6f1c880b5bf614112e2009ac088fee79b6bc4d1fa43789c460 |
File details
Details for the file pytablereader-0.31.4-py3-none-any.whl
.
File metadata
- Download URL: pytablereader-0.31.4-py3-none-any.whl
- Upload date:
- Size: 48.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ce0e81b1035ba6b345cc1edbf5734780ed089fdead05c1fd12869a09cc0c3ce |
|
MD5 | d638021f5b68225f087ac5c029670ca1 |
|
BLAKE2b-256 | 41e9eeffa7b8ce57ecfa711f1f173012705bb8b082cb547c2d68a951845ad289 |