Skip to main content

A utility that extracts tables from HTML documents and converts them to CSV format

Project description

html2csv

Build Status

html2csv is a utility that extracts tables from HTML documents and converts them to CSV format, written in Python.

asciicast

Setup

Python 3 is required (version >=3.6). Install html2csv by pip.

pip install html-to-csv

Yes, the package name is html-to-csv due to collision ;-)

Examples

Input from the standard input, and output to the standard output.

html2csv

Input from a file, and output to the standard output.

html2csv example.html

Input from files, and output to a file.

html2csv example1.html example2.html -o output.csv

Input from the network, and output to the standard output.

html2csv http://example.com

Usage

usage: html2csv [-h] [-o [OUTPUT]] [-e ENGINE] [-V] [input [input ...]]

Convert HTML table to CSV format.

positional arguments:
  input                 input sources (files, URLs, etc., default: standard
                        input)

optional arguments:
  -h, --help            show this help message and exit
  -o [OUTPUT], --output [OUTPUT]
                        output target (default: standard output)
  -e ENGINE, --engine ENGINE
                        HTML parser engine (default: html.parser or lxml if
                        installed)
  -V, --version         display version

Author and Contact

Wentao Han (wentao.han@gmail.com)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html-to-csv-0.1.3.tar.gz (3.1 kB view details)

Uploaded Source

Built Distribution

html_to_csv-0.1.3-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file html-to-csv-0.1.3.tar.gz.

File metadata

  • Download URL: html-to-csv-0.1.3.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.4

File hashes

Hashes for html-to-csv-0.1.3.tar.gz
Algorithm Hash digest
SHA256 6de36f4afb701e2d960429636e5b27f2c921cd4646c6064232af44f62d1a05f8
MD5 28213f8bc10927f355374ad539e035eb
BLAKE2b-256 e4c0c12ec45f34660440a128086addc9b605a14f93ee8f3c495ed0d56bed2bb6

See more details on using hashes here.

File details

Details for the file html_to_csv-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: html_to_csv-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.4

File hashes

Hashes for html_to_csv-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 721f7be3488af50d1455888b0df71d9209c54db389a7149260de30be4024edbe
MD5 4705d09d11013e81cf0ee7270be7335c
BLAKE2b-256 7818c97082294c5104e8bb14819c57eed20a157402ba6deeaa147d7f352b9ef1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page