Skip to main content

HDX Python scraper utilities to assemble data from multiple sources

Project description

Build Status Coverage Status Code style: black Imports: isort

The HDX Python Scraper Library is designed to enable you to easily develop code that assembles data from one or more tabular sources that can be csv, xls, xlsx or JSON. It uses a YAML file that specifies for each source what needs to be read and allows some transformations to be performed on the data. The output is written to JSON, Google sheets and/or Excel and includes the addition of Humanitarian Exchange Language (HXL) hashtags specified in the YAML file. Custom Python scrapers can also be written that conform to a defined specification and the framework handles the execution of both configurable and custom scrapers.

For more information, please read the documentation.

This library is part of the Humanitarian Data Exchange (HDX) project. If you have humanitarian related data, please upload your datasets to HDX.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdx-python-scraper-2.1.5.tar.gz (6.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hdx_python_scraper-2.1.5-py2.py3-none-any.whl (48.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file hdx-python-scraper-2.1.5.tar.gz.

File metadata

  • Download URL: hdx-python-scraper-2.1.5.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for hdx-python-scraper-2.1.5.tar.gz
Algorithm Hash digest
SHA256 aa780951025e3ec66817ba1fc8182b8e8b38e584d9fc0e698edb6757fed0beb6
MD5 31e27af862e1db8aa7e697e66aa86919
BLAKE2b-256 4ac24dba3339e0187f2cd5a34d1bcdc7801684684d4e03657b1d677ff064e8e9

See more details on using hashes here.

File details

Details for the file hdx_python_scraper-2.1.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for hdx_python_scraper-2.1.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 149d24b273dae97275a3d6547fd5287357bfb4b1fb6a5d8f331f64c9ee7b5244
MD5 da53098615cffd737b16d65032b0a3a8
BLAKE2b-256 6cfff8294130caf0fa0fff1ac14262923f87f4454ed3559f857b8b286f8bad92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page