Skip to main content

HDX Python scraper utilities to assemble data from multiple sources

Project description

Build Status Coverage Status Code style: black Imports: isort Downloads

The HDX Python Scraper Library is designed to enable you to easily develop code that assembles data from one or more tabular sources that can be csv, xls, xlsx or JSON. It uses a YAML file that specifies for each source what needs to be read and allows some transformations to be performed on the data. The output is written to JSON, Google sheets and/or Excel and includes the addition of Humanitarian Exchange Language (HXL) hashtags specified in the YAML file. Custom Python scrapers can also be written that conform to a defined specification and the framework handles the execution of both configurable and custom scrapers.

For more information, please read the documentation.

This library is part of the Humanitarian Data Exchange (HDX) project. If you have humanitarian related data, please upload your datasets to HDX.

Project details


Release history Release notifications | RSS feed

This version

2.3.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdx_python_scraper-2.3.1.tar.gz (6.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hdx_python_scraper-2.3.1-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file hdx_python_scraper-2.3.1.tar.gz.

File metadata

  • Download URL: hdx_python_scraper-2.3.1.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.26.0

File hashes

Hashes for hdx_python_scraper-2.3.1.tar.gz
Algorithm Hash digest
SHA256 78925d5b53de431ff2c654e85591dbe5f4ecaf876799af69427731c7541d9e36
MD5 ed3c219a591c4a5bdf31cf4472dd2c7e
BLAKE2b-256 2bb8d4d3cc58328e63e8ceb37a155c5858f6a21f36625769276769a6a2131ea0

See more details on using hashes here.

File details

Details for the file hdx_python_scraper-2.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for hdx_python_scraper-2.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ae517893152075db9c89acad74d12106967e2d6f17ab25334b0fd8f1f610e8fa
MD5 2369c0b06300ae8d524ab2c45c2ba56b
BLAKE2b-256 1b7c94284876ce61292da8c1d11eb8e8e4838569bec4d9e8f5815e629321c804

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page