Skip to main content

Selenium-based web scraper to extract data from e-Redes website and load it into database storage.

Project description

E-REDES Scraper

Description

This is a web scraper that collects data from the E-REDES website and stores it in a database. Since there is no exposed interface to the data, the web scraper is the only approach available to collect it. A high-level of the process is:

  1. The scraper collects the data from the E-REDES website.
  2. A file with the energy consumption readings is downloaded.
  3. The file is parsed and the data is compared to the data in the database to determine if there are new readings.
  4. If there are new readings, they are stored in the database.

:information_source: This package supports E-REDES website available at time of writing 23/10/2023. The entrypoint for the scraper is the page https://balcaodigital.e-redes.pt/login.

Installation

The package can be installed using pip:

pip install eredesscraper

Configuration

Usage is based on a YAML configuration file.
A config.yml is used to specify the credentials for the E-REDES website and [Optionally] the database connection. Currently, only InfluxDB is supported as a database sink.

Template config.yml:

eredes:
  # eredes credentials
  nif: <my-eredes-nif>
  pwd: <my-eredes-password>
  # CPE to monitor. e.g. PT00############04TW (where # is a digit). CPE can be found in your bill details
  cpe: PT00############04TW


influxdb:
  # url to InfluxDB.  e.g. http://localhost or https://influxdb.my-domain.com
  host: http://localhost
  # default port is 8086
  port: 8086
  bucket: <my-influx-bucket>
  org: <my-influx-org>
  # access token with write access
  token: <token>

Usage

:snake: Python script:

from eredesscraper.workflows import switchboard
from pathlib import Path

switchboard(name="current_month_consumption",
            db="influxdb",
            config_path=Path("./config.yml"))

:computer: CLI:

ers config load "/path/to/config.yml"

ers run

Limitations

Available workflows:

  • current_month_consumption: Collects the current month consumption data from the E-REDES website.

Available databases:

Roadmap

  • Add support for other workflows.
  • Add support for other databases.
  • Build CLI.
  • Add tests.
  • Add CI/CD.
  • Add logging.
  • Add runtime support for multiple CPEs.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

See LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eredesscraper-0.1.4.tar.gz (42.6 kB view details)

Uploaded Source

Built Distribution

eredesscraper-0.1.4-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file eredesscraper-0.1.4.tar.gz.

File metadata

  • Download URL: eredesscraper-0.1.4.tar.gz
  • Upload date:
  • Size: 42.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.2 Windows/10

File hashes

Hashes for eredesscraper-0.1.4.tar.gz
Algorithm Hash digest
SHA256 33273dcabbbdd1196e9fea310482888af0a4431c70f37e897f3a089d0c9f45a3
MD5 b49affa23d0a87f5c3ccf7feca55e31b
BLAKE2b-256 fcff4382e1c87346545e0f9dafbcd6c0f6bdf6eb39be9a87ea111e4c43075bde

See more details on using hashes here.

File details

Details for the file eredesscraper-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: eredesscraper-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.2 Windows/10

File hashes

Hashes for eredesscraper-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e681b33e49c65062f7bc1e1ac724704d5c1da872755afb9d0db41ece69c4db5e
MD5 88547c55c00bd77efa01cbdaee497340
BLAKE2b-256 b1d489e944cc31ef7db4dde28352dcc98dd9d224d6fdeebd851e817df9aef8ae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page