Skip to main content

a scraper to mirror edi-energy.de

Project description

edi-energy.de scraper

Unittests status badge Coverage status badge Linting status badge Black status badge PyPi Status Badge

The Python package edi_energy_scraper provides easy to use methods to mirror the website edi-energy.de.

Rationale / Why?

If you'd like to be informed about new regulations or data formats being published on edi-energy.de you can either

  • visit the site every day and hope that you see the changes if this is your favourite hobby,
  • or automate the task.

This repository helps you with the latter. It allows you to create an up-to-date copy of edi-energy.de on your local computer. Other than if you mirrored the files using wget or curl, you'll get a clean and intuitive directory structure.

From there you can e.g. commit the files into a VCS (like e.g. our edi_energy_mirror), scrape the PDF/Word files for later use...

We're all hoping for the day of true digitization on which this repository will become obsolete.

How to use the Package (as a user)

Install via pip:

pip install edi_energy_scraper

Create a directory in which you'd like to save the mirrored data:

mkdir edi_energy_de

Then import it and start the download:

import asyncio
from edi_energy_scraper import EdiEnergyScraper

# add the following lines to enable debug logging to stdout (CLI)
# import logging
# import sys
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)

async def mirror():
    scraper = EdiEnergyScraper(path_to_mirror_directory="edi_energy_de")
    await scraper.mirror()


if __name__ == "__main__":
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    asyncio.run(mirror())

This creates a directory structure:

-|-your_script_cwd.py
 |-edi_energy_de
    |- past (contains archived files)
        |- ahb.pdf
        |- ahb.docx
        |- ...
    |- current (contains files valid as of today)
        |- mig.pdf
        |- mig.docx
        |- ...
    |- future (contains files valid in the future)
        |- allgemeine_festlegungen.pdf
        |- schema.xsd
        |- ...

How to use this Repository on Your Machine (for development)

Please follow the instructions in our Python Template Repository . And for further information, see the Tox Repository.

Contribute

You are very welcome to contribute to this template repository by opening a pull request against the main branch.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edi_energy_scraper-0.7.1.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

edi_energy_scraper-0.7.1-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file edi_energy_scraper-0.7.1.tar.gz.

File metadata

  • Download URL: edi_energy_scraper-0.7.1.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for edi_energy_scraper-0.7.1.tar.gz
Algorithm Hash digest
SHA256 2f4741724d0a4c8cb6fe8ac70280f697adcd67be4d5c3ec49659ddbba9b62376
MD5 bfbb49e89eb25099706ff9a879939e66
BLAKE2b-256 5454a1b960cfdd4bdfac63a992221db414fe9724e6a5ef6259141b7b6ef5a92c

See more details on using hashes here.

Provenance

The following attestation bundles were made for edi_energy_scraper-0.7.1.tar.gz:

Publisher: python-publish.yml on Hochfrequenz/edi_energy_scraper

Attestations:

File details

Details for the file edi_energy_scraper-0.7.1-py3-none-any.whl.

File metadata

File hashes

Hashes for edi_energy_scraper-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 30fc8d7f30e91b1c0802aab834ead2deeb602792ad33211f98380d4a321544f0
MD5 4f52c5455e188a1dadbcd3b626518550
BLAKE2b-256 d729e7c59b6c2f03bfd40cbbe2abaefbdf036184f8cb244e80b07736ec849f9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for edi_energy_scraper-0.7.1-py3-none-any.whl:

Publisher: python-publish.yml on Hochfrequenz/edi_energy_scraper

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page