a scraper to mirror edi-energy.de
Project description
edi-energy.de scraper
The Python package edi_energy_scraper
provides easy to use methods to mirror the website edi-energy.de.
Rationale / Why?
If you'd like to be informed about new regulations or data formats being published on edi-energy.de you can either
- visit the site every day and hope that you see the changes if this is your favourite hobby,
- or automate the task.
This repository helps you with the latter. It allows you to create an up-to-date copy of edi-energy.de on your local
computer. Other than if you mirrored the files using wget
or curl
, you'll get a clean and intuitive directory
structure.
From there you can e.g. commit the files into a VCS (like e.g. our edi_energy_mirror), scrape the PDF/Word files for later use...
We're all hoping for the day of true digitization on which this repository will become obsolete.
How to use the Package (as a user)
Install via pip:
pip install edi_energy_scraper
Create a directory in which you'd like to save the mirrored data:
mkdir edi_energy_de
Then import it and start the download:
import asyncio
from edi_energy_scraper import EdiEnergyScraper
# add the following lines to enable debug logging to stdout (CLI)
# import logging
# import sys
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
async def mirror():
scraper = EdiEnergyScraper(path_to_mirror_directory="edi_energy_de")
await scraper.mirror()
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
asyncio.run(mirror())
This creates a directory structure:
-|-your_script_cwd.py
|-edi_energy_de
|- past (contains archived files)
|- ahb.pdf
|- ahb.docx
|- ...
|- current (contains files valid as of today)
|- mig.pdf
|- mig.docx
|- ...
|- future (contains files valid in the future)
|- allgemeine_festlegungen.pdf
|- schema.xsd
|- ...
How to use this Repository on Your Machine (for development)
Please follow the instructions in our Python Template Repository . And for further information, see the Tox Repository.
Contribute
You are very welcome to contribute to this template repository by opening a pull request against the main branch.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file edi_energy_scraper-0.7.1.tar.gz
.
File metadata
- Download URL: edi_energy_scraper-0.7.1.tar.gz
- Upload date:
- Size: 17.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f4741724d0a4c8cb6fe8ac70280f697adcd67be4d5c3ec49659ddbba9b62376 |
|
MD5 | bfbb49e89eb25099706ff9a879939e66 |
|
BLAKE2b-256 | 5454a1b960cfdd4bdfac63a992221db414fe9724e6a5ef6259141b7b6ef5a92c |
Provenance
The following attestation bundles were made for edi_energy_scraper-0.7.1.tar.gz
:
Publisher:
python-publish.yml
on Hochfrequenz/edi_energy_scraper
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
edi_energy_scraper-0.7.1.tar.gz
- Subject digest:
2f4741724d0a4c8cb6fe8ac70280f697adcd67be4d5c3ec49659ddbba9b62376
- Sigstore transparency entry: 150107624
- Sigstore integration time:
- Predicate type:
File details
Details for the file edi_energy_scraper-0.7.1-py3-none-any.whl
.
File metadata
- Download URL: edi_energy_scraper-0.7.1-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30fc8d7f30e91b1c0802aab834ead2deeb602792ad33211f98380d4a321544f0 |
|
MD5 | 4f52c5455e188a1dadbcd3b626518550 |
|
BLAKE2b-256 | d729e7c59b6c2f03bfd40cbbe2abaefbdf036184f8cb244e80b07736ec849f9e |
Provenance
The following attestation bundles were made for edi_energy_scraper-0.7.1-py3-none-any.whl
:
Publisher:
python-publish.yml
on Hochfrequenz/edi_energy_scraper
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
edi_energy_scraper-0.7.1-py3-none-any.whl
- Subject digest:
30fc8d7f30e91b1c0802aab834ead2deeb602792ad33211f98380d4a321544f0
- Sigstore transparency entry: 150107625
- Sigstore integration time:
- Predicate type: