Skip to main content

Python package for easy access of the of umwelt.info API

Project description

umwelt_apy Python package

The goal of umwelt_apy is to provide a Python-based access to the (meta-)datasets of umwelt.info. The package, therefore, allows for an easy integration of those datasets into your Python-based workflows. The functionality of the package mirrors the webbased access as provided at https://umwelt.info and https://md.umwelt.info/. You can use the same queries and get the same datasets by accessing our API. The datasets are returned in either a JSON format or as Python dataframes like pandas and polars.

For more information, the API endpoint provides a SwaggerUI description of the data model where the search for multiple datasets is explained.

Installation

The umwelt_apy package can be installed from PyPi using:

# basic version (requests only)
pip install umwelt_apy
# for using pandas
pip install umwelt_apy[pandas]
# for using pandas and polars
pip install umwelt_apy[pandas,polars]

You can also install the development version of the package from our GitLab repository using:

pip config set global.extra-index-url https://gitlab.opencode.de/api/v4/projects/[TODO]/packages/pypi/simple
pip install umwelt-apy

Usage

The package provides a number of functions for querying umwelt.info and representing the resulting datasets in Python:

Core Functions

  • fetch_by_url(url) — This function is ideal for reproducing an earlier search of yours on our web-based interface. You can use a button in our web interface to create a url, which fetches the same datasets.
  • fetch_by_query(query="*", ...) — This function sends a search query to umwelt.info and fetches the same datasets you would get from using this query at our web-based interface. It returns these datasets in a variety of formates, like a generator object of JSON entries, as a pandas.DataFrame or a polars.DataFrame. This is the most performance-efficient way to retrieve large amounts of data. However, reproducing all available filters through the query can be challenging and users may prefer the fetch_by_url() function instead.
  • fetch_by_ids(ids) — This function fetches datasets by their unique identifier. Ideal for when the IDs are already known; e.g. this can be useful for downloading resources. After using preview_resources you can select a subset of from the preview list using fetch_by_ids() and forward it as input to download_resources().
  • download_resources(url) — This function downloads all the resources attached to the datasets of a respective query.

Examples

Example 1: Fetching datasets by a direct URL The basic function of umwelt_apy is fetch_by_url(url="*", ...). An example that queries umwelt.info for 'Luftqualität' and prints the title of all resulting datasets would be

from umwelt_apy import fetch_by_query

for dataset in fetch_by_url(url="https://md.umwelt.info/search?query=luftqualit%C3%A4t"):
    print(dataset['title'])

Example 2: Fetching by query string For background how to build a query see https://md.umwelt.info/swagger-ui/#/search/text_search If you want to know which facet values exist for a certain facet, you can use fetch_facet_values. Let's say you want to only fetch datasets from the data portal of Bavaria.

from umwelt_apy import fetch_by_query, fetch_facet_values

organisations = fetch_facet_values("organisations")
print(organisations)
results = list(fetch_by_query(query="organisation:/Land/Bayern/open.bydata AND Ozon"))

First, you fetch all organisation values. Then, you use the corresponding facet value in your query, and finally, you use the refined query to only fetch "Luftqualität" entries from the Bavarian data portal.

Example 3: Download resources attached to the datasets of a given query This is a four step process. First you retrieve the list of resources from the api (fetch_by_url or fetch_by_query), you unnest and optionally refine it further (unnest_and_filter), you screen the preview (preview_resources) and download the resources (download_resources). Note that the unnesting is a prerequisite for preview_resources() and download_resources(). Currently this workflow only works for output = "Pandas".

url = "https://md.umwelt.info/search/all?query=(Ozon)+AND+organisation%3A%2FLand%2FBayern%2Fopen.bydata"
from umwelt_apy import fetch_by_url, unnest_and_filter, preview_resources, download_resources
import pandas
results = (
    fetch_by_url(
        api_url = url,
        build_row = "only_resources",
        filter_datasets = lambda dataset: "resources" in dataset,
        output = "Pandas",
    )
    .pipe(unnest_and_filter, formats=["Microsoft Excel Spreadsheet", "CSV"], description_regex="Ozon")
    .pipe(preview_resources)
)

In case the list of resources in the preview should be downloaded

download_resources(results, base_dir="downloads")

More examples are provided in the docstrings of each function.

Requirements

For umwelt_apy you need at least Python 3.11 or higher as well as the request package version 2.32 or higher. If you want to use Pandas or Polars dataframes, you need versions 2.2.3 or 1.26, respectively. Full details are provided in the pyproject.toml.

Contributing

Contributions are welcome! Whether you want to fix a bug, add a new feature, or improve the documentation. You can find the source code at https://gitlab.opencode.de/umwelt-info/packages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

umwelt_apy-0.1.4.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

umwelt_apy-0.1.4-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file umwelt_apy-0.1.4.tar.gz.

File metadata

  • Download URL: umwelt_apy-0.1.4.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for umwelt_apy-0.1.4.tar.gz
Algorithm Hash digest
SHA256 cdb4c8a328af0eff4b85db00edbb80f2e6f38502389607c9fd04228587e2765d
MD5 46d1c703ae63d33644ba38da8dfb7e07
BLAKE2b-256 f70a933b10755bbcacbf5c0b8bd094c59939e811cc8d15c2b50ba65a19a325ee

See more details on using hashes here.

File details

Details for the file umwelt_apy-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: umwelt_apy-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for umwelt_apy-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 69accb7f0ec33d9ff8ec4bd81817ce3385571d7e823bffbf760c3f003e9c4b55
MD5 1c2aedfb090d4e2fab4076176c14e66f
BLAKE2b-256 ca1c3578bbdde47099b5030323d5f8285e969a9470a2259a3eb2235c9f88bf41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page