Ingesting legal data like laws and court decisions via OLDP API
Project description
oldp-ingestor
Ingesting legal data like laws and court decisions via OLDP API.
Data sources:
| CLI provider | Type | Source |
|---|---|---|
ris |
laws + cases | Rechtsinformationssystem des Bundes (RIS) |
rii |
cases | Rechtsprechung im Internet (RII) — federal courts |
by |
cases | Gesetze Bayern — Bavarian courts |
nrw |
cases | NRWE Rechtsprechungsdatenbank — NRW courts |
ns |
cases | NI-VORIS Niedersachsen |
eu |
cases | EUR-Lex — EU court decisions |
juris-bb |
cases | Landesrecht Berlin-Brandenburg |
juris-bw |
cases | Landesrecht Baden-Württemberg |
juris-he |
cases | Landesrecht Hessen |
juris-hh |
cases | Landesrecht Hamburg |
juris-mv |
cases | Landesrecht Mecklenburg-Vorpommern |
juris-rlp |
cases | Landesrecht Rheinland-Pfalz |
juris-sa |
cases | Landesrecht Sachsen-Anhalt |
juris-sh |
cases | Landesrecht Schleswig-Holstein |
juris-sl |
cases | Landesrecht Saarland |
juris-th |
cases | Landesrecht Thüringen |
dummy |
laws + cases | Django fixture JSON files (for testing) |
Installation
pip install oldp-ingestor
Some providers require Playwright browsers. Install them after pip:
playwright install chromium
For development, clone the repo and use Make (auto-detects uv or falls back
to pip):
git clone https://github.com/openlegaldata/oldp-ingestor.git
cd oldp-ingestor
make install
Configuration
Set the following environment variables (or add them to a .env file):
| Variable | Description |
|---|---|
OLDP_API_URL |
Base URL of the OLDP instance (e.g. http://localhost:8000) |
OLDP_API_TOKEN |
API authentication token |
OLDP_API_HTTP_AUTH |
Optional HTTP basic auth in user:password format |
Usage
Show API info
oldp-ingestor info
Ingest laws
From the RIS API (rechtsinformationen.bund.de)
# Ingest all available legislation
oldp-ingestor laws --provider ris
# Search for specific legislation
oldp-ingestor laws --provider ris --search-term "EinbTestV"
# Limit the number of law books to ingest
oldp-ingestor laws --provider ris --limit 5
# Combine search and limit
oldp-ingestor laws --provider ris --search-term "BGB" --limit 1
Incremental fetching and request pacing
# Only fetch legislation adopted since a given date
oldp-ingestor laws --provider ris --date-from 2025-12-01
# Fetch legislation within a date range
oldp-ingestor laws --provider ris --date-from 2025-01-01 --date-to 2025-06-30
# Override the default request delay (0.2s) for slower pacing
oldp-ingestor laws --provider ris --request-delay 0.5
For automated cron usage, see dev-deployment/ingest-ris.sh (laws) and
dev-deployment/ingest-ris-cases.sh (cases) which track the last successful
run date in a state file and pass it as --date-from on subsequent runs.
From a JSON fixture file (dummy provider)
oldp-ingestor laws --provider dummy --path /path/to/fixture.json
Ingest cases
From the RIS API (rechtsinformationen.bund.de)
# Ingest all cases from all federal courts
oldp-ingestor cases --provider ris
# Filter by court and date range
oldp-ingestor cases --provider ris --court BGH --date-from 2026-01-01
# Limit for testing
oldp-ingestor cases --provider ris --limit 10 -v
From a JSON fixture file (dummy provider)
oldp-ingestor cases --provider dummy --path /path/to/fixture.json
# Limit the number of cases to ingest
oldp-ingestor cases --provider dummy --path /path/to/fixture.json --limit 10
The fixture file should contain Django fixture entries with courts.court and
cases.case models. Court foreign keys are resolved to court_name strings
for the OLDP cases API.
Output sinks
By default, data is written to the OLDP REST API. Use --sink json-file to
write JSON files to disk instead:
# Export laws to local files
oldp-ingestor --sink json-file --output-dir /tmp/export \
laws --provider ris --search-term BGB --limit 1
# Export cases to local files
oldp-ingestor --sink json-file --output-dir /tmp/export \
cases --provider ris --court BGH --limit 5
See docs/sinks.md for details on directory structure and implementing custom sinks.
Architecture
The ingestor uses a provider-based architecture. Each data source implements a
provider class (LawProvider or CaseProvider), and shared RIS HTTP logic
(retry, pacing, User-Agent) lives in RISBaseClient. Output is routed through
a sink (ApiSink or JSONFileSink).
Provider
├── LawProvider → DummyLawProvider, RISProvider
└── CaseProvider → DummyCaseProvider, RISCaseProvider,
RiiCaseProvider, ByCaseProvider,
NrwCaseProvider, NsCaseProvider,
EuCaseProvider, JurisCaseProvider (10 state variants)
Sink
├── ApiSink → OLDP REST API (default)
└── JSONFileSink → local JSON files
See docs/architecture.md for the full design.
Politeness and rate limiting
The RIS API allows 600 req/min. The ingestor stays under this with:
- Request pacing — 0.2 s delay between requests (configurable)
- Retry with backoff — exponential backoff on 429/503, respects
Retry-After - Descriptive User-Agent —
oldp-ingestor/0.1.0
See docs/politeness.md for details.
Further documentation
- docs/architecture.md — class hierarchy, data flow, file layout
- docs/sinks.md — sink concept, CLI examples, custom sinks
- docs/politeness.md — rate limiting, retry logic, cron operation
Provider docs
| Provider | Doc |
|---|---|
| RIS (laws + cases) | docs/providers/de/ris.md |
| RII (federal courts) | docs/providers/de/rii.md |
| Bayern | docs/providers/de/by.md |
| NRW | docs/providers/de/nrw.md |
| Niedersachsen | docs/providers/de/ns.md |
| EUR-Lex (EU) | docs/providers/de/eu.md |
| Bremen | docs/providers/de/hb.md |
| Sachsen OVG | docs/providers/de/sn_ovg.md |
| Sachsen ESAMOSplus | docs/providers/de/sn.md |
| Sachsen VerfGH | docs/providers/de/sn_verfgh.md |
| Juris (10 states) | docs/providers/de/juris.md |
| Dummy (test/dev) | docs/providers/dummy/dummy.md |
Development
# Run tests
make test
# Run tests with coverage
make test-cov
# Lint
make lint
# Auto-format
make format
See CONTRIBUTING.md for the full development setup, how to add new providers, and pull request guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oldp_ingestor-0.1.3.tar.gz.
File metadata
- Download URL: oldp_ingestor-0.1.3.tar.gz
- Upload date:
- Size: 157.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a0734d3696c6caafca0d84e00b430bd253af49e318e73d04bdce8879ded95e9
|
|
| MD5 |
961a8621f8c20a0255be9484186897b7
|
|
| BLAKE2b-256 |
300a38ce99f432e35afb7c07c68c16349dfebc8020239eb3a72a078db36e6c4d
|
Provenance
The following attestation bundles were made for oldp_ingestor-0.1.3.tar.gz:
Publisher:
publish.yml on openlegaldata/oldp-ingestor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
oldp_ingestor-0.1.3.tar.gz -
Subject digest:
2a0734d3696c6caafca0d84e00b430bd253af49e318e73d04bdce8879ded95e9 - Sigstore transparency entry: 991538452
- Sigstore integration time:
-
Permalink:
openlegaldata/oldp-ingestor@fbe4408f816239dc6d8152aa94d1b1a92c4c5c93 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/openlegaldata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fbe4408f816239dc6d8152aa94d1b1a92c4c5c93 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file oldp_ingestor-0.1.3-py3-none-any.whl.
File metadata
- Download URL: oldp_ingestor-0.1.3-py3-none-any.whl
- Upload date:
- Size: 64.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14d28d456344296e52f7757c6cf8de10b42c433d1e9fa8ac711e0f4072069551
|
|
| MD5 |
42a4bdd751a62007cbb2b7deb7e2d9a0
|
|
| BLAKE2b-256 |
ad015e49c5f2117755c537a225f7bf2a73694f707b77370dbfb375f8de845b76
|
Provenance
The following attestation bundles were made for oldp_ingestor-0.1.3-py3-none-any.whl:
Publisher:
publish.yml on openlegaldata/oldp-ingestor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
oldp_ingestor-0.1.3-py3-none-any.whl -
Subject digest:
14d28d456344296e52f7757c6cf8de10b42c433d1e9fa8ac711e0f4072069551 - Sigstore transparency entry: 991538456
- Sigstore integration time:
-
Permalink:
openlegaldata/oldp-ingestor@fbe4408f816239dc6d8152aa94d1b1a92c4c5c93 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/openlegaldata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fbe4408f816239dc6d8152aa94d1b1a92c4c5c93 -
Trigger Event:
workflow_dispatch
-
Statement type: