Skip to main content

Swedish legal data collection tool

Project description

juris

Website PyPI License: MIT

Swedish Parliament

prop mot bet skr

Swedish Government

sou ds dir lagr sfs

Courts

nja ad hfd mod pmod

Authorities

jo jk foreskrift

EU Law

eu_reg eu_dir cjeu echr

A command-line tool for collecting and normalizing Swedish legal documents from official government sources.

Sweden has a wealth of public legal information — laws, government bills, public inquiries, court decisions — scattered across multiple government websites and APIs with inconsistent formats. juris collects documents from these sources, normalizes them into a unified format, and saves them as browsable, version-controlled files (Markdown + JSON). Think of it as a git-native open database for Swedish law.

Features

  • 8 data sources covering Swedish parliament, government, courts, authorities, and EU law
  • 21 document types from bills and motions to court decisions and EU regulations
  • Dual output format — Markdown (human-readable, browsable on GitHub) and JSON (machine-parseable)
  • Incremental collection with state tracking to resume where you left off
  • Async I/O with built-in rate limiting to respect source servers
  • PDF text extraction from document attachments
  • Date and session filtering for targeted collection

Data sources

Source Method Document types
Riksdagen JSON API prop, sou, mot, bet, dir, skr, sfs
Regeringen.se Web scraping prop, sou, ds, lagr, dir, skr
Domstolsverket REST API nja, ad, hfd, mod, pmod
JO Web scraping jo
JK Web scraping jk
Lagrummet Web scraping foreskrift
EUR-Lex SPARQL eu_reg, eu_dir
CURIA / HUDOC SPARQL / JSON API cjeu, echr

Document types

Swedish Parliament

Type Swedish English
prop Propositioner Government bills
mot Motioner Parliamentary motions
bet Betänkanden Committee reports
skr Skrivelser Government communications

Swedish Government

Type Swedish English
sou Statens offentliga utredningar State public inquiries
ds Departementsserien Department series
dir Kommittédirektiv Committee directives
lagr Lagrådsremisser Legal council referrals
sfs Svensk författningssamling Swedish Code of Statutes

Courts

Type Swedish English
nja Nytt Juridiskt Arkiv Supreme Court precedents
ad Arbetsdomstolens domar Labour Court decisions
hfd Högsta förvaltningsdomstolens årsbok Supreme Administrative Court
mod Mark- och miljööverdomstolen Land and Environment Court
pmod Patent- och marknadsöverdomstolen Patent and Market Court

Authorities

Type Swedish English
jo Justitieombudsmannens beslut Parliamentary Ombudsman decisions
jk Justitiekanslerns beslut Chancellor of Justice decisions
foreskrift Myndighetsföreskrifter Agency regulations

EU law

Type Swedish English
eu_reg EU-förordningar EU regulations
eu_dir EU-direktiv EU directives
cjeu EU-domstolens domar Court of Justice of the EU
echr Europadomstolens domar European Court of Human Rights

Installation

pip install -e .

Requires Python 3.11 or later.

Usage

# Collect government bills from the 2024/25 parliamentary session
juris collect riksdagen --type prop --session 2024/25

# Collect SOU reports published since a specific date
juris collect riksdagen --type sou --since 2024-01-01

# Collect from the government website with a limit
juris collect regeringen --type prop --session 2024/25 --limit 5

# Collect Supreme Court decisions
juris collect domstol --type nja --since 2024-01-01

# Collect agency regulations
juris collect lagrummet --type foreskrift --limit 10

# Collect EU regulations
juris collect eur_lex --type eu_reg --since 2024-01-01

# Check collection progress
juris status

# Count collected documents
juris stats

Options

Option Description
--type TYPE Document type to collect (required)
--session SESSION Parliamentary session, e.g. 2024/25
--since DATE Collect documents from this date (YYYY-MM-DD)
--until DATE Collect documents until this date (YYYY-MM-DD)
--limit N Maximum number of documents to collect
--skip-existing / --no-skip-existing Skip already collected documents (default: on)
--skip-content / --no-skip-content Metadata only, skip full text (default: off)
--data-dir PATH Output directory (default: data)
-v, --verbose Enable debug logging

Output format

Each document is saved in two formats:

Markdown (human-readable, browsable on GitHub):

---
doc_id: "prop-2024/25:208"
doc_type: prop
title: "Ett mer heltäckande straffansvar vid angrepp på företagshemligheter"
date: "2025-09-08"
source: riksdagen
department: Justitiedepartementet
session: "2024/25"
source_url: "https://..."
---

# Ett mer heltäckande straffansvar vid angrepp på företagshemligheter

Proposition 2024/25:208

[full text...]

JSON (machine-readable, full metadata):

{
  "doc_id": "prop-2024/25:208",
  "doc_type": "prop",
  "title": "Ett mer heltäckande straffansvar...",
  "date": "2025-09-08",
  "text": "...",
  "source": "riksdagen",
  "attachments": [...]
}

Documents are organized by type and session:

data/
├── prop/
│   └── 2024-25/
│       ├── prop-2024-25_208.json
│       └── prop-2024-25_208.md
├── sou/
│   └── 2024/
├── nja/
└── .state/

Project structure

src/juris/
├── cli.py              # Command-line interface (Click)
├── models.py           # Document data models (Pydantic)
├── storage.py          # File storage (JSON + Markdown)
├── state.py            # Incremental collection state
├── pdf.py              # PDF text extraction
├── utils.py            # Shared utilities
└── collectors/
    ├── base.py         # Abstract base collector
    ├── riksdagen.py    # Riksdagen API
    ├── regeringen.py   # Regeringen.se scraper
    ├── domstol.py      # Court decisions API
    ├── jo_jk.py        # JO/JK decisions
    ├── lagrummet.py    # Agency regulations
    ├── eurlex.py       # EUR-Lex SPARQL
    ├── curia.py        # CJEU SPARQL
    └── hudoc.py        # ECtHR API

Development

# Install with dev dependencies (or use: make install)
pip install -e ".[dev]"

# Lint and format check
ruff check src/ tests/
ruff format --check src/ tests/

# Type check (strict mode)
mypy src/

# Run unit tests
pytest tests/ --ignore=tests/test_e2e.py

# Or use the Makefile shortcuts
make lint        # Lint + format check
make typecheck   # Type check
make test        # Unit tests
make format      # Auto-format code
make test-e2e    # End-to-end tests (hits live APIs)

Contributing

See CONTRIBUTING.md for development setup, coding standards, and how to add new collectors.

Please report security vulnerabilities via GitHub's private reporting — see SECURITY.md for details.

This project follows the Contributor Covenant v2.1.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

juris-0.4.1.tar.gz (159.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

juris-0.4.1-py3-none-any.whl (85.9 kB view details)

Uploaded Python 3

File details

Details for the file juris-0.4.1.tar.gz.

File metadata

  • Download URL: juris-0.4.1.tar.gz
  • Upload date:
  • Size: 159.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for juris-0.4.1.tar.gz
Algorithm Hash digest
SHA256 05e6f8a11ceced60d5b291b0ae8e89ba4be13b62ef339685b02d743d104a3aa6
MD5 268cf9b2aeb562c2873ede0545976292
BLAKE2b-256 b85fd70871cc52f3c0f374bdf2cbf8ac5c297c9777777970a9e89cc13bd3a0ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for juris-0.4.1.tar.gz:

Publisher: release.yml on niclaslindstedt/juris

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file juris-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: juris-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 85.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for juris-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 39b76872278bef6c0f26f527470a28885c6fe702bcd7389a68582b3c90cc2aa3
MD5 7bf456d5e1258d06ae025096aadb24bd
BLAKE2b-256 229b48ee4803f89fc8a5050b4793e282999bb5131a604b220b8f78a32d2cab19

See more details on using hashes here.

Provenance

The following attestation bundles were made for juris-0.4.1-py3-none-any.whl:

Publisher: release.yml on niclaslindstedt/juris

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page