Skip to main content

BAG tools

Project description

bag-nl

bag-nl is a small toolkit for importing Dutch BAG (Basisregistratie Adressen en Gebouwen) extracts into a SQLite (or other SQLAlchemy-compatible) database.

It focuses on a fast, one-way ETL process: read BAG ZIP archives, normalize a subset of the data, and write it into a relational schema that is easier to query from Python or SQL.

Features

  • Command-line interface: bag-nl import ... for one-shot imports.

  • SQLite support out of the box; any SQLAlchemy-supported database can be used via a DSN.

  • Imports the following BAG object types into normalized tables:

    • Nummeraanduidingen (bag_num)
    • Openbare ruimte (bag_opr)
    • Panden (bag_pnd)
    • Verblijfsobjecten (bag_vbo)
    • Adressen met status (bag_sta)
    • Ligplaatsen (bag_lig)
    • Woonplaatsen (bag_wpl)
    • Woonplaats–gemeente relaties (bag_wpl_gem)
    • Relaties verblijfsobject–pand (bag_vbo_pnd)
    • Gebruiksdoelen van verblijfsobjecten (bag_vbo_purpose)
    • A shared text code table (bag_text) containing status and purpose codes
  • Calculates simple tile codes from geometry to speed up spatial prefix queries.

  • Uses SQLAlchemy ORM so you can query the resulting database directly from Python.

Installation

This package requires Python 3.12 or later.

Install from source:

git clone https://git.leptonix.net/bag-nl.git
cd bag-nl
pip install .

Or install using pip directly from the remote if your environment allows:

pip install git+https://git.leptonix.net/bag-nl.git

The installer pulls in the runtime dependencies:

  • corylus (XML parsing and time utilities)
  • dpath (deep dictionary access)
  • SQLAlchemy >= 2
  • tqdm

Getting BAG Data

bag-nl expects BAG extract ZIP archives in the current working directory when you run the import.

Roughly:

  • For most collections, it looks for files matching:

    • 9999*<TYPE>*

    where <TYPE> is the uppercase prefix of the collection name (for example, VBO, PND, NUM, OPR, STA, LIG, WPL).

  • For the wpl_gem collection, it looks for files matching:

    • GEM-WPL*

The exact filenames and formats follow the official BAG delivery conventions. If the expected files are not present, the importer will fail with an error.

Obtain the official BAG extracts from the Dutch Kadaster or your data provider, then place the relevant ZIP files in a directory and run bag-nl from there.

Usage

Command-line Interface

The installed console script is bag-nl.

Basic usage:

bag-nl import /path/to/bag.db

This will:

  1. Create (or open) the SQLite database at /path/to/bag.db.
  2. Create all necessary tables if they do not exist.
  3. Import all default collections.

Database Argument

The db parameter can be:

  • A SQLite file path, for example:

    bag-nl import data/bag.sqlite
    
  • A full SQLAlchemy DSN for another database backend, for example:

    bag-nl import "postgresql+psycopg2://user:pass@host/dbname"
    

When you pass a plain path without ://, it is treated as a SQLite file.

Selecting Collections

By default, the following collections are imported:

  • text
  • wpl
  • wpl_gem
  • opr
  • num
  • vbo
  • vbo_purpose
  • pnd
  • vbo_pnd
  • sta
  • lig

You can restrict the import using the --collections option:

# Import only numbers and public spaces
bag-nl import bag.db --collections num opr

# Comma-separated syntax also works
bag-nl import bag.db --collections vbo,num,pnd

Both forms can be mixed; internally, the option is normalized to a list of collection names.

Test Mode

Use --test to limit how many rows are imported per collection.

  • Without a value: imports a small sample (currently 10 rows per collection):

    bag-nl import bag.db --test
    
  • With an integer: imports at most N rows per collection:

    bag-nl import bag.db --test 1000
    

This is useful for quickly validating that the pipeline works on your environment.

Verbose Error Output

Add --verbose to see more detailed exception information when the import fails:

bag-nl import bag.db --collections vbo --verbose

Without --verbose, only the exception message is printed; with --verbose, the full Python representation (including type and traceback location) is shown.

Database Schema Overview

All tables live in a single schema, with simple integer primary keys and some indexes for common queries.

A brief overview:

  • bag_text
    Simple mapping between text codes (status, usage types, etc.) and integer IDs used throughout the schema.

  • bag_num
    BAG address numbers (nummeraanduidingen). Fields include:

    • id (primary key)
    • postcode
    • number
    • extra
    • letter
    • type
    • opr (linked to bag_opr.id)
    • date (julian day of validity start)
  • bag_opr
    Public spaces (openbare ruimten):

    • id, name, type, status, wpl, date
  • bag_pnd
    Buildings (panden):

    • id, geo, year, status, date
  • bag_vbo
    Verblijfsobjecten (occupiable units):

    • id, num (main address), geo, area, tile, date
  • bag_sta / bag_lig
    Addresses with status, for addresses on land and water:

    • id, num, status, geo, tile, date
  • bag_vbo_pnd
    Relationship between verblijfsobjecten and panden:

    • composite key (vbo, pnd)
  • bag_vbo_purpose
    Relationship between verblijfsobjecten and usage purposes:

    • composite key (vbo, purpose), where purpose references bag_text.id.
  • bag_wpl
    Woonplaatsen (localities):

    • id, name, geo, tile, date
  • bag_wpl_gem
    Relationship between woonplaatsen and gemeenten:

    • composite key (wpl, gem)

Geometry fields (geo) are stored as text using an internal, compact representation derived from the BAG XML geometries. Tile fields (tile) are short string prefixes representing location buckets, intended to speed up prefix filtering.

Programmatic Use

You can also import and use the components directly from Python:

from bag_nl.importer import importer
from bag_nl.models import BagNum, BagVbo
from sqlalchemy.orm import Session

# Run an import
importer(db="bag.sqlite", collections=["num", "vbo"])

# Query results
from bag_nl.db import get_session_factory

SessionFactory = get_session_factory("bag.sqlite")
with SessionFactory() as session:
    some_address = session.query(BagNum).filter_by(postcode="1234AB", number=1).first()
    print(some_address)

This flows through the same schema and importer as the CLI.

Limitations and Caveats

  • The tool assumes specific BAG XML structures and file naming conventions. Changes in upstream BAG release formats may require code adjustments.
  • There is no upsert logic; running the importer twice into the same database can produce integrity errors from conflicting primary keys.
  • Geometry is stored as unnamed, internal textual encodings; there is no GeoJSON or WKT export built in.
  • Some parts of the parsing (for example, status codes and usage purposes) rely on a fixed internal table of strings. Unknown strings will cause failures.

Use this tool as an ETL helper for your own BAG workflows, and be prepared to adapt the importer or models as Kadaster changes their formats.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bag_nl-0.3.0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bag_nl-0.3.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file bag_nl-0.3.0.tar.gz.

File metadata

  • Download URL: bag_nl-0.3.0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for bag_nl-0.3.0.tar.gz
Algorithm Hash digest
SHA256 011c085d565f53af92eba28bcab78b7b0d08378129afc85178b4d7b44a9f9638
MD5 0f154d02ad296649b041caafee3237f9
BLAKE2b-256 8b2a08b4c0f149a3fc8f46490d221e6536ecb2670c4528a5eac00d0086787409

See more details on using hashes here.

File details

Details for the file bag_nl-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: bag_nl-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for bag_nl-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 afce7cc9d8c2973eec14a925d28ec39d97e8e9bc9fc1a9bbec6ee36949e37c7d
MD5 b39167252102b23e247b4f5c2c40d83b
BLAKE2b-256 857fe8f4c65e6ac30fcb95eaadd707faa510834fe9f07f66b69172a882b2c12f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page