BAG tools
Project description
bag-nl
bag-nl is a small toolkit for importing Dutch BAG (Basisregistratie Adressen en Gebouwen) extracts into a SQLite (or other SQLAlchemy-compatible) database.
It focuses on a fast, one-way ETL process: read BAG ZIP archives, normalize a subset of the data, and write it into a relational schema that is easier to query from Python or SQL.
Features
-
Command-line interface:
bag-nl import ...for one-shot imports. -
SQLite support out of the box; any SQLAlchemy-supported database can be used via a DSN.
-
Imports the following BAG object types into normalized tables:
- Nummeraanduidingen (
bag_num) - Openbare ruimte (
bag_opr) - Panden (
bag_pnd) - Verblijfsobjecten (
bag_vbo) - Adressen met status (
bag_sta) - Ligplaatsen (
bag_lig) - Woonplaatsen (
bag_wpl) - Woonplaats–gemeente relaties (
bag_wpl_gem) - Relaties verblijfsobject–pand (
bag_vbo_pnd) - Gebruiksdoelen van verblijfsobjecten (
bag_vbo_purpose) - A shared text code table (
bag_text) containing status and purpose codes
- Nummeraanduidingen (
-
Calculates simple tile codes from geometry to speed up spatial prefix queries.
-
Uses SQLAlchemy ORM so you can query the resulting database directly from Python.
Installation
This package requires Python 3.12 or later.
Install from source:
git clone https://git.leptonix.net/bag-nl.git
cd bag-nl
pip install .
Or install using pip directly from the remote if your environment allows:
pip install git+https://git.leptonix.net/bag-nl.git
The installer pulls in the runtime dependencies:
corylus(XML parsing and time utilities)dpath(deep dictionary access)SQLAlchemy >= 2tqdm
Getting BAG Data
bag-nl expects BAG extract ZIP archives in the current working directory when you run the import.
Roughly:
-
For most collections, it looks for files matching:
9999*<TYPE>*
where
<TYPE>is the uppercase prefix of the collection name (for example,VBO,PND,NUM,OPR,STA,LIG,WPL). -
For the
wpl_gemcollection, it looks for files matching:GEM-WPL*
The exact filenames and formats follow the official BAG delivery conventions. If the expected files are not present, the importer will fail with an error.
Obtain the official BAG extracts from the Dutch Kadaster or your data provider, then place the relevant ZIP files in a directory and run bag-nl from there.
Usage
Command-line Interface
The installed console script is bag-nl.
Basic usage:
bag-nl import /path/to/bag.db
This will:
- Create (or open) the SQLite database at
/path/to/bag.db. - Create all necessary tables if they do not exist.
- Import all default collections.
Database Argument
The db parameter can be:
-
A SQLite file path, for example:
bag-nl import data/bag.sqlite
-
A full SQLAlchemy DSN for another database backend, for example:
bag-nl import "postgresql+psycopg2://user:pass@host/dbname"
When you pass a plain path without ://, it is treated as a SQLite file.
Selecting Collections
By default, the following collections are imported:
textwplwpl_gemoprnumvbovbo_purposepndvbo_pndstalig
You can restrict the import using the --collections option:
# Import only numbers and public spaces
bag-nl import bag.db --collections num opr
# Comma-separated syntax also works
bag-nl import bag.db --collections vbo,num,pnd
Both forms can be mixed; internally, the option is normalized to a list of collection names.
Test Mode
Use --test to limit how many rows are imported per collection.
-
Without a value: imports a small sample (currently 10 rows per collection):
bag-nl import bag.db --test
-
With an integer: imports at most N rows per collection:
bag-nl import bag.db --test 1000
This is useful for quickly validating that the pipeline works on your environment.
Verbose Error Output
Add --verbose to see more detailed exception information when the import fails:
bag-nl import bag.db --collections vbo --verbose
Without --verbose, only the exception message is printed; with --verbose, the full Python representation (including type and traceback location) is shown.
Database Schema Overview
All tables live in a single schema, with simple integer primary keys and some indexes for common queries.
A brief overview:
-
bag_text
Simple mapping between text codes (status, usage types, etc.) and integer IDs used throughout the schema. -
bag_num
BAG address numbers (nummeraanduidingen). Fields include:id(primary key)postcodenumberextralettertypeopr(linked tobag_opr.id)date(julian day of validity start)
-
bag_opr
Public spaces (openbare ruimten):id,name,type,status,wpl,date
-
bag_pnd
Buildings (panden):id,geo,year,status,date
-
bag_vbo
Verblijfsobjecten (occupiable units):id,num(main address),geo,area,tile,date
-
bag_sta/bag_lig
Addresses with status, for addresses on land and water:id,num,status,geo,tile,date
-
bag_vbo_pnd
Relationship between verblijfsobjecten and panden:- composite key
(vbo, pnd)
- composite key
-
bag_vbo_purpose
Relationship between verblijfsobjecten and usage purposes:- composite key
(vbo, purpose), wherepurposereferencesbag_text.id.
- composite key
-
bag_wpl
Woonplaatsen (localities):id,name,geo,tile,date
-
bag_wpl_gem
Relationship between woonplaatsen and gemeenten:- composite key
(wpl, gem)
- composite key
Geometry fields (geo) are stored as text using an internal, compact representation derived from the BAG XML geometries. Tile fields (tile) are short string prefixes representing location buckets, intended to speed up prefix filtering.
Programmatic Use
You can also import and use the components directly from Python:
from bag_nl.importer import importer
from bag_nl.models import BagNum, BagVbo
from sqlalchemy.orm import Session
# Run an import
importer(db="bag.sqlite", collections=["num", "vbo"])
# Query results
from bag_nl.db import get_session_factory
SessionFactory = get_session_factory("bag.sqlite")
with SessionFactory() as session:
some_address = session.query(BagNum).filter_by(postcode="1234AB", number=1).first()
print(some_address)
This flows through the same schema and importer as the CLI.
Limitations and Caveats
- The tool assumes specific BAG XML structures and file naming conventions. Changes in upstream BAG release formats may require code adjustments.
- There is no upsert logic; running the importer twice into the same database can produce integrity errors from conflicting primary keys.
- Geometry is stored as unnamed, internal textual encodings; there is no GeoJSON or WKT export built in.
- Some parts of the parsing (for example, status codes and usage purposes) rely on a fixed internal table of strings. Unknown strings will cause failures.
Use this tool as an ETL helper for your own BAG workflows, and be prepared to adapt the importer or models as Kadaster changes their formats.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bag_nl-0.3.0.tar.gz.
File metadata
- Download URL: bag_nl-0.3.0.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
011c085d565f53af92eba28bcab78b7b0d08378129afc85178b4d7b44a9f9638
|
|
| MD5 |
0f154d02ad296649b041caafee3237f9
|
|
| BLAKE2b-256 |
8b2a08b4c0f149a3fc8f46490d221e6536ecb2670c4528a5eac00d0086787409
|
File details
Details for the file bag_nl-0.3.0-py3-none-any.whl.
File metadata
- Download URL: bag_nl-0.3.0-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afce7cc9d8c2973eec14a925d28ec39d97e8e9bc9fc1a9bbec6ee36949e37c7d
|
|
| MD5 |
b39167252102b23e247b4f5c2c40d83b
|
|
| BLAKE2b-256 |
857fe8f4c65e6ac30fcb95eaadd707faa510834fe9f07f66b69172a882b2c12f
|