Skip to main content

OpenStreetMap Stats Generator: Commandline

Project description

osmsg

CI Docker PyPI Python License: MIT Ruff uv Container

OpenStreetMap Stats Generator. A tiny CLI (and Python library) that turns OSM history into per-user counts of nodes, ways, and relations created, modified, or deleted, written to parquet, csv, json, markdown, or Postgres.

A Project of OSGeo Nepal.

What does it do?

  • Per-user create/modify/delete counts over any time window.
  • Tag and hashtag breakdowns (e.g. building, #hotosm).
  • Country and custom-boundary filters via Geofabrik.
  • Cron-friendly resume with --update.
  • One-command setup: osmsg --insert loads all history into your store, osmsg --update keeps it current.
  • Outputs you can query: parquet, csv, json, markdown, DuckDB, Postgres.
  • Cloud-native history: months covered by a published parquet dataset are read remotely.

Install

Pick the one that fits how you work.

uvx --from osmsg osmsg --last hour       # zero-install, one-shot run
pip install osmsg                        # into your project
uv tool install osmsg                    # standalone CLI
docker run --rm -v "$PWD:/work" -w /work ghcr.io/osgeonepal/osmsg:latest --last hour

uvx can run osmsg in a throwaway environment , no install, no virtualenv to manage. Works with any flag combination, e.g. uvx --from osmsg osmsg --last hour --tags building --summary -f parquet -f markdown.

More ways to install:

conda install -c conda-forge osmsg                 # conda / mamba
brew install osgeonepal/tap/osmsg          # macOS / Linux (Homebrew tap)

On Windows, download osmsg.exe from the latest release and run it directly, no Python required.

Quick start

osmsg --last hour                        # planet, last hour
osmsg --last day --tags building         # last day with a tag breakdown
osmsg --hashtags hotosm --last day       # only changesets tagged #hotosm

That's it. A stats.duckdb and a stats.parquet show up in your current folder.

Set up a full history store

Two commands give you a complete, self-updating store. The first loads all of OSM history from the published dataset and records where to resume; the second catches up to now and runs on a schedule.

osmsg --insert            # load all history into stats.duckdb, then exit
osmsg --update            # catch up to now (repeat on cron)

osmsg clears the multi-week backlog on day diffs, then refines to finer diffs as the store stays current. For near-real-time, run osmsg --update --url minute.

Pick your store with one flag. DuckDB is the default (stats.duckdb); add a DSN for Postgres:

osmsg --insert --psql-dsn "postgresql://user:pass@localhost/osmsg"
osmsg --update --psql-dsn "postgresql://user:pass@localhost/osmsg"

Load only a slice with --start/--end; --update then continues from the end of that slice:

osmsg --insert --start 2020-01-01 --end 2023-01-01

Already have the planet files? Insert from them directly:

osmsg --insert --osh-file history-latest.osh.pbf --changeset-file changesets-latest.osm.bz2

Tutorials

1. Stats for a country

osmsg --country nepal --last day

--country resolves through Geofabrik and needs an OSM account. Set OSM_USERNAME and OSM_PASSWORD in your shell or a .env file:

export OSM_USERNAME=you
export OSM_PASSWORD=secret

2. A custom date range with summaries

osmsg --start "2026-04-01" --end "2026-04-08" \
      --tags building --tags highway --summary

--summary adds a daily rollup file alongside the per-changeset stats.

3. Run on a schedule

osmsg --country nepal --update           # picks up where the last run stopped

Drop that into cron or a GitHub Actions schedule. State is stored inside the DuckDB file, so reruns are safe.

4. Query the output

duckdb stats.duckdb -c "SELECT username, SUM(nodes_created) AS n
                        FROM users JOIN changeset_stats USING (uid)
                        GROUP BY username ORDER BY n DESC LIMIT 10"

Same schema in DuckDB and Postgres: users, changesets, changeset_stats, state.

5. Run the API

Push stats into Postgres, then start the Litestar API:

osmsg --last day --format psql --psql-dsn "postgresql://user:pass@localhost/osmsg"
litestar --app api.app:app run --host 0.0.0.0 --port 8000
GET /health
GET /api/v1/user-stats?start=2026-05-01T00:00:00Z&end=2026-05-02T00:00:00Z
GET /docs

For self-hosting with Docker Compose and systemd, see docs/infra.md.

6. Use it as a library

from datetime import datetime, UTC
from osmsg import RunConfig, run

result = run(RunConfig(
    name="nepal",
    countries=["nepal"],
    start_date=datetime(2026, 4, 25, tzinfo=UTC),
    end_date=datetime(2026, 4, 26, tzinfo=UTC),
))
print(result["files"]["parquet"])

Same pipeline as the CLI.

7. Long flag lists? Use a config

osmsg --config nepal.yaml

Any flag works as a YAML key. See docs/Manual.md for the full list.

Output formats

Every run writes stats.duckdb (or <--name>.duckdb) plus the formats you ask for via -f parquet|csv|json|markdown|psql. Parquet is the default. Open it with duckdb, polars, pandas, anything.

Rerunning the same query with a different -f re-exports from the existing <name>.duckdb instead of refetching, so adding a format is instant. Pass --overwrite to force a fresh recompute.

Configuration

Every meaningful flag has a matching OSMSG_* env var so the CLI, a .env file, and a docker-compose environment: block all reach the same setting. CLI flag wins over env var.

CLI flag Env var Default Notes
--name OSMSG_NAME stats Output basename; sets <name>.duckdb.
--country OSMSG_COUNTRY unset Geofabrik region id(s). Comma-separated when set via env.
--boundary OSMSG_BOUNDARY unset GeoJSON path or inline GeoJSON.
--url OSMSG_URL minute minute/hour/day shortcut or full URL. Comma-separated when set via env.
--workers OSMSG_WORKERS cpu count Parallel workers.
--cache-dir OSMSG_CACHE_DIR platform cache Where downloaded OSM files are kept across runs.
--output-dir OSMSG_OUTPUT_DIR . Where <name>.duckdb and exports are written.
--format / -f OSMSG_FORMAT parquet Repeat for multiple. Comma-separated when set via env.
--overwrite (none) off Recompute even if <name>.duckdb already holds this exact query.
--psql-dsn OSMSG_PSQL_DSN unset libpq DSN for -f psql.
--psql-bulk OSMSG_PSQL_BULK off Faster first full load to Postgres.
--history / --no-history OSMSG_HISTORY on Read covered months from the published dataset.
--history-url OSMSG_HISTORY_URL osmsg-history Published dataset location.
--insert (none) off Load history into the store and seed resume, then exit. No window loads all of it.
--osh-file / --changeset-file (none) unset Insert from local planet history + changeset files instead of the dataset.
--changeset-pad-hours OSMSG_CHANGESET_PAD_HOURS 1 See below.
(auto-bootstrap on --update) OSMSG_BOOTSTRAP hour hour, day, or week. Used when --update runs against an empty DB.
(auto-bootstrap on --update) OSMSG_BOOTSTRAP_DAYS unset Integer N; overrides OSMSG_BOOTSTRAP.
OSM credentials (Geofabrik) OSM_USERNAME, OSM_PASSWORD unset Required only when a Geofabrik URL is in use.

A .env file at the working directory is loaded automatically.

Maintainers

Generating and publishing the history dataset is the osmsg maintain group:

osmsg maintain month 2026-06 --repo osgeonepal/osmsg-history   # append one finished month
osmsg maintain month 2026-06 --no-upload                       # generate locally, review, upload later
osmsg maintain convert history.osh.pbf changesets.osm.bz2 2005-01-01 2026-06-01 work --parts 24
osmsg maintain publish work/out --repo osgeonepal/osmsg-history

See experiments/parquet-history for the full-history batch.

Documentation

Contributing

Pull requests are welcome. Quick path:

git clone https://github.com/osgeonepal/osmsg && cd osmsg
git switch develop
uv sync
uv run pre-commit install
uv run pytest -m "not network"

Please read CONTRIBUTING.md and the Code of Conduct before opening a PR. Use Conventional Commits (cz commit).

License

MIT © OSGeo Nepal contributors.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osmsg-1.2.1.tar.gz (53.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

osmsg-1.2.1-py3-none-any.whl (67.8 kB view details)

Uploaded Python 3

File details

Details for the file osmsg-1.2.1.tar.gz.

File metadata

  • Download URL: osmsg-1.2.1.tar.gz
  • Upload date:
  • Size: 53.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for osmsg-1.2.1.tar.gz
Algorithm Hash digest
SHA256 24af7bc802d8f002badaf51379b260a53fb33023e41346f58bf02159c970dd91
MD5 53de493b1e1fb704d144a23d6f1b3d22
BLAKE2b-256 17c16750bdc767e9c1c7d17e48982b91296ac68f80a1189c70dc90046f143fdb

See more details on using hashes here.

File details

Details for the file osmsg-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: osmsg-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 67.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for osmsg-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a0427dce830ca85878bd54d4caa6fa21846bcc8077c09dcca286d8e8b6edf8bb
MD5 af2b16fe7d504a18a15da8582f044473
BLAKE2b-256 0a7caff77adced3d0cbcec4eac02a3875eb52e44304de86ac621cec4b36d2848

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page