Agent-first Python CLI for querying Overture Maps (overturemaps.org) data, built so LLM agents (and humans) can answer geospatial questions without GIS expertise. Forked from overturemaps-py.
Project description
botmap
A fork of overturemaps-py, rebuilt as a demonstration of agent-first CLI design.
botmap exists to demonstrate two things:
- How to design a CLI for agent use. What changes when the primary user of a command-line tool is an LLM agent rather than a person who has read the docs — and how those same changes make the tool friendlier for humans too.
- How LLMs can deliver GIS data to non-experts. Open geospatial data is abundant but gated behind domain expertise: feature types, schemas, bounding boxes, category taxonomies. Pair a well-designed CLI with an agent's natural language abilities and "how many coffee shops are in Brooklyn?" becomes a question anyone can ask, no GIS background required.
The data is Overture Maps: free and open geospatial
map data from many sources, normalized to a
common schema (see
https://docs.overturemaps.org). The upstream overturemaps CLI remains the
official Overture tool; botmap is the design experiment built on top of it.
The redesign in a nutshell
The upstream CLI is data-shaped: a single download command whose flags mirror
the storage layout.
overturemaps download --bbox=-71.068,42.353,-71.058,42.363 -t place -f geojson -o pois.geojson
Using it means already knowing which type holds your answer, having a WGS84 bounding box on hand, and reading the schema docs in another tab. Agents, and the geospatial naive, can't easily ask data-shaped questions. They ask question-shaped ones: "find hospitals in Manhattan", "what's at this lat/lon?", "coffee shops within 1km of here". So our redesign rule is:
Expose verbs that read like the question, and resolve the data-shaped parts internally.
| Design move | What it looks like |
|---|---|
| Verbs over types | botmap places --category hospital, not download -t place --where … |
| Lookups, not coordinates | --in "Brooklyn" resolves names to geometry; no hand-built bboxes |
| Discovery is a command, not a manual | themes, types, schema -t place, categories, capabilities |
| Preview before paying | count and sample size up a query before any download |
--json everywhere |
every metadata command is a clean pipe source; stdout for data, stderr for humans |
| Errors teach | zero rows for --category ferry_terminal suggests ferry_boat_company |
| Ship the context | install-skill registers a Skill so agents discover the CLI on their own |
| Measured, don't assume | an eval suite scores whether agents can actually use it |
download still exists, but it's the escape hatch now, not the front door.
Quick Start
Question-shaped, end to end:
# "Where is Brooklyn, and how many coffee shops does it have?"
botmap where "Brooklyn"
botmap count -t place --in "Brooklyn" --where categories.primary=coffee_shop
# "Get them as GeoJSON"
botmap places --in "Brooklyn" --category coffee_shop -f geojsonseq -o brooklyn_coffee.jsonl
Now this is still more confusing than it should be, especially for human users. Bot users, however, can refer to the installed skill, errors, and schema built into the CLI.
The data-shaped form still works when you have exact coordinates:
botmap download --bbox=-71.068,42.353,-71.058,42.363 -f geojson --type=building -o boston.geojson
Quick Start for Coding Agents
Install the Skill so an agent can discover this CLI automatically:
pip install botmap
botmap install-skill
Self-introspect:
botmap --json capabilities # list every subcommand + parameters
botmap --json themes # list themes
botmap --json types # list types
botmap --json schema -t place # fields + a sample feature
Resolve a place, count, then download:
botmap --json where "Boston, MA"
botmap --json count -t place --in "Boston, MA" --where categories.primary=restaurant
botmap places --in "Boston, MA" --category restaurant -f geojsonseq -o out.jsonl
Examples
Finding POIs
# All hospitals in Brooklyn
botmap places --in "Brooklyn" --category hospital -f geojsonseq -o hospitals.jsonl
# Coffee shops in Brooklyn, with high source confidence
botmap places --in "Brooklyn" --category coffee_shop --where 'confidence>0.8' \
-f geojsonseq -o brooklyn_coffee.jsonl
# Hotels in Berlin (using a country code qualifier)
botmap places --in "Berlin, DE" --category hotel -f geojsonseq -o berlin_hotels.jsonl
# Pharmacies near the Empire State Building (~250m)
botmap at 40.7484,-73.9857 -t place --category pharmacy --radius 250 -n 20
Discovering before downloading
# What categories exist in Brooklyn? (cheap; reads only places in the bbox)
botmap categories -t place --in "Brooklyn" --top 30
# How many buildings in Manhattan are at least 100m tall? Decide before downloading.
botmap count -t building --in "Manhattan" --where 'height>=100'
# Peek at five matching features before committing to the full pull
botmap sample -t building --in "Manhattan" --where 'height>=100' -n 5
# sample, at, and every convenience verb take -n/--limit to cap output
# (default: all matches). The low-level `download` command does not.
botmap places --in "Brooklyn" --category coffee_shop -n 20
Buildings with attributes
# Tall buildings in Manhattan, as GeoParquet for analytics
botmap buildings --in "Manhattan" --where 'height>150' -f geoparquet -o tall.parquet
# Skyscrapers (≥40 floors) in Chicago
botmap buildings --in "Chicago, IL" --where 'num_floors>=40' -f geojsonseq -o skyscrapers.jsonl
# Buildings of a specific subtype
botmap buildings --in "Boston, MA" --where subtype=education -f geojsonseq -o schools.jsonl
Roads and transportation
# Highways in Texas
botmap roads --in "Texas, USA" --class motorway -f geojsonseq -o tx_highways.jsonl
# Main roads (primary or secondary) in Berlin
botmap roads --in "Berlin, DE" --where "class in [primary,secondary]" \
-f geojsonseq -o berlin_main.jsonl
# Footways and cycleways in central Amsterdam
botmap roads --in "Amsterdam, NL" --where "class in [footway,cycleway]" \
-f geojsonseq -o amsterdam_paths.jsonl
# `roads` covers every transportation segment — use --class for bike paths too
botmap roads --in "Alameda County, CA" --class cycleway \
-f geojsonseq -o bikepaths.jsonl
Water and land use
# Lakes near Minneapolis
botmap water --in "Minneapolis, MN" --class lake -f geojsonseq -o lakes.jsonl
# Residential land-use polygons in Brooklyn
botmap landuse --in "Brooklyn, NY" --class residential \
-f geojsonseq -o residential.jsonl
Both water and landuse mirror roads: pass --class (e.g. ocean,
lake, river for water; commercial, residential, recreation,
agriculture for land use) or any --where filter.
Boundary polygons
# Get a division's polygon as a GeoJSON Feature (for clipping / spatial joins)
botmap boundary "Alameda County, CA" > county.geojson
# Longer form: where --geometry does the same thing
botmap where "Alameda County, CA" --geometry > county.geojson
boundary is the dedicated verb for fetching a division's polygon. It accepts
any place name that where resolves, including neighborhood+city forms like
"Brooklyn, NY". where --geometry (alias --geojson) is an equivalent
long-form. Using download -t division_area will now error with a redirect.
Address lookups
# Find a specific address (case-insensitive substring on street;
# --number / --postcode are exact). --in or --bbox is required.
botmap addresses --in "Alameda, US-CA" \
--street Fountain --number 1208
# All "Main St" addresses in a city
botmap addresses --in "Brookline, MA" --street "Main St"
# All addresses inside a small bbox over Beacon Hill
botmap addresses --bbox=-71.075,42.355,-71.060,42.365 \
-f geojsonseq -o beacon_hill_addresses.jsonl
# Address density in a neighborhood
botmap count -t address --in "Brookline, MA"
The addresses command requires --in or --bbox so queries stay
bounded — the global address dataset is too large to scan unfiltered.
--street is a case-insensitive substring match (so Fountain will
match Fountain St, Fountain Avenue, and E Fountain Blvd).
Overture's address coverage is uneven; if a known address returns no
rows, the data simply isn't there for that area yet.
Point queries
# What's at a given lat/lon (defaults to nearest POIs)
botmap at 51.5074,-0.1278 -n 5
# Which admin divisions contain this point? (innermost-first)
botmap containing 35.6762,139.6503
Composing commands
--json makes any metadata command pipeable. Use this for ad-hoc workflows or
when scripting against the CLI.
# Resolve a bbox, then download with it
BBOX=$(botmap --json where "Berlin, DE" | jq -r '.bbox | join(",")')
botmap download -t place --bbox "$BBOX" \
--where categories.primary=hotel \
-f geojsonseq -o berlin_hotels.jsonl
# Top-3 categories in a place, then dump features for each
for cat in $(botmap --json categories -t place --in "Brooklyn" --top 3 | jq -r '.[].value'); do
botmap places --in "Brooklyn" --category "$cat" \
-f geojsonseq -o "brooklyn_${cat}.jsonl"
done
# Bbox of a country, then count of all roads
COUNT=$(botmap --json count -t segment --in "Iceland" | jq '.count')
echo "Iceland has $COUNT road segments"
Multi-step agent workflow
A typical sequence an agent runs when given a layperson question like "how many coffee shops are in Brooklyn?":
# 1. Confirm the place resolves
botmap --json where "Brooklyn"
# > {"name": "Brooklyn", "subtype": "locality", "region": "US-NY", "population": 2736074, ...}
# 2. Discover the right category name
botmap --json categories -t place --in "Brooklyn" --top 50 | jq -r '.[].value' | grep -i coffee
# > coffee_shop
# 3. Count
botmap --json count -t place --in "Brooklyn" --where categories.primary=coffee_shop
# > {"count": 412, ...}
# 4. Download if needed
botmap places --in "Brooklyn" --category coffee_shop \
-f geojsonseq -o brooklyn_coffee.jsonl
Usage
where TEXT
Resolve a place name to a division feature. Returns the matched division's id,
subtype, country/region, bbox, population, and parent. --json emits a
candidates array so an ambiguous query can be re-narrowed.
Qualifier syntax: "Place, ST", "Place, US-ST", "Place, CC",
"Place, CCC", or "Place, Country Name" — e.g. all of these resolve to
Boston, US-MA: "Boston, MA", "Boston, US-MA", "Boston, US",
"Boston, USA", "Boston, United States".
botmap where "Boston, MA"
botmap where "Alameda, CA" --all # list every candidate
botmap --json where "Walnut Creek, CA, USA" | jq '.bbox'
botmap --json where "Cambridge" | jq '.candidates | length' # how many Cambridges?
Best match is picked by:
- presence of population data (real places people search for outrank thinly-documented administrative areas),
- higher population,
- innermost
admin_levelas a final tiebreaker.
When more than one candidate matches, every data command (places,
buildings, roads, addresses, count, sample, …) prints a one-line
stderr warning naming the picked division and the top alternative, pointing
at where --all for full inspection. Do not silence stderr — that warning
is the only signal that the resolver made a judgment call.
where (and all data commands) support neighborhood+city names like
"Brooklyn, NY": when the exact string isn't in the divisions index, the
resolver retries scoped to the parent locality's region, or falls back to
the parent's bbox with a yellow stderr note.
boundary TEXT
Emit a division's polygon as a GeoJSON Feature on stdout, for clipping or
spatial joins. Accepts the same place names as where.
botmap boundary "Alameda County, CA" > county.geojson
botmap boundary "Brooklyn, NY" | jq '.properties'
download -t division_area is no longer supported — boundary is the
replacement.
count
Row count for a query without downloading. The cheap preview that should
precede any download.
botmap count -t place --in "Boston, MA"
botmap --json count -t place --in "Boston, MA" --where categories.primary=restaurant
sample
Emit the first N features matching a query. Defaults to geojsonseq and N=10.
botmap sample -t building --in "Brooklyn" --where 'height>100' -n 5
botmap sample -t place --in "Brooklyn" --where categories.primary=coffee_shop -n 3
themes, types, schema
Introspect what's queryable.
botmap themes # 6 themes with one-line descriptions
botmap types --theme buildings # 2 types in this theme
botmap --json schema -t place # full field list + a sample feature
categories -t place
Enumerate categories.primary values (with counts) for a place-scoped region.
botmap categories -t place --in "Brooklyn" --top 20
botmap --json categories -t place --in "Manhattan" --top 50 | jq -r '.[] | "\(.count)\t\(.value)"'
capabilities
Emit a machine-readable manifest of all subcommands with their parameters. Agents read this once to learn the CLI surface.
botmap --json capabilities | jq '.commands[].name'
places, buildings, roads, addresses, water, landuse
Intent verbs that wrap download with a familiar shape. Each accepts either
--in "Place Name" (resolved via the divisions index) or --bbox xmin,ymin,xmax,ymax.
--category / --class / --street desugar to common --where filters,
and --where is still available for advanced predicates. water and landuse
take --class just like roads. Running download -t TYPE for a type covered
by one of these verbs prints a one-line stderr tip pointing at the verb. All
data verbs accept a trailing --json flag silently (they already emit GeoJSON).
Transit stops (bus_stop, bus_station, train_station) are place features —
download -t infrastructure --where class=bus_stop will error and redirect to
places --category bus_stop.
# POIs by category (named place)
botmap places --in "Brooklyn" --category hospital -f geojsonseq -o hospitals.jsonl
# POIs by category (manual bbox — skip the named-place lookup)
botmap places --bbox=-122.295,37.778,-122.265,37.800 --category coffee_shop
# Buildings filtered by attribute
botmap buildings --in "Manhattan" --where 'height>150' -f geojsonseq -o tall.jsonl
botmap buildings --in "Boston, MA" --where 'num_floors>=10' --where 'height>30' -f geoparquet -o tall.parquet
# Roads by class
botmap roads --in "Texas, US" --class motorway -f geojsonseq -o tx_highways.jsonl
botmap roads --in "Berlin, DE" --where "class in [primary,secondary]" -f geojsonseq -o berlin_main.jsonl
# Addresses by street (case-insensitive substring on --street; --number / --postcode are exact)
botmap addresses --in "Alameda, US-CA" --street Fountain --number 1234
botmap addresses --in "Brookline, MA" --street "Main St"
# Water and land use by class
botmap water --in "Minneapolis, MN" --class lake -f geojsonseq -o lakes.jsonl
botmap landuse --in "Brooklyn, NY" --class residential -f geojsonseq -o zoning.jsonl
places includes a zero-result hint: when --category X (or
--where categories.primary=X) returns 0 rows AND that value isn't
present in the bbox, the CLI scans the bbox once for the live category
list and emits a stderr suggestion of up to 3 near-matches drawn from
what's actually there. So --category ferry_terminal in a bbox where
only ferry_boat_company exists yields:
[botmap] 0 rows. No place has categories.primary='ferry_terminal' in
this bbox. Did you mean: ferry_boat_company? Run `botmap categories
-t place --bbox …` to see the full list.
This means agents typically don't need to round-trip through categories
themselves; the hint surfaces the right value automatically.
at LAT,LON
Nearest-neighbor lookup at a point. Defaults to -t place and -n 10. The
--radius (meters) controls how far out to search; per-type defaults are
100 m for place, 50 m for building, 25 m for address. --where
filters apply just like the intent verbs, so this is the right command for
"X near a point."
botmap at 40.7484,-73.9857 # POIs near the Empire State Building
botmap at 37.8270,-122.4230 -t place \
--radius 1500 --where "categories.primary=restaurant" -n 5
botmap at 51.5074,-0.1278 -t building -n 3
Use at … --where … instead of constructing a manual bbox + download.
It's the dedicated proximity primitive and returns features sorted by
distance.
containing LAT,LON
Which admin divisions contain this point, innermost-first.
botmap containing 42.3601,-71.0589
botmap --json containing 35.6762,139.6503 | jq -r '.[] | "\(.subtype)\t\(.name)"'
install-skill
Install the agent-discoverable Skill for Claude Code and/or write an
AGENTS.md section so coding agents will reach for this CLI when a user's
question implies geospatial data.
botmap install-skill # interactive
botmap install-skill --target claude-user --yes # scripted
botmap install-skill --target agents-md --yes # writes ./AGENTS.md
cache info|clear|build
The first --in or containing call builds an on-disk divisions index under
$XDG_CACHE_HOME/botmap/ (default ~/.cache/botmap/). The index
is keyed by Overture release and rebuilds automatically when the latest
release changes; these commands let you inspect or force the lifecycle.
botmap cache info # path, current release, up-to-date status
botmap cache build # force a rebuild against the latest release
botmap cache clear # remove all cached index files
gers [UUID]
Look up an ID in the GERS Registry. If the feature is present in the latest release, it will download the feature and write it out in the specified format.
Command-line options:
-f("geojson", "geojsonseq", "geoparquet"): output format, defaults to geojsonseq for a single feature on one line.--output/-o(optional): Location of output file. When omitted output will be written to stdout.--connect_timeout(optional): Socket connection timeout, in seconds. If omitted, the AWS SDK default value is used (typically 1 second).--request_timeout(optional): Socket read timeouts on Windows and macOS, in seconds. If omitted, the AWS SDK default value is used (typically 3 seconds). This option is ignored on non-Windows, non-macOS systems.
Python API
botmap is also a Python library. Import directly from botmap to query Overture data
without using the CLI.
Place-name geocoding
resolve(name) returns all matching divisions; best_match(name) returns the top
pick. Both read a small on-disk index that builds lazily on first call.
from botmap import best_match, resolve
pick = best_match("Boston, MA")
print(pick.name, pick.region, pick.bbox)
# Boston US-MA (-71.19, 42.23, -70.80, 42.40)
# Disambiguate manually
all_bostons = resolve("Boston")
for d in all_bostons:
print(d.name, d.region, d.population)
Counting before downloading
count_rows returns the row count for a query without streaming data.
from botmap import best_match, count_rows
division = best_match("Brooklyn")
n = count_rows("place", bbox=division.bbox, stac=True)
print(f"Brooklyn has {n:,} places")
Installation
Install botmap from PyPI using pip:
pip install botmap
If you have uv installed, you can run botmap with uvx without installing it:
uvx botmap count -t building --in "Boston, MA"
(The upstream overturemaps package is additionally available via Homebrew and
conda-forge; botmap is currently PyPI-only.)
Agent-Usability Evals
The eval suite measures whether an AI agent can answer real geospatial questions
using the CLI's high-level verbs — without falling back to the low-level download
command and without triggering CLI errors. The goal is to drive download usage
toward zero for any question a convenience verb already covers.
Running the evals
Requires the claude CLI on PATH and network access to Overture S3. The first run
warms the divisions index cache (one-time, ~30 seconds).
# Full batch: 10 questions × 2 repeats
uv run python -m evals.runner --model sonnet
uv run python -m evals.score
uv run python -m evals.synthesize --model opus
# Single-question smoke test (cheap sanity check)
uv run python -m evals.runner --smoke --model sonnet
uv run python -m evals.score
Each run produces three artifacts:
| Artifact | What it contains |
|---|---|
evals/runs/<id>__r<n>/transcript.jsonl |
Full Claude Code session transcript |
evals/runs/<id>__r<n>/shim.log |
Every botmap call with exit codes |
evals/runs/<id>__r<n>/record.json |
Scored metrics for that run |
evals/report.md |
Ranked failure clusters + per-question rates |
evals/proposals.json |
Concrete CLI/skill/docs improvement proposals |
Question bank
Questions live in evals/questions.yaml and are organized into five tiers of
increasing complexity:
| Tier | What it tests |
|---|---|
| 1 | Single-verb lookups (where, count) |
| 2 | Filtered downloads with attribute predicates |
| 3 | Point-query primitives (at, containing) |
| 4 | Types with no convenience verb — download is the right answer |
| 5 | Multi-layer spatial joins requiring two verbs plus in-process computation |
Each question carries a download_is_legitimate flag. When false, any
download call is scored as an agent failure. When true (tier 4 questions
with no convenience verb), a download is a coverage-gap candidate — a signal
to add a new verb rather than a failure to penalize the agent.
Adding questions
Add an entry to evals/questions.yaml:
- id: my-new-question # stable slug, no '__'
question: "Natural-language prompt handed verbatim to the agent"
tier: 2
download_is_legitimate: false
target_type: place
place: "Brooklyn, US-NY" # optional; used by the cost guard to bound S3 reads
notes: "Ideal path: ..."
Reading the output
evals/report.md summarises every run after just eval completes. The key
columns in the per-question table:
- Download — fraction of runs where any
downloadwas issued - Unnecessary DL — fraction where
downloadwas used when a verb existed - Error — fraction where at least one CLI call exited non-zero
- Completed — fraction where the agent produced a final answer
evals/proposals.json contains LLM-generated, evidence-backed suggestions
(targeting cli, skill, docs, or hint) derived from the failure clusters.
How it works
The runner sets up an isolated working directory per run, installs the Overture
skill so the agent can discover the CLI, and puts a logging shim first on PATH.
The shim intercepts every botmap call, records the arguments and exit
code to shim.log, then forwards the call to the real binary. After all runs
complete, the scorer reads each shim.log and transcript to produce
record.json, and the synthesizer aggregates those records into the final report
and proposals.
Development
uv sync
uv run pytest tests/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file botmap-0.1.2.tar.gz.
File metadata
- Download URL: botmap-0.1.2.tar.gz
- Upload date:
- Size: 255.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdf990eee12fb0391e18d1da9195b86616d232274a2821ea06c4a8f1452d82a2
|
|
| MD5 |
e16d44ca1fd884851ffa08f6ac6c8c56
|
|
| BLAKE2b-256 |
57c926983c8ebc756cab8d8ef3212f688df01ae68be4768ef8e6e9de363a4e23
|
File details
Details for the file botmap-0.1.2-py3-none-any.whl.
File metadata
- Download URL: botmap-0.1.2-py3-none-any.whl
- Upload date:
- Size: 62.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dbfb12a2669d19ea0d94bb9696722b2b813eac85ac6936ce9e7fdc580af5f59
|
|
| MD5 |
698d846ebfdab015b575d66a8786eb6b
|
|
| BLAKE2b-256 |
503877d2e81763bec5c6e1e9600d18aa4b1a05052aa3908ce083a782887d32d8
|