Nigerian-language geo-tagging and gazetteer resolution (Pidgin, Hausa, Igbo, Yoruba, English).
Project description
SafeSignal-Geo
An open-source geolocation library for informal Nigerian place names.
"Oshodi under-bridge" · "behind Shoprite Surulere" · "Mile 2 last bus stop" · "Berger junction"
The problem
Across Nigerian apps — preventive safety, ride-hailing, last-mile logistics, journalism, humanitarian response — the place names that matter most are not in any gazetteer.
Input: "trouble for Berger this morning, dem don block road"
Want: [{name: "Berger Junction", lat: 6.5841, lng: 3.3790, confidence: 0.91}]
Existing geoparsers (Mordecai 3, Edinburgh Geoparser, Nominatim) collapse on Nigerian vernacular geography. They rely on GeoNames, which is sparse below the LGA level and has zero entries for under-bridges, junctions, last bus stops, behind-landmark references, or named markets.
SafeSignal-Geo is the first open library to treat informal Nigerian geolocation as the primary problem, not an edge case.
What you get
safesignal-geo— a Python package:pip install safesignal-geo. Ships today with a 41K-row Nigerian gazetteer, BM25 resolver, and a context-aware reranker (Top-1 0.93, P95 65 ms on CPU).safesignal-geo-base— a ~270M-parameter AfroXLMR + LoRA span tagger. Trained on the v0.2 corpus (partial F1 0.749 on v0.2 dev gold); weights publishing to Hugging Face in v0.2.0.safesignal-geo-gazetteer— the bundled Nigerian gazetteer, CC-BY 4.0. v0.2 is a 42,315-row national extract with an OSM-mined informal-place index (junctions, bus stops, motor parks, markets) for Lagos / FCT / Rivers / Kano.- A public eval leaderboard — planned for v1.0; submit your own model, see how it ranks.
Quickstart
pip install safesignal-geo
from safesignal_geo import Geo
geo = Geo() # bundled v0.2 Nigerian gazetteer (~42k records)
for hit in geo.resolve("Traffic don jam for Ojuelegba this morning."):
print(hit.canonical_name, hit.admin_state, hit.admin_lga, hit.lat, hit.lng)
# Ojuelegba Lagos Surulere 6.5093742 3.3665407
Or from the command line:
safesignal-geo resolve "incident at Computer Village, Lagos"
safesignal-geo resolve --span "Bauchi" --context "today in Bauchi state"
The library bundles the gazetteer, BM25 resolver, and a context-aware
reranker that uses Nigerian-state priors and co-mention signals. The
fine-tuned span tagger is an optional [tagger] extra; weights publish
to Hugging Face in v0.2.0.
Status
Pre-alpha. Project started May 2026. Target v1.0: August 2026.
| Milestone | Target | Status |
|---|---|---|
| Gazetteer schema + 500 Lagos seed | May 2026 | ✅ Done (42,315 rows in v0.2 — Lagos/FCT/Rivers/Kano informal-place index growing) |
| Annotation guidelines v0.1 | May 2026 | ✅ Done |
| Span tagger v0.2 (LoRA fine-tune) | June 2026 | ✅ Trained (partial F1 0.749); weights publishing in v0.2.0 |
| Resolver + reranker | July 2026 | ✅ Done (Top-1 0.93, P95 65 ms; heuristic reranker — learned cross-encoder is a v1.1 candidate) |
| Pip-installable package | July 2026 | ✅ Done (v0.1.0) |
| Public Gradio demo | July 2026 | 🚧 In progress (rewrite to call the package + HF Space deploy) |
| v1.0 public release | August 2026 | ⏳ Planned |
v0.2 numbers vs. spec
| Metric | Target | v0.2 actual | status |
|---|---|---|---|
| Span F1 (partial) | ≥ 0.82 | 0.749 | working toward |
| Top-1 resolution accuracy | ≥ 0.70 | 0.931 | pass (+0.231) |
| Top-3 recall | ≥ 0.88 | 0.970 | pass (+0.090) |
| Latency P95 (CPU) | ≤ 200 ms | 65.30 ms | pass (3.1× headroom) |
Reproducibility commands and full slice breakdowns:
docs/benchmarks/v0.2.md. Design doc:
docs/design-doc.md.
How to contribute
You don't need to be an ML engineer. The single most valuable thing you can do is add 10 places from your neighborhood to the gazetteer. That takes 30 minutes and meaningfully moves v1.0 forward.
- Add a place — open an issue with the
Add a placetemplate, or submit a PR togazetteer/contributions/. - Annotate text — once the Label Studio instance is live (Month 1), pick up annotation tasks.
- Flag an error — wrong coordinates? duplicate? missing alias? File an issue with
Flag an error. - Code & model contributions — see CONTRIBUTING.md.
All contributions are credited in the dataset's per-row source field.
Coverage
v0.2 (bundled): Nigeria-wide gazetteer (42,315 rows). Per-state informal-place counts in the four priority cities:
| city | junctions | bus_stops | motor_parks | markets |
|---|---|---|---|---|
| Lagos | 100 | 114 | 23 | 73 |
| FCT (Abuja) | 1 | 16 | 4 | 52 |
| Rivers (Port Harcourt) | 1 | 19 | 9 | 18 |
| Kano | 0 | 2 | 1 | 13 |
v1.0 (August 2026): grow informal-place rows via Nominatim retries and community contributions; OSM PBF mining was largely exhausted in v0.2.
v1.1+: Ibadan, Benin City, Onitsha, Aba, Enugu, Kaduna. Native Yoruba/Hausa/Igbo support with Masakhane collaboration. Learned cross-encoder reranker.
Scope: what this is not
- Not a routing engine.
- Not reverse geocoding (lat/lng → name).
- Not a safety / incident / crime model. SafeSignal-Geo is only geolocation. Domain logic lives downstream.
- Not trained on ACLED data (their EULA prohibits ML training).
- Not used for security-force tracking. The library does not include surveillance categories.
License stack
| Artifact | License |
|---|---|
| Code | Apache 2.0 |
| Data (gazetteer + spans) | CC-BY 4.0 |
| Model weights | OpenRAIL-M |
| Documentation | CC-BY 4.0 |
OpenRAIL-M restricts use of the weights for surveillance, military, and discriminatory applications. We chose this deliberately given Nigeria's surveillance context.
Acknowledgements
SafeSignal-Geo is incubated by Jyv Tech LLC and built around Chipon's preventive-safety platform as the anchor user. We build on top of:
- AfroXLMR — Adelani et al., 2022
- MasakhaNER — Adelani et al., 2021 (evaluation only)
- OpenStreetMap Nigeria contributors (ODbL)
- The Masakhane, Data Science Nigeria, and AfricaNLP communities
Citation
@software{safesignal_geo_2026,
author = {Tanta, Abraham Esandayinze},
title = {SafeSignal-Geo: An Open-Source Geolocation Library for Informal Nigerian Place Names},
year = {2026},
url = {https://github.com/mr-tanta/safesignal-geo},
organization = {Jyv Tech LLC}
}
Contact
- Maintainer: Abraham Esandayinze Tanta · abraham@jyvtechllc.com
- Issues: github.com/mr-tanta/safesignal-geo/issues
- Discussions: github.com/mr-tanta/safesignal-geo/discussions
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file safesignal_geo-0.2.0.tar.gz.
File metadata
- Download URL: safesignal_geo-0.2.0.tar.gz
- Upload date:
- Size: 3.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90af24c06171f1580500cba9a5bd9bf727d6b7ff44ef6ba83c802f6afc162fdf
|
|
| MD5 |
4d3921a48c2dd60bb0504ea8d167f3ca
|
|
| BLAKE2b-256 |
5aff2b55caa41b01206191ab69786384bfcfc0a1cb7237621b54dd31b95c8c3f
|
Provenance
The following attestation bundles were made for safesignal_geo-0.2.0.tar.gz:
Publisher:
publish.yml on mr-tanta/safesignal-geo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
safesignal_geo-0.2.0.tar.gz -
Subject digest:
90af24c06171f1580500cba9a5bd9bf727d6b7ff44ef6ba83c802f6afc162fdf - Sigstore transparency entry: 1585485603
- Sigstore integration time:
-
Permalink:
mr-tanta/safesignal-geo@10ff682f382ad6444655fd6a3a0b027b05872f95 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mr-tanta
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@10ff682f382ad6444655fd6a3a0b027b05872f95 -
Trigger Event:
push
-
Statement type:
File details
Details for the file safesignal_geo-0.2.0-py3-none-any.whl.
File metadata
- Download URL: safesignal_geo-0.2.0-py3-none-any.whl
- Upload date:
- Size: 3.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab5917a54d678de194d6ca26b3ddef77d84b3ff102d3e809e528feaf0335dce2
|
|
| MD5 |
77e78fc47957b3bfb9800fbee7b25fca
|
|
| BLAKE2b-256 |
945392f002b6fd0462d9e46acf80e41a42e81e00de069705eeca6e0a5585a65f
|
Provenance
The following attestation bundles were made for safesignal_geo-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on mr-tanta/safesignal-geo
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
safesignal_geo-0.2.0-py3-none-any.whl -
Subject digest:
ab5917a54d678de194d6ca26b3ddef77d84b3ff102d3e809e528feaf0335dce2 - Sigstore transparency entry: 1585485737
- Sigstore integration time:
-
Permalink:
mr-tanta/safesignal-geo@10ff682f382ad6444655fd6a3a0b027b05872f95 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mr-tanta
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@10ff682f382ad6444655fd6a3a0b027b05872f95 -
Trigger Event:
push
-
Statement type: