Skip to main content

Nigerian-language geo-tagging and gazetteer resolution (Pidgin, Hausa, Igbo, Yoruba, English).

Project description

SafeSignal-Geo

An open-source geolocation library for informal Nigerian place names.

"Oshodi under-bridge" · "behind Shoprite Surulere" · "Mile 2 last bus stop" · "Berger junction"

License: Apache 2.0 Data: CC-BY 4.0 Weights: OpenRAIL-M Status: Pre-alpha


The problem

Across Nigerian apps — preventive safety, ride-hailing, last-mile logistics, journalism, humanitarian response — the place names that matter most are not in any gazetteer.

Input:   "trouble for Berger this morning, dem don block road"
Want:    [{name: "Berger Junction", lat: 6.5841, lng: 3.3790, confidence: 0.91}]

Existing geoparsers (Mordecai 3, Edinburgh Geoparser, Nominatim) collapse on Nigerian vernacular geography. They rely on GeoNames, which is sparse below the LGA level and has zero entries for under-bridges, junctions, last bus stops, behind-landmark references, or named markets.

SafeSignal-Geo is the first open library to treat informal Nigerian geolocation as the primary problem, not an edge case.

What you get

  • safesignal-geo — a Python package: pip install safesignal-geo. Ships today with a 41K-row Nigerian gazetteer, BM25 resolver, and a context-aware reranker (Top-1 0.93, P95 65 ms on CPU).
  • safesignal-geo-base — a ~270M-parameter AfroXLMR + LoRA span tagger. Trained on the v0.2 corpus (partial F1 0.749 on v0.2 dev gold); weights publishing to Hugging Face in v0.2.0.
  • safesignal-geo-gazetteer — the bundled Nigerian gazetteer, CC-BY 4.0. v0.2 is a 42,315-row national extract with an OSM-mined informal-place index (junctions, bus stops, motor parks, markets) for Lagos / FCT / Rivers / Kano.
  • A public eval leaderboard — planned for v1.0; submit your own model, see how it ranks.

Quickstart

pip install safesignal-geo
from safesignal_geo import Geo

geo = Geo()  # bundled v0.2 Nigerian gazetteer (~42k records)

for hit in geo.resolve("Traffic don jam for Ojuelegba this morning."):
    print(hit.canonical_name, hit.admin_state, hit.admin_lga, hit.lat, hit.lng)

# Ojuelegba Lagos Surulere 6.5093742 3.3665407

Or from the command line:

safesignal-geo resolve "incident at Computer Village, Lagos"
safesignal-geo resolve --span "Bauchi" --context "today in Bauchi state"

The library bundles the gazetteer, BM25 resolver, and a context-aware reranker that uses Nigerian-state priors and co-mention signals. The fine-tuned span tagger is an optional [tagger] extra; weights publish to Hugging Face in v0.2.0.

Status

Pre-alpha. Project started May 2026. Target v1.0: August 2026.

Milestone Target Status
Gazetteer schema + 500 Lagos seed May 2026 ✅ Done (42,315 rows in v0.2 — Lagos/FCT/Rivers/Kano informal-place index growing)
Annotation guidelines v0.1 May 2026 ✅ Done
Span tagger v0.2 (LoRA fine-tune) June 2026 ✅ Trained (partial F1 0.749); weights publishing in v0.2.0
Resolver + reranker July 2026 ✅ Done (Top-1 0.93, P95 65 ms; heuristic reranker — learned cross-encoder is a v1.1 candidate)
Pip-installable package July 2026 ✅ Done (v0.1.0)
Public Gradio demo July 2026 🚧 In progress (rewrite to call the package + HF Space deploy)
v1.0 public release August 2026 ⏳ Planned

v0.2 numbers vs. spec

Metric Target v0.2 actual status
Span F1 (partial) ≥ 0.82 0.749 working toward
Top-1 resolution accuracy ≥ 0.70 0.931 pass (+0.231)
Top-3 recall ≥ 0.88 0.970 pass (+0.090)
Latency P95 (CPU) ≤ 200 ms 65.30 ms pass (3.1× headroom)

Reproducibility commands and full slice breakdowns: docs/benchmarks/v0.2.md. Design doc: docs/design-doc.md.

How to contribute

You don't need to be an ML engineer. The single most valuable thing you can do is add 10 places from your neighborhood to the gazetteer. That takes 30 minutes and meaningfully moves v1.0 forward.

  1. Add a place — open an issue with the Add a place template, or submit a PR to gazetteer/contributions/.
  2. Annotate text — once the Label Studio instance is live (Month 1), pick up annotation tasks.
  3. Flag an error — wrong coordinates? duplicate? missing alias? File an issue with Flag an error.
  4. Code & model contributions — see CONTRIBUTING.md.

All contributions are credited in the dataset's per-row source field.

Coverage

v0.2 (bundled): Nigeria-wide gazetteer (42,315 rows). Per-state informal-place counts in the four priority cities:

city junctions bus_stops motor_parks markets
Lagos 100 114 23 73
FCT (Abuja) 1 16 4 52
Rivers (Port Harcourt) 1 19 9 18
Kano 0 2 1 13

v1.0 (August 2026): grow informal-place rows via Nominatim retries and community contributions; OSM PBF mining was largely exhausted in v0.2.

v1.1+: Ibadan, Benin City, Onitsha, Aba, Enugu, Kaduna. Native Yoruba/Hausa/Igbo support with Masakhane collaboration. Learned cross-encoder reranker.

Scope: what this is not

  • Not a routing engine.
  • Not reverse geocoding (lat/lng → name).
  • Not a safety / incident / crime model. SafeSignal-Geo is only geolocation. Domain logic lives downstream.
  • Not trained on ACLED data (their EULA prohibits ML training).
  • Not used for security-force tracking. The library does not include surveillance categories.

License stack

Artifact License
Code Apache 2.0
Data (gazetteer + spans) CC-BY 4.0
Model weights OpenRAIL-M
Documentation CC-BY 4.0

OpenRAIL-M restricts use of the weights for surveillance, military, and discriminatory applications. We chose this deliberately given Nigeria's surveillance context.

Acknowledgements

SafeSignal-Geo is incubated by Jyv Tech LLC and built around Chipon's preventive-safety platform as the anchor user. We build on top of:

Citation

@software{safesignal_geo_2026,
  author  = {Tanta, Abraham Esandayinze},
  title   = {SafeSignal-Geo: An Open-Source Geolocation Library for Informal Nigerian Place Names},
  year    = {2026},
  url     = {https://github.com/mr-tanta/safesignal-geo},
  organization = {Jyv Tech LLC}
}

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

safesignal_geo-0.2.0.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

safesignal_geo-0.2.0-py3-none-any.whl (3.0 MB view details)

Uploaded Python 3

File details

Details for the file safesignal_geo-0.2.0.tar.gz.

File metadata

  • Download URL: safesignal_geo-0.2.0.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for safesignal_geo-0.2.0.tar.gz
Algorithm Hash digest
SHA256 90af24c06171f1580500cba9a5bd9bf727d6b7ff44ef6ba83c802f6afc162fdf
MD5 4d3921a48c2dd60bb0504ea8d167f3ca
BLAKE2b-256 5aff2b55caa41b01206191ab69786384bfcfc0a1cb7237621b54dd31b95c8c3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for safesignal_geo-0.2.0.tar.gz:

Publisher: publish.yml on mr-tanta/safesignal-geo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file safesignal_geo-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: safesignal_geo-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for safesignal_geo-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ab5917a54d678de194d6ca26b3ddef77d84b3ff102d3e809e528feaf0335dce2
MD5 77e78fc47957b3bfb9800fbee7b25fca
BLAKE2b-256 945392f002b6fd0462d9e46acf80e41a42e81e00de069705eeca6e0a5585a65f

See more details on using hashes here.

Provenance

The following attestation bundles were made for safesignal_geo-0.2.0-py3-none-any.whl:

Publisher: publish.yml on mr-tanta/safesignal-geo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page