Bayesian inference engine for geographic place guessing

These details have not been verified by PyPI

Project links

Project description

⚡ Atlas GMP Engine

Bayesian inference engine for geographic place guessing

The AI brain powering GuessMyPlace —
identifies any place on Earth through intelligent yes/no questions.

What is Atlas GMP Engine?

Atlas GMP Engine is a standalone Python package implementing a Bayesian inference system for geographic place identification. Given a dataset of places (countries, cities, landmarks, etc.) and a bank of yes/no questions, the engine:

Maintains a probability distribution across all places
Selects the most informative next question using information gain + Bayesian scoring
Updates probabilities after each answer using likelihood multipliers
Eliminates low-probability candidates through soft filtering
Returns a confident prediction with accuracy metrics

Live performance: ~94% accuracy on 115 world countries, averaging 10 questions per game.

How It Works

                    ┌──────────────────────────────────────────┐
                    │            Atlas GMP Engine               │
                    │                                          │
  User Answer ──→  │  ProbabilityManager                      │
                    │    ↓  Bayesian likelihood updates         │
                    │  BayesianNetwork                         │
                    │    ↓  Belief propagation across attrs     │
                    │  InformationGain  ←── FeatureImportance  │
                    │    ↓  Shannon entropy (NumPy + C++)       │
                    │  QuestionSelector                        │
                    │    ↓  5-factor weighted scoring           │
                    │  ConfidenceCalculator                    │
                    │    ↓  4-signal composite score (0–100%)   │
                    │  Prediction / Next Question              │
                    └──────────────────────────────────────────┘

Core Components

Component	File	Purpose
`InferenceEngine`	`inference_engine.py`	Main coordinator — manages game sessions
`ProbabilityManager`	`probability_manager.py`	Bayesian likelihood updates + soft filtering
`BayesianNetwork`	`bayesian_network.py`	Belief propagation across related attributes
`InformationGain`	`information_gain.py`	Shannon entropy calculation (NumPy + C++)
`QuestionSelector`	`question_selector.py`	5-factor question scoring + disambiguation
`ConfidenceCalculator`	`confidence_calculator.py`	4-signal composite confidence score
`FeatureImportance`	`feature_importance.py`	ML-learned attribute weights
`Embeddings`	`embeddings.py`	MiniLM-L6-v2 semantic similarity
`FAISSIndex`	`faiss_index.py`	Fast last-mile disambiguation

Question Selection Algorithm

Every candidate question is scored with a weighted formula:

score = (information_gain  × 0.40)   # How much entropy does this reduce?
      + (stage_bonus        × 0.35)   # continent→region→culture→specific
      + (answer_balance     × 0.10)   # prefer questions that split ~50/50
      + (bayesian_belief    × 0.10)   # prior probability of this attribute value
      + (feature_importance × 0.05)   # weight learned from real game data (ML)

Stage ordering ensures the engine always asks broad questions first:

Stage 0 → continent, type
Stage 1 → region, sub-region
Stage 2 → coast, landlocked, island, climate, mountains
Stage 3 → population, size, GDP level
Stage 4 → government, religion, drive side
Stage 5 → language, flag, colonial history, UNESCO
Stage 6 → exports, famous for, neighbors
Stage 7 → capital, currency (very specific — asked last)

Probability Updates

Each answer multiplies all place probabilities using likelihood ratios:

Answer	Match multiplier	Mismatch multiplier
Yes	×10.0	×0.001
Probably	×3.5	×0.15
Don't Know	×1.0	×1.0
Probably Not	×0.15	×3.5
No	×0.001	×10.0

After each update, probabilities are normalized and a soft filter eliminates candidates below 0.5% of the top probability (keeping at least 5).

Confidence Score

The confidence signal is a weighted combination of 4 measurements:

confidence = (probability_gap   × 0.40)   # gap between top-1 and top-2 probability
           + (normalized_prob   × 0.30)   # top probability / total
           + (item_count_score  × 0.20)   # fewer remaining = more confident
           + (entropy_score     × 0.10)   # inverse of distribution entropy

The engine triggers a guess when confidence crosses a stage-dependent threshold:

Questions 1–10: requires 99%
Questions 11–25: requires 95%
Questions 26–50: requires 88%
Questions 50+: requires 78%

Installation

pip install atlas-gmp-engine

With C++ extensions (recommended — 8× faster probability operations):

pip install atlas-gmp-engine[cpp]

With semantic embeddings (for FAISS disambiguation):

pip install atlas-gmp-engine[embeddings]

Full installation:

pip install atlas-gmp-engine[all]

Quick Start

from atlas_engine import InferenceEngine

# Define your places
places = [
    {
        "id": "bd",
        "name": "Bangladesh",
        "type": "country",
        "emoji": "🇧🇩",
        "description": "A South Asian nation known for the Sundarbans and the Padma River.",
        "fun_fact": "Bangladesh is home to the world's largest river delta.",
        "attributes": {
            "continent":    "asia",
            "subRegion":    "south asia",
            "landlocked":   False,
            "hasCoast":     True,
            "hasDelta":     True,
            "climate":      "tropical",
            "mainReligion": "islam",
            "language":     "Bengali",
            "population":   "verylarge",
            "driveSide":    "left",
            "famousFor":    ["Sundarbans", "Padma River", "garments industry", "rickshaws"],
        },
    },
    # ... more places
]

# Define your questions
questions = [
    {
        "id": "q1",
        "question_text": "🌏 Is it located in Asia?",
        "attribute": "continent",
        "value": "asia",
        "stage": 0,
        "base_weight": 1.0,
    },
    {
        "id": "q2",
        "question_text": "🌊 Does it have a coastline?",
        "attribute": "hasCoast",
        "value": True,
        "stage": 2,
        "base_weight": 1.2,
    },
    # ... more questions
]

# Initialize engine
engine = InferenceEngine()

# Optionally load ML-learned feature importance
engine.load_feature_importance({
    "continent":    0.95,
    "subRegion":    0.90,
    "mainReligion": 0.88,
    "famousFor":    0.85,
    "language":     0.90,
})

# Start a game session
session = engine.start_game(places, questions)

# Game loop
while True:
    question = engine.get_next_question(session)

    if question is None:
        break  # Engine is ready to guess

    print(f"\n{question['question_text']}")
    answer = input("(yes / probably / dontknow / probablynot / no): ").strip()

    result = engine.process_answer(session, answer)
    print(f"  Confidence: {result['confidence']:.1f}%")
    print(f"  Remaining:  {result['active_places_count']} places")

    if result["should_stop"]:
        break

# Get prediction
pred = engine.get_prediction(session)

if pred["prediction"]:
    p = pred["prediction"]
    print(f"\n🎯 Atlas guesses: {p['emoji']} {p['name']}")
    print(f"   Confidence: {pred['confidence']}%")
    print(f"   Questions asked: {pred['questions_asked']}")

Data Format

Place object

{
    "id":          str,              # unique identifier
    "name":        str,              # display name
    "type":        str,              # "country" | "city" | "landmark" | ...
    "emoji":       str | None,       # optional emoji flag or symbol
    "description": str | None,       # 2-3 sentence description
    "fun_fact":    str | None,       # surprising fact
    "attributes": {                  # key-value pairs matching your questions
        "continent":    str,         # "asia" | "europe" | "africa" | ...
        "subRegion":    str,         # "south asia" | "western europe" | ...
        "landlocked":   bool,
        "hasCoast":     bool,
        "hasMountains": bool,
        "climate":      str,         # "tropical" | "desert" | "temperate" | ...
        "population":   str,         # "small" | "medium" | "large" | "verylarge"
        "mainReligion": str,
        "language":     str,
        "famousFor":    list[str],   # list values supported
        "neighbors":    list[str],
        # ... any attributes your questions reference
    }
}

Question object

{
    "id":            str,     # unique identifier
    "question_text": str,     # "🌏 Is it in Asia?" — emoji prefix recommended
    "attribute":     str,     # "continent" — must match place attributes key
    "value":         any,     # "asia" — the value for which answer is YES
    "stage":         int,     # 0–7 (see stage ordering above)
    "base_weight":   float,   # 1.0 default, higher = preferred
}

Advanced Usage

Load ML-learned feature importance

engine = InferenceEngine()

# Scores from 0.0 to 1.0 — higher = more important for discrimination
engine.load_feature_importance({
    "continent":    0.95,
    "type":         0.98,
    "subRegion":    0.90,
    "mainReligion": 0.88,
    "language":     0.90,
    "famousFor":    0.85,
    "capital":      0.95,
    "landlocked":   0.80,
})

Handle user correction (feedback)

# When Atlas guesses wrong and user corrects it:
engine.apply_feedback(session, correct_place_id="bd")
# Boosts Bangladesh ×25, reduces all others ×0.04
# Engine can then continue asking and make a new prediction

Use semantic embeddings for disambiguation

from atlas_engine.embeddings import embed_place

# Generate embedding for a place
place_data = {"name": "Bangladesh", "description": "...", "attributes": {...}}
embedding = embed_place(place_data)   # returns numpy array (384-dim)
# Store in your vector DB (e.g. Supabase pgvector)

Build FAISS index for fast similarity search

from atlas_engine.faiss_index import build_index, load_index

# Build from places with embeddings
places_with_embeddings = [
    {"id": "bd", "name": "Bangladesh", "type": "country", "embedding": [...]},
    # ...
]
build_index(places_with_embeddings)

# Load into memory (call once at startup)
load_index()

C++ Extensions

For large datasets (10,000+ places), the hot-path operations are implemented in C++ via pybind11:

atlas_engine/cpp/probability_ops.cpp
  ├── normalize_probabilities()   ← called after every answer
  ├── soft_filter()               ← eliminates near-zero candidates
  ├── shannon_entropy()           ← information gain inner loop
  └── information_gain_binary()   ← runs for every candidate question

Performance comparison:

Dataset	Python (NumPy)	C++ (pybind11)
100 places	~3ms	~1ms
1,000 places	~25ms	~5ms
10,000 places	~600ms	~70ms
50,000 places	~8s	~400ms

The engine automatically falls back to NumPy if C++ is not compiled.

Build C++ extensions manually:

cd atlas_engine/cpp
pip install pybind11
python setup.py build_ext --inplace

Performance

Dataset Size	Avg Response	Memory	C++ Required
≤ 1,000	< 20ms	~150MB	No
≤ 10,000	< 80ms	~800MB	Recommended
≤ 50,000	< 400ms	~5GB	Yes

Requirements

Core (always required):

numpy >= 1.26
scipy >= 1.13
structlog >= 24.2   (optional, falls back to stdlib logging)

Optional extras:

scikit-learn >= 1.5       (ML feature importance training)
sentence-transformers >= 3.0  (semantic embeddings)
faiss-cpu >= 1.8          (fast vector similarity search)
pybind11 >= 2.13          (C++ hot-path extensions)

Changelog

[1.0.0] — 2026

Initial release as a standalone package.

Features:

Bayesian inference engine with 5-factor question selection
Probability Manager with likelihood multipliers
Bayesian Network for belief propagation across attributes
Information Gain Calculator (NumPy + C++ pybind11)
Confidence Calculator (4-signal composite score)
FAISS semantic index for last-mile disambiguation
MiniLM-L6-v2 embeddings (384-dim)
Soft filtering with configurable thresholds
Stage-ordered question selection
Feature importance (both static and ML-learned)
C++ extensions for hot-path operations (8× speedup)
Graceful fallback to pure Python when C++ unavailable

Used By

GuessMyPlace — the geography guessing game this engine was built for

License

MIT License — see LICENSE for details.

Part of the GuessMyPlace project

PyPI · GuessMyPlace · Docs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Jun 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atlas_gmp_engine-1.0.0.tar.gz (27.2 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

atlas_gmp_engine-1.0.0-py3-none-any.whl (23.4 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file atlas_gmp_engine-1.0.0.tar.gz.

File metadata

Download URL: atlas_gmp_engine-1.0.0.tar.gz
Upload date: Jun 4, 2026
Size: 27.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for atlas_gmp_engine-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`88766d74dc647678583dd251d41256c0fe8c99fc9373799a0e236ea4171187e4`
MD5	`589ca7f66667d9381e0a550ca4f3e047`
BLAKE2b-256	`9e935fd75fc7e2be06be45f6fe7e8f50430a714d7f53a7b92044c6d88882cccf`

See more details on using hashes here.

File details

Details for the file atlas_gmp_engine-1.0.0-py3-none-any.whl.

File metadata

Download URL: atlas_gmp_engine-1.0.0-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 23.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for atlas_gmp_engine-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`92ffaf3d5c1686e7c03a5a2508d5102fa67fb2381fed39f8ee71ed34d4c9a57f`
MD5	`a0d8c46dbeddcc373c0f9b3750e5b4ca`
BLAKE2b-256	`4e26069fdbccd9aa702744c51201e60fc9e3844c51f64ebab9992f0297309683`

See more details on using hashes here.

atlas-gmp-engine 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

⚡ Atlas GMP Engine

What is Atlas GMP Engine?

How It Works

Core Components

Question Selection Algorithm

Probability Updates

Confidence Score

Installation

Quick Start

Data Format

Place object

Question object

Advanced Usage

Load ML-learned feature importance

Handle user correction (feedback)

Use semantic embeddings for disambiguation

Build FAISS index for fast similarity search

C++ Extensions

Performance

Requirements

Changelog

[1.0.0] — 2026

Used By

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes