Geospatial Vision-Language Model analysis for street-level imagery. Download Mapillary images by location and generate structured descriptions using VLMs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yunusserhat

These details have not been verified by PyPI

Project description

GeoAI-VLM

Geospatial Vision-Language Model analysis for street-level imagery.

GeoAI-VLM combines ZenSVI's Mapillary downloading capabilities with Vision-Language Models (VLMs) and a high-performance vLLM backend to generate structured descriptions of street-level images. Starting with v0.2.0, GeoAI-VLM also supports multimodal embedding with Qwen3-VL-Embedding, enabling semantic clustering, spatial autocorrelation analysis, and vector similarity search over geotagged imagery. It's designed for GeoAI research.

Features

Core

🗺️ Geospatial Queries: Point, line, polygon, and bounding box queries with automatic buffering
📸 Mapillary Integration: Download street-level imagery via ZenSVI
🤖 VLM Analysis: Generate structured descriptions using Qwen-VL, and other image-text-to-text models
📊 GeoParquet Output: Native geometry columns for seamless GIS integration
📏 Distance Calculations: Automatic distance-to-query computation using haversine
⚡ High Performance: vLLM backend for fast batch inference (Transformers fallback available)
🔄 Resume Support: Skip already-processed images for incremental workflows

Embedding & Analysis (v0.2.0)

🧬 Multimodal Embeddings: Generate dense vector representations from text and images using Qwen3-VL-Embedding (2B & 8B variants)
🔍 Vector Search: Build searchable indices with ChromaDB or FAISS and retrieve semantically similar places by text or image query
📈 Semantic Clustering: K-Means clustering over embeddings with automatic keyword extraction per cluster
🌐 Spatial Autocorrelation: Global and local Moran's I to detect spatial patterns in cluster assignments
📉 Visualization: Elbow curves, cluster maps, LISA significance maps, category distributions, and full HTML reports

Requirements & Platform Support

Python 3.9-3.12 supported
Windows is NOT supported due to the vLLM dependency. Please use Linux or macOS.
CUDA-compatible GPU (recommended for VLM inference)
Mapillary API key for downloading street-level imagery

Set up using Python

Create a new Python environment

It's recommended to use uv, a very fast Python environment manager, to create and manage Python environments. Please follow the documentation to install uv. After installing uv, you can create a new Python environment using the following commands:

uv venv --python 3.12 --seed
source .venv/bin/activate

Installation

Option 1: Install from PyPI

uv pip install geoai-vlm

Option 2: Install from GitHub

# Clone the repository
git clone https://github.com/yunusserhat/geoai-vlm.git
cd geoai-vlm

# Install in the current environment
uv pip install .

# For development (editable mode)
uv pip install -e ".[dev]"

Verify Installation

python -c "import geoai_vlm; print('GeoAI-VLM installed successfully!')"

Quick Start

Basic Usage

from geoai_vlm import describe_place

# Describe images from a place name
results = describe_place(
    place_name="Sultanahmet, Istanbul",
    mly_api_key="YOUR_MAPILLARY_API_KEY",
    buffer_m=100,
    output_path="sultanahmet_descriptions.parquet"
)

print(results.head())

Point Query with Distance

from geoai_vlm import describe_point

# Query images near a specific coordinate
results = describe_point(
    lat=41.0082,
    lon=28.9784,
    buffer_m=50,
    mly_api_key="YOUR_API_KEY",
    output_path="hagia_sophia.parquet"
)

# Results include distance_to_query_m column
print(results[['image_id', 'distance_to_query_m', 'scene_narrative']].head())

Line Query (Street/Route Analysis)

from geoai_vlm import describe_line
from shapely.geometry import LineString

# Analyze images along a street
street_line = LineString([
    (28.9700, 41.0100),  # Start point (lon, lat)
    (28.9750, 41.0120),  # Midpoint
    (28.9800, 41.0080),  # End point
])

results = describe_line(
    geometry=street_line,
    buffer_m=25,
    mly_api_key="YOUR_API_KEY"
)

# Results include distance_to_line_m and distance_along_line_m

Bounding Box Query

from geoai_vlm import describe_bbox

results = describe_bbox(
    minx=28.970, miny=41.005,
    maxx=28.985, maxy=41.015,
    mly_api_key="YOUR_API_KEY",
    model_name="Qwen/Qwen3-VL-2B-Instruct"
)

Custom Prompts

from geoai_vlm import ImageDescriber, describe_place

# Use custom system/user prompts
custom_system = """You are an urban safety analyst. Describe safety-relevant features."""
custom_user = """Analyze this street image for: lighting, visibility, foot traffic, escape routes."""

results = describe_place(
    query="Fatih, Istanbul",
    mly_api_key="YOUR_API_KEY",
    system_prompt=custom_system,
    user_prompt=custom_user,
    output_path="safety_analysis.parquet"
)

Using Different Backends

from geoai_vlm import ImageDescriber

# VLLM backend (default, fastest)
describer = ImageDescriber(
    model_name="Qwen/Qwen3-VL-2B-Instruct",
    backend="vllm",
    gpu_memory_utilization=0.8
)

# Transformers backend (fallback)
describer = ImageDescriber(
    model_name="Qwen/Qwen3-VL-2B-Instruct",
    backend="transformers",
    device="cuda"
)

# Describe images
results = describer.describe(
    image_dir="./my_images",
    output_path="descriptions.parquet",
    batch_size=8
)

Output Schema

The default GeoAI schema extracts structured urban features:

{
    "scene_narrative": "80-120 word description of the urban scene",
    "land_use_character": {"primary": "commercial", "intensity": "high"},
    "urban_morphology": {"street_type": "pedestrian", "enclosure_ratio": "high"},
    "streetscape_elements": {"sidewalk_quality": "good", "street_trees": "moderate"},
    "mobility_infrastructure": {"modes_visible": ["pedestrian", "bicycle"]},
    "place_character": {"dominant_activity": "shopping", "human_presence": "crowded"},
    "environmental_quality": {"greenery_coverage": "moderate", "cleanliness": "good"},
    "semantic_tags": ["historic", "tourist", "commercial", "pedestrian", "busy"]
}

Multimodal Embeddings

Generate dense vector representations from VLM descriptions and street-level images using Qwen3-VL-Embedding:

from geoai_vlm import ImageEmbedder

# Initialize the embedder (auto-selects vLLM or Transformers backend)
embedder = ImageEmbedder(
    model_name="Qwen/Qwen3-Embedding-0.6B",
    backend="auto"
)

# Embed text descriptions
vectors = embedder.embed_texts(["A busy commercial street with shops"])
print(vectors.shape)  # (1, hidden_dim)

# Embed images directly
img_vectors = embedder.embed_images(["path/to/image.jpg"])

# Multimodal: combine text + image into a single embedding
mm_vectors = embedder.embed_multimodal(
    texts=["A quiet residential area"],
    image_paths=["path/to/image.jpg"]
)

Semantic Clustering

Cluster geotagged descriptions by semantic similarity and extract per-cluster keywords:

from geoai_vlm import SemanticClusterer, ClusterConfig

config = ClusterConfig(
    n_clusters=8,
    embedding_columns=["scene_narrative", "semantic_tags"],
    n_keywords=10
)
clusterer = SemanticClusterer(embedder=embedder, config=config)

# Cluster a GeoDataFrame of VLM descriptions
gdf = clusterer.cluster(gdf)
print(gdf["cluster"].value_counts())

# Find the optimal number of clusters
k_values, inertias = clusterer.find_optimal_k(gdf, k_range=range(2, 20))

# Extract TF-IDF keywords per cluster
keywords = clusterer.extract_keywords(gdf)
for cluster_id, words in keywords.items():
    print(f"Cluster {cluster_id}: {words}")

Spatial Autocorrelation

Detect whether semantic clusters are spatially random or form significant patterns:

from geoai_vlm import SpatialAnalyzer

analyzer = SpatialAnalyzer(k_neighbors=8)

# Global Moran's I — is there overall spatial clustering?
global_result = analyzer.moran_global(gdf, column="cluster")
print(f"Moran's I = {global_result.I:.3f}, p = {global_result.p_sim:.4f}")

# Local Moran's I (LISA) — where are the hot/cold spots?
gdf = analyzer.moran_local(gdf, column="cluster")
# Adds 'lisa_Is', 'lisa_q', 'lisa_p_sim' columns to the GeoDataFrame

Vector Similarity Search

Build a searchable index over your geotagged descriptions and find semantically similar places:

from geoai_vlm import VectorDB

# Build an index from a GeoDataFrame
vdb = VectorDB(embedder=embedder, store_backend="chromadb")
vdb.build(
    gdf,
    text_column="scene_narrative",
    image_dir="./images",
    metadata_columns=["land_use_character", "cluster"]
)

# Search by natural language
results = vdb.search(query_text="tree-lined residential street", n_results=5)
print(results[["scene_narrative", "distance"]])

# Search by image
results = vdb.search(query_image="query_photo.jpg", n_results=5)

Visualization

from geoai_vlm import (
    plot_elbow_curve,
    plot_cluster_map,
    plot_lisa_map,
    plot_category_distribution,
    generate_report
)

# Elbow curve for choosing k
plot_elbow_curve(k_values, inertias, save_path="elbow.png")

# Map of clusters
plot_cluster_map(gdf, cluster_column="cluster", save_path="clusters.png")

# LISA significance map
plot_lisa_map(gdf, save_path="lisa.png")

# Category breakdown
plot_category_distribution(gdf, category_columns=["land_use_character"])

# Full HTML report
generate_report(gdf, output_dir="./report")

One-Line Pipeline

Run the entire workflow — download, describe, embed, cluster, analyze — in a single call:

from geoai_vlm import embed_place, cluster_descriptions, analyze_spatial

# 1. Download + embed
gdf = embed_place(
    place_name="Sultanahmet, Istanbul",
    mly_api_key="YOUR_API_KEY",
    embedding_model="Qwen/Qwen3-Embedding-0.6B"
)

# 2. Cluster
gdf = cluster_descriptions(gdf, n_clusters=8)

# 3. Spatial analysis
gdf = analyze_spatial(gdf, column="cluster", k_neighbors=8)

GeoParquet Output

Results are saved as GeoParquet with native geometry:

import geopandas as gpd

# Load results
gdf = gpd.read_parquet("results.parquet")

# Native geometry column preserved
print(gdf.geometry)  # POINT geometries
print(gdf.crs)       # EPSG:4326

# Easy GIS operations
gdf.to_file("results.geojson", driver="GeoJSON")
gdf.explore()  # Interactive map in Jupyter

Requirements

Python 3.9-3.12 supported
Mapillary API key (get one here)
GPU recommended for VLM inference and embedding generation

Dependencies

Core: geopandas, pandas, shapely, pyarrow, haversine
Downloading: zensvi (Mapillary integration)
VLM (choose one):
- vLLM + qwen-vl-utils (recommended)
- Transformers + torch + accelerate
Embedding & Analysis: chromadb, faiss-cpu, scikit-learn, matplotlib, libpysal, esda

License

MIT License - see LICENSE for details.

Citation

If you use GeoAI-VLM in your research, please cite:

@software{geoai_vlm,
  author  = {B{\i}cak{\c{c}}{\i}, Yunus Serhat},
  title   = {GeoAI-VLM: Geospatial Vision-Language Model Analysis},
  year    = {2026},
  publisher = {Zenodo},
  doi     = {10.5281/zenodo.18169685},
  url     = {https://github.com/yunusserhat/GeoAI-VLM}
}

Acknowledgments

ZenSVI for Mapillary integration
Qwen-VL for vision-language models
Qwen3-VL-Embedding for multimodal embeddings
vLLM for high-performance inference
ChromaDB and FAISS for vector search
PySAL for spatial statistics

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

yunusserhat

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.1

Feb 28, 2026

0.2.0

Feb 28, 2026

0.1.4

Jan 7, 2026

0.1.2

Jan 7, 2026

0.1.1

Jan 6, 2026

0.1.0

Jan 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geoai_vlm-0.2.1.tar.gz (46.4 kB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

geoai_vlm-0.2.1-py3-none-any.whl (54.0 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file geoai_vlm-0.2.1.tar.gz.

File metadata

Download URL: geoai_vlm-0.2.1.tar.gz
Upload date: Feb 28, 2026
Size: 46.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geoai_vlm-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`ce1b4438e98bf9a9b147be0303e5f3327bb9cef97c2131cadec4c3137adcb346`
MD5	`2de2797c0d8c317f3fa96c2f0bc073c5`
BLAKE2b-256	`4ceac740c74409bb3373830aa3baa6cadce34db81a450d22244891327e739362`

See more details on using hashes here.

Provenance

The following attestation bundles were made for geoai_vlm-0.2.1.tar.gz:

Publisher: publish.yml on yunusserhat/GeoAI-VLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: geoai_vlm-0.2.1.tar.gz
- Subject digest: ce1b4438e98bf9a9b147be0303e5f3327bb9cef97c2131cadec4c3137adcb346
- Sigstore transparency entry: 1005354588
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: yunusserhat/GeoAI-VLM@19524bdbfe59f9a897160755b7e66e5e1c4676d0
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/yunusserhat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@19524bdbfe59f9a897160755b7e66e5e1c4676d0
- Trigger Event: release

File details

Details for the file geoai_vlm-0.2.1-py3-none-any.whl.

File metadata

Download URL: geoai_vlm-0.2.1-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 54.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geoai_vlm-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5771d57d4e01ad24f136d024f8b14604ef057f0b13f0be748d8222474a6418c4`
MD5	`980e729b0f2ab48722eef30686c756b5`
BLAKE2b-256	`4e52823dcc2813646e0de73974017de5dcd6a2454a8d2fcc513ee93198346f9d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for geoai_vlm-0.2.1-py3-none-any.whl:

Publisher: publish.yml on yunusserhat/GeoAI-VLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: geoai_vlm-0.2.1-py3-none-any.whl
- Subject digest: 5771d57d4e01ad24f136d024f8b14604ef057f0b13f0be748d8222474a6418c4
- Sigstore transparency entry: 1005354590
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: yunusserhat/GeoAI-VLM@19524bdbfe59f9a897160755b7e66e5e1c4676d0
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/yunusserhat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@19524bdbfe59f9a897160755b7e66e5e1c4676d0
- Trigger Event: release

geoai-vlm 0.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

GeoAI-VLM

Features

Core

Embedding & Analysis (v0.2.0)

Requirements & Platform Support

Set up using Python

Create a new Python environment

Installation

Option 1: Install from PyPI

Option 2: Install from GitHub

Verify Installation

Quick Start

Basic Usage

Point Query with Distance

Line Query (Street/Route Analysis)

Bounding Box Query

Custom Prompts

Using Different Backends

Output Schema

Multimodal Embeddings

Semantic Clustering

Spatial Autocorrelation

Vector Similarity Search

Visualization

One-Line Pipeline

GeoParquet Output

Requirements

Dependencies

License

Citation

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance