Geospatial Vision-Language Model analysis for street-level imagery. Download Mapillary images by location and generate structured descriptions using VLMs.
Project description
GeoAI-VLM
Geospatial Vision-Language Model analysis for street-level imagery.
GeoAI-VLM combines ZenSVI's Mapillary downloading capabilities with Vision-Language Models (VLMs) and a high-performance vLLM backend to generate structured descriptions of street-level images. It's designed for GeoAI research.
Features
- 🗺️ Geospatial Queries: Point, line, polygon, and bounding box queries with automatic buffering
- 📸 Mapillary Integration: Download street-level imagery via ZenSVI
- 🤖 VLM Analysis: Generate structured descriptions using Qwen-VL, and other image-text-to-text models
- 📊 GeoParquet Output: Native geometry columns for seamless GIS integration
- 📏 Distance Calculations: Automatic distance-to-query computation using haversine
- ⚡ High Performance: vLLM backend for fast batch inference (Transformers fallback available)
- 🔄 Resume Support: Skip already-processed images for incremental workflows
Requirements & Platform Support
- Python 3.9-3.12 supported
- Windows is NOT supported due to the vLLM dependency. Please use Linux or macOS.
- CUDA-compatible GPU (recommended for VLM inference)
- Mapillary API key for downloading street-level imagery
Set up using Python
Create a new Python environment
It's recommended to use uv, a very fast Python environment manager, to create and manage Python environments. Please follow the documentation to install uv. After installing uv, you can create a new Python environment using the following commands:
uv venv --python 3.12 --seed
source .venv/bin/activate
Installation
Option 1: Install from PyPI
uv pip install geoai-vlm
Option 2: Install from GitHub
# Clone the repository
git clone https://github.com/yunusserhat/geoai-vlm.git
cd geoai-vlm
# Install in the current environment
uv pip install .
# For development (editable mode)
uv pip install -e ".[dev]"
Verify Installation
python -c "import geoai_vlm; print('GeoAI-VLM installed successfully!')"
Quick Start
Basic Usage
from geoai_vlm import describe_place
# Describe images from a place name
results = describe_place(
place_name="Sultanahmet, Istanbul",
mly_api_key="YOUR_MAPILLARY_API_KEY",
buffer_m=100,
output_path="sultanahmet_descriptions.parquet"
)
print(results.head())
Point Query with Distance
from geoai_vlm import describe_point
# Query images near a specific coordinate
results = describe_point(
lat=41.0082,
lon=28.9784,
buffer_m=50,
mly_api_key="YOUR_API_KEY",
output_path="hagia_sophia.parquet"
)
# Results include distance_to_query_m column
print(results[['image_id', 'distance_to_query_m', 'scene_narrative']].head())
Line Query (Street/Route Analysis)
from geoai_vlm import describe_line
from shapely.geometry import LineString
# Analyze images along a street
street_line = LineString([
(28.9700, 41.0100), # Start point (lon, lat)
(28.9750, 41.0120), # Midpoint
(28.9800, 41.0080), # End point
])
results = describe_line(
geometry=street_line,
buffer_m=25,
mly_api_key="YOUR_API_KEY"
)
# Results include distance_to_line_m and distance_along_line_m
Bounding Box Query
from geoai_vlm import describe_bbox
results = describe_bbox(
minx=28.970, miny=41.005,
maxx=28.985, maxy=41.015,
mly_api_key="YOUR_API_KEY",
model_name="Qwen/Qwen3-VL-2B-Instruct"
)
Custom Prompts
from geoai_vlm import ImageDescriber, describe_place
# Use custom system/user prompts
custom_system = """You are an urban safety analyst. Describe safety-relevant features."""
custom_user = """Analyze this street image for: lighting, visibility, foot traffic, escape routes."""
results = describe_place(
query="Fatih, Istanbul",
mly_api_key="YOUR_API_KEY",
system_prompt=custom_system,
user_prompt=custom_user,
output_path="safety_analysis.parquet"
)
Using Different Backends
from geoai_vlm import ImageDescriber
# VLLM backend (default, fastest)
describer = ImageDescriber(
model_name="Qwen/Qwen3-VL-2B-Instruct",
backend="vllm",
gpu_memory_utilization=0.8
)
# Transformers backend (fallback)
describer = ImageDescriber(
model_name="Qwen/Qwen3-VL-2B-Instruct",
backend="transformers",
device="cuda"
)
# Describe images
results = describer.describe(
image_dir="./my_images",
output_path="descriptions.parquet",
batch_size=8
)
Output Schema
The default GeoAI schema extracts structured urban features:
{
"scene_narrative": "80-120 word description of the urban scene",
"land_use_character": {"primary": "commercial", "intensity": "high"},
"urban_morphology": {"street_type": "pedestrian", "enclosure_ratio": "high"},
"streetscape_elements": {"sidewalk_quality": "good", "street_trees": "moderate"},
"mobility_infrastructure": {"modes_visible": ["pedestrian", "bicycle"]},
"place_character": {"dominant_activity": "shopping", "human_presence": "crowded"},
"environmental_quality": {"greenery_coverage": "moderate", "cleanliness": "good"},
"semantic_tags": ["historic", "tourist", "commercial", "pedestrian", "busy"]
}
GeoParquet Output
Results are saved as GeoParquet with native geometry:
import geopandas as gpd
# Load results
gdf = gpd.read_parquet("results.parquet")
# Native geometry column preserved
print(gdf.geometry) # POINT geometries
print(gdf.crs) # EPSG:4326
# Easy GIS operations
gdf.to_file("results.geojson", driver="GeoJSON")
gdf.explore() # Interactive map in Jupyter
Requirements
- Python 3.9-3.12 supported
- Mapillary API key (get one here)
- GPU recommended for VLM inference
Dependencies
- Core: geopandas, pandas, shapely, pyarrow, haversine
- Downloading: zensvi (Mapillary integration)
- VLM (choose one):
- vLLM + qwen-vl-utils (recommended)
- Transformers + torch + accelerate
License
MIT License - see LICENSE for details.
Citation
If you use GeoAI-VLM in your research, please cite:
@software{geoai_vlm,
author = {B{\i}cak{\c{c}}{\i}, Yunus Serhat},
title = {GeoAI-VLM: Geospatial Vision-Language Model Analysis},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18169685},
url = {https://github.com/yunusserhat/GeoAI-VLM}
}
Acknowledgments
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file geoai_vlm-0.1.4.tar.gz.
File metadata
- Download URL: geoai_vlm-0.1.4.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2901b6614424113baa95480fe5b6ab537e0f381ca2238c6c2903409d4970dc97
|
|
| MD5 |
3d44b8afb6407b0fed9a80ca3bc2d7f9
|
|
| BLAKE2b-256 |
e7bab142bfb0b74c1cc291ac8e9a058bc067ee3ddebce0941aa69362620141d6
|
Provenance
The following attestation bundles were made for geoai_vlm-0.1.4.tar.gz:
Publisher:
publish.yml on yunusserhat/GeoAI-VLM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geoai_vlm-0.1.4.tar.gz -
Subject digest:
2901b6614424113baa95480fe5b6ab537e0f381ca2238c6c2903409d4970dc97 - Sigstore transparency entry: 799150425
- Sigstore integration time:
-
Permalink:
yunusserhat/GeoAI-VLM@2108c8a36b9fc6cabb3f3028b3e044e22723089c -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/yunusserhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2108c8a36b9fc6cabb3f3028b3e044e22723089c -
Trigger Event:
release
-
Statement type:
File details
Details for the file geoai_vlm-0.1.4-py3-none-any.whl.
File metadata
- Download URL: geoai_vlm-0.1.4-py3-none-any.whl
- Upload date:
- Size: 28.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb51f6150a261b5c5e017ff436de2844dc441f2424991e654c6f9cfed5b2bfb2
|
|
| MD5 |
2d9d4d5ceecafb68ed85b844f0b349ed
|
|
| BLAKE2b-256 |
29dbf40382d7e4801664dcdb5f4e24d2b3c4f17d17e8be55b9047e86b84d8fcb
|
Provenance
The following attestation bundles were made for geoai_vlm-0.1.4-py3-none-any.whl:
Publisher:
publish.yml on yunusserhat/GeoAI-VLM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
geoai_vlm-0.1.4-py3-none-any.whl -
Subject digest:
eb51f6150a261b5c5e017ff436de2844dc441f2424991e654c6f9cfed5b2bfb2 - Sigstore transparency entry: 799150426
- Sigstore integration time:
-
Permalink:
yunusserhat/GeoAI-VLM@2108c8a36b9fc6cabb3f3028b3e044e22723089c -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/yunusserhat
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2108c8a36b9fc6cabb3f3028b3e044e22723089c -
Trigger Event:
release
-
Statement type: