Skip to main content

Geographic + Socioeconomic + Climate intelligence with ML, auto-update, and visualization - World's most comprehensive city data platform

Project description

🌍 GeoDataSim v0.3.0 - Intelligence Boost

World's most comprehensive city data platform with ML, auto-update, and visualization

PyPI version Python 3.10+ License: MIT

Geographic + Socioeconomic + Climate intelligence library with ML clustering, auto-update engine, and interactive visualization. All data from free public APIs (World Bank, REST Countries, Open-Meteo).


🚀 What's NEW in v0.3.0 - Intelligence Boost

🤖 ML-Powered Intelligence

  • City Clustering (KMeans, DBSCAN, Agglomerative)
  • 10x Faster Similarity (numba JIT optimization)
  • Advanced Feature Engineering (sklearn integration)

📊 Interactive Visualization

  • Plotly Charts (scatter, heatmap, radar, bar, geo)
  • Export to HTML (interactive, shareable)
  • Quick visualization APIs

🔄 Auto-Update Engine

  • Monthly data refresh from World Bank API
  • 30-day cache (avoids unnecessary API calls)
  • Update history tracking
  • No API key required (100% free sources)

✅ Production-Ready Features

  • Pydantic validation (type-safe data models)
  • Progress bars (tqdm integration)
  • Enhanced geopy distance calculations
  • Comprehensive error handling

📦 Installation

pip install geodatasim

Requirements: Python 3.10+


⚡ Quick Start

Basic Usage

from geodatasim import City

# Create city with automatic data loading
istanbul = City("Istanbul")

print(f"Population: {istanbul.population:,}")
print(f"GDP per capita: ${istanbul.gdp_per_capita:,.2f}")
print(f"Climate: {istanbul.climate_zone} ({istanbul.avg_temperature}°C)")
print(f"HDI: {istanbul.hdi}")

# Find similar cities
similar = istanbul.find_similar(n=5)
for city in similar:
    print(f"  - {city.name}, {city.country}")

🆕 ML Clustering (v0.3.0)

from geodatasim.ml import CityClustering, cluster_cities
from geodatasim.analysis import BatchAnalyzer

# Get data
analyzer = BatchAnalyzer(["Istanbul", "Paris", "Tokyo", "New York"])
df = analyzer.to_dataframe()

# Cluster cities
clustering = CityClustering(n_clusters=3, method='kmeans')
clustering.fit(df)

print(f"Silhouette score: {clustering.silhouette_score_:.3f}")
summary = clustering.get_cluster_summary(df)
print(summary)

🆕 Interactive Visualization (v0.3.0)

from geodatasim.viz import CityVisualizer

viz = CityVisualizer()

# Scatter plot
fig = viz.scatter(df, x='population', y='gdp_per_capita',
                  color='region', size='population')
fig.show()  # Interactive in browser
fig.write_html("cities.html")

# Correlation heatmap
viz.heatmap(df, columns=['population', 'gdp', 'hdi']).show()

# Radar chart comparison
viz.radar(df, metrics=['population', 'gdp', 'hdi'],
          cities=['Istanbul', 'Paris', 'Tokyo']).show()

🆕 Auto-Update Engine (v0.3.0)

from geodatasim import City
from geodatasim.core.updater import UpdateEngine, update_city

engine = UpdateEngine()

# Method 1: Update City object (easiest)
istanbul = City("Istanbul")
updated_data = engine.update(istanbul)
if 'population' in updated_data:
    print(f"Population: {updated_data['population']:,}")
if 'gdp_per_capita' in updated_data:
    print(f"GDP: ${updated_data['gdp_per_capita']:,.2f}")

# Method 2: Update specific fields
pop_result = engine.update_population(istanbul)
if pop_result:
    population, metadata = pop_result
    print(f"Updated population: {population:,}")

gdp_result = engine.update_gdp(istanbul)
if gdp_result:
    gdp, metadata = gdp_result
    print(f"Updated GDP: ${gdp:,.2f}")

# Method 3: Update with city data dictionary
city_data = {
    'name': 'Istanbul',
    'country_code': 'TUR',
    'latitude': 41.0082,
    'longitude': 28.9784
}
updated = engine.update_city_all(city_data)

# Method 4: Convenience function
from geodatasim.core.updater import update_city
updated = update_city(istanbul)

# Check if update needed (30-day interval)
needs_update = engine.should_update('Istanbul', 'population')
if needs_update:
    print("Update available!")

📊 Features

v0.3.0 - Intelligence Boost 🆕

  • 🤖 ML Clustering (KMeans, DBSCAN, Agglomerative)
  • ⚡ 10x Faster Similarity (numba optimization)
  • 📊 Interactive Visualization (plotly)
  • 🔄 Auto-Update Engine (monthly refresh)
  • ✅ Pydantic Validation
  • 📈 Progress Bars (tqdm)

v0.2.0 - Data Science Tools

  • ✅ Batch Analysis
  • ✅ Rankings & Filtering
  • ✅ Export (CSV, Excel, JSON, Markdown)
  • ✅ pandas Integration
  • ✅ Statistical Analysis

v0.1.0 - Core Features

  • ✅ 46 cities from 36 countries
  • ✅ 20+ data fields per city
  • ✅ World Bank API integration
  • ✅ Smart caching (90-day TTL)
  • ✅ City similarity algorithm
  • ✅ Distance calculations

📈 Data Sources

All from free, public domain sources:

Source Data API Key Required
World Bank GDP, Population, HDI ❌ No
REST Countries Country metadata ❌ No
Open-Meteo Climate data ❌ No

Safe for commercial use - All sources are public domain


🎯 Use Cases

Data Science & ML

from geodatasim.ml import CityClustering
clustering = CityClustering(n_clusters=5)
clustering.fit(cities_df)

Urban Planning

istanbul = City("Istanbul")
similar = istanbul.find_similar(min_population=5_000_000)

Business Intelligence

from geodatasim.analysis import CityRankings
rankings = CityRankings()
wealthy_cities = rankings.filter_cities(min_gdp=40000)

Interactive Dashboards

from geodatasim.viz import CityVisualizer
viz = CityVisualizer()
viz.scatter(df, 'population', 'gdp').show()

📖 Examples

# Test basic features
python test_v0_3_0.py

# Run comprehensive examples
python examples/v0_3_0_intelligence_boost.py

🛣️ Roadmap

v0.4.0 - Performance (Polars, UMAP, PyArrow) v0.5.0 - Geo Intelligence (geopandas, folium) v1.0.0 - Complete Platform (100+ cities, predictions)


📄 License

MIT License


📬 Contact

PyPI: pypi.org/project/geodatasim GitHub: github.com/teyfikoz/GeoDataSim


GeoDataSim v0.3.0 🚀 ML · Visualization · Auto-Update · Intelligence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geodatasim-0.3.3.tar.gz (50.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geodatasim-0.3.3-py3-none-any.whl (51.1 kB view details)

Uploaded Python 3

File details

Details for the file geodatasim-0.3.3.tar.gz.

File metadata

  • Download URL: geodatasim-0.3.3.tar.gz
  • Upload date:
  • Size: 50.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for geodatasim-0.3.3.tar.gz
Algorithm Hash digest
SHA256 89ef9d437ac32f8cdefae13bc7b39fddc0cef56a7e79ddb5d9e9aaf74dd319cd
MD5 fa795254dc9578154f4eb59b6811514b
BLAKE2b-256 706add982fa668ba88a55b5fcf22cf3be6126f07390808097584bebb5dc12801

See more details on using hashes here.

File details

Details for the file geodatasim-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: geodatasim-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 51.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for geodatasim-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ea48f808aa0d6566285e9f876e9653a07676f37d2921fca0491a0d7cd1a9a2e4
MD5 d9f26ddc5873320c802d95b13171d355
BLAKE2b-256 27d632c8602cd14caec983d4fa8a2c628047dd2e38f4b9690e1fe8df8ba8893e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page