Skip to main content

GeoFeatureKit transforms simple coordinates into powerful geospatial insights. Analyze street networks, POI diversity, and spatial patterns with professional progress tracking โ€“ no paid APIs or complex setup required.

Project description

GeoFeatureKit

Python 3.9+ PyPI version PyPI downloads Tests License: MIT

GeoFeatureKit: Instantly extract geospatial features, POI analysis, and accessibility insights from coordinates for ML, urban planning, and location intelligence.

๐ŸŽฏ What You Get

Input: Just latitude and longitude coordinates
Output: Comprehensive geospatial intelligence including:

  • ๐Ÿš€ NEW: Multi-modal isochrone accessibility: Walk, bike, and drive accessibility analysis with custom speeds
  • 23 Comprehensive Points of Interest (POI) categories: dining, retail, education, healthcare, culture, recreation, transportation, bicycle services, public transit, water features, green infrastructure, community, financial, accommodation, services, childcare, toilets & hygiene, automotive, animal services, workspace, utilities, safety & emergency, and natural features
  • Street network metrics: connectivity, total street length, segment distributions, pattern entropy
  • Spatial intelligence: POI diversity indices (Shannon, Simpson) and clustering patterns

๐Ÿš€ Use Cases

Functionality Example Applications Target Users
๐Ÿ” Matching & Similarity Propensity score matching, site similarity Data scientists, causal researchers
๐Ÿ“ˆ Predictive Modeling Retail sales, price models ML engineers, analysts
๐Ÿšถ Accessibility Analysis Walk/bike/drive accessibility scoring Urban planners, mobility researchers
๐ŸŽฏ Clustering & Segmentation Urban typology, market segmentation GIS analysts, city scientists
๐Ÿ“Š Exposure Analysis Competitor density, service area analysis Business analysts, planners

๐ŸŒŸ Enhanced Features

See CHANGELOG.md for complete version history and recent updates.

๐Ÿ™๏ธ 23 Comprehensive POI Categories

๐Ÿ“‹ View all 23 POI categories
Category Examples Use Cases
๐Ÿฝ๏ธ Dining Restaurants, cafes, bars, fast food Food accessibility, nightlife analysis
๐Ÿช Retail Supermarkets, malls, convenience stores Shopping accessibility, commercial zones
๐ŸŽ“ Education Schools, universities, libraries Educational accessibility, learning hubs
๐Ÿฅ Healthcare Hospitals, clinics, pharmacies Medical accessibility, health services
๐ŸŽญ Culture Museums, theaters, art centers Cultural richness, entertainment venues
๐Ÿƒ Recreation Parks, gyms, sports centers Fitness accessibility, recreational spaces
๐Ÿš— Transportation Parking, bike rental, airports Mobility infrastructure, transport hubs
๐Ÿš‡ Public Transit Subway stations, bus stops, trams Public transport accessibility
๐ŸŒŠ Water Features Rivers, fountains, coastlines Natural water access, scenic features
๐ŸŒณ Green Infrastructure Trees, parks, gardens, benches Environmental quality, green spaces
๐Ÿ›๏ธ Community Community centers, places of worship Social infrastructure, civic spaces
๐Ÿฆ Financial Banks, ATMs, financial services Banking accessibility, financial hubs
๐Ÿจ Accommodation Hotels, hostels, guest houses Tourism infrastructure, lodging
๐Ÿ”ง Services Post offices, laundry, salons Daily services, convenience access
โšก Utilities Power stations, water treatment Critical infrastructure, urban systems
๐Ÿšจ Safety & Emergency Fire stations, police, hospitals Emergency services, public safety
๐ŸŒฟ Natural Forests, beaches, nature reserves Natural environment, biodiversity
๐Ÿ‘ถ Childcare Nurseries, kindergartens, daycare Family support, child development
๐Ÿšป Toilets & Hygiene Public toilets, showers, chemists Essential public amenities
๐Ÿš— Automotive Gas stations, EV charging, car repair Vehicle infrastructure & services
๐Ÿ• Animal Services Veterinarians, pet shops, shelters Pet care & animal welfare
๐Ÿ’ผ Workspace Coworking spaces, offices Modern work infrastructure
๐Ÿšด Bicycle Services Bike shops, repairs, rentals Cycling infrastructure support

๐Ÿš€ Multi-Modal Isochrone Accessibility

Mode Default Speed Use Cases
๐Ÿšถ Walking 5.0 km/h Pedestrian accessibility, walkability analysis
๐Ÿšด Biking 15.0 km/h Cycling infrastructure, bike-friendly areas
๐Ÿš— Driving 40.0 km/h Car accessibility, service area analysis
  • Network-based routing: Uses actual street networks for realistic travel times
  • Custom speed configuration: Adjust speeds for different analysis scenarios
  • Combined analysis: Compare radius-based vs time-based accessibility
  • Comprehensive metrics: POI counts, area coverage, accessibility comparisons

๐Ÿ”ฌ Advanced Spatial Analysis

  • Nearest Neighbor Analysis: Spatial clustering patterns (clustered/random/dispersed)
  • Diversity Metrics: Shannon and Simpson indices for POI variety
  • Street Pattern Analysis: Bearing entropy, intersection ratios, connectivity metrics
  • Graceful Error Handling: Robust extraction even with limited data availability

โœจ Why GeoFeatureKit?

Advantage Benefit
โœ… Simple Just coordinates in โ€“ structured features out
โœ… Powerful Dozens of geospatial metrics in one function call
โœ… User-friendly Optional progress bars and verbose modes
โœ… Open Data Built entirely on OpenStreetMap (OSM) and public geospatial libraries

๐Ÿš€ Quick Start

Installation

pip install geofeaturekit

Basic Usage

from geofeaturekit import features_from_location

# Example: Analyze Times Square with progress bar
features = features_from_location({
    'latitude': 40.7580,
    'longitude': -73.9855,
    'radius_meters': 500
}, show_progress=True)

print(features)

Enhanced POI Analysis

from geofeaturekit import features_from_location

# Get comprehensive POI analysis
result = features_from_location({
    'latitude': 40.7580,  # Times Square
    'longitude': -73.9855,
    'radius_meters': 800
})

# Access enhanced POI categories
poi_counts = result['poi_metrics']['absolute_counts']['counts_by_category']

# Example: Find areas with good public transit
transit_count = poi_counts.get('total_public_transit_places', {}).get('count', 0)
print(f"Public transit stops: {transit_count}")

# Example: Analyze green infrastructure
green_count = poi_counts.get('total_green_infrastructure_places', {}).get('count', 0)
print(f"Green infrastructure: {green_count}")

# Example: Check water features
water_count = poi_counts.get('total_water_features_places', {}).get('count', 0)
print(f"Water features: {water_count}")

# Access spatial distribution analysis
spatial = result['poi_metrics']['distribution_metrics']['spatial_distribution']
print(f"Spatial pattern: {spatial['pattern_interpretation']}")
print(f"Mean distance between POIs: {spatial['mean_nearest_neighbor_distance_meters']}m")

๐Ÿš€ NEW: Multi-Modal Isochrone Accessibility Analysis

from geofeaturekit import features_from_coordinate

# Multi-modal accessibility analysis
features = features_from_coordinate(
    lat=40.7580,  # Times Square
    lon=-73.9855,
    max_travel_time_min_walk=10,    # 10-minute walking isochrone
    max_travel_time_min_bike=5,     # 5-minute biking isochrone  
    max_travel_time_min_drive=15,   # 15-minute driving isochrone
    speed_config={'walk': 4.8, 'bike': 17, 'drive': 35}  # Custom speeds
)

# Access walking accessibility
walk_data = features['isochrone_features_walk']
walk_pois = walk_data['poi_metrics']['absolute_counts']['total_points_of_interest']
walk_area = walk_data['isochrone_info']['area_sqm']
print(f"Walking (10min): {walk_pois} POIs accessible in {walk_area:.0f} sqm")

# Access biking accessibility  
bike_data = features['isochrone_features_bike']
bike_pois = bike_data['poi_metrics']['absolute_counts']['total_points_of_interest']
bike_area = bike_data['isochrone_info']['area_sqm']
print(f"Biking (5min): {bike_pois} POIs accessible in {bike_area:.0f} sqm")

# Compare accessibility by transportation mode
print("Accessibility Comparison:")
for mode, data in features.items():
    info = data['isochrone_info']
    poi_count = data['poi_metrics']['absolute_counts']['total_points_of_interest']
    print(f"  {info['mode'].title()}: {poi_count} POIs in {info['travel_time_minutes']}min")

Combined Radius + Isochrone Analysis

from geofeaturekit import features_from_coordinate

# Analyze both circular radius and accessibility isochrones
features = features_from_coordinate(
    lat=40.7580,
    lon=-73.9855,
    radius_m=500,                   # 500m circular radius
    max_travel_time_min_walk=8,     # 8-minute walking accessibility
    max_travel_time_min_bike=4      # 4-minute biking accessibility
)

# Compare different analysis methods
radius_pois = features['radius_features']['poi_metrics']['absolute_counts']['total_points_of_interest']
walk_pois = features['isochrone_features_walk']['poi_metrics']['absolute_counts']['total_points_of_interest'] 
bike_pois = features['isochrone_features_bike']['poi_metrics']['absolute_counts']['total_points_of_interest']

print(f"Circular (500m): {radius_pois} POIs")
print(f"Walking (8min): {walk_pois} POIs") 
print(f"Biking (4min): {bike_pois} POIs")

๐Ÿ“ Example Output

Times Square Analysis (500m radius):

{
  "network_metrics": {
    "basic_metrics": {
      "total_nodes": 777,
      "total_street_segments": 2313,
      "total_intersections": 731,
      "total_dead_ends": 0,
      "total_street_length_meters": 80044.7
    },
    "density_metrics": {
      "intersections_per_sqkm": 930.74,
      "street_length_per_sqkm": 101.92
    },
    "connectivity_metrics": {
              "streets_to_nodes_ratio": 1.488,
        "average_connections_per_node": {
          "value": 5.954,
          "confidence_interval_95": {
            "lower": 5.837,
            "upper": 6.071
          }
        }
    },
    "street_pattern_metrics": {
      "street_segment_length_distribution": {
        "minimum_meters": 0.5,
        "maximum_meters": 286.6,
        "mean_meters": 34.6,
        "median_meters": 12.0,
        "std_dev_meters": 50.7
      },
      "street_bearing_distribution": {
        "mean_degrees": 163.3,
        "std_dev_degrees": 101.5
      },
      "ninety_degree_intersection_ratio": 0.0,
      "bearing_entropy": 2.056
    }
  },
  "poi_metrics": {
    "absolute_counts": {
      "total_points_of_interest": 1076,
      "counts_by_category": {
        "total_dining_places": {
          "count": 400,
          "percentage": 37.17
        },
        "total_transportation_places": {
          "count": 190,
          "percentage": 17.66
        },
        "total_retail_places": {
          "count": 126,
          "percentage": 11.71
        },
        "total_public_transit_places": {
          "count": 96,
          "percentage": 8.92
        },
        "total_bicycle_services_places": {
          "count": 86,
          "percentage": 7.99
        },
        "total_green_infrastructure_places": {
          "count": 37,
          "percentage": 3.44
        },
        "total_culture_places": {
          "count": 34,
          "percentage": 3.16
        },
        "total_financial_places": {
          "count": 31,
          "percentage": 2.88
        },
        "total_services_places": {
          "count": 27,
          "percentage": 2.51
        },
        "total_accommodation_places": {
          "count": 16,
          "percentage": 1.49
        },
        "total_healthcare_places": {
          "count": 13,
          "percentage": 1.21
        },
        "total_water_features_places": {
          "count": 11,
          "percentage": 1.02
        },
        "total_recreation_places": {
          "count": 6,
          "percentage": 0.56
        },
        "total_toilets_hygiene_places": {
          "count": 5,
          "percentage": 0.46
        },
        "total_workspace_places": {
          "count": 5,
          "percentage": 0.46
        },
        "total_education_places": {
          "count": 3,
          "percentage": 0.28
        },
        "total_community_places": {
          "count": 3,
          "percentage": 0.28
        }
      }
    },
    "density_metrics": {
      "points_of_interest_per_sqkm": 1370.7,
      "density_by_category": {
        "dining_places_per_sqkm": 471.1,
        "transportation_places_per_sqkm": 241.9,
        "public_transit_places_per_sqkm": 122.2,
        "green_infrastructure_places_per_sqkm": 47.1,
        "culture_places_per_sqkm": 43.3,
        "financial_places_per_sqkm": 39.5,
        "retail_places_per_sqkm": 22.9,
        "water_features_places_per_sqkm": 14.0
      }
    },
    "distribution_metrics": {
      "unique_category_count": 15,
      "largest_category": {
        "name": "dining",
        "count": 370,
        "percentage": 34.39
      },
      "diversity_metrics": {
        "shannon_diversity_index": 2.11,
        "simpson_diversity_index": 0.81,
        "category_evenness": 0.78
      },
      "spatial_distribution": {
        "mean_nearest_neighbor_distance_meters": 13.2,
        "nearest_neighbor_distance_std_meters": 9.7,
        "r_statistic": 0.978,
        "pattern_interpretation": "random"
      }
    }
  }
}

๐Ÿ” Analysis Results

Location Characteristics Value Interpretation
๐Ÿ™๏ธ POI Density 1,371 per kmยฒ Ultra-dense location (rural areas: <10)
๐Ÿฝ๏ธ Food Scene 400 establishments Dining powerhouse - major food hub
๐Ÿšด Bicycle Infrastructure 86 bike facilities Excellent cycling support services
๐Ÿช Retail Access 126 stores Strong shopping accessibility
๐Ÿš‡ Public Transit 96 stops/stations Outstanding public transport connectivity
๐Ÿšป Essential Amenities 5 toilet facilities Basic public amenities available
๐Ÿ’ผ Workspace Options 5 coworking spaces Modern work infrastructure present
Network Intelligence Value Interpretation
๐Ÿšถ Walkability 5.95 connections/node Very high pedestrian connectivity (2-8+ scale)
๐Ÿ—บ๏ธ Street Pattern 2.056 bearing entropy Organized grid-like layout (0-4+ scale, lower = more organized)
๐Ÿ›ฃ๏ธ Network Density 101.9 km/kmยฒ Dense street network
Spatial Intelligence Value Use Case
๐Ÿ“Š Shannon Diversity 2.245 High POI variety (0-4+ scale) โ†’ Rich ML features
๐Ÿ“ˆ Simpson Diversity 0.79 Robust POI mix probability (0-1 scale) โ†’ Stable predictions
๐ŸŽฏ Clustering Pattern R = 0.978 Random distribution (<1 clustered, ~1 random, >1 dispersed) โ†’ Uniform coverage

Perfect for: Price prediction models, accessibility scoring, urban planning analysis

๐ŸŽฏ Key Features

Rich Points of Interest (POI) Analysis

  • 23 comprehensive categories: dining, retail, healthcare, education, transportation, childcare, toilets & hygiene, bicycle services, workspace, and more
  • Density metrics: POIs per square kilometer by category
  • Diversity indices:
    • Shannon diversity: Measures variety and evenness (higher = more diverse)
    • Simpson diversity: Probability two random POIs are different types
  • Spatial patterns: clustered, dispersed, or random POI distributions

Street Network Insights

  • Connectivity: average connections per intersection
  • Total length: meters of streets within radius
  • Segment patterns: distribution of street segment lengths
  • Bearing analysis: street orientation entropy and grid patterns

Progress Tracking

Control output verbosity with show_progress=True/False and progress_detail='normal'/'verbose'.

๐Ÿ”ฌ Scientific Applications

Geospatial Research:

# Compare neighborhood walkability
locations = [
    {'latitude': 40.7580, 'longitude': -73.9855, 'radius_meters': 800},  # Times Square
    {'latitude': 40.7829, 'longitude': -73.9654, 'radius_meters': 800}   # Central Park
]

for loc in locations:
    features = features_from_location(loc)
    walkability_score = (
        features['poi_metrics']['density_metrics']['points_of_interest_per_sqkm'] * 0.4 +
        features['network_metrics']['connectivity_metrics']['average_connections_per_node']['value'] * 100 * 0.6
    )
    print(f"Walkability score: {walkability_score:.1f}")

Machine Learning (ML) Feature Engineering:

# Generate features for price prediction model
import pandas as pd

properties = pd.read_csv('real_estate.csv')  # lat, lon, price columns
features_list = []

for _, row in properties.iterrows():
    location_features = features_from_location({
        'latitude': row['lat'],
        'longitude': row['lon'], 
        'radius_meters': 1000
    }, show_progress=False)
    
    # Extract key features for ML
    features_list.append({
        'restaurant_density': location_features['poi_metrics']['density_metrics']['restaurant_places_per_sqkm'],
        'transit_access': location_features['poi_metrics']['absolute_counts']['counts_by_category'].get('total_bus_station_places', {}).get('count', 0),
        'street_connectivity': location_features['network_metrics']['connectivity_metrics']['average_connections_per_node']['value'],
        'location_diversity': location_features['poi_metrics']['distribution_metrics']['diversity_metrics']['shannon_diversity_index']
    })

# Add to your ML pipeline
features_df = pd.DataFrame(features_list)
properties = pd.concat([properties, features_df], axis=1)

๐Ÿ›  Advanced Usage

Batch Processing

# Process multiple locations efficiently
locations = [
    {'latitude': 40.7580, 'longitude': -73.9855, 'radius_meters': 500},
    {'latitude': 40.7829, 'longitude': -73.9654, 'radius_meters': 500},
    {'latitude': 40.7527, 'longitude': -73.9772, 'radius_meters': 500}
]

results = features_from_location(locations, show_progress=True)

Command Line Interface (CLI)

# Single location analysis
geofeaturekit analyze 40.7580 -73.9855 --radius 500 --verbose

# Batch analysis from file
geofeaturekit batch-analyze locations.json --radius 1000 --output results/

Custom Radius Analysis

# Compare different scales
radii = [200, 500, 1000, 2000]  # meters

for radius in radii:
    features = features_from_location({
        'latitude': 40.7580,
        'longitude': -73.9855, 
        'radius_meters': radius
    })
    
    poi_count = features['poi_metrics']['absolute_counts']['total_points_of_interest']
    print(f"{radius}m radius: {poi_count} POIs")

๐Ÿ“Š Output Structure

GeoFeatureKit returns a comprehensive dictionary with:

  • network_metrics: Street connectivity, density, and patterns
  • poi_metrics: POI counts, density, and diversity analysis
  • units: Standardized International System of Units (SI) measurements

See full JSON structure in the example output section.

๐ŸŒ Standards & Quality

  • International System of Units (SI): All measurements in meters, square kilometers
  • Confidence Intervals: Statistical uncertainty for network metrics
  • Reproducible: Deterministic results with caching
  • Validated: Comprehensive test suite with property-based testing

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guide for details.

๐Ÿš€ Automated Releases

GeoFeatureKit uses automated releases via GitHub Actions. Every time a version tag is pushed, the package is automatically:

  • โœ… Tested on Python 3.9, 3.10, 3.11, and 3.12
  • โœ… Built with proper validation
  • โœ… Published to Python Package Index (PyPI)
  • โœ… Released on GitHub with auto-generated notes

For maintainers: Use ./release.sh <version> to automate the entire release process.

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ™ Acknowledgments

Built with OSMnx, NetworkX, and GeoPandas. Data from OpenStreetMap (OSM) contributors.

๐Ÿ“š Citation

If you use GeoFeatureKit in your research, please cite:

@software{geofeaturekit2025,
    title={GeoFeatureKit: Geospatial Feature Extraction and Analysis},
    author={Alexander Li},
    year={2025},
    url={https://github.com/lihangalex/geofeaturekit}
}

Ready to analyze any location? Start with pip install geofeaturekit and explore geospatial patterns like never before! ๐ŸŒ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geofeaturekit-0.5.1.tar.gz (69.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geofeaturekit-0.5.1-py3-none-any.whl (57.5 kB view details)

Uploaded Python 3

File details

Details for the file geofeaturekit-0.5.1.tar.gz.

File metadata

  • Download URL: geofeaturekit-0.5.1.tar.gz
  • Upload date:
  • Size: 69.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for geofeaturekit-0.5.1.tar.gz
Algorithm Hash digest
SHA256 3f46757069a1b6c0edbdba83b51317b07cb89c993bd7c8a6c3242359dc9b8aee
MD5 9b2573030e733cc1f73e92bfc06b3110
BLAKE2b-256 e14ab56f6d67b9237f35942fde8b165edc21203d94927514d6373056d0a68229

See more details on using hashes here.

File details

Details for the file geofeaturekit-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: geofeaturekit-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 57.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for geofeaturekit-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 558956c75afe56d92ccfd8667853db2eaecc4f052890cd58e3c87d687e33f9ab
MD5 92721148e908f2690a2175523b080dcd
BLAKE2b-256 f6c98cd9d855c1556cb2cc2635008668f8f8a64a3fa2865fb9c3ccd135405346

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page