Transform raw coordinates into actionable geospatial features – street networks, POI data, and spatial metrics – using open data with zero setup. Perfect for ML engineers and data scientists building location-based models.

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: GIS

Project description

GeoFeatureKit

A comprehensive Python library for extracting and analyzing urban features from OpenStreetMap data.

GeoFeatureKit empowers urban planners, researchers, and developers to analyze city infrastructure, amenities, and spatial patterns using scientifically rigorous metrics and statistical analysis.

🚀 Key Features

🏙️ Street Network Analysis

Connectivity Metrics: Streets-to-nodes ratios, average connections per node with confidence intervals
Pattern Analysis: Street bearing distributions, entropy measures, grid pattern detection
Density Calculations: Street length per km², intersection density, segment distributions
Statistical Rigor: Confidence intervals, standard deviations, robust statistical measures

📍 Points of Interest (POI) Analysis

Comprehensive Categorization: 40+ POI categories with automatic classification
Density Metrics: POI counts per km² with category-specific breakdowns
Diversity Analysis: Shannon diversity index, Simpson diversity, category evenness
Spatial Distribution: Nearest neighbor analysis, clustering patterns

📊 Advanced Urban Metrics

Data Quality Assessment: Completeness percentages, reliability scores
Statistical Analysis: Confidence intervals for all major metrics
Spatial Analysis: Area calculations, density distributions, pattern recognition
Real-world Validation: Tested on major urban areas worldwide

📦 Installation

# Install from PyPI
pip install geofeaturekit

# Or install directly from GitHub for latest development version
pip install git+https://github.com/lihangalex/geofeaturekit.git

# For development
git clone https://github.com/lihangalex/geofeaturekit.git
cd geofeaturekit
pip install -e .

Requirements

Python 3.9+
NumPy, SciPy for statistical analysis
GeoPandas, OSMnx for geospatial processing
NetworkX for network analysis

🎯 Quick Start

from geofeaturekit import features_from_location

# Analyze any location worldwide
features = features_from_location({
    'latitude': 40.7580,   # Times Square, NYC
    'longitude': -73.9855,
    'radius_meters': 500
})

# Access comprehensive metrics
network = features['network_metrics']
pois = features['poi_metrics']

print(f"Street length: {network['basic_metrics']['total_street_length_meters']:.1f}m")
print(f"POI count: {pois['absolute_counts']['total_points_of_interest']}")
print(f"POI density: {pois['density_metrics']['points_of_interest_per_sqkm']:.1f} per km²")

🌟 Real-World Examples

Times Square Analysis

# Dense commercial district
features = features_from_location({
    'latitude': 40.7580, 'longitude': -73.9855, 'radius_meters': 500
})

# Results:
# - 777 network nodes, 2,313 street segments
# - 80.0 km of streets in 0.785 km² area
# - 1,076 POIs (1,371 per km²)
# - 42 unique POI categories
# - High connectivity: 3.59 connections per node

Central Park Analysis

# Park and recreational area
features = features_from_location({
    'latitude': 40.7829, 'longitude': -73.9654, 'radius_meters': 500  
})

# Results:
# - 356 network nodes, 1,002 street segments
# - 41.3 km of paths and streets
# - 185 POIs (236 per km²) 
# - Dominated by benches (35.7%) and recreational amenities
# - Lower but adequate connectivity: 3.26 connections per node

Grand Central District

# Transportation and business hub
features = features_from_location({
    'latitude': 40.7527, 'longitude': -73.9772, 'radius_meters': 500
})

# Results:
# - 1,002 network nodes, 2,975 street segments  
# - 91.2 km of streets (highest density)
# - 1,131 POIs (1,441 per km²)
# - Mixed commercial and transportation amenities
# - Excellent connectivity: 3.60 connections per node

📈 Comprehensive Output Structure

{
    "network_metrics": {
        "basic_metrics": {
            "total_nodes": 777,
            "total_street_segments": 2313,
            "total_intersections": 0,
            "total_dead_ends": 41,
            "total_street_length_meters": 80044.7
        },
        "density_metrics": {
            "intersections_per_sqkm": 0.0,
            "street_length_per_sqkm": 101.916091
        },
        "connectivity_metrics": {
            "streets_to_nodes_ratio": 1.488417,
            "average_connections_per_node": {
                "value": 3.589,
                "confidence_interval_95": {
                    "lower": 3.536,
                    "upper": 3.643
                }
            }
        },
        "street_pattern_metrics": {
            "street_segment_length_distribution": {
                "minimum_meters": 0.5,
                "maximum_meters": 286.6,
                "mean_meters": 34.6,
                "median_meters": 12.0,
                "std_dev_meters": 50.7
            },
            "street_bearing_distribution": {
                "mean_degrees": 163.3,
                "std_dev_degrees": 101.5
            },
            "ninety_degree_intersection_ratio": 0.0,
            "bearing_entropy": 2.056
        }
    },
    "poi_metrics": {
        "absolute_counts": {
            "total_points_of_interest": 1076,
            "counts_by_category": {
                "total_restaurant_places": {
                    "count": 173,
                    "percentage": 16.1,
                    "confidence_interval_95": {
                        "lower": 14.0,
                        "upper": 18.4
                    }
                }
                // ... 40+ categories
            }
        },
        "density_metrics": {
            "points_of_interest_per_sqkm": 1370.700637,
            "density_by_category": {
                "restaurant_places_per_sqkm": 220.382166,
                "cafe_places_per_sqkm": 94.267516
                // ... per-category densities
            }
        },
        "distribution_metrics": {
            "unique_category_count": 42,
            "diversity_metrics": {
                "shannon_diversity_index": 2.245,
                "simpson_diversity_index": 0.79,
                "category_evenness": 0.601
            },
            "spatial_distribution": {
                "pattern_interpretation": "clustered"
            }
        }
    },
    "units": {
        "area": "square_meters",
        "length": "meters", 
        "density": "per_square_kilometer"
    }
}

🔬 Scientific Applications

Urban Planning Research

# Compare neighborhood walkability
locations = [
    {'name': 'Downtown', 'lat': 40.7580, 'lon': -73.9855},
    {'name': 'Residential', 'lat': 40.7829, 'lon': -73.9654}
]

for loc in locations:
    features = features_from_location(loc)
    connectivity = features['network_metrics']['connectivity_metrics']
    poi_density = features['poi_metrics']['density_metrics']
    
    print(f"{loc['name']} Walkability Score:")
    print(f"  Connectivity: {connectivity['average_connections_per_node']['value']:.2f}")
    print(f"  POI Density: {poi_density['points_of_interest_per_sqkm']:.0f} per km²")

Accessibility Analysis

# Analyze service accessibility
features = features_from_location({'lat': 40.7527, 'lon': -73.9772, 'radius_meters': 800})

essential_services = [
    'restaurant_places_per_sqkm',
    'bank_places_per_sqkm', 
    'pharmacy_places_per_sqkm'
]

for service in essential_services:
    density = features['poi_metrics']['density_metrics'][service]
    print(f"{service}: {density:.1f} per km²")

Comparative Urban Studies

# Multi-city comparison
cities = [
    {'name': 'NYC Times Square', 'lat': 40.7580, 'lon': -73.9855},
    {'name': 'London Piccadilly', 'lat': 51.5100, 'lon': -0.1347},
    {'name': 'Tokyo Shibuya', 'lat': 35.6598, 'lon': 139.7006}
]

results = {}
for city in cities:
    features = features_from_location(city)
    results[city['name']] = {
        'street_density': features['network_metrics']['density_metrics']['street_length_per_sqkm'],
        'poi_diversity': features['poi_metrics']['distribution_metrics']['diversity_metrics']['shannon_diversity_index']
    }

🛠️ Advanced Usage

Batch Processing

import pandas as pd

# Process multiple locations
locations_df = pd.read_csv('study_locations.csv')
results = []

for _, row in locations_df.iterrows():
    try:
        features = features_from_location({
            'latitude': row['lat'],
            'longitude': row['lon'], 
            'radius_meters': row['radius']
        })
        
        results.append({
            'location_id': row['id'],
            'poi_count': features['poi_metrics']['absolute_counts']['total_points_of_interest'],
            'street_length': features['network_metrics']['basic_metrics']['total_street_length_meters'],
            'connectivity': features['network_metrics']['connectivity_metrics']['average_connections_per_node']['value']
        })
    except Exception as e:
        print(f"Error processing {row['id']}: {e}")

results_df = pd.DataFrame(results)

Statistical Analysis

# Extract confidence intervals and statistical measures
features = features_from_location({'lat': 40.7580, 'lon': -73.9855, 'radius_meters': 500})

# Network connectivity with confidence intervals
conn = features['network_metrics']['connectivity_metrics']['average_connections_per_node']
print(f"Average connections: {conn['value']:.3f}")
print(f"95% CI: [{conn['confidence_interval_95']['lower']:.3f}, {conn['confidence_interval_95']['upper']:.3f}]")

# POI category analysis with statistical measures
categories = features['poi_metrics']['absolute_counts']['counts_by_category']
for category, data in categories.items():
    if 'confidence_interval_95' in data:
        print(f"{category}: {data['percentage']:.1f}% ± {(data['confidence_interval_95']['upper'] - data['confidence_interval_95']['lower'])/2:.1f}%")

📋 Metric Standards

All metrics follow SI (International System of Units) standards [[memory:2272173]]:

Length: meters (m)
Area: square meters (m²)
Density: per square kilometer (per km²)
Angles: degrees (°)
Statistical measures: Include confidence intervals where applicable

🧪 Testing & Quality

Comprehensive test suite: Property-based testing with Hypothesis [[memory:2272171]]
Real-world validation: Tested on major urban areas
Statistical rigor: All major metrics include confidence intervals
Error handling: Robust handling of edge cases and missing data
Performance: Optimized for large-scale analysis

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Run tests (tox -e py310)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

🙏 Acknowledgments

OpenStreetMap: For providing the foundational geographic data
OSMnx: For excellent OpenStreetMap network analysis tools
GeoPandas: For robust geospatial data processing
SciPy ecosystem: For statistical analysis capabilities

📚 Citation

If you use GeoFeatureKit in your research, please cite:

@software{geofeaturekit2024,
    title={GeoFeatureKit: Urban Feature Extraction and Analysis},
    author={Your Name},
    year={2024},
    url={https://github.com/lihangalex/geofeaturekit}
}

Ready to analyze your city? Start with pip install geofeaturekit and explore urban patterns like never before! 🏙️

Project details

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering :: GIS

Release history Release notifications | RSS feed

0.6.1

Jul 8, 2025

0.6.0

Jul 7, 2025

0.5.1

Jul 6, 2025

0.5.0

Jul 6, 2025

0.4.0

Jul 6, 2025

0.2.9

Jul 6, 2025

0.2.8

Jul 6, 2025

0.2.7

Jul 6, 2025

0.2.6

Jul 6, 2025

0.2.4

Jul 5, 2025

0.2.3

Jul 5, 2025

0.2.2

Jul 5, 2025

0.2.1

Jul 5, 2025

0.2.0

Jul 5, 2025

0.1.5

Jul 5, 2025

0.1.4

Jul 5, 2025

This version

0.1.2

Jul 5, 2025

0.1.1

Jul 5, 2025

0.1.0

Jul 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geofeaturekit-0.1.2.tar.gz (47.5 kB view details)

Uploaded Jul 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

geofeaturekit-0.1.2-py3-none-any.whl (44.6 kB view details)

Uploaded Jul 5, 2025 Python 3

File details

Details for the file geofeaturekit-0.1.2.tar.gz.

File metadata

Download URL: geofeaturekit-0.1.2.tar.gz
Upload date: Jul 5, 2025
Size: 47.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for geofeaturekit-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`a7441ac90e358544a19488019d2cb18f5388c30672f0e33638e9030fab49d117`
MD5	`3b268ba312214d0ee0ba42ba82d8061d`
BLAKE2b-256	`5f6d78f2ada62f830f800d65ef1fc7ad5d2898fc7a5323a2ff3cd889b32cd33f`

See more details on using hashes here.

File details

Details for the file geofeaturekit-0.1.2-py3-none-any.whl.

File metadata

Download URL: geofeaturekit-0.1.2-py3-none-any.whl
Upload date: Jul 5, 2025
Size: 44.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for geofeaturekit-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`67a2e9ed00867da3a1e7cad17d4d14bb89d5f7bcb8b9dd8e21957662692d4aa2`
MD5	`50553daad7574dfc69463f05cd2594fa`
BLAKE2b-256	`ac1df6a774663455cc7b2abb75556a140466294d336a9c49e32894ae5e6c7bbf`

See more details on using hashes here.

geofeaturekit 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GeoFeatureKit

🚀 Key Features

🏙️ Street Network Analysis

📍 Points of Interest (POI) Analysis

📊 Advanced Urban Metrics

📦 Installation

Requirements

🎯 Quick Start

🌟 Real-World Examples

Times Square Analysis

Central Park Analysis

Grand Central District

📈 Comprehensive Output Structure

🔬 Scientific Applications

Urban Planning Research

Accessibility Analysis

Comparative Urban Studies

🛠️ Advanced Usage

Batch Processing

Statistical Analysis

📋 Metric Standards

🧪 Testing & Quality

🤝 Contributing

📄 License

🙏 Acknowledgments

📚 Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes