Skip to main content

A package for creating and managing datacubes for coastal resilience analysis.

Project description

VectorOps

A Python library for efficiently working with geospatial vector data using DuckDB. VectorOps simplifies querying, filtering, and manipulating GeoParquet files with support for both local and S3 storage.

Features

  • Fast querying of GeoParquet files using DuckDB
  • Support for both local and S3 data sources
  • Spatial indexing using Hilbert curves for improved query performance
  • Easy conversion between DuckDB tables and GeoDataFrames
  • Partitioned writing of GeoParquet files
  • SQL-based filtering and querying

Installation

pip install vectorops

Requirements

  • geospatial-analysis-environment

Quick Start

Example data can be found in Notion: https://www.notion.so/GeoDataLake-1-0-1897554f2151804d9fdcefbe4ca50f26?pvs=4#19e7554f215180d19e99db64239c0976

Reading Data

from vectorops.geoduck import GeoDuck
# Local file
geoduck = GeoDuck("./data/counties.parquet")

# S3 file
geoduck = GeoDuck("s3://bucket/path/to/counties.parquet")

Querying Data

# SQL queries
result = geoduck.query("SELECT NAME FROM source WHERE state = 'CA'")

# Filter and create new view
geoduck.filter("population > 1000000", view="large_counties")

# Get as GeoDataFrame
gdf = geoduck.get_dataframe("large_counties")

Writing Data

# Write filtered data with partitioning
geoduck.write_parquet(
    "./output/counties",
    partition_by=["state"]
)

# Write GeoDataFrame directly
from vectorops.storage import save_gdf_to_parquet
save_gdf_to_parquet(gdf, "./output/counties", partition_by=["state"])

S3 Configuration

For S3 access, set the following environment variables:

export AWS_ACCESS_KEY_ID="your_access_key_id"
export AWS_SECRET_ACCESS_KEY="your_secret_access_key"
export AWS_ENDPOINT_URL="your_endpoint_url"

API Reference

GeoDuck Class

The main interface for working with geospatial data: Key methods:

  • __init__(path): Initialize with path to parquet file(s)
  • filter(query, view): Filter data using SQL WHERE clause
  • query(sql): Execute custom SQL query
  • get_dataframe(view): Convert view to GeoDataFrame
  • write_parquet(path, view, partition_by): Write data to parquet

Storage Functions

Utility functions for data storage:

  • save_gdf_to_parquet(gdf, path, partition_by): Save GeoDataFrame to parquet
  • write_view_to_parquet(con, view, path, partition_by): Write DuckDB view to parquet

Performance

VectorOps uses several optimizations for performance:

  • Hilbert curve spatial indexing for improved query speed
  • DuckDB for efficient SQL operations
  • Lazy loading of data
  • ZSTD compression for storage efficiency

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorops-0.1.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectorops-0.1.1-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file vectorops-0.1.1.tar.gz.

File metadata

  • Download URL: vectorops-0.1.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.1

File hashes

Hashes for vectorops-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9a9fd1fe9b8ca7b6eb66a1c2fe6ab6e42200dc65af116197c7a5620a3f923583
MD5 3bcc4e1d81afac5fa087c77f43d0a945
BLAKE2b-256 5625d79383fdcf50791b6a4a7de22eaaa835cf189b89ba544dfbda82c24fb923

See more details on using hashes here.

File details

Details for the file vectorops-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vectorops-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.1

File hashes

Hashes for vectorops-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 040d78912749abc78567c26f191448df627b46d7bb45b1ed78010c339f8646fd
MD5 02be3d67645fa0b4639eb6d12fc92784
BLAKE2b-256 0231164486b4697a0a590c524b17f8739364d1f7d974cb4e53ed4ef93c05b619

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page