A package for creating and managing datacubes for coastal resilience analysis.
Project description
VectorOps
A Python library for efficiently working with geospatial vector data using DuckDB. VectorOps simplifies querying, filtering, and manipulating GeoParquet files with support for both local and S3 storage.
Features
- Fast querying of GeoParquet files using DuckDB
- Support for both local and S3 data sources
- Spatial indexing using Hilbert curves for improved query performance
- Easy conversion between DuckDB tables and GeoDataFrames
- Partitioned writing of GeoParquet files
- SQL-based filtering and querying
Installation
pip install vectorops
Requirements
- geospatial-analysis-environment
Quick Start
Example data can be found in Notion: https://www.notion.so/GeoDataLake-1-0-1897554f2151804d9fdcefbe4ca50f26?pvs=4#19e7554f215180d19e99db64239c0976
Reading Data
from vectorops.geoduck import GeoDuck
# Local file
geoduck = GeoDuck("./data/counties.parquet")
# S3 file
geoduck = GeoDuck("s3://bucket/path/to/counties.parquet")
Querying Data
# SQL queries
result = geoduck.query("SELECT NAME FROM source WHERE state = 'CA'")
# Filter and create new view
geoduck.filter("population > 1000000", view="large_counties")
# Get as GeoDataFrame
gdf = geoduck.get_dataframe("large_counties")
Writing Data
# Write filtered data with partitioning
geoduck.write_parquet(
"./output/counties",
partition_by=["state"]
)
# Write GeoDataFrame directly
from vectorops.storage import save_gdf_to_parquet
save_gdf_to_parquet(gdf, "./output/counties", partition_by=["state"])
S3 Configuration
For S3 access, set the following environment variables:
export AWS_ACCESS_KEY_ID="your_access_key_id"
export AWS_SECRET_ACCESS_KEY="your_secret_access_key"
export AWS_ENDPOINT_URL="your_endpoint_url"
API Reference
GeoDuck Class
The main interface for working with geospatial data: Key methods:
__init__(path): Initialize with path to parquet file(s)filter(query, view): Filter data using SQL WHERE clausequery(sql): Execute custom SQL queryget_dataframe(view): Convert view to GeoDataFramewrite_parquet(path, view, partition_by): Write data to parquet
Storage Functions
Utility functions for data storage:
save_gdf_to_parquet(gdf, path, partition_by): Save GeoDataFrame to parquetwrite_view_to_parquet(con, view, path, partition_by): Write DuckDB view to parquet
Performance
VectorOps uses several optimizations for performance:
- Hilbert curve spatial indexing for improved query speed
- DuckDB for efficient SQL operations
- Lazy loading of data
- ZSTD compression for storage efficiency
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vectorops-0.1.1.tar.gz.
File metadata
- Download URL: vectorops-0.1.1.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a9fd1fe9b8ca7b6eb66a1c2fe6ab6e42200dc65af116197c7a5620a3f923583
|
|
| MD5 |
3bcc4e1d81afac5fa087c77f43d0a945
|
|
| BLAKE2b-256 |
5625d79383fdcf50791b6a4a7de22eaaa835cf189b89ba544dfbda82c24fb923
|
File details
Details for the file vectorops-0.1.1-py3-none-any.whl.
File metadata
- Download URL: vectorops-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
040d78912749abc78567c26f191448df627b46d7bb45b1ed78010c339f8646fd
|
|
| MD5 |
02be3d67645fa0b4639eb6d12fc92784
|
|
| BLAKE2b-256 |
0231164486b4697a0a590c524b17f8739364d1f7d974cb4e53ed4ef93c05b619
|