Open AIS data platform for Python
Project description
๐ฑ Neptune AIS
Open AIS data platform for Python
Download, normalize, fuse, and analyze vessel tracking data from multiple open-source AIS archives.
One interface. Many sources. Clean output.
Installation โข Quick Start โข Data Sources โข Features โข CLI โข Docs
What is Neptune?
Neptune is a Python library that gives you a single, unified interface to download, normalize, and analyze AIS (Automatic Identification System) vessel tracking data from multiple open-source archives.
AIS data powers maritime domain awareness โ vessel tracking, trade analytics, environmental monitoring, fishing surveillance, and port operations. But working with it is painful: every provider uses a different format, schema, delivery mechanism, and quality profile. Neptune handles all of that so you can focus on analysis.
from neptune_ais import Neptune
n = Neptune("2024-06-15", sources=["noaa"])
n.download()
positions = n.positions() # Polars LazyFrame โ normalized, QC'd
result = n.sql("SELECT mmsi, count(*) as n FROM positions GROUP BY mmsi ORDER BY n DESC LIMIT 5")
Think of Neptune as Herbie for maritime data โ a clean data-access layer that handles the messy plumbing of fetching, normalizing, and cataloging data from heterogeneous archives, so you get reproducible, analysis-ready output every time.
Key Features
- Multi-source ingestion โ Download from NOAA, DMA, Global Fishing Watch, Finland, AISHub, and AISStream through one API
- Automatic normalization โ Every source is normalized to a canonical schema with QC scoring and provenance tracking
- Multi-source fusion โ Merge overlapping sources with configurable dedup strategies (
best,union,prefer:<source>) - Polars-native โ Query positions, vessels, tracks, and events as lazy DataFrames with full predicate pushdown
- SQL via DuckDB โ Run SQL queries directly over your cataloged data
- Event detection โ Derive port calls, EEZ crossings, vessel encounters, and loitering from raw positions
- Real-time streaming โ Connect to live AIS feeds with backpressure, checkpointing, and durable sinks
- Interactive maps โ Visualize positions, tracks, and events with lonboard
- Plugin system โ Add custom source adapters via Python entry points
- CLI included โ
neptune download,neptune inventory,neptune sql, and more
Installation
Neptune's core is lightweight โ only Polars, Pydantic, and httpx are required. Everything else is opt-in.
# Core (Polars + Pydantic + httpx)
pip install neptune-ais
# With SQL support (DuckDB)
pip install neptune-ais[sql]
# With spatial & visualization (GeoDataFrames, lonboard, H3)
pip install neptune-ais[geo]
# With real-time streaming (WebSocket feeds)
pip install neptune-ais[stream]
# With the CLI (Click + Rich)
pip install neptune-ais[cli]
# Everything
pip install neptune-ais[all]
Requirements: Python 3.10+
Optional dependency groups explained
| Extra | Adds | Used by |
|---|---|---|
sql |
duckdb | Neptune.sql(), Neptune.duckdb(), DuckDBSink |
parquet |
pyarrow | Full Parquet write options (compression, statistics) |
geo |
shapely, geopandas, movingpandas, lonboard, h3 | Boundary lookups, GeoDataFrame bridges, maps |
stream |
websockets | NeptuneStream, live AIS feeds |
cli |
click, rich | neptune console commands |
notebooks |
jupyter, ipykernel | Interactive notebook examples |
dev |
pytest, mypy, ruff, coverage, nbstripout | Development and testing |
all |
All of the above (except dev) | Full-featured install |
Quick Start
Download and query AIS data
from neptune_ais import Neptune
# Download a day of NOAA AIS data
# Neptune handles: fetch โ normalize โ QC โ partition โ catalog
n = Neptune("2024-06-15", sources=["noaa"])
n.download()
# Query as a Polars LazyFrame
positions = n.positions()
df = positions.collect()
print(f"{len(df):,} position reports from {df['mmsi'].n_unique():,} vessels")
# SQL queries via DuckDB
top_vessels = n.sql("""
SELECT mmsi, count(*) as n
FROM positions
GROUP BY mmsi
ORDER BY n DESC
LIMIT 10
""")
Common operations with helpers
from neptune_ais.helpers import latest_positions, snapshot, vessel_history
# Most recent position per vessel
latest = latest_positions(positions)
# Point-in-time snapshot โ where was every vessel at noon?
noon = snapshot(positions, when="2024-06-15T12:00:00")
# Full history for a single vessel
history = vessel_history(367000001, positions=positions)
Multi-source fusion
# Combine NOAA and DMA with automatic deduplication
n = Neptune(
("2024-06-15", "2024-06-16"),
sources=["noaa", "dma"],
merge="best", # "best" | "union" | "prefer:noaa"
)
n.download()
fused = n.positions() # Deduplicated across sources
Event detection
# Derive maritime events from position data
events = n.events(kind="port_call", min_confidence=0.7)
# Event types: port_call, eez_crossing, encounter, loitering
# Each event includes confidence scores and full provenance
Real-time streaming
import asyncio
from neptune_ais.stream import NeptuneStream, StreamConfig
from neptune_ais.sinks import ParquetSink, promote_landing
config = StreamConfig(
source="aisstream",
api_key="YOUR_KEY",
bbox=(-74.5, 40.0, -73.5, 41.0), # New York harbor
)
async def ingest():
sink = ParquetSink("/tmp/neptune_landing", source="aisstream")
async with NeptuneStream(config=config) as stream:
await stream.run_sink(sink, max_messages=10_000)
# Promote to canonical storage
promote_landing("/tmp/neptune_landing", store_root="~/.neptune", source="aisstream")
asyncio.run(ingest())
Data Sources
Neptune includes adapters for six open AIS data providers, with a plugin system for adding more.
| Source | Provider | Coverage | Delivery | Auth | Backfill |
|---|---|---|---|---|---|
noaa |
NOAA AIS Archive | US waters, global ATON | Daily files | None | Yes |
dma |
Danish Maritime Authority | European waters | Daily files | None | Yes |
gfw |
Global Fishing Watch | Global (satellite AIS) | Daily files | API key | Yes |
finland |
Digitraffic Finland | Finnish waters | Epoch-based | None | Yes |
aishub |
AISHub | Global (variable quality) | Multiple feeds | API key | Yes |
aisstream |
AISStream | Global (real-time) | WebSocket | API key | No (live only) |
Discover sources programmatically
from neptune_ais import sources
sources.load_all_adapters()
# List all sources
for s in sources.catalog():
print(f"{s.source_id:<12} {s.provider:<30} auth={s.auth_scheme or 'none'}")
# Find open-data sources with backfill
for s in sources.find_sources(backfill=True, auth=False):
print(s.source_id)
Add a custom source via plugin
External packages register adapters through Python entry points:
# In your plugin's pyproject.toml
[project.entry-points."neptune_ais.adapters"]
my_source = "my_package.adapter:MyAdapter"
Features
Architecture
Neptune is organized around a canonical dataset family and a three-layer local store:
โโโโโโโโโโโโโโโโโโ
โ Your Code โ
โ Polars / SQL โ
โโโโโโโโโฌโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโ
โ Neptune API โ
โ .positions() โ
โ .tracks() โ
โ .events() โ
โ .sql() โ
โโโโฌโโโโโโโโโโโโโโโฌโโโ
โโโโโโโโโโโโโโโผโโโ โโโโโโโผโโโโโโโโโโโโโโโ
โ Archival Path โ โ Streaming Path โ
โ fetch โ norm โ โ NeptuneStream โ
โ โ QC โ store โ โ โ sink โ promote โ
โโโโฌโโโโโโโโโโโโโโ โโโโโโโฌโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโ
โ Three-Layer Store โ
โ raw/ (source payloads) โ canonical/ (normalized) โ
โ โ derived/ (cached products) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Catalog & Manifests โ
โ partition tracking ยท schema versions ยท QC summaries โ
โ staleness detection ยท atomic writes (stage โ commit) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Canonical Datasets
| Dataset | Description | Schema |
|---|---|---|
| positions | Timestamped AIS point observations (mmsi, lat, lon, sog, cog, ...) | positions/v1 |
| vessels | Vessel identity and reference data (slowly changing dimensions) | vessels/v1 |
| tracks | Derived trip/trajectory segments | tracks/v1 |
| events | Maritime events (port calls, EEZ crossings, encounters, loitering) | events/v1 |
Quality Control
Every ingested record passes through row-level and partition-level QC checks:
- Data type validation, range checks, sentinel detection
- Confidence scoring in three tiers: HIGH (>= 0.7), MEDIUM (0.3โ0.7), LOW (< 0.3)
- Per-adapter QC rule injection for source-specific quirks
- Full provenance tracking from source through fusion
Fusion Modes
When querying across multiple sources, Neptune supports three merge strategies:
| Mode | Behavior |
|---|---|
best |
Deduplicate with configurable field-level precedence |
union |
Keep all records from all sources, tag provenance |
prefer:<source> |
Deterministic source preference (e.g., prefer:noaa) |
Event Detection
Neptune derives four maritime event families from position data using heuristic detectors:
| Event | Description |
|---|---|
| Port calls | Sustained low-speed presence within a port boundary |
| EEZ crossings | Transitions between exclusive economic zones |
| Encounters | Two vessels within 500m for a sustained duration |
| Loitering | Sustained low-speed movement in a small area |
Each event includes a deterministic event_id, confidence score, timestamps, and full provenance linking back to source positions. See HEURISTICS.md for detection assumptions and known limitations.
CLI
Neptune includes a full command-line interface (requires pip install neptune-ais[cli]):
# Download data
neptune download --source noaa --date 2024-06-15
neptune download --source noaa --source dma --start 2024-06-01 --end 2024-06-07
# Inspect what you have
neptune inventory
neptune inventory --dataset positions
# Quality reports
neptune qc --source noaa --date 2024-06-15
# SQL queries from the terminal
neptune sql "SELECT count(*) FROM positions WHERE source = 'noaa'"
# Source catalog
neptune sources
neptune sources --compare noaa dma gfw
# Event queries
neptune events --kind port_call --date 2024-06-15
# Health and provenance
neptune health
neptune provenance --date 2024-06-15
# Promote streaming data to canonical store
neptune promote --landing-dir /tmp/neptune_landing --source aisstream
Documentation
Full Sphinx documentation is planned. In the meantime:
| Resource | Description |
|---|---|
examples/ |
Six narrative examples covering the full workflow |
| HEURISTICS.md | Event detection assumptions, confidence limits, non-goals |
| RELEASING.md | Release procedures and checklist |
| RC_CHECKLIST.md | Release-candidate validation results |
Examples
| # | Example | Topics |
|---|---|---|
| 1 | Source Discovery (.py) | Inspect sources, capabilities, filters |
| 2 | Archival Ingest (.py) | Download, Polars queries, SQL, helpers |
| 3 | Multi-Source Fusion (.py) | Merge strategies, fusion config |
| 4 | Event Detection (.py) | Port calls, EEZ crossings, encounters |
| 5 | Streaming Pipeline (.py) | Live feeds, sinks, promotion |
| 6 | External Plugin | Custom adapter via entry point |
Tip: Install notebook support with
pip install neptune-ais[notebooks]to run the interactive examples.
Contributing
Contributions are welcome. To get started:
git clone https://github.com/yourorg/neptune-ais.git
cd neptune-ais
pip install -e ".[all,dev]"
pytest
The test suite includes 768 tests covering adapter certification, schema reproducibility, streaming soak tests, and packaging validation.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neptune_ais-0.1.0.tar.gz.
File metadata
- Download URL: neptune_ais-0.1.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efe3afadfe81bba8e99384a3b3c1a9333f22fc3b02869699772c8b08d9508645
|
|
| MD5 |
04047d00ef5c702670e94be54c43a058
|
|
| BLAKE2b-256 |
56c19d11768c239c9463883b134a3566def0ba03fda26073a2cb8fa95004febb
|
File details
Details for the file neptune_ais-0.1.0-py3-none-any.whl.
File metadata
- Download URL: neptune_ais-0.1.0-py3-none-any.whl
- Upload date:
- Size: 132.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0520c3178d1cb9f57a696cdd1281b19f1019d690305173c9553cabc45a8350e9
|
|
| MD5 |
fab7cc528014c81856126a9586743785
|
|
| BLAKE2b-256 |
988eaebf836d78ea57850c6d3d13489b06a817debefe0edd4cae7b5b64810305
|