Skip to main content

A reproducible workflow for the FAA aircraft registry

Project description

Hangarbay

Stop struggling with FAA registration data. Get clean, queryable aircraft registry tables in seconds.

Hangarbay downloads, normalizes and indexes the complete FAA aircraft registry so you can skip straight to analysis. No more hunting for files to download, or joining fragmented tables, or trying to decoding cryptic lookup relationships. One command gets you 307,000+ aircraft registrations, owners and specifications ready to query with SQL or pandas.

This is a tool built for researchers, data journalists and aviation analysts who need fast, reliable and repeatable workflows.

Quick start

Requirements: Python 3.9+

Python API (notebooks & scripts)

# Install
pip install hangarbay

# One-time setup
import hangarbay as hb
hb.load_data()  # Downloads & processes FAA data to ~/.hangarbay/data/

# Look up aircraft
df = hb.search("N221LA")

# Find fleets
fleet = hb.fleet("United Airlines")
fleet = hb.fleet("LAPD|Los Angeles Police", state="CA")

# Custom SQL
df = hb.query("""
    SELECT maker, COUNT(*) as count
    FROM aircraft_decoded
    WHERE year_mfr > 2020
    GROUP BY maker
    ORDER BY count DESC
""")

# Check data age
info = hb.status()
print(f"Data is {info['age_days']} days old")

Command-line interface

# Clone and install
git clone https://github.com/stiles/hangarbay.git
cd hangarbay

# Recommended: use a virtual environment (uv, venv, conda, etc.)
uv venv && source .venv/bin/activate  # or: python -m venv .venv && source .venv/bin/activate

pip install -e ".[dev]"

# Run the full pipeline (~2 minutes)
hangar update     # Download, normalize and publish (all-in-one)

# Or run steps individually
hangar fetch      # Download FAA data
hangar normalize  # Parse to typed Parquet tables
hangar publish    # Build DuckDB + SQLite FTS indexes

# Check data status and age
hangar status

# Start querying!
hangar search N757AF
hangar sql "SELECT COUNT(*) FROM aircraft"

What this does

Hangarbay downloads FAA aircraft registration data, normalizes it into typed tables and provides fast SQL querying:

  • 307K+ aircraft registrations with owner and address information
  • 93K+ make/model references (Cessna, Piper, Boeing, etc.)
  • 4,700+ engine specifications (horsepower, type, manufacturer)
  • DuckDB for analytical SQL queries (sub-second on 300K+ rows)
  • SQLite FTS5 for full-text search by owner name or address
  • Parquet files for efficient columnar storage
  • Data lineage tracking with SHA256 checksums and version metadata
  • Shared data directory at ~/.hangarbay/data/ - works from any project or notebook

Features

N-Number lookup

Look up any aircraft registration with decoded status codes and human-readable output:

hangar search N221LA

Fleet search

Find all aircraft owned by a person or company:

# Search by owner name (case-insensitive)
hangar fleet "United Airlines"
hangar fleet "LAPD"
hangar fleet "NetJets"

# OR logic with pipe separator (great for finding multiple airlines)
hangar fleet "LAPD|Los Angeles Police"     # Either pattern matches
hangar fleet "Delta|American|United"       # Any of the three

# Filter by state
hangar fleet "Delta" --state GA

# Export to CSV
hangar fleet "Boeing" --export boeing_fleet.csv

# Limit results
hangar fleet "Cessna" --limit 20
Aircraft Registration: N221LA

Owner: LAPD AIR SUPPORT DIVISION
Location: LOS ANGELES, CA

  Make & Model:           AIRBUS HELICOPTERS INC AS350B3
  Year Manufactured:      2014
  Serial Number:          7900
  Registration Status:    Valid
  Certificate Type:       Standard Airworthiness - Normal
  Status Date:            May 19, 2023
  Expiration:             Jun 30, 2028

SQL queries

Execute analytical queries with pretty output:

# Total aircraft
hangar sql "SELECT COUNT(*) FROM aircraft"

# Use decoded views for readable output
hangar sql "SELECT * FROM aircraft_decoded WHERE year_mfr > 2020 LIMIT 10"
hangar sql "SELECT * FROM owners_clean WHERE owner_name LIKE '%boeing%'" -i

# Top manufacturers
hangar sql "SELECT maker, COUNT(*) as count FROM aircraft 
  JOIN aircraft_make_model USING(mfr_mdl_code) 
  WHERE maker != '' GROUP BY 1 ORDER BY 2 DESC LIMIT 10"

# Lookup status codes
hangar sql "SELECT * FROM status_codes"

# Top states by registrations
hangar sql "SELECT state_std, COUNT(*) as count FROM owners 
  WHERE state_std != '' GROUP BY 1 ORDER BY 2 DESC LIMIT 10"

Output Formats:

# Pretty table (default)
hangar sql "SELECT * FROM status_codes LIMIT 3"

# JSON output (for APIs/scripts)
hangar sql "SELECT * FROM status_codes LIMIT 3" --output-format json

# CSV output (for Excel/spreadsheets)
hangar sql "SELECT * FROM status_codes LIMIT 3" --output-format csv

Available tables

Core Tables:

  • aircraft - Current registration facts (N-number, serial, make/model, year, status)
  • registrations - Canonical registration state (type, status, dates)
  • owners - Owner records with raw and standardized addresses
  • owners_summary - Aggregated ownership (co-owners, counts)
  • aircraft_make_model - Make/model reference (Cessna 172, Boeing 737, etc.)
  • engines - Engine specifications (manufacturer, horsepower, type)

Decoded Views (Recommended):

  • aircraft_decoded - Aircraft with decoded status codes and joined make/model
  • owners_clean - Simplified owner table (standardized fields only, decoded owner type)

Reference Tables:

  • status_codes - Registration status code lookups
  • airworthiness_classes - Airworthiness certificate class lookups
  • owner_types - Owner type code lookups

Data quality

  • Type safety: PyArrow schemas enforce correct data types (dates, integers, strings)
  • Address standardization: Uppercase, state/ZIP normalization, whitespace cleanup
  • Deterministic IDs: xxhash64 ensures consistent owner IDs across pipeline runs
  • File verification: SHA256 checksums validate all downloads
  • Version tracking: Manifest records schema versions and row counts for each snapshot

Architecture

FAA Data (MASTER, ACFTREF, ENGINE)
    ↓ fetch (with SHA256 verification)
Raw Text Files (275 MB)
    ↓ normalize (PyArrow type casting)
Typed Parquet Tables (36 MB)
    ↓ publish (indexing)
DuckDB (106 MB) + SQLite FTS (55 MB)
    ↓ query
CLI / Python API

Development

# Run tests
make test

# Update data
make update       # or: hangar update

# Check data status
make status       # or: hangar status

# Full pipeline (individual steps)
make all          # fetch, normalize, publish, verify

# Individual steps
make fetch
make normalize
make publish

# Clean intermediate files
make clean

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hangarbay-0.4.0.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hangarbay-0.4.0-py3-none-any.whl (29.8 kB view details)

Uploaded Python 3

File details

Details for the file hangarbay-0.4.0.tar.gz.

File metadata

  • Download URL: hangarbay-0.4.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for hangarbay-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b79c4259651c019c57c5b93c48a4d174d7865ea24358a65789f546c54ad6559a
MD5 9c17da278c195b4310652631027aa4b2
BLAKE2b-256 86a2e42fc94fa9d7455521ea3b4be51116ffffa281211c68c9243f278159e43b

See more details on using hashes here.

File details

Details for the file hangarbay-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: hangarbay-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 29.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for hangarbay-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdef1295effc5b4982fa4ad1fb8a350f9310267fd351b09673b0232884f516f8
MD5 aad5edb10a1d078ceca08b1f42e8f981
BLAKE2b-256 a1daa8807941650cd735d97104bcf2a7a571190520d8b8725ceba9c5e2675fcb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page