Skip to main content

Terminal-first S3 browser for scientists and data engineers

Project description

s3peek

Navigate S3 buckets from your terminal — with instant header quicklook for FITS, ASDF, Parquet, and JSON files, plus one-command pre-signed URL sharing.

CI License: MIT Python 3.11+ Homebrew


Purpose

Problem: Navigating S3 buckets from the CLI is clunky. aws s3 ls shows raw keys; inspecting a FITS or ASDF file means downloading it first; sharing a file with a colleague requires remembering the aws s3 presign syntax and expiry flags.

Solution: s3peek is a terminal-first S3 browser that combines interactive bucket navigation (arrow keys, fuzzy filter) with in-place header quicklook for astronomy and data science formats, and instant pre-signed URL generation with clipboard copy.

Who benefits: Astronomers and data engineers at IPAC, STScI, or any institution working with AWS-hosted science data (FITS, ASDF, Parquet, JSON). Zero Python import required by end users — distributed as a standalone binary or Homebrew formula.


Architecture

┌──────────────────────────────────────────────────────────┐
│                        s3peek CLI                        │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────┐  │
│  │  TUI browser │  │  Quicklook   │  │  Presign cmd   │  │
│  │  (Textual)   │  │  engine      │  │  (boto3)       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────┬────────┘  │
│         └─────────────────┴──────────────────┘           │
│                      S3 abstraction layer                 │
│                  (boto3 + s3fs + range-GET)               │
└──────────────────────────────────────────────────────────┘
         │                                    │
    AWS S3 API                         Clipboard (pyperclip)

Key design decisions:

  • Range-GET for headers — FITS and Parquet headers are read with HTTP Range requests (first N bytes only). No full file download.
  • Streaming ASDF open — ASDF tree is read via asdf.open() with lazy_load=True; only the YAML header block is parsed.
  • No local state — no database, no cache file. All navigation state is in-memory for the session.
  • AWS credentials pass-through — uses the standard boto3 credential chain (~/.aws, env vars, instance profile). No credential storage.

Repository Layout

s3peek/
├── README.md                  # This file — spec + public docs
├── pyproject.toml             # Build config; entry_points for CLI
├── Makefile                   # Dev commands: lint, test, build, brew-test
├── Formula/
│   └── s3peek.rb              # Homebrew formula (auto-generated by release CI)
├── src/
│   └── s3peek/
│       ├── __init__.py
│       ├── cli.py             # Typer app: entry point, top-level commands
│       ├── browser.py         # Textual TUI: bucket/prefix navigation widget
│       ├── quicklook.py       # Header readers per format (FITS, ASDF, Parquet, JSON)
│       ├── presign.py         # Pre-signed URL generation + clipboard copy
│       ├── s3.py              # S3 abstraction: list, stat, range-GET
│       └── config.py          # Config model: defaults, env var bindings
├── tests/
│   ├── conftest.py            # moto-based S3 fixtures; sample test files
│   ├── test_quicklook.py      # Format readers against fixture files
│   ├── test_presign.py        # Pre-signed URL generation (moto)
│   ├── test_s3.py             # S3 abstraction layer (moto)
│   └── test_cli.py            # CLI smoke tests via Typer test runner
├── fixtures/
│   ├── sample.fits            # Minimal FITS with header only
│   ├── sample.asdf            # Minimal ASDF with known tree
│   ├── sample.parquet         # Minimal Parquet with schema
│   └── sample.json            # Sample JSON object
├── .github/
│   └── workflows/
│       ├── ci.yml             # Test + lint on push/PR
│       └── release.yml        # PyPI publish + Homebrew formula bump on tag
├── .env.example               # Documented env vars; never committed with values
└── CHANGELOG.md

Prerequisites

Requirement Version Notes
Python 3.11+ CPython; PyPy untested
AWS credentials any valid chain ~/.aws/credentials, env vars, or instance profile
IAM permissions s3:ListBucket, s3:GetObject s3:GetObjectAttributes for stat; no write permissions needed
xclip or xsel (Linux) any For clipboard copy; optional — URL printed to stdout if absent
macOS 12+ pbcopy built-in; no extra deps

IAM minimum policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:ListBucket", "s3:GetObject", "s3:GetObjectAttributes"],
      "Resource": ["arn:aws:s3:::YOUR_BUCKET", "arn:aws:s3:::YOUR_BUCKET/*"]
    }
  ]
}

For pre-signed URLs: the caller’s identity signs the URL. The recipient does not need AWS credentials. No s3:PutObject or s3:GetBucketPolicy required.


Quick Start

Install via Homebrew (macOS / Linux with Linuxbrew)

brew tap ejoliet/tap
brew install s3peek

Install via tarball (Linux, no Homebrew)

curl -fsSL https://github.com/ejoliet/s3peek/releases/latest/download/s3peek-linux-x86_64.tar.gz \
  | tar -xz -C ~/.local/bin
chmod +x ~/.local/bin/s3peek

Install via pip / uv

pip install s3peek
# or
uv tool install s3peek

First run

# Browse a bucket interactively
s3peek browse s3://my-bucket/

# Quicklook a file header
s3peek peek s3://my-bucket/data/obs001.fits

# Copy a pre-signed URL to clipboard (1-day default)
s3peek share s3://my-bucket/data/obs001.fits

# Pre-signed URL with custom expiry
s3peek share s3://my-bucket/data/obs001.fits --expires 4h

Configuration Reference

All settings can be set via environment variable or ~/.config/s3peek/config.toml.

Env var Type Default Description
S3PEEK_DEFAULT_EXPIRY string 1d Default pre-signed URL expiry. Format: Xd, Xh, Xm
S3PEEK_AWS_PROFILE string default AWS CLI profile name to use
S3PEEK_AWS_REGION string us-east-1 AWS region for S3 requests
S3PEEK_FITS_MAX_HDUS int 10 Max HDUs (Header/Data Units) to display in FITS quicklook
S3PEEK_PARQUET_MAX_COLS int 50 Max columns to show in Parquet schema quicklook
S3PEEK_CLIPBOARD bool true Auto-copy pre-signed URL to clipboard
S3PEEK_PAGE_SIZE int 200 S3 list_objects_v2 page size for browser
S3PEEK_THEME string dark TUI theme: dark or light

~/.config/s3peek/config.toml example:

default_expiry = "1d"
aws_profile = "roman-dev"
aws_region = "us-east-1"
fits_max_hdus = 5
clipboard = true

API / Interface Contract

CLI commands

s3peek [OPTIONS] COMMAND [ARGS]

Commands:
  browse   Interactive TUI browser for a bucket or prefix
  peek     Print header/schema of a single S3 object to stdout
  share    Generate a pre-signed URL; copy to clipboard
  ls       Non-interactive list (like aws s3 ls, but with size/type cols)
  version  Print version and exit

Options:
  --profile TEXT    AWS profile [env: S3PEEK_AWS_PROFILE]
  --region TEXT     AWS region  [env: S3PEEK_AWS_REGION]
  --no-color        Disable ANSI color output
  --help            Show this message and exit

s3peek browse

s3peek browse S3_URI [OPTIONS]

Arguments:
  S3_URI    s3://bucket[/prefix]  required

Options:
  --page-size INT   Objects per page [default: 200]

TUI keybindings:
  ↑ / ↓          Navigate list
  Enter           Descend into prefix / open peek for object
  Backspace       Go up one prefix level
  p               Peek selected object (header quicklook)
  s               Share selected object (pre-signed URL)
  /               Filter (fuzzy, case-insensitive)
  q               Quit

s3peek peek

s3peek peek S3_URI [OPTIONS]

Arguments:
  S3_URI    s3://bucket/key   required

Options:
  --format [fits|asdf|parquet|json|auto]   Force format [default: auto]
  --output [text|yaml|json]                Output format [default: text]
  --max-bytes INT                          Max bytes for range-GET [default: 65536]

Exit codes:
  0   success
  1   S3 access error
  2   format not supported
  3   parse error (file exists but header unreadable)

s3peek share

s3peek share S3_URI [OPTIONS]

Arguments:
  S3_URI    s3://bucket/key   required

Options:
  --expires TEXT    Expiry: Xd, Xh, Xm [default: 1d, max: 7d]
  --no-clipboard    Print URL only; do not copy to clipboard
  --qr              Print QR code to terminal (requires `qrcode` extra)

Output (stdout):
  Pre-signed URL as plain text (always printed regardless of --no-clipboard)

Quicklook output contract

Each format reader returns a HeaderResult object:

from dataclasses import dataclass, field
from typing import Any

@dataclass
class HeaderResult:
    format: str                        # "fits" | "asdf" | "parquet" | "json"
    s3_uri: str
    size_bytes: int | None             # None if unavailable
    headers: list[dict[str, Any]]      # one dict per HDU (FITS) or one (others)
    truncated: bool = False            # True if range-GET hit max_bytes
    error: str | None = None           # set on parse failure

FITS headers entry structure:

{
    "hdu_index": 0,
    "hdu_type": "PrimaryHDU",         # HDU type string from astropy
    "naxis": 2,
    "shape": [2048, 2048],
    "cards": {"SIMPLE": True, "BITPIX": -32, ...}
}

ASDF headers entry structure:

{
    "asdf_version": "1.6.0",
    "tree": { ... }                   # full YAML tree dict, no array data
}

Parquet headers entry structure:

{
    "num_rows": 1048576,
    "num_row_groups": 4,
    "schema": [
        {"name": "ra", "type": "DOUBLE", "nullable": False},
        ...
    ],
    "metadata": { ... }               # file-level key/value metadata
}

JSON headers entry structure:

{
    "type": "object",                  # top-level JSON type
    "keys": ["ra", "dec", "mag"],     # top-level keys if object
    "length": 3                        # array length if top-level is array
}

Data Model

No persistent storage. All runtime state lives in:

@dataclass
class SessionState:
    bucket: str
    prefix: str = ""
    history: list[str] = field(default_factory=list)   # navigation breadcrumb
    selected_key: str | None = None

Config is loaded once at startup into:

class Config(BaseModel):
    default_expiry: str = "1d"
    aws_profile: str = "default"
    aws_region: str = "us-east-1"
    fits_max_hdus: int = 10
    parquet_max_cols: int = 50
    clipboard: bool = True
    page_size: int = 200
    theme: str = "dark"

Error Handling

Error class When raised Exit code User message
S3AccessError NoCredentialsError, ClientError 403 1 "AWS credentials missing or insufficient permissions"
S3KeyNotFoundError ClientError 404 1 "Object not found: s3://..."
FormatNotSupportedError Extension not in supported list 2 "Format not supported. Supported: fits, asdf, parquet, json"
QuicklookParseError Header bytes unreadable 3 "Could not parse header — file may be truncated or corrupt"
PresignExpirySyntaxError Expiry string invalid 1 "Invalid expiry format. Use: 1d, 6h, 30m"
PresignExpiryTooLongError Expiry > 7 days 1 "Maximum expiry is 7 days (604800 seconds)"

All errors write to stderr. stdout is reserved for data output only.


Testing

# Run full suite
make test

# With coverage report
make test-cov

# Lint only
make lint

# Single module
pytest tests/test_quicklook.py -v

Test matrix

Suite Scope Fixtures
test_s3.py list, stat, range-GET moto S3 mock; fixtures/ uploaded at setup
test_quicklook.py all four format readers fixtures/sample.{fits,asdf,parquet,json}
test_presign.py URL generation, expiry parsing, clipboard skip moto + monkeypatched pyperclip
test_cli.py all CLI commands, exit codes, --output json moto + Typer CliRunner

Constraint: Tests must never hit real AWS endpoints. moto mocking is mandatory.


Deployment / Installation Targets

Homebrew (primary macOS + Linux)

The Formula/s3peek.rb formula is auto-generated by release.yml on tag push.

Manual formula update (for maintainer):

make brew-bump VERSION=0.2.0 SHA256=<sha256_of_tarball>

Standalone binary (PyInstaller)

make build-binary   # outputs dist/s3peek (macOS) or dist/s3peek-linux

CI builds for macos-latest and ubuntu-latest via GitHub Actions matrix. Artifacts uploaded to GitHub Release assets.

pip / uv

pip install s3peek
uv tool install s3peek   # preferred; isolated env

Linux tarball (no package manager)

# Built by release.yml; SHA256 verified in formula
curl -fsSL https://github.com/ejoliet/s3peek/releases/latest/download/s3peek-linux-x86_64.tar.gz \
  | tar -xz -C ~/.local/bin

Security

  • Pre-signed URLs are signed with the caller’s temporary or long-term AWS credentials. They do not grant any additional IAM permissions beyond what the signing identity has.
  • Max expiry is hard-capped at 7 days — the AWS maximum for signature v4 pre-signed URLs with IAM user credentials; STS session tokens cap at session duration.
  • No credentials stored by s3peek itself. The tool is read-only by design (no s3:PutObject).
  • Clipboard warning — if S3PEEK_CLIPBOARD=true (default), the pre-signed URL is silently written to the system clipboard. Users sharing their screen should be aware.

Non-Goals (v1)

  • No upload or delete operations — read-only tool; no s3:PutObject, s3:DeleteObject
  • No recursive download — use aws s3 sync for that
  • No full file render — quicklook reads headers/schema only; not a FITS image viewer
  • No bucket creation or policy management
  • No Windows support — macOS and Linux only in v1; PRs welcome
  • No auth management — does not create or rotate AWS credentials
  • No HiPS / MOC / VOTable quicklook — v2 candidate
  • No interactive diff between two S3 keys

Open Questions

# Question Blocking? Owner
1 Should s3peek browse support paginating >10k objects with a spinner, or just limit to first page_size? No — default to limit; add pagination in v2 Emmanuel
2 qrcode extra for --qr flag: include in default install or opt-in extra? No — make it an extra: pip install s3peek[qr] Emmanuel
3 Should FITS quicklook display WCS (World Coordinate System) summary if present? No — nice-to-have for v2 open
4 For ASDF files from Roman pipeline: should asdf-astropy be a hard dep or optional? No — optional extra: pip install s3peek[astro] Emmanuel

Agent Build Instructions

This section is the authoritative build specification. A coding agent should implement this tool end-to-end using only this README — no clarifying questions needed.

Build Order

Phase Deliverable Done when
0 Repo scaffold + CI skeleton make lint passes on empty project; GitHub Actions runs
1 S3 abstraction layer (s3.py) test_s3.py passes with moto; list, stat, range-GET work
2 Quicklook engine (quicklook.py) test_quicklook.py passes for all 4 formats against fixtures
3 Presign module (presign.py) test_presign.py passes; clipboard copy mocked; expiry parsing correct
4 CLI commands (cli.py) — non-TUI first test_cli.py passes for peek, share, ls, version
5 TUI browser (browser.py) Manual smoke test: arrow navigation + p/s keys work
6 Build + packaging make build-binary succeeds; brew install from local formula

File Map

File Purpose Key symbols
src/s3peek/config.py Pydantic config model; env var + TOML loading class Config(BaseModel)
src/s3peek/s3.py S3 list, stat, range-GET via boto3 list_prefix(), stat_object(), range_get()
src/s3peek/quicklook.py Format dispatch + four readers quicklook(), _read_fits(), _read_asdf(), _read_parquet(), _read_json()
src/s3peek/presign.py URL generation + expiry parsing + clipboard generate_presigned_url(), parse_expiry(), copy_to_clipboard()
src/s3peek/browser.py Textual TUI app and widget S3Browser(App), ObjectList(Widget)
src/s3peek/cli.py Typer app; all commands app = typer.Typer(), browse, peek, share, ls, version
tests/conftest.py moto fixtures; fixture file upload s3_client, populated_bucket
tests/test_s3.py S3 layer tests test_list_prefix, test_range_get
tests/test_quicklook.py Format reader tests one test per format; error path tests
tests/test_presign.py Presign + expiry tests test_expiry_parsing, test_url_structure
tests/test_cli.py CLI integration tests test_peek_fits, test_share_no_clipboard, test_ls
Makefile Dev commands lint, test, test-cov, build-binary, brew-bump
pyproject.toml Build + deps + entry point [project.scripts] s3peek = "s3peek.cli:app"
Formula/s3peek.rb Homebrew formula url, sha256, depends_on blocks

Constraints

  • Python 3.11+ only. No match on Python < 3.10; use 3.11+ syntax freely.
  • Range-GET for FITS: read first 65536 bytes (configurable via --max-bytes). Parse with astropy.io.fits.open(BytesIO(...)) + ignore_missing_end=True.
  • ASDF range-GET: read first 65536 bytes; open with asdf.open(BytesIO(...), lazy_load=True, copy_arrays=False).
  • Parquet range-GET: use pyarrow.parquet.ParquetFile(pa.BufferReader(bytes)) — reads footer from end; for range-GET, fetch last 65536 bytes (footer is at end of file in Parquet format).
  • JSON: fetch first 65536 bytes; parse with json.loads; on parse failure try json.JSONDecoder().raw_decode() for streaming objects.
  • Pre-signed URL expiry: parse Xd/Xh/Xm → seconds. Cap at 604800 (7 days). Error on invalid format.
  • Clipboard: use pyperclip; catch pyperclip.PyperclipException and fall back to stdout-only with a warning.
  • Tests must use moto (@mock_aws decorator). No real boto3 calls in tests.
  • All public functions must have typed signatures and docstrings.
  • ruff + mypy must pass at zero warnings.

Acceptance Criteria

  • make test passes with ≥ 80% coverage
  • make lint passes (ruff check + mypy --strict)
  • s3peek peek s3://test-bucket/sample.fits prints HDU table to stdout
  • s3peek share s3://test-bucket/sample.fits --no-clipboard prints a valid pre-signed URL
  • s3peek share s3://test-bucket/sample.fits --expires 8d exits with code 1 and error on stderr
  • s3peek browse s3://test-bucket/ launches TUI without crash (manual check)
  • make build-binary produces a standalone executable that runs on macOS and Linux
  • brew install --build-from-source Formula/s3peek.rb succeeds locally
  • All Open Questions resolved or explicitly deferred to v2 in CHANGELOG

Next Steps

Ordered agent task list:

  1. git init s3peek && cd s3peek — initialise repo
  2. Create pyproject.toml with deps: typer, textual, boto3, astropy, asdf, pyarrow, pyperclip, pydantic, tomli; dev deps: moto[s3], pytest, pytest-cov, ruff, mypy
  3. Scaffold directory tree per Repository Layout
  4. Create fixtures/ with minimal valid sample files (use astropy, asdf, pyarrow to generate)
  5. Implement config.py → pass test_config.py
  6. Implement s3.py → pass test_s3.py
  7. Implement quicklook.py → pass test_quicklook.py for all 4 formats
  8. Implement presign.py → pass test_presign.py
  9. Implement cli.py (non-TUI commands first) → pass test_cli.py
  10. Implement browser.py (Textual TUI) → manual smoke test
  11. Write Makefile with lint, test, test-cov, build-binary, brew-bump targets
  12. Set up .github/workflows/ci.yml (test matrix: macOS + ubuntu, Python 3.11/3.12)
  13. Set up .github/workflows/release.yml (tag → PyPI publish + binary upload + formula bump)
  14. Write Formula/s3peek.rb template; validate with brew audit
  15. Resolve all Open Questions; update CHANGELOG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3peek-0.1.0.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3peek-0.1.0-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file s3peek-0.1.0.tar.gz.

File metadata

  • Download URL: s3peek-0.1.0.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for s3peek-0.1.0.tar.gz
Algorithm Hash digest
SHA256 52bfa529843da2d78c0f89da6fb256c92e5584da8a9753da903bdaa57b4d1ec1
MD5 375112ededc94d51c899e6019b5d075d
BLAKE2b-256 e7c87cdc26e1955a09b861d1a8c758d81b456063b03af385baeb85d922755fb3

See more details on using hashes here.

File details

Details for the file s3peek-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: s3peek-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for s3peek-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7859074f57850e264f92e47c55bf6145c4460a53b0fc55038572e018a37f5abf
MD5 fe227e7fcc50786a4437ffe18289140e
BLAKE2b-256 ed2240f4ab1a5caa9800531888f5fee3ed3f76d60db6a2aa79bb60071d58a76b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page