Terminal-first S3 browser for scientists and data engineers
Project description
s3peek
Navigate S3 buckets from your terminal — with instant header quicklook for FITS, ASDF, Parquet, and JSON files, plus one-command pre-signed URL sharing.
Purpose
Problem: Navigating S3 buckets from the CLI is clunky. aws s3 ls shows raw keys; inspecting a FITS or ASDF file means downloading it first; sharing a file with a colleague requires remembering the aws s3 presign syntax and expiry flags.
Solution: s3peek is a terminal-first S3 browser that combines interactive bucket navigation (arrow keys, fuzzy filter) with in-place header quicklook for astronomy and data science formats, and instant pre-signed URL generation with clipboard copy.
Who benefits: Astronomers and data engineers at IPAC, STScI, or any institution working with AWS-hosted science data (FITS, ASDF, Parquet, JSON). Zero Python import required by end users — distributed as a standalone binary or Homebrew formula.
Architecture
┌──────────────────────────────────────────────────────────┐
│ s3peek CLI │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ TUI browser │ │ Quicklook │ │ Presign cmd │ │
│ │ (Textual) │ │ engine │ │ (boto3) │ │
│ └──────┬───────┘ └──────┬───────┘ └───────┬────────┘ │
│ └─────────────────┴──────────────────┘ │
│ S3 abstraction layer │
│ (boto3 + s3fs + range-GET) │
└──────────────────────────────────────────────────────────┘
│ │
AWS S3 API Clipboard (pyperclip)
Key design decisions:
- Range-GET for headers — FITS and Parquet headers are read with HTTP
Rangerequests (first N bytes only). No full file download. - Streaming ASDF open — ASDF tree is read via
asdf.open()withlazy_load=True; only the YAML header block is parsed. - No local state — no database, no cache file. All navigation state is in-memory for the session.
- AWS credentials pass-through — uses the standard boto3 credential chain (
~/.aws, env vars, instance profile). No credential storage.
Repository Layout
s3peek/
├── README.md # This file — spec + public docs
├── pyproject.toml # Build config; entry_points for CLI
├── Makefile # Dev commands: lint, test, build, brew-test
├── Formula/
│ └── s3peek.rb # Homebrew formula (auto-generated by release CI)
├── src/
│ └── s3peek/
│ ├── __init__.py
│ ├── cli.py # Typer app: entry point, top-level commands
│ ├── browser.py # Textual TUI: bucket/prefix navigation widget
│ ├── quicklook.py # Header readers per format (FITS, ASDF, Parquet, JSON)
│ ├── presign.py # Pre-signed URL generation + clipboard copy
│ ├── s3.py # S3 abstraction: list, stat, range-GET
│ └── config.py # Config model: defaults, env var bindings
├── tests/
│ ├── conftest.py # moto-based S3 fixtures; sample test files
│ ├── test_quicklook.py # Format readers against fixture files
│ ├── test_presign.py # Pre-signed URL generation (moto)
│ ├── test_s3.py # S3 abstraction layer (moto)
│ └── test_cli.py # CLI smoke tests via Typer test runner
├── fixtures/
│ ├── sample.fits # Minimal FITS with header only
│ ├── sample.asdf # Minimal ASDF with known tree
│ ├── sample.parquet # Minimal Parquet with schema
│ └── sample.json # Sample JSON object
├── .github/
│ └── workflows/
│ ├── ci.yml # Test + lint on push/PR
│ └── release.yml # PyPI publish + Homebrew formula bump on tag
├── .env.example # Documented env vars; never committed with values
└── CHANGELOG.md
Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | CPython; PyPy untested |
| AWS credentials | any valid chain | ~/.aws/credentials, env vars, or instance profile |
| IAM permissions | s3:ListBucket, s3:GetObject |
s3:GetObjectAttributes for stat; no write permissions needed |
xclip or xsel (Linux) |
any | For clipboard copy; optional — URL printed to stdout if absent |
| macOS | 12+ | pbcopy built-in; no extra deps |
IAM minimum policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket", "s3:GetObject", "s3:GetObjectAttributes"],
"Resource": ["arn:aws:s3:::YOUR_BUCKET", "arn:aws:s3:::YOUR_BUCKET/*"]
}
]
}
For pre-signed URLs: the caller’s identity signs the URL. The recipient does not need AWS credentials. No s3:PutObject or s3:GetBucketPolicy required.
Quick Start
Install via Homebrew (macOS / Linux with Linuxbrew)
brew tap ejoliet/tap
brew install s3peek
Install via tarball (Linux, no Homebrew)
curl -fsSL https://github.com/ejoliet/s3peek/releases/latest/download/s3peek-linux-x86_64.tar.gz \
| tar -xz -C ~/.local/bin
chmod +x ~/.local/bin/s3peek
Install via pip / uv
pip install s3peek
# or
uv tool install s3peek
First run
# Browse a bucket interactively
s3peek browse s3://my-bucket/
# Quicklook a file header
s3peek peek s3://my-bucket/data/obs001.fits
# Copy a pre-signed URL to clipboard (1-day default)
s3peek share s3://my-bucket/data/obs001.fits
# Pre-signed URL with custom expiry
s3peek share s3://my-bucket/data/obs001.fits --expires 4h
Configuration Reference
All settings can be set via environment variable or ~/.config/s3peek/config.toml.
| Env var | Type | Default | Description |
|---|---|---|---|
S3PEEK_DEFAULT_EXPIRY |
string | 1d |
Default pre-signed URL expiry. Format: Xd, Xh, Xm |
S3PEEK_AWS_PROFILE |
string | default |
AWS CLI profile name to use |
S3PEEK_AWS_REGION |
string | us-east-1 |
AWS region for S3 requests |
S3PEEK_FITS_MAX_HDUS |
int | 10 |
Max HDUs (Header/Data Units) to display in FITS quicklook |
S3PEEK_PARQUET_MAX_COLS |
int | 50 |
Max columns to show in Parquet schema quicklook |
S3PEEK_CLIPBOARD |
bool | true |
Auto-copy pre-signed URL to clipboard |
S3PEEK_PAGE_SIZE |
int | 200 |
S3 list_objects_v2 page size for browser |
S3PEEK_THEME |
string | dark |
TUI theme: dark or light |
~/.config/s3peek/config.toml example:
default_expiry = "1d"
aws_profile = "roman-dev"
aws_region = "us-east-1"
fits_max_hdus = 5
clipboard = true
API / Interface Contract
CLI commands
s3peek [OPTIONS] COMMAND [ARGS]
Commands:
browse Interactive TUI browser for a bucket or prefix
peek Print header/schema of a single S3 object to stdout
share Generate a pre-signed URL; copy to clipboard
ls Non-interactive list (like aws s3 ls, but with size/type cols)
version Print version and exit
Options:
--profile TEXT AWS profile [env: S3PEEK_AWS_PROFILE]
--region TEXT AWS region [env: S3PEEK_AWS_REGION]
--no-color Disable ANSI color output
--help Show this message and exit
s3peek browse
s3peek browse S3_URI [OPTIONS]
Arguments:
S3_URI s3://bucket[/prefix] required
Options:
--page-size INT Objects per page [default: 200]
TUI keybindings:
↑ / ↓ Navigate list
Enter Descend into prefix / open peek for object
Backspace Go up one prefix level
p Peek selected object (header quicklook)
s Share selected object (pre-signed URL)
/ Filter (fuzzy, case-insensitive)
q Quit
s3peek peek
s3peek peek S3_URI [OPTIONS]
Arguments:
S3_URI s3://bucket/key required
Options:
--format [fits|asdf|parquet|json|auto] Force format [default: auto]
--output [text|yaml|json] Output format [default: text]
--max-bytes INT Max bytes for range-GET [default: 65536]
Exit codes:
0 success
1 S3 access error
2 format not supported
3 parse error (file exists but header unreadable)
s3peek share
s3peek share S3_URI [OPTIONS]
Arguments:
S3_URI s3://bucket/key required
Options:
--expires TEXT Expiry: Xd, Xh, Xm [default: 1d, max: 7d]
--no-clipboard Print URL only; do not copy to clipboard
--qr Print QR code to terminal (requires `qrcode` extra)
Output (stdout):
Pre-signed URL as plain text (always printed regardless of --no-clipboard)
Quicklook output contract
Each format reader returns a HeaderResult object:
from dataclasses import dataclass, field
from typing import Any
@dataclass
class HeaderResult:
format: str # "fits" | "asdf" | "parquet" | "json"
s3_uri: str
size_bytes: int | None # None if unavailable
headers: list[dict[str, Any]] # one dict per HDU (FITS) or one (others)
truncated: bool = False # True if range-GET hit max_bytes
error: str | None = None # set on parse failure
FITS headers entry structure:
{
"hdu_index": 0,
"hdu_type": "PrimaryHDU", # HDU type string from astropy
"naxis": 2,
"shape": [2048, 2048],
"cards": {"SIMPLE": True, "BITPIX": -32, ...}
}
ASDF headers entry structure:
{
"asdf_version": "1.6.0",
"tree": { ... } # full YAML tree dict, no array data
}
Parquet headers entry structure:
{
"num_rows": 1048576,
"num_row_groups": 4,
"schema": [
{"name": "ra", "type": "DOUBLE", "nullable": False},
...
],
"metadata": { ... } # file-level key/value metadata
}
JSON headers entry structure:
{
"type": "object", # top-level JSON type
"keys": ["ra", "dec", "mag"], # top-level keys if object
"length": 3 # array length if top-level is array
}
Data Model
No persistent storage. All runtime state lives in:
@dataclass
class SessionState:
bucket: str
prefix: str = ""
history: list[str] = field(default_factory=list) # navigation breadcrumb
selected_key: str | None = None
Config is loaded once at startup into:
class Config(BaseModel):
default_expiry: str = "1d"
aws_profile: str = "default"
aws_region: str = "us-east-1"
fits_max_hdus: int = 10
parquet_max_cols: int = 50
clipboard: bool = True
page_size: int = 200
theme: str = "dark"
Error Handling
| Error class | When raised | Exit code | User message |
|---|---|---|---|
S3AccessError |
NoCredentialsError, ClientError 403 |
1 | "AWS credentials missing or insufficient permissions" |
S3KeyNotFoundError |
ClientError 404 |
1 | "Object not found: s3://..." |
FormatNotSupportedError |
Extension not in supported list | 2 | "Format not supported. Supported: fits, asdf, parquet, json" |
QuicklookParseError |
Header bytes unreadable | 3 | "Could not parse header — file may be truncated or corrupt" |
PresignExpirySyntaxError |
Expiry string invalid | 1 | "Invalid expiry format. Use: 1d, 6h, 30m" |
PresignExpiryTooLongError |
Expiry > 7 days | 1 | "Maximum expiry is 7 days (604800 seconds)" |
All errors write to stderr. stdout is reserved for data output only.
Testing
# Run full suite
make test
# With coverage report
make test-cov
# Lint only
make lint
# Single module
pytest tests/test_quicklook.py -v
Test matrix
| Suite | Scope | Fixtures |
|---|---|---|
test_s3.py |
list, stat, range-GET | moto S3 mock; fixtures/ uploaded at setup |
test_quicklook.py |
all four format readers | fixtures/sample.{fits,asdf,parquet,json} |
test_presign.py |
URL generation, expiry parsing, clipboard skip | moto + monkeypatched pyperclip |
test_cli.py |
all CLI commands, exit codes, --output json |
moto + Typer CliRunner |
Constraint: Tests must never hit real AWS endpoints. moto mocking is mandatory.
Deployment / Installation Targets
Homebrew (primary macOS + Linux)
The Formula/s3peek.rb formula is auto-generated by release.yml on tag push.
Manual formula update (for maintainer):
make brew-bump VERSION=0.2.0 SHA256=<sha256_of_tarball>
Standalone binary (PyInstaller)
make build-binary # outputs dist/s3peek (macOS) or dist/s3peek-linux
CI builds for macos-latest and ubuntu-latest via GitHub Actions matrix.
Artifacts uploaded to GitHub Release assets.
pip / uv
pip install s3peek
uv tool install s3peek # preferred; isolated env
Linux tarball (no package manager)
# Built by release.yml; SHA256 verified in formula
curl -fsSL https://github.com/ejoliet/s3peek/releases/latest/download/s3peek-linux-x86_64.tar.gz \
| tar -xz -C ~/.local/bin
Security
- Pre-signed URLs are signed with the caller’s temporary or long-term AWS credentials. They do not grant any additional IAM permissions beyond what the signing identity has.
- Max expiry is hard-capped at 7 days — the AWS maximum for signature v4 pre-signed URLs with IAM user credentials; STS session tokens cap at session duration.
- No credentials stored by
s3peekitself. The tool is read-only by design (nos3:PutObject). - Clipboard warning — if
S3PEEK_CLIPBOARD=true(default), the pre-signed URL is silently written to the system clipboard. Users sharing their screen should be aware.
Non-Goals (v1)
- No upload or delete operations — read-only tool; no
s3:PutObject,s3:DeleteObject - No recursive download — use
aws s3 syncfor that - No full file render — quicklook reads headers/schema only; not a FITS image viewer
- No bucket creation or policy management
- No Windows support — macOS and Linux only in v1; PRs welcome
- No auth management — does not create or rotate AWS credentials
- No HiPS / MOC / VOTable quicklook — v2 candidate
- No interactive diff between two S3 keys
Open Questions
| # | Question | Blocking? | Owner |
|---|---|---|---|
| 1 | Should s3peek browse support paginating >10k objects with a spinner, or just limit to first page_size? |
No — default to limit; add pagination in v2 | Emmanuel |
| 2 | qrcode extra for --qr flag: include in default install or opt-in extra? |
No — make it an extra: pip install s3peek[qr] |
Emmanuel |
| 3 | Should FITS quicklook display WCS (World Coordinate System) summary if present? | No — nice-to-have for v2 | open |
| 4 | For ASDF files from Roman pipeline: should asdf-astropy be a hard dep or optional? |
No — optional extra: pip install s3peek[astro] |
Emmanuel |
Agent Build Instructions
This section is the authoritative build specification. A coding agent should implement this tool end-to-end using only this README — no clarifying questions needed.
Build Order
| Phase | Deliverable | Done when |
|---|---|---|
| 0 | Repo scaffold + CI skeleton | make lint passes on empty project; GitHub Actions runs |
| 1 | S3 abstraction layer (s3.py) |
test_s3.py passes with moto; list, stat, range-GET work |
| 2 | Quicklook engine (quicklook.py) |
test_quicklook.py passes for all 4 formats against fixtures |
| 3 | Presign module (presign.py) |
test_presign.py passes; clipboard copy mocked; expiry parsing correct |
| 4 | CLI commands (cli.py) — non-TUI first |
test_cli.py passes for peek, share, ls, version |
| 5 | TUI browser (browser.py) |
Manual smoke test: arrow navigation + p/s keys work |
| 6 | Build + packaging | make build-binary succeeds; brew install from local formula |
File Map
| File | Purpose | Key symbols |
|---|---|---|
src/s3peek/config.py |
Pydantic config model; env var + TOML loading | class Config(BaseModel) |
src/s3peek/s3.py |
S3 list, stat, range-GET via boto3 | list_prefix(), stat_object(), range_get() |
src/s3peek/quicklook.py |
Format dispatch + four readers | quicklook(), _read_fits(), _read_asdf(), _read_parquet(), _read_json() |
src/s3peek/presign.py |
URL generation + expiry parsing + clipboard | generate_presigned_url(), parse_expiry(), copy_to_clipboard() |
src/s3peek/browser.py |
Textual TUI app and widget | S3Browser(App), ObjectList(Widget) |
src/s3peek/cli.py |
Typer app; all commands | app = typer.Typer(), browse, peek, share, ls, version |
tests/conftest.py |
moto fixtures; fixture file upload | s3_client, populated_bucket |
tests/test_s3.py |
S3 layer tests | test_list_prefix, test_range_get |
tests/test_quicklook.py |
Format reader tests | one test per format; error path tests |
tests/test_presign.py |
Presign + expiry tests | test_expiry_parsing, test_url_structure |
tests/test_cli.py |
CLI integration tests | test_peek_fits, test_share_no_clipboard, test_ls |
Makefile |
Dev commands | lint, test, test-cov, build-binary, brew-bump |
pyproject.toml |
Build + deps + entry point | [project.scripts] s3peek = "s3peek.cli:app" |
Formula/s3peek.rb |
Homebrew formula | url, sha256, depends_on blocks |
Constraints
- Python 3.11+ only. No
matchon Python < 3.10; use 3.11+ syntax freely. - Range-GET for FITS: read first
65536bytes (configurable via--max-bytes). Parse withastropy.io.fits.open(BytesIO(...))+ignore_missing_end=True. - ASDF range-GET: read first
65536bytes; open withasdf.open(BytesIO(...), lazy_load=True, copy_arrays=False). - Parquet range-GET: use
pyarrow.parquet.ParquetFile(pa.BufferReader(bytes))— reads footer from end; for range-GET, fetch last 65536 bytes (footer is at end of file in Parquet format). - JSON: fetch first
65536bytes; parse withjson.loads; on parse failure tryjson.JSONDecoder().raw_decode()for streaming objects. - Pre-signed URL expiry: parse
Xd/Xh/Xm→ seconds. Cap at604800(7 days). Error on invalid format. - Clipboard: use
pyperclip; catchpyperclip.PyperclipExceptionand fall back to stdout-only with a warning. - Tests must use
moto(@mock_awsdecorator). No real boto3 calls in tests. - All public functions must have typed signatures and docstrings.
ruff+mypymust pass at zero warnings.
Acceptance Criteria
-
make testpasses with ≥ 80% coverage -
make lintpasses (ruff check+mypy --strict) -
s3peek peek s3://test-bucket/sample.fitsprints HDU table to stdout -
s3peek share s3://test-bucket/sample.fits --no-clipboardprints a valid pre-signed URL -
s3peek share s3://test-bucket/sample.fits --expires 8dexits with code 1 and error on stderr -
s3peek browse s3://test-bucket/launches TUI without crash (manual check) -
make build-binaryproduces a standalone executable that runs on macOS and Linux -
brew install --build-from-source Formula/s3peek.rbsucceeds locally - All Open Questions resolved or explicitly deferred to v2 in CHANGELOG
Next Steps
Ordered agent task list:
git init s3peek && cd s3peek— initialise repo- Create
pyproject.tomlwith deps:typer,textual,boto3,astropy,asdf,pyarrow,pyperclip,pydantic,tomli; dev deps:moto[s3],pytest,pytest-cov,ruff,mypy - Scaffold directory tree per Repository Layout
- Create
fixtures/with minimal valid sample files (useastropy,asdf,pyarrowto generate) - Implement
config.py→ passtest_config.py - Implement
s3.py→ passtest_s3.py - Implement
quicklook.py→ passtest_quicklook.pyfor all 4 formats - Implement
presign.py→ passtest_presign.py - Implement
cli.py(non-TUI commands first) → passtest_cli.py - Implement
browser.py(Textual TUI) → manual smoke test - Write
Makefilewithlint,test,test-cov,build-binary,brew-bumptargets - Set up
.github/workflows/ci.yml(test matrix: macOS + ubuntu, Python 3.11/3.12) - Set up
.github/workflows/release.yml(tag → PyPI publish + binary upload + formula bump) - Write
Formula/s3peek.rbtemplate; validate withbrew audit - Resolve all Open Questions; update CHANGELOG
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s3peek-0.1.0.tar.gz.
File metadata
- Download URL: s3peek-0.1.0.tar.gz
- Upload date:
- Size: 28.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52bfa529843da2d78c0f89da6fb256c92e5584da8a9753da903bdaa57b4d1ec1
|
|
| MD5 |
375112ededc94d51c899e6019b5d075d
|
|
| BLAKE2b-256 |
e7c87cdc26e1955a09b861d1a8c758d81b456063b03af385baeb85d922755fb3
|
File details
Details for the file s3peek-0.1.0-py3-none-any.whl.
File metadata
- Download URL: s3peek-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7859074f57850e264f92e47c55bf6145c4460a53b0fc55038572e018a37f5abf
|
|
| MD5 |
fe227e7fcc50786a4437ffe18289140e
|
|
| BLAKE2b-256 |
ed2240f4ab1a5caa9800531888f5fee3ed3f76d60db6a2aa79bb60071d58a76b
|