Skip to main content

Singer tap for extracting data from Substack newsletters, built with the Meltano Singer SDK

Project description

tap-substack

A Singer tap for extracting data from Substack newsletters, built with the Meltano Singer SDK.

What's Working

Streams

Stream Endpoint Auth Status Description
posts /api/v1/archive No Working All published posts (paginated, sorted by date)
post_details /api/v1/posts/{slug} No Working Full HTML content per post (child of posts)
comments /api/v1/post/{id}/comments No Working Comments per post (child of posts)
dashboard_summary /api/v1/publish-dashboard/summary Yes Working High-level publication metrics (subscribers, views, open rate)
email_stats /api/v1/publication/stats/email_stats Yes Working Per-post email delivery and engagement metrics
recommendations_inbound /api/v1/recommendations/stats/to Yes Working Publications recommending you + signup stats
recommendations_outbound /api/v1/recommendations/stats/from Yes Working Publications you recommend + signup stats
subscribers /api/v1/publication/subscribers Yes Working Subscriber list with email, type, source

Features

  • Public + authenticated modes: Runs without auth for public data; add session_token to unlock dashboard/analytics streams
  • Pagination: Offset-based pagination for list endpoints (posts, email_stats, subscribers)
  • Rate limiting: Built-in 0.5s request throttle to avoid 429s
  • Graceful auth errors: Authenticated streams log warnings and skip on 403/404/429 instead of crashing
  • Parent-child streams: post_details and comments automatically iterate over posts from the posts stream
  • Incremental replication: posts stream supports replication_key on post_date

Installation

pip install -e .

Configuration

Setting Required Description
subdomain Yes Your Substack subdomain (e.g. newsletter for newsletter.substack.com)
session_token Yes Your substack.sid cookie (see below)

Getting your session token

  1. Log in to your Substack dashboard
  2. Open browser DevTools > Application > Cookies
  3. Copy the value of the substack.sid cookie

Usage

# Discovery mode
tap-substack --config sample_config.json --discover

# Sync public data only
tap-substack --config sample_config.json

# Pipe to a target
tap-substack --config config.json | target-jsonl

Development

# Set up
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Run against a test newsletter (public only)
tap-substack --config test_config.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tap_substack-1.0.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tap_substack-1.0.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file tap_substack-1.0.0.tar.gz.

File metadata

  • Download URL: tap_substack-1.0.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for tap_substack-1.0.0.tar.gz
Algorithm Hash digest
SHA256 70098d9c035e1ad6fc4f7fc6647e7bb26b0d4048e33d24404d4bfa592b96e0f2
MD5 2377e5ebfa23b98961b462ba46b0831a
BLAKE2b-256 8428fdfa6ad0e326b9f3d6b793caa92a7b2c9bf1a7c2f1c409f32820ddbef8d7

See more details on using hashes here.

File details

Details for the file tap_substack-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tap_substack-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for tap_substack-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f31043e92b7f06d8d054d28e63e2cf31d08c4a7c2243d17403d51b4d005100cf
MD5 6743529cc8f61d7284261ab0be9d68d1
BLAKE2b-256 9783f1bfbf737be024590cff8119688686a7646cc7bd6af3315ffa5b9534188e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page