Skip to main content

Bluesky/AT Protocol social-graph CLI — twecoll equivalent

Project description

skycoll

skycoll is a Bluesky/AT Protocol social-graph CLI tool — the equivalent of twecoll for the ATmosphere.

It resolves identities, fetches social graphs, downloads posts and likes via CAR repo sync, reconstructs reply threads, and produces GML graph files (with optional PNG visualisations).

Installation

From source

git clone https://github.com/j4ckxyz/skycoll.git
cd skycoll
pip install -e .

# Optional: graph visualisation support
pip install -e ".[graph]"

Dependencies

Core: httpx, atproto, cryptography, cbor2

Optional: python-igraph (for PNG graph rendering)

Dev: pytest, pytest-httpx

First-run OAuth flow

The first time you run a command that requires authentication (init, fetch, posts, likes), skycoll will:

  1. Resolve your handle to a DID and PDS endpoint.
  2. Discover the OAuth 2.0 authorisation server from your PDS.
  3. Use localhost-development OAuth client mode (client_id on http://localhost with redirect_uri query parameter), and start a temporary HTTP server on 127.0.0.1:<random-port> only to receive the callback.
  4. Open your browser for you to authorise the request (scopes: atproto transition:generic).
  5. Exchange the authorisation code using PKCE (S256) and bind it with DPoP (ES256).
  6. Save the session to ~/.skycoll/sessions/<did>.json (mode 0600).

On subsequent runs, the saved session is reused and refreshed automatically when within 60 seconds of token expiry.

NOTE: The transition:generic scope provides read access and like-deletion. When AT Protocol proposal 0011 (granular scopes) stabilises, this should be narrowed to only app.bsky.feed.* reads and app.bsky.feed.like delete.

Commands

resolve

Resolve a handle to a DID (or a DID to a handle + PDS endpoint).

skycoll resolve j4ck.xyz
skycoll resolve did:plc:z72i7hdynmk6r22z27h6tvae

init

Fetch your profile, follows, and followers. Writes <handle>.dat and downloads avatars to img/.

skycoll init j4ck.xyz

# Also fetch lists you've created
skycoll init j4ck.xyz --lists

# Include self-labels and server-assigned labels
skycoll init j4ck.xyz --labels

# Route through the Blacksky AppView
skycoll init j4ck.xyz --appview blacksky

# Query a Constellation backlinks index
skycoll init j4ck.xyz --constellation https://constellation.example.com

# All flags combined
skycoll init j4ck.xyz --lists --labels --appview blacksky --constellation https://constellation.example.com

The .dat file includes:

  • Profile header row with labels column
  • F rows for follows
  • B rows for followers
  • L rows for lists (with --lists)
  • S rows for starter packs
  • K rows for Constellation backlink counts (with --constellation)

By default, labels are omitted from the profile row. Use --labels to include server/self labels in the .dat profile header.

appview flag

Several commands accept --appview to route API requests through a specific Bluesky-compatible AppView. This sets the atproto-proxy HTTP header to a service DID, rather than hardcoding a base URL.

Built-in names:

Name Service DID Description
bluesky did:web:api.bsky.app#bsky_appview Bluesky official AppView (default)
blacksky did:web:api.blacksky.community#bsky_appview Blacksky community AppView

You can also pass a raw DID+fragment string for custom AppViews:

skycoll init j4ck.xyz --appview did:web:custom.example#bsky_appview

appviews

List the built-in AppView names and their service DIDs:

skycoll appviews

fetch

Fetch the follows of every person listed in <handle>.dat. Writes one fdat/<friend>.dat per followed user.

skycoll fetch j4ck.xyz

posts

Download posts using paginated getAuthorFeed (default, no cap — pages until cursor is exhausted):

skycoll posts j4ck.xyz

Use --car for full CAR repo sync (slower but gives a complete archive including all record types):

skycoll posts j4ck.xyz --car

Rich .twt format columns: type uri timestamp reply_to_uri root_uri text

Where type is post, repost, or quote.

Route through an alternative AppView:

skycoll posts j4ck.xyz --appview blacksky

likes

Download all likes. Writes <handle>.fav (tab-separated: uri timestamp author_did author_handle text).

skycoll likes j4ck.xyz

Purge (delete all likes — the only write operation):

skycoll likes j4ck.xyz --purge

Route likes reads through an alternative AppView:

skycoll likes j4ck.xyz --appview blacksky

Verbose logging

Use global verbose mode to print low-level network/auth debug logs:

skycoll --verbose init j4ck.xyz
skycoll -v posts j4ck.xyz --car

You can also enable it via environment variable:

SKYCOLL_VERBOSE=1 skycoll init j4ck.xyz

threads

Reconstruct reply threads from an existing <handle>.twt file. Uses the reply_to_uri and root_uri fields to build thread trees. Outputs <handle>.threads as JSON.

skycoll threads j4ck.xyz

edgelist

Generate graph files from .dat and fdat/ data.

  • Default output: <handle>.gml
  • Optional Gephi-native output: <handle>.gexf via --gexf
  • Use --no-gml to skip writing GML

If python-igraph is installed and GML output is enabled, skycoll also renders a <handle>.png visualisation.

The GML includes bidirectional edges, mutual_only attributes, and node_type attributes.

skycoll edgelist j4ck.xyz

# Also write GEXF for Gephi
skycoll edgelist j4ck.xyz --gexf

# Write only GEXF (no GML)
skycoll edgelist j4ck.xyz --gexf --no-gml

# Enrich edges with likes counts from Constellation
skycoll edgelist j4ck.xyz --constellation https://constellation.example.com

convert

Convert an existing graph file between GML and GEXF without re-fetching data.

skycoll convert j4ck.xyz --to gexf
skycoll convert j4ck.xyz --to gml

sync

Download the full repo CAR and write it to <handle>.car for archival. No parsing.

skycoll sync j4ck.xyz

backlinks

Query a Constellation backlinks index and pretty-print the full backlink breakdown for a handle.

Constellation is a self-hostable AT Protocol backlinks index. A public instance may be available; this feature is opt-in and the host must be provided explicitly.

skycoll backlinks j4ck.xyz --constellation https://constellation.example.com

plc

Fetch the full PLC directory operation log for a DID and write it to <did>.plc as JSON. This gives the complete identity history — handle changes, PDS migrations, key rotations.

skycoll plc did:plc:z72i7hdynmk6r22z27h6tvae

# Also print a human-readable summary
skycoll plc did:plc:z72i7hdynmk6r22z27h6tvae --audit

firehose

Connect to an AT Protocol relay WebSocket and stream repo events in real time. Filter by handle or DID, and optionally stop after N events.

# Stream all events from the default relay (wss://bsky.network)
skycoll firehose

# Filter by DID
skycoll firehose --did did:plc:abc123

# Filter by handle (resolved to DID automatically)
skycoll firehose --handle j4ck.xyz

# Use the Blacksky/atproto.africa relay
skycoll firehose --relay wss://atproto.africa

# Stop after 100 matching events
skycoll firehose --handle j4ck.xyz --limit 100

File formats

File Format
<handle>.dat Tab-separated: profile header + F/B/L/S/K prefixed rows
fdat/<handle>.dat Same format as .dat, one file per followed user
<handle>.twt Tab-separated: type uri timestamp reply_to_uri root_uri text
<handle>.fav Tab-separated: uri timestamp author_did author_handle text
<handle>.threads JSON array of thread trees (root + nested replies)
<handle>.gml Graph Modeling Language file with mutual_only and node_type
<handle>.gexf GEXF 1.3 graph for Gephi with richer node/edge attributes
<handle>.car Raw CAR archive (binary)
<did>.plc PLC directory operation log (JSON)
img/<handle> Avatar image

Gephi workflow

  1. Run skycoll edgelist j4ck.xyz --gexf
  2. Open j4ck.xyz.gexf in Gephi
  3. Use File → Open (Gephi auto-detects GEXF)
  4. Suggested layout: ForceAtlas2 with LinLog mode enabled and Prevent Overlap on
  5. Use the mutual edge attribute for edge colouring and followers_count for node sizing

Authentication details

  • PKCE: S256 code challenge method (mandatory)
  • DPoP: ES256 keypair; separate nonces for auth server vs PDS
  • Scopes: atproto transition:generic
  • Client type: Public/native — loopback redirect URI on a random port
  • Client metadata mode: Uses atproto localhost-development client_id (http://localhost/?redirect_uri=...&scope=...)
  • Session storage: ~/.skycoll/sessions/<did>.json (mode 0600)
  • sub verification: Token exchange verifies the sub claim matches the expected DID
  • atproto-proxy header: Routes requests through a specified AppView service DID
  • PAR + nonce handling: Uses pushed authorization requests and retries with server-provided DPoP-Nonce

PDS resolution

skycoll never hardcodes bsky.social. For every handle:

  1. Resolve handle → DID via DNS _atproto TXT or https://bsky.social/xrpc/com.atproto.identity.resolveHandle
  2. Fetch the DID document (plc.directory for did:plc, HTTPS well-known for did:web)
  3. Extract the #atproto_pds service endpoint
  4. Make all authenticated API calls against that PDS

Pagination & rate limits

All AT Protocol list endpoints are cursor-based. skycoll loops until no cursor is returned. On HTTP 429, it backs off with exponential retry (max 3 attempts).

Running tests

pip install -r requirements-dev.txt
pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skycoll-0.3.3.tar.gz (59.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skycoll-0.3.3-py3-none-any.whl (48.6 kB view details)

Uploaded Python 3

File details

Details for the file skycoll-0.3.3.tar.gz.

File metadata

  • Download URL: skycoll-0.3.3.tar.gz
  • Upload date:
  • Size: 59.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for skycoll-0.3.3.tar.gz
Algorithm Hash digest
SHA256 ebd96dd52320d33308b2522b1ce25e7089bc84366ae9781ea948d971f536024d
MD5 3b80bb692ad83d54f9fdcace5cb67e77
BLAKE2b-256 5f62a9f45454d9fa13420768fbf284f313bbdc2b245b0d69be2b00895d8ce40c

See more details on using hashes here.

File details

Details for the file skycoll-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: skycoll-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 48.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for skycoll-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 49046bf2f7914b2933cea30f78ef1aff87ea1c3466601af1985039373dbc8f2b
MD5 475070d4eb26243b3d6a9abbe8c34fc1
BLAKE2b-256 cf27655b6293e0a39813447a18328ae74732823372ac4d3c5f700dc78e7b4945

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page