Social media archival with authenticity guarantees
Project description
Pusheen Archiver
Save social media posts before they disappear. Pusheen Archiver captures posts, profiles, and media from X, YouTube, TikTok, Instagram, SoundCloud, and Pinterest — all stored locally, with cryptographic hashes so you can prove the content is untampered.
No cloud account. No subscription. Just pip install pusheen-archiver and you're done.
Install
pip install pusheen-archiver
pusheen
That's it. The first time you run pusheen with no arguments it'll ask where you want to store your archives and set everything up.
Windows installer: Grab PusheenInstaller.exe from the releases page for a GUI installer that handles Python, PATH, and Chromium automatically.
Quick start
# archive a whole profile
pusheen save https://x.com/someuser
pusheen save https://www.tiktok.com/@someuser
pusheen save https://soundcloud.com/some_artist
# single post
pusheen save https://www.youtube.com/watch?v=dQw4w9WgXcQ
# keep it up to date
pusheen sync x someuser
Paste any supported URL and pusheen figures out what it is — profile, post, playlist, whatever.
What gets saved
For each post:
- The media (video, images, audio) at the best available quality
- A
metadata.jsonwith the caption, stats, hashtags, and everything else - A full-page screenshot and rendered HTML snapshot (via Playwright)
- A
versions/folder that records every edit the post goes through
For each profile run:
- Avatar and banner images
- A signed
manifest.jsonlisting every file with its SHA256 hash - A
receipt.txtyou can attach to a legal filing
Nothing ever gets deleted from disk. If a post disappears online, it gets flagged in the database but stays in your archive.
Platforms
| Platform | Auth needed? | Notes |
|---|---|---|
| X (Twitter) | No API key — browser cookies work | See cookie setup below |
| YouTube | Optional API key | Works fine without one |
| TikTok | Optional | Public profiles work without credentials |
| Optional | Public profiles work without credentials | |
| SoundCloud | None | client_id is auto-discovered |
| Optional | Public boards work without credentials |
Configuration
All settings are in a single TOML file — no scattered environment variables for desktop use:
| OS | Location |
|---|---|
| Windows | %APPDATA%\pusheen-archiver\config.toml |
| macOS | ~/Library/Application Support/pusheen-archiver/config.toml |
| Linux | ~/.config/pusheen-archiver/config.toml |
pusheen config edit # opens it in your default editor
The file is fully commented so you know what everything does. The important bits:
[paths]
archive_root = "C:/Users/you/archive"
[archive]
capture_screenshots = true
capture_html = true
skip_media = false # true = metadata only, no downloads
save_info_json = true # yt-dlp .info.json sidecar files
save_thumbnail = true # thumbnail images alongside media
max_posts = 0 # 0 = no limit
[media]
media_format = "default" # default | mp4 | webm | mp3 | m4a | flac | opus
media_quality = "best" # best | high | medium | low | worst
You can also pass --no-info-json or --no-thumbnail on the command line to skip those for a single run without touching the config.
Cookie auth for X
X doesn't require an API key. Browser cookies are enough.
Option A — cookies file (more reliable)
- Install the Get cookies.txt LOCALLY extension
- Log into x.com, click the extension, export as
cookies.txt - In
config.tomlunder[x]:cookies_file = "C:/path/to/cookies.txt"
Option B — live browser (easier)
[x]
cookies_browser = "brave" # chrome | firefox | edge | brave | chromium
The browser has to be closed when you run pusheen — Chrome and Brave lock their cookie database while they're open.
Archive structure
archive/
x/
someuser/
profile/
profile.json
avatar.jpg
banner.jpg
posts/
2026-06-10_1234567890/
metadata.json
screenshot.png
page.html
media/
video.mp4
versions/
v1.json
v2.json ← created automatically when a post is edited
manifests/
manifest.json ← every file + its SHA256
manifest.sig ← Ed25519 signature (if you've run `pusheen keygen`)
receipt.txt
Verifying an archive
pusheen verify archive/x/someuser/manifests
Checks every file hash against the manifest. If you generated signing keys (pusheen keygen), it validates the Ed25519 signature too.
All commands
pusheen save <url> archive anything — post, profile, playlist
--no-media skip downloads, save metadata only
--no-screenshots skip Playwright screenshots
--no-info-json skip yt-dlp .info.json sidecars
--no-thumbnail skip thumbnail images
--out <dir> save to a specific directory
--watch re-archive a profile on a schedule
pusheen sync <platform> <user> incremental sync (new posts only)
pusheen sync-all sync every account you've archived
pusheen daemon run sync-all on repeat until Ctrl-C
pusheen search <query> search across all archived captions
pusheen history <platform> <user> show profile change timeline
pusheen export <platform> <user> pack to .zip or .tar.gz
pusheen status list archived accounts and stats
pusheen verify <manifest_dir> check file hashes and signature
pusheen keygen generate Ed25519 signing keys
pusheen config edit open config.toml in your editor
pusheen config show print current settings
pusheen config update add missing keys to an existing config
pusheen db init create database tables (first run)
pusheen db migrate run Alembic migrations
pusheen install-browser install Playwright browser
pusheen shell interactive REPL
Platform aliases: x/tw, yt, ig/insta, tt, sc, pin
Search
pusheen search "concert announcement"
pusheen search "cute" --platform tiktok
pusheen search "dropped" --username someuser --limit 50
Profile history
pusheen history x someuser
2026-01-15 first seen bio: "just a person" followers: 1,204
2026-03-02 bio changed "just a person on the internet" +185 followers
2026-06-10 avatar changed +113 followers
Server mode
The default setup uses SQLite and runs entirely locally. If you want to run pusheen as a shared service with a REST API and async job queue:
pip install "pusheen-archiver[server]"
# set database_url in config.toml to your PostgreSQL connection string
docker-compose up -d db redis
pusheen db migrate
uvicorn pusheen_archiver.api.main:app --host 0.0.0.0 --port 8000
API docs at http://localhost:8000/docs. Requires PostgreSQL + Redis. This is for self-hosted or developer deployments — not needed for personal use.
Adding a platform
- Create
src/pusheen_archiver/adapters/myplatform.py, subclassBasePlatformAdapter - Implement
discover_account,discover_posts,fetch_metadata,download_media - Register it in
src/pusheen_archiver/adapters/__init__.py
Everything else — signing, manifests, version history, deduplication, CLI — works automatically.
Development
git clone https://github.com/pusheenism/pusheen-archiver
cd pusheen-archiver
pip install -e ".[dev]"
pytest
For a deep dive into the internals — database schema, adapter interface, API endpoints, signing system — see docs/ARCHITECTURE.md.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pusheen_archiver-0.1.0.tar.gz.
File metadata
- Download URL: pusheen_archiver-0.1.0.tar.gz
- Upload date:
- Size: 111.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaf657204790f756983f94354baf2459730e1c68b53c0f8b23de3b2d325ddb51
|
|
| MD5 |
021486df7954f4edfdb4a950cbdbd403
|
|
| BLAKE2b-256 |
2d067cbf8241ee4a97c46f93151ce9bac859461e79ae4ecdca1deef53cb9f9ec
|
File details
Details for the file pusheen_archiver-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pusheen_archiver-0.1.0-py3-none-any.whl
- Upload date:
- Size: 126.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
921cf680892e3c471c2780854f140c08ce7d654d28ee53a608bd4e2110386b20
|
|
| MD5 |
8ab0c0987c398573c1d9e43c15543707
|
|
| BLAKE2b-256 |
f2527879c4fb874a890c31c681e6b673012b26aa3fac5d27063b34690871f2aa
|