Skip to main content

Turn any website into an API. Graft scriptable access onto authenticated web services.

Project description

🔌 graftpunk

Turn any website into an API.

Graft scriptable access onto authenticated web services.

Python 3.11+ License: MIT Code style: ruff Typed

InstallationQuick StartPluginsCLI ReferenceExamplesArchitecture


The Problem

That service has your data—but no API.

Your ISP account. Your kid's school portal. Your local library. That niche e-commerce site you order from. Your medical records. They all have data that belongs to you, locked behind a login page with no API in sight.

You're left with two options: click through the UI manually every time, or give up.

graftpunk gives you a third option.

The Solution

Log in once, script forever.

  1. LOG IN              2. CACHE               3. SCRIPT

  +-------------+       +-------------+       +-------------+
  |   Browser   |       |  Encrypted  |       |   Python    |
  |   Session   |------>|   Storage   |------>|   Script    |
  |             |       |             |       |             |
  +-------------+       +-------------+       +-------------+

  Log in manually       Session cached        Use the session
  or declaratively      with AES-128          with real browser
  via plugin config     encryption            headers replayed

Once your session is cached, you can:

  • Make HTTP requests with your authenticated cookies and real browser headers
  • Reverse-engineer XHR calls from browser dev tools
  • Build CLI tools that feel like real APIs
  • Automate downloads of documents and data
  • Keep sessions alive with background daemons
  • Capture network traffic for debugging and auditing

What You Can Build

With graftpunk as your foundation, you can turn any authenticated website into a terminal-based interface:

# Pull your kid's grades and assignments
gp schoolportal grades --student emma --format table

# Download your medical lab results
gp mychart labs --after 2024-06-01 --output ./results/

# Export your energy usage data
gp utility usage --months 12 --format csv > energy.csv

# Scrape your property tax history
gp county assessor --parcel 12345 --format json

# Make ad-hoc requests with cached session cookies + browser headers
gp http get -s mychart https://mychart.example.com/api/appointments

These aren't real APIs—they're commands defined in graftpunk plugins that replay the same XHR calls the website makes. To the server, it looks like a browser. To you, it's just automation.

Installation

pip install graftpunk

With cloud storage:

pip install graftpunk[supabase]   # Supabase backend
pip install graftpunk[s3]         # AWS S3 backend
pip install graftpunk[all]        # Everything

Quick Start

1. Cache a Session

The fastest way is with a plugin. Here's the httpbin example (no auth needed):

# Drop a YAML plugin into your plugins directory
mkdir -p ~/.config/graftpunk/plugins
cp examples/plugins/httpbin.yaml ~/.config/graftpunk/plugins/

# Use it immediately
gp httpbin ip
gp httpbin headers
gp httpbin status --code 418  # I'm a teapot!

For sites that require authentication, plugins can define declarative login:

# Log in via auto-generated command (opens browser, fills form, caches session)
gp quotes login

# Use the cached session for API calls
gp quotes list
gp quotes random

2. Use It Programmatically

from graftpunk import load_session_for_api

# Load your cached session — returns a GraftpunkSession with
# browser headers pre-loaded for realistic request fingerprints
api = load_session_for_api("mysite")

# Make authenticated requests that look like they came from Chrome
response = api.get("https://app.example.com/api/internal/documents")
documents = response.json()

for doc in documents:
    print(f"Downloading {doc['name']}...")
    content = api.get(doc['download_url']).content
    with open(doc['name'], 'wb') as f:
        f.write(content)

3. Keep It Alive

Sessions expire. graftpunk can keep them alive in the background with the keepalive daemon.

Features

Feature Why It Matters
🥷 Stealth Mode Multiple backends: Selenium with undetected-chromedriver, or NoDriver for CDP-direct automation without WebDriver detection. Bot-detection cookies (Akamai, etc.) are automatically filtered during cookie injection to prevent WAF rejection.
🔒 Encrypted Storage Sessions encrypted with AES-128 (Fernet). Local by default, optional cloud storage.
🔑 Declarative Login Define login flows with CSS selectors. graftpunk opens the browser, fills the form, and caches the session. Works in both Python and YAML plugins.
🌐 Browser Header Replay Captures real browser headers during login and replays them in API calls. Requests look like they came from Chrome, not Python.
🔌 Plugin System Full command framework with CommandContext, resource limits, output formatting, and auto-generated CLI. Python for complex logic, YAML for simple calls.
🛡️ Token & CSRF Support Declarative token extraction from cookies, headers, or page content. EAFP injection with automatic 403 retry. Tokens cached through session serialization.
📡 Observability Capture screenshots, HAR files, console logs, and network traffic. Interactive mode lets you browse manually while recording.
🔄 Keepalive Daemon Background daemon pings sites periodically to prevent session timeout.
🛠️ Ad-hoc HTTP gp http get -s <session> <url> — make one-off authenticated requests without writing a plugin.
🎨 Beautiful CLI Rich terminal output with spinners, tables, and color. --format json|table|raw on all commands.

Plugins

graftpunk is extensible via Python classes or YAML configuration. Both support declarative login, resource limits, and output formatting.

YAML Plugin (Simple REST Calls)

For straightforward HTTP calls, no Python needed:

# ~/.config/graftpunk/plugins/mybank.yaml
site_name: mybank
base_url: "https://secure.mybank.com"

login:
  url: /login
  fields:
    username: "input#email"
    password: "input#password"
  submit: "button[type=submit]"

commands:
  accounts:
    help: "List all accounts"
    method: GET
    url: "/api/accounts"
    jmespath: "accounts[].{id: id, name: name, balance: balance}"

  statements:
    help: "Get statements for a month"
    method: GET
    url: "/api/statements"
    params:
      - name: month
        required: true
        help: "Month name"
      - name: year
        type: int
        default: 2024
    timeout: 30
    max_retries: 2

Python Plugin (Complex Logic)

from graftpunk.plugins import CommandContext, LoginConfig, SitePlugin, command

class MyBankPlugin(SitePlugin):
    site_name = "mybank"
    base_url = "https://secure.mybank.com"
    backend = "nodriver"  # or "selenium"
    api_version = 1

    login_config = LoginConfig(
        url="/login",
        fields={"username": "input#email", "password": "input#password"},
        submit="button[type=submit]",
        success=".dashboard",
    )

    @command(help="List all accounts")
    def accounts(self, ctx: CommandContext):
        return ctx.session.get(f"{self.base_url}/api/accounts").json()

    @command(help="Get statements for a month")
    def statements(self, ctx: CommandContext, month: str, year: int = 2024):
        url = f"{self.base_url}/api/statements/{year}/{month}"
        return ctx.session.get(url).json()

Using Plugins

# Login (auto-generated from declarative config)
gp mybank login

# Run commands
gp mybank accounts
gp mybank statements --month january --year 2024 --format table

# List all discovered plugins
gp plugins

Plugin Discovery

Plugins are discovered from three sources:

  1. Entry points — Python packages registered via pyproject.toml
  2. YAML files~/.config/graftpunk/plugins/*.yaml and *.yml
  3. Python files~/.config/graftpunk/plugins/*.py

If two plugins share the same site_name, registration fails with an error showing both sources. No silent shadowing.

See examples/ for working plugins and templates.

CLI Reference

$ gp --help

 🔌 graftpunk - turn any website into an API

Commands:
  session     Manage encrypted browser sessions
  http        Make ad-hoc HTTP requests with cached session cookies
  observe     Capture and view browser observability data
  plugins     List discovered plugins
  import-har  Import HAR file and generate a plugin
  config      Show current configuration
  keepalive   Manage the session keepalive daemon
  version     Show version info

Session Management

gp session list              # List all cached sessions
gp session show <name>       # Session metadata (domain, cookies, expiry)
gp session clear <name>      # Remove a session (or --all)
gp session export <name>     # Export cookies to HTTPie session format
gp session use <name>        # Set active session for subsequent commands
gp session unset             # Clear active session

Ad-hoc HTTP Requests

Make authenticated requests using cached sessions without writing a plugin:

gp http get -s mybank https://secure.mybank.com/api/accounts
gp http post -s mybank https://secure.mybank.com/api/transfer --data '{"amount": 100}'

Supports all HTTP methods: get, post, put, patch, delete, head, options.

Observability

Capture browser activity for debugging:

# Open authenticated browser and capture network traffic
gp observe -s mybank go https://secure.mybank.com/dashboard

# Interactive mode — browse manually, Ctrl+C to save
gp observe -s mybank interactive https://secure.mybank.com/dashboard

# Or use the --interactive flag on observe go
gp observe -s mybank go --interactive https://secure.mybank.com/dashboard

# View captured data
gp observe list
gp observe show mybank
gp observe clean mybank

Interactive mode opens an authenticated browser and records all network traffic (including response bodies) while you click around. Press Ctrl+C to stop — HAR files, screenshots, page source, and console logs are saved automatically.

Pass --observe full to any command to capture screenshots, HAR files, and console logs.

HAR Import

Generate plugins from browser network captures:

gp import-har auth-flow.har --name mybank

Configuration

Variable Default Description
GRAFTPUNK_STORAGE_BACKEND local Storage: local, supabase, or s3
GRAFTPUNK_CONFIG_DIR ~/.config/graftpunk Config and encryption key location
GRAFTPUNK_SESSION_TTL_HOURS 720 Session lifetime (30 days)
GRAFTPUNK_LOG_LEVEL WARNING Logging verbosity
GRAFTPUNK_LOG_FORMAT console Log format: console or json

CLI flags: -v (info), -vv (debug), --log-format json, --observe full, --network-debug (wire-level HTTP tracing).

Browser Backends

graftpunk supports two browser automation backends (both included by default):

Backend Best For
selenium Simple sites, backward compatibility
nodriver Enterprise sites, better anti-detection

Why NoDriver? NoDriver uses Chrome DevTools Protocol (CDP) directly without the WebDriver binary, eliminating a common detection vector used by anti-bot systems.

Bot-detection cookie filtering: When injecting session cookies into a nodriver browser (for observe mode, token extraction, etc.), graftpunk automatically skips known WAF tracking cookies (Akamai bm_*, ak_bmsc, _abck). These cookies carry stale bot-classification state that causes WAFs to reject the browser with ERR_HTTP2_PROTOCOL_ERROR. Disable with skip_bot_cookies=False if needed.

from graftpunk import BrowserSession

# Use BrowserSession with explicit backend
session = BrowserSession(backend="nodriver", headless=False)

Security

Your Data, Your Rules

graftpunk is for automating access to your own accounts. You're not scraping other people's data—you're building tools to access information that already belongs to you.

Some services may consider automation a ToS violation. Use your judgment.

Encryption

  • Algorithm: Fernet (AES-128-CBC + HMAC-SHA256)
  • Key storage: ~/.config/graftpunk/.session_key with 0600 permissions
  • Integrity: SHA-256 checksum validated before deserializing

Best Practices

  • Keep your encryption key secure
  • Don't share session files
  • Run graftpunk on trusted machines
  • Use unique, strong passwords for automated accounts

Pickle warning: graftpunk uses Python's pickle for serialization. Only load sessions you created.

Development

git clone https://github.com/stavxyz/graftpunk.git
cd graftpunk
just setup    # Install deps with uv
just check    # Run lint, typecheck, tests
just build    # Build for PyPI

Requires uv for development. See CONTRIBUTING.md for full guidelines.

License

MIT License—see LICENSE.

Acknowledgments


Built for automating your own data access.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graftpunk-1.3.0.tar.gz (617.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graftpunk-1.3.0-py3-none-any.whl (172.5 kB view details)

Uploaded Python 3

File details

Details for the file graftpunk-1.3.0.tar.gz.

File metadata

  • Download URL: graftpunk-1.3.0.tar.gz
  • Upload date:
  • Size: 617.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for graftpunk-1.3.0.tar.gz
Algorithm Hash digest
SHA256 558eea3ed0da3dea06e63a0cd83989af119e21cec8b5d3d537a6bcb991e12b87
MD5 560a0ffb32d03d24f8bcc834fc54fb68
BLAKE2b-256 2cfd60d9888b449b352c59df746f3542ad4f9ef6c78dc479935002ca9e8371a9

See more details on using hashes here.

File details

Details for the file graftpunk-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: graftpunk-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 172.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for graftpunk-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23e6f9d3cf72f2ff18403dc008c7d6888fa75ae76689e83687df5376822ace3e
MD5 a3233d37fc7119ae5de1a7d05a018dff
BLAKE2b-256 de469ef04329c69c603d99862b966957c0023069cdfe52e0e64be68a193127c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page