Skip to main content

Turn any website into an API. Graft scriptable access onto authenticated web services.

Project description

🔌 graftpunk

Turn any website into an API.

Graft scriptable access onto authenticated web services.

PyPI Python 3.11+ License: MIT Code style: ruff Typed

InstallationQuick StartPluginsCLI ReferenceExamplesArchitecture


The Problem

That service has your data—but no API.

Your ISP account. Your kid's school portal. Your local library. That niche e-commerce site you order from. Your medical records. They all have data that belongs to you, locked behind a login page with no API in sight.

You're left with two options: click through the UI manually every time, or give up.

graftpunk gives you a third option.

The Solution

Log in once, script forever.

  1. LOG IN              2. CACHE               3. SCRIPT

  +-------------+       +-------------+       +-------------+
  |   Browser   |       |  Encrypted  |       |   Python    |
  |   Session   |------>|   Storage   |------>|   Script    |
  |             |       |             |       |             |
  +-------------+       +-------------+       +-------------+

  Log in manually       Session cached        Use the session
  or declaratively      with AES-128          with real browser
  via plugin config     encryption            headers replayed

Once your session is cached, you can:

  • Make HTTP requests with your authenticated cookies and real browser headers
  • Reverse-engineer XHR calls from browser dev tools
  • Build CLI tools that feel like real APIs
  • Automate downloads of documents and data
  • Keep sessions alive with background daemons
  • Capture network traffic for debugging and auditing

What You Can Build

With graftpunk as your foundation, you can turn any authenticated website into a terminal-based interface:

# Pull your kid's grades and assignments
gp schoolportal grades --student emma --format table

# Download your medical lab results
gp mychart labs --after 2024-06-01 --output ./results/

# Export your energy usage data
gp utility usage --months 12 --format csv > energy.csv

# Scrape your property tax history
gp county assessor --parcel 12345 --format json

# Make ad-hoc requests with cached session cookies + browser headers
gp http get -s mychart https://mychart.example.com/api/appointments

These aren't real APIs—they're commands defined in graftpunk plugins that replay the same XHR calls the website makes. To the server, it looks like a browser. To you, it's just automation.

Installation

pip install graftpunk

With cloud storage:

pip install graftpunk[supabase]   # Supabase backend
pip install graftpunk[s3]         # AWS S3 backend
pip install graftpunk[all]        # Everything

Quick Start

1. Cache a Session

The fastest way is with a plugin. Here's the httpbin example (no auth needed):

# Drop a YAML plugin into your plugins directory
mkdir -p ~/.config/graftpunk/plugins
cp examples/plugins/httpbin.yaml ~/.config/graftpunk/plugins/

# Use it immediately
gp httpbin ip
gp httpbin headers
gp httpbin status --code 418  # I'm a teapot!

For sites that require authentication, plugins can define declarative login:

# Log in via auto-generated command (opens browser, fills form, caches session)
gp quotes login

# Use the cached session for API calls
gp quotes list
gp quotes random

2. Use It Programmatically

from graftpunk import GraftpunkClient

# Use plugin commands from Python — same session, tokens, and retries as the CLI
with GraftpunkClient("mybank") as client:
    accounts = client.accounts()
    statements = client.statements(month="january", year=2024)

    # Grouped commands use nested attribute access
    detail = client.accounts.detail(id=42)

For lower-level access without plugins, load a session directly:

from graftpunk import load_session_for_api

# Returns a GraftpunkSession with browser headers pre-loaded
api = load_session_for_api("mysite")
response = api.get("https://app.example.com/api/internal/documents")

3. Keep It Alive

Sessions expire. graftpunk can keep them alive in the background with the keepalive daemon.

Features

Feature Why It Matters
🥷 Stealth Mode Multiple backends: Selenium with undetected-chromedriver, or NoDriver for CDP-direct automation without WebDriver detection. Bot-detection cookies (Akamai, etc.) are automatically filtered during cookie injection to prevent WAF rejection.
🔒 Encrypted Storage Sessions encrypted with AES-128 (Fernet). Local by default, optional cloud storage.
🔑 Declarative Login Define login flows with CSS selectors. graftpunk opens the browser, fills the form, and caches the session. Works in both Python and YAML plugins.
🌐 Browser Header Replay Captures real browser headers during login and replays them in API calls. Requests look like they came from Chrome, not Python.
🔌 Plugin System Full command framework with CommandContext, resource limits, output formatting, and auto-generated CLI. Python for complex logic, YAML for simple calls.
🛡️ Token & CSRF Support Declarative token extraction from cookies, headers, or page content. EAFP injection with automatic 403 retry. Tokens cached through session serialization.
📡 Observability Capture screenshots, HAR files, console logs, and network traffic. Interactive mode lets you browse manually while recording.
🔄 Keepalive Daemon Background daemon pings sites periodically to prevent session timeout.
🛠️ Ad-hoc HTTP gp http get -s <session> <url> — make one-off authenticated requests without writing a plugin.
🎨 Beautiful CLI Rich terminal output with spinners, tables, and color. --format json|table|raw on all commands.

Plugins

graftpunk is extensible via Python classes or YAML configuration. Both support declarative login, resource limits, and output formatting.

YAML Plugin (Simple REST Calls)

For straightforward HTTP calls, no Python needed:

# ~/.config/graftpunk/plugins/mybank.yaml
site_name: mybank
base_url: "https://secure.mybank.com"

login:
  url: /login
  fields:
    username: "input#email"
    password: "input#password"
  submit: "button[type=submit]"

commands:
  accounts:
    help: "List all accounts"
    method: GET
    url: "/api/accounts"
    jmespath: "accounts[].{id: id, name: name, balance: balance}"

  statements:
    help: "Get statements for a month"
    method: GET
    url: "/api/statements"
    params:
      - name: month
        required: true
        help: "Month name"
      - name: year
        type: int
        default: 2024
    timeout: 30
    max_retries: 2

Python Plugin (Complex Logic)

from graftpunk.plugins import CommandContext, LoginConfig, SitePlugin, command

class MyBankPlugin(SitePlugin):
    site_name = "mybank"
    base_url = "https://secure.mybank.com"
    backend = "nodriver"  # or "selenium"
    api_version = 1

    login_config = LoginConfig(
        url="/login",
        fields={"username": "input#email", "password": "input#password"},
        submit="button[type=submit]",
        success=".dashboard",
    )

    @command(help="List all accounts")
    def accounts(self, ctx: CommandContext):
        return ctx.session.get(f"{self.base_url}/api/accounts").json()

    @command(help="Get statements for a month")
    def statements(self, ctx: CommandContext, month: str, year: int = 2024):
        url = f"{self.base_url}/api/statements/{year}/{month}"
        return ctx.session.get(url).json()

Using Plugins

# Login (auto-generated from declarative config)
gp mybank login

# Run commands
gp mybank accounts
gp mybank statements --month january --year 2024 --format table

# List all discovered plugins
gp plugins

Plugin Discovery

Plugins are discovered from three sources:

  1. Entry points — Python packages registered via pyproject.toml
  2. YAML files~/.config/graftpunk/plugins/*.yaml and *.yml
  3. Python files~/.config/graftpunk/plugins/*.py

If two plugins share the same site_name, registration fails with an error showing both sources. No silent shadowing.

See examples/ for working plugins and templates.

CLI Reference

$ gp --help

 🔌 graftpunk - turn any website into an API

Commands:
  session     Manage encrypted browser sessions
  http        Make ad-hoc HTTP requests with cached session cookies
  observe     Capture and view browser observability data
  plugins     List discovered plugins
  import-har  Import HAR file and generate a plugin
  config      Show current configuration
  keepalive   Manage the session keepalive daemon
  version     Show version info

Session Management

gp session list              # List all cached sessions
gp session show <name>       # Session metadata (domain, cookies, expiry)
gp session clear <name>      # Remove a session (or --all)
gp session export <name>     # Export cookies to HTTPie session format
gp session use <name>        # Set active session for subsequent commands
gp session unset             # Clear active session

Ad-hoc HTTP Requests

Make authenticated requests using cached sessions without writing a plugin:

gp http get -s mybank https://secure.mybank.com/api/accounts
gp http post -s mybank https://secure.mybank.com/api/transfer --data '{"amount": 100}'

Use --role to set browser header roles (built-in or plugin-defined):

gp http get -s mybank --role xhr https://secure.mybank.com/api/status
gp http get -s mybank --role api https://secure.mybank.com/v2/data  # custom plugin role

Supports all HTTP methods: get, post, put, patch, delete, head, options.

Observability

Capture browser activity for debugging:

# Open authenticated browser and capture network traffic
gp observe -s mybank go https://secure.mybank.com/dashboard

# Interactive mode — browse manually, Ctrl+C to save
gp observe -s mybank interactive https://secure.mybank.com/dashboard

# Or use the --interactive flag on observe go
gp observe -s mybank go --interactive https://secure.mybank.com/dashboard

# View captured data
gp observe list
gp observe show mybank
gp observe clean mybank

Interactive mode opens an authenticated browser and records all network traffic (including response bodies) while you click around. Press Ctrl+C to stop — HAR files, screenshots, page source, and console logs are saved automatically.

Pass --observe full to any command to capture screenshots, HAR files, and console logs.

HAR Import

Generate plugins from browser network captures:

gp import-har auth-flow.har --name mybank

Configuration

Variable Default Description
GRAFTPUNK_STORAGE_BACKEND local Storage: local, supabase, or s3
GRAFTPUNK_CONFIG_DIR ~/.config/graftpunk Config and encryption key location
GRAFTPUNK_SESSION_TTL_HOURS 720 Session lifetime (30 days)
GRAFTPUNK_LOG_LEVEL WARNING Logging verbosity
GRAFTPUNK_LOG_FORMAT console Log format: console or json

CLI flags: -v (info), -vv (debug), --log-format json, --observe full, --network-debug (wire-level HTTP tracing).

Browser Backends

graftpunk supports two browser automation backends (both included by default):

Backend Best For
selenium Simple sites, backward compatibility
nodriver Enterprise sites, better anti-detection

Why NoDriver? NoDriver uses Chrome DevTools Protocol (CDP) directly without the WebDriver binary, eliminating a common detection vector used by anti-bot systems.

Bot-detection cookie filtering: When injecting session cookies into a nodriver browser (for observe mode, token extraction, etc.), graftpunk automatically skips known WAF tracking cookies (Akamai bm_*, ak_bmsc, _abck). These cookies carry stale bot-classification state that causes WAFs to reject the browser with ERR_HTTP2_PROTOCOL_ERROR. Disable with skip_bot_cookies=False if needed.

from graftpunk import BrowserSession

# Use BrowserSession with explicit backend
session = BrowserSession(backend="nodriver", headless=False)

Security

Your Data, Your Rules

graftpunk is for automating access to your own accounts. You're not scraping other people's data—you're building tools to access information that already belongs to you.

Some services may consider automation a ToS violation. Use your judgment.

Encryption

  • Algorithm: Fernet (AES-128-CBC + HMAC-SHA256)
  • Key storage: ~/.config/graftpunk/.session_key with 0600 permissions
  • Integrity: SHA-256 checksum validated before deserializing

Best Practices

  • Keep your encryption key secure
  • Don't share session files
  • Run graftpunk on trusted machines
  • Use unique, strong passwords for automated accounts

Pickle warning: graftpunk uses Python's pickle for serialization. Only load sessions you created.

Development

git clone https://github.com/stavxyz/graftpunk.git
cd graftpunk
just setup    # Install deps with uv
just check    # Run lint, typecheck, tests
just build    # Build for PyPI

Requires uv for development. See CONTRIBUTING.md for full guidelines.

License

MIT License—see LICENSE.

Acknowledgments


Built for automating your own data access.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graftpunk-1.5.0.tar.gz (500.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graftpunk-1.5.0-py3-none-any.whl (180.0 kB view details)

Uploaded Python 3

File details

Details for the file graftpunk-1.5.0.tar.gz.

File metadata

  • Download URL: graftpunk-1.5.0.tar.gz
  • Upload date:
  • Size: 500.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for graftpunk-1.5.0.tar.gz
Algorithm Hash digest
SHA256 295db1c6a4c8b6b67f33b8fd372786a723bef2b300428127f7749c97d83115d0
MD5 feddf0a42f17f861ff9d7637f989d633
BLAKE2b-256 8b6aa12366419a88779892614a511764d81a3462da940a878385bea033048a25

See more details on using hashes here.

File details

Details for the file graftpunk-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: graftpunk-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 180.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for graftpunk-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ba6314a5df52c6aa9615095173243ee945f03ca6927f6b7fefa6ff013e5fd6f
MD5 1978927c7825acbae88b07b579c36221
BLAKE2b-256 956030dea6f218d35adb5259cc5559e51f2ed480f72d8538c10c838dfadb8712

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page