Skip to main content

Agent-native headless browser. HTML in, Semantic Object Model out.

Project description

plasmate

Agent-native headless browser for Python. HTML in, Semantic Object Model out.

Install

pip install plasmate

Requires the plasmate binary in your PATH:

curl -fsSL https://plasmate.app/install.sh | sh

Quick Start

from plasmate import Plasmate

browser = Plasmate()

# Fetch a page as a structured Semantic Object Model
som = browser.fetch_page("https://news.ycombinator.com")
print(f"{som['title']}: {len(som['regions'])} regions")

# Extract clean text only
text = browser.extract_text("https://example.com")
print(text)

# Interactive browsing
session = browser.open_page("https://example.com")
print(session["session_id"], session["som"]["title"])

title = browser.evaluate(session["session_id"], "document.title")
print(title)

browser.close_page(session["session_id"])
browser.close()

Async

from plasmate import AsyncPlasmate

async with AsyncPlasmate() as browser:
    som = await browser.fetch_page("https://example.com")
    print(som["title"])

Context Manager

with Plasmate() as browser:
    som = browser.fetch_page("https://example.com")
    # Process closes automatically

API

Plasmate(binary="plasmate", timeout=30)

Param Type Default Description
binary str "plasmate" Path to the plasmate binary
timeout float 30 Response timeout in seconds

Stateless (one-shot)

  • fetch_page(url, *, budget=None, javascript=True) - Returns SOM dict
  • extract_text(url, *, max_chars=None) - Returns clean text string

Stateful (interactive sessions)

  • open_page(url) - Returns dict with session_id and som
  • evaluate(session_id, expression) - Run JS, get result
  • click(session_id, element_id) - Click element, get updated SOM
  • close_page(session_id) - Close session

Lifecycle

  • close() - Shut down the plasmate process

Pydantic Models

Parse SOM responses into typed Pydantic v2 models:

from plasmate import Plasmate, Som, find_interactive, find_by_text, flat_elements

browser = Plasmate()
data = browser.fetch_page("https://example.com")
som = Som(**data)

print(som.title)               # "Example Domain"
print(som.meta.element_count)  # 12

for region in som.regions:
    print(f"{region.role}: {len(region.elements)} elements")

Query Helpers

Search and traverse SOM documents:

from plasmate import Som, find_by_role, find_by_id, find_by_tag
from plasmate import find_interactive, find_by_text, flat_elements, get_token_estimate

# Find all navigation regions
navs = find_by_role(som, "navigation")

# Find a specific element
el = find_by_id(som, "e5")
if el:
    print(el.role, el.text)

# Find all links
links = find_by_tag(som, "link")

# Get all interactive elements (buttons, inputs, etc.)
for el in find_interactive(som):
    print(f"{el.id}: {el.role} - {el.text}")

# Search by text content (case-insensitive)
results = find_by_text(som, "sign up")

# Flatten all elements for iteration
all_elements = flat_elements(som)
print(f"{len(all_elements)} total elements")

# Estimate token usage
tokens = get_token_estimate(som)
print(f"~{tokens} tokens")

How It Works

The SDK spawns plasmate mcp as a child process and communicates via JSON-RPC 2.0 over stdio. The plasmate binary handles HTML parsing, JavaScript execution (V8), and SOM compilation in Rust.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plasmate-0.4.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plasmate-0.4.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file plasmate-0.4.1.tar.gz.

File metadata

  • Download URL: plasmate-0.4.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for plasmate-0.4.1.tar.gz
Algorithm Hash digest
SHA256 4eb308f2ed26369541beb10a4567871d069f24eabc6d4aa7bce310c5e7cff29f
MD5 ff3fcaaa6904914f55661e604703effc
BLAKE2b-256 a76404c2e01848ef90229b87aeafdecf39df5871ab7c297d256d500193cf64b5

See more details on using hashes here.

File details

Details for the file plasmate-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: plasmate-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for plasmate-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 32c9e3c4d0c4e2d8e8636749dd1025f7ae5b22439a1389827ea6aa00a1880964
MD5 18c352e777fbca9987116509223d72ca
BLAKE2b-256 012d486431baa8b17c4607ebe6b7014c68037161325a8583865577392f80ebce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page