Skip to main content

Agent-native headless browser. HTML in, Semantic Object Model out.

Project description

plasmate

Agent-native headless browser for Python. HTML in, Semantic Object Model out.

Install

pip install plasmate

Requires the plasmate binary in your PATH:

curl -fsSL https://plasmate.app/install.sh | sh

Quick Start

from plasmate import Plasmate

browser = Plasmate()

# Fetch a page as a structured Semantic Object Model
som = browser.fetch_page("https://news.ycombinator.com")
print(f"{som['title']}: {len(som['regions'])} regions")

# Extract clean text only
text = browser.extract_text("https://example.com")
print(text)

# Interactive browsing
session = browser.open_page("https://example.com")
print(session["session_id"], session["som"]["title"])

title = browser.evaluate(session["session_id"], "document.title")
print(title)

browser.close_page(session["session_id"])
browser.close()

Async

from plasmate import AsyncPlasmate

async with AsyncPlasmate() as browser:
    som = await browser.fetch_page("https://example.com")
    print(som["title"])

Context Manager

with Plasmate() as browser:
    som = browser.fetch_page("https://example.com")
    # Process closes automatically

API

Plasmate(binary="plasmate", timeout=30)

Param Type Default Description
binary str "plasmate" Path to the plasmate binary
timeout float 30 Response timeout in seconds

Stateless (one-shot)

  • fetch_page(url, *, budget=None, javascript=True) - Returns SOM dict
  • extract_text(url, *, max_chars=None) - Returns clean text string

Stateful (interactive sessions)

  • open_page(url) - Returns dict with session_id and som
  • evaluate(session_id, expression) - Run JS, get result
  • click(session_id, element_id) - Click element, get updated SOM
  • close_page(session_id) - Close session

Lifecycle

  • close() - Shut down the plasmate process

Pydantic Models

Parse SOM responses into typed Pydantic v2 models:

from plasmate import Plasmate, Som, find_interactive, find_by_text, flat_elements

browser = Plasmate()
data = browser.fetch_page("https://example.com")
som = Som(**data)

print(som.title)               # "Example Domain"
print(som.meta.element_count)  # 12

for region in som.regions:
    print(f"{region.role}: {len(region.elements)} elements")

Query Helpers

Search and traverse SOM documents:

from plasmate import Som, find_by_role, find_by_id, find_by_tag
from plasmate import find_interactive, find_by_text, flat_elements, get_token_estimate

# Find all navigation regions
navs = find_by_role(som, "navigation")

# Find a specific element
el = find_by_id(som, "e5")
if el:
    print(el.role, el.text)

# Find all links
links = find_by_tag(som, "link")

# Get all interactive elements (buttons, inputs, etc.)
for el in find_interactive(som):
    print(f"{el.id}: {el.role} - {el.text}")

# Search by text content (case-insensitive)
results = find_by_text(som, "sign up")

# Flatten all elements for iteration
all_elements = flat_elements(som)
print(f"{len(all_elements)} total elements")

# Estimate token usage
tokens = get_token_estimate(som)
print(f"~{tokens} tokens")

How It Works

The SDK spawns plasmate mcp as a child process and communicates via JSON-RPC 2.0 over stdio. The plasmate binary handles HTML parsing, JavaScript execution (V8), and SOM compilation in Rust.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plasmate-0.4.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plasmate-0.4.0-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file plasmate-0.4.0.tar.gz.

File metadata

  • Download URL: plasmate-0.4.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for plasmate-0.4.0.tar.gz
Algorithm Hash digest
SHA256 50dc809b562a6915fa81562267023c9a3eca2af267a1c1c64d891a64cd80e49c
MD5 ea0bb5d9258636e72ad7f034cfd286d4
BLAKE2b-256 63c52657140655f3cc5459dc113281c0ccecd781dfb4f1a6be7789a3e93cedb2

See more details on using hashes here.

File details

Details for the file plasmate-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: plasmate-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for plasmate-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6cddc8105ffad78f867d512c51c8f82eeae29193c65eee19c998a90e44ae602c
MD5 ebd965898bec85a5807401f8167dd0e1
BLAKE2b-256 f60c4e87dd119ae852b16ce51baedff1c6f7fe883f77f1401e3299cc031fa961

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page