Python SDK for Maxun - web automation and data extraction
Project description
Maxun Python SDK
The official Python SDK for Maxun — turn any website into an API.
Works with both Maxun Cloud and Maxun Open Source.
What can you do with Maxun SDK?
- Extract structured data from any website
- Scrape entire pages as Markdown or HTML
- Crawl multiple pages automatically to discover and scrape content
- Perform web searches and extract results as metadata or full content
- Use AI to extract data with natural language prompts
- Capture screenshots (visible area or full page)
- Automate workflows with clicks, form fills, and navigation
- Schedule recurring jobs to keep your data fresh
- Get webhooks when extractions complete
- Handle pagination automatically (scroll, click, load more)
Installation
pip install maxun
With LLM support:
pip install "maxun[anthropic]" # Anthropic Claude
pip install "maxun[openai]" # OpenAI GPT
pip install "maxun[all]" # All LLM providers
Local Development
Dependencies are declared in pyproject.toml.
To install the SDK locally in editable mode:
cd python-sdk
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e . # core only
pip install -e ".[all]" # core + all LLM providers
Configuration
from maxun import Config
config = Config(
api_key="your-api-key", # Required
base_url="https://app.maxun.dev/api/sdk/", # Optional, defaults to localhost
team_id="your-team-uuid", # Optional, for team-scoped robots
)
Environment variables are supported via a .env file (uses python-dotenv):
MAXUN_API_KEY=your-api-key
MAXUN_BASE_URL=https://app.maxun.dev/api/sdk/
MAXUN_TEAM_ID=your-team-uuid
Core Classes
Scrape
Scrape full pages as Markdown, HTML, or screenshots.
from maxun import Scrape, Config
scraper = Scrape(Config(api_key="..."))
robot = await scraper.create(
"Page Scraper",
"https://example.com",
formats=["markdown", "html"],
)
result = await robot.run()
print(result["data"]["markdown"])
Available formats: "markdown", "html", "screenshot-visible", "screenshot-fullpage"
Crawl
Crawl multiple pages starting from a URL.
from maxun import Crawl, CrawlConfig, Config
crawler = Crawl(Config(api_key="..."))
robot = await crawler.create(
"Site Crawler",
"https://example.com",
CrawlConfig(
mode="domain", # "domain" | "subdomain" | "path"
limit=50,
max_depth=3,
use_sitemap=True,
follow_links=True,
respect_robots=True,
),
)
result = await robot.run()
Search
Search the web and collect results.
from maxun import Search, SearchConfig, Config
searcher = Search(Config(api_key="..."))
robot = await searcher.create(
"AI News Search",
SearchConfig(
query="artificial intelligence 2025",
mode="discover", # "discover" | "scrape"
limit=10,
),
)
result = await robot.run()
Extract
Build robots that extract structured data from pages.
from maxun import Extract, Config
extractor = Extract(Config(api_key="..."))
# Capture specific text fields
robot = await (
extractor
.create("My Robot")
.navigate("https://example.com")
.capture_text({"Title": "h1", "Price": ".price"})
)
# Capture a list of items with optional pagination
robot = await (
extractor
.create("Product List")
.navigate("https://shop.example.com")
.capture_list({
"selector": "article.product",
"pagination": {"type": "clickNext", "selector": "a.next"},
"maxItems": 100,
})
)
result = await robot.run()
LLM Extraction
Use a natural language prompt to extract data.
from maxun import Extract, Config
extractor = Extract(Config(api_key="..."))
robot = await extractor.extract(
prompt="Extract the product name, price, and rating",
url="https://shop.example.com/product/123",
llm_provider="anthropic",
llm_model="claude-3-5-sonnet-20241022",
llm_api_key="your-anthropic-key",
)
result = await robot.run()
Robot Management
All robot types return a Robot instance with a consistent API:
# Run the robot
result = await robot.run()
# Schedule recurring runs
await robot.schedule({
"runEvery": 1,
"runEveryUnit": "DAYS",
"timezone": "UTC",
})
# Add a webhook
await robot.add_webhook({
"url": "https://your-server.com/webhook",
"events": ["run.completed", "run.failed"],
})
# Get execution history
runs = await robot.get_runs()
latest = await robot.get_latest_run()
specific = await robot.get_run("run-id")
# Update metadata or workflow
await robot.update({"meta": {"name": "New Name"}})
await robot.refresh() # reload from server
# Delete
await robot.delete()
Scheduling
from maxun import ScheduleConfig
await robot.schedule({
"runEvery": 6,
"runEveryUnit": "HOURS", # MINUTES | HOURS | DAYS | WEEKS | MONTHS
"timezone": "America/New_York",
})
# Stop scheduling
await robot.unschedule()
# Read current schedule
schedule = robot.get_schedule()
Webhooks
await robot.add_webhook({
"url": "https://your-server.com/webhook",
"events": ["run.completed", "run.failed"],
})
webhooks = robot.get_webhooks()
await robot.remove_webhooks()
Error Handling
from maxun import MaxunError
try:
result = await robot.run()
except MaxunError as e:
print(f"Error {e.status_code}: {e}")
print(f"Details: {e.details}")
Types Reference
| Type | Description |
|---|---|
Config |
SDK configuration (api_key, base_url, team_id) |
CrawlConfig |
Crawl robot configuration |
SearchConfig |
Search robot configuration |
ScheduleConfig |
Schedule configuration |
WebhookConfig |
Webhook configuration |
ExtractListConfig |
List capture configuration |
PaginationConfig |
Pagination strategy |
MaxunError |
SDK exception with status_code and details |
Examples
See the examples/ directory for complete working examples.
Requirements
- Python 3.8+
httpx >= 0.24.0python-dotenv >= 1.0.0- Optional:
anthropic >= 0.18.0,openai >= 1.0.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file maxun-0.0.4.tar.gz.
File metadata
- Download URL: maxun-0.0.4.tar.gz
- Upload date:
- Size: 14.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2fa84fc2cd903f0e8163987655ad081999ab554336e438ce6e475a301cd3c17
|
|
| MD5 |
48576e91fad8ce571ea4202aa6435e7d
|
|
| BLAKE2b-256 |
e6d72d7d999a50dc6131d440d578596f658b88e9741de83c309795c03d4e67b8
|
File details
Details for the file maxun-0.0.4-py3-none-any.whl.
File metadata
- Download URL: maxun-0.0.4-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ac77cecca1070393bc70af551a3df0bc99fc82366605051adfeabf3d2a24aab
|
|
| MD5 |
97b68d7fa74621882ec1f4b8f82fc716
|
|
| BLAKE2b-256 |
a7d0c225ec79ce1f0bb03d8a903f6338171103257e5b24448e2da0aa636d0406
|