A simple API client for code execution
Project description
InstaVM Client
A comprehensive Python client library for InstaVM's code execution and browser automation APIs.
Features
- Code Execution: Run Python, Bash, and other languages in secure cloud environments
- Browser Automation: Control web browsers for testing, scraping, and automation
- Session Management: Automatic session creation and server-side expiration
- File Operations: Upload files to execution environments
- Async Support: Execute commands asynchronously for long-running tasks
- Error Handling: Comprehensive exception handling for different failure modes
Installation
You can install the package using pip:
pip install instavm
Quick Start
Code Execution
from instavm import InstaVM, ExecutionError, NetworkError
# Create client with automatic session management
client = InstaVM(api_key='your_api_key')
try:
# Execute a command
result = client.execute("print(100**100)")
print(result)
# Get usage info for the session
usage = client.get_usage()
print(usage)
except ExecutionError as e:
print(f"Code execution failed: {e}")
except NetworkError as e:
print(f"Network issue: {e}")
finally:
client.close_session()
File Upload
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Upload a file to the execution environment
result = client.upload_file("local_script.py", "/remote/path/script.py")
print(result)
# Execute the uploaded file
execution_result = client.execute("python /remote/path/script.py", language="bash")
print(execution_result)
Error Handling
from instavm import InstaVM, AuthenticationError, RateLimitError, SessionError
try:
client = InstaVM(api_key='invalid_key')
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limit exceeded - try again later")
except SessionError as e:
print(f"Session error: {e}")
Async Execution
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Execute command asynchronously (returns task ID)
result = client.execute_async("sleep 5 && echo 'Long task complete!'", language="bash")
task_id = result['task_id']
print(f"Task {task_id} is running in background...")
# Poll for task result
task_result = client.get_task_result(task_id, poll_interval=2, timeout=30)
print("Task complete!")
print(f"Stdout: {task_result['stdout']}")
print(f"Stderr: {task_result['stderr']}")
VMs, Snapshots, and Shares
Manage snapshots (templates), VMs, and shared ports via either:
- Sub-clients (recommended):
client.vms.*,client.snapshots.*,client.shares.* - Flat aliases:
client.create_vm(...),client.list_snapshots(...), etc.
import os
from instavm import InstaVM
client = InstaVM(
api_key=os.environ["INSTA_API_KEY"],
base_url=os.getenv("INSTA_BASE_URL", "https://api.instavm.io"),
)
# 1) Create a snapshot from an OCI image
snapshot = client.snapshots.create(
oci_image="python:3.12-slim",
name="py312",
)
# 2) Create a VM from that snapshot
vm = client.vms.create(
snapshot_id=snapshot["id"],
memory_mb=1024,
vcpu_count=2,
metadata={"project": "demo"},
)
# 3) (Optional) Snapshot a running VM
vm_snapshot = client.vms.snapshot(vm["vm_id"], name="post-install")
print("VM snapshot:", vm_snapshot)
# 4) (Optional) Clone a VM
clone = client.vms.clone(vm["vm_id"])
print("Clone:", clone)
# 5) (Optional) Share a port from the VM
client.shares.create(vm_id=vm["vm_id"], port=8000, is_public=False)
# Cleanup
client.vms.delete(vm["vm_id"])
Browser Automation
Basic Browser Usage
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Create browser session
session_id = client.create_browser_session(1920, 1080)
print(f"Browser session: {session_id}")
# Navigate to a webpage
nav_result = client.browser_navigate("https://example.com", session_id)
print(f"Navigation: {nav_result}")
# Take screenshot (returns base64 string)
screenshot = client.browser_screenshot(session_id)
print(f"Screenshot size: {len(screenshot)} characters")
# Extract page elements
elements = client.browser_extract_elements(session_id, "title", attributes=["text"])
print(f"Page title: {elements}")
# Interact with page
client.browser_scroll(session_id, y=200)
client.browser_click("button#submit", session_id)
client.browser_fill("input[name='email']", "test@example.com", session_id)
# Sessions auto-expire on server side (no explicit close needed)
# But you can close manually if desired:
# client.close_browser_session(session_id)
Browser Manager (High-Level Interface)
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Create managed browser session
browser_session = client.browser.create_session(1366, 768)
print(f"Managed session: {browser_session.session_id}")
# Use session object for operations
browser_session.navigate("https://example.com")
browser_session.click("button#submit")
browser_session.fill("input[name='email']", "test@example.com")
browser_session.type("textarea", "Hello world!")
# Take screenshot
screenshot = browser_session.screenshot()
print(f"Screenshot: {len(screenshot)} chars")
# Extract elements
titles = browser_session.extract_elements("h1", attributes=["text"])
print(f"H1 elements: {titles}")
# Close session when done
browser_session.close()
Convenience Methods (Auto-Session)
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# These methods auto-create a browser session if needed
client.browser.navigate("https://example.com")
screenshot = client.browser.screenshot()
elements = client.browser.extract_elements("title")
print(f"Auto-session screenshot: {len(screenshot)} chars")
print(f"Elements found: {elements}")
Available Browser Methods
Session Management:
create_browser_session(width, height, user_agent)- Create new browser sessionget_browser_session(session_id)- Get session informationlist_browser_sessions()- List active sessionsclose_browser_session(session_id)- Close session (optional - sessions auto-expire)
Navigation & Interaction:
browser_navigate(url, session_id, timeout)- Navigate to URLbrowser_click(selector, session_id, force, timeout)- Click elementbrowser_type(selector, text, session_id, delay, timeout)- Type textbrowser_fill(selector, value, session_id, timeout)- Fill form fieldbrowser_scroll(session_id, selector, x, y)- Scroll page or elementbrowser_wait(condition, session_id, selector, timeout)- Wait for condition
Data Extraction:
browser_screenshot(session_id, full_page, clip, format)- Take screenshotbrowser_extract_elements(session_id, selector, attributes)- Extract DOM elementsbrowser_extract_content(session_id, include_interactive, include_anchors, max_anchors)- NEW: Extract LLM-friendly content
Browser Error Handling
from instavm import (
InstaVM, BrowserSessionError, BrowserInteractionError,
ElementNotFoundError, BrowserTimeoutError, QuotaExceededError
)
client = InstaVM(api_key='your_api_key')
try:
session_id = client.create_browser_session(1920, 1080)
client.browser_navigate("https://example.com", session_id)
client.browser_click("button#nonexistent", session_id)
except BrowserSessionError:
print("Browser session error - may be down or quota exceeded")
except ElementNotFoundError as e:
print(f"Element not found: {e}")
except BrowserTimeoutError:
print("Browser operation timed out")
except BrowserInteractionError as e:
print(f"Browser interaction failed: {e}")
Complete Automation Example
from instavm import InstaVM
import base64
def web_automation_example():
client = InstaVM(api_key='your_api_key')
# 1. Execute setup code
setup = client.execute("""
import json
data = {"timestamp": "2024-01-01", "status": "starting"}
print(json.dumps(data))
""", language="python")
print("Setup result:", setup)
# 2. Browser automation
session_id = client.create_browser_session(1920, 1080)
# Navigate and interact
client.browser_navigate("https://httpbin.org/forms/post", session_id)
client.browser_fill("input[name='custname']", "Test User", session_id)
client.browser_fill("input[name='custemail']", "test@example.com", session_id)
# Take screenshot before submission
screenshot = client.browser_screenshot(session_id)
# Save screenshot
with open("automation_screenshot.png", "wb") as f:
f.write(base64.b64decode(screenshot))
# Get page info
elements = client.browser_extract_elements(session_id, "input", attributes=["name", "value"])
# 3. Process results
analysis = client.execute(f"""
elements_count = {len(elements)}
screenshot_size = {len(screenshot)}
print(f"Found {{elements_count}} form elements")
print(f"Screenshot size: {{screenshot_size}} characters")
print("Automation completed successfully")
""", language="python")
return {
"setup": setup,
"elements": elements,
"analysis": analysis,
"screenshot_saved": True
}
# Run automation
result = web_automation_example()
print("Final result:", result)
Egress Policy Management
Control outbound network access for sessions and VMs with fine-grained egress policies.
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Set session egress policy (whitelist-only model)
client.set_session_egress(
allow_package_managers=True, # allow pypi, npm, apt, OS repos
allow_http=True, # allow outbound HTTP to whitelisted domains
allow_https=True, # allow outbound HTTPS to whitelisted domains
allowed_domains=["api.example.com", "pypi.org"],
allowed_cidrs=["10.0.0.0/8"]
)
# Get current egress policy
policy = client.get_session_egress()
print(policy)
# Set VM-level egress policy
client.set_vm_egress(
"vm-id-here",
allow_package_managers=False,
allowed_domains=["internal-api.example.com"]
)
# Get VM egress policy
vm_policy = client.get_vm_egress("vm-id-here")
print(vm_policy)
SSH Key Management
Manage SSH keys for secure VM access.
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
# Add an SSH public key
key = client.add_ssh_key("ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAA... user@host")
print(f"Added key: {key['fingerprint']}")
# List all SSH keys
keys = client.list_ssh_keys()
for k in keys:
print(f" {k['id']}: {k['fingerprint']} ({k['key_type']})")
# Delete an SSH key
client.delete_ssh_key(key['id'])
LLM-Friendly Content Extraction
InstaVM now provides intelligent content extraction designed specifically for LLM-powered browser automation. This solves the core challenge of "Content Discovery → Element Interaction" by providing clean, structured content alongside precise interaction capabilities.
The Problem
LLMs face two key challenges when automating browsers:
- Context Limits: Full DOM with JS/CSS/ads overwhelms LLM context windows
- Element Location: After reading content, LLMs need exact selectors to interact
The Solution: Three-Part Content Structure
The extract_content() method returns three complementary components:
from instavm import InstaVM
client = InstaVM(api_key='your_api_key')
session = client.browser.create_session()
# Navigate and extract
session.navigate("https://example.com/article")
content = session.extract_content()
# 1. readable_content: Clean article text (no JS/CSS/ads)
article_text = content['readable_content']['content']
title = content['readable_content']['title']
word_count = content['readable_content']['word_count']
# 2. interactive_elements: All clickable/typeable elements with selectors
for element in content['interactive_elements']:
print(f"{element['interactive_type']}: {element['text']} → {element['selector']}")
# 3. content_anchors: Text snippets mapped to DOM selectors
for anchor in content['content_anchors']:
print(f"{anchor['text']} → {anchor['selector']}")
LLM Workflow Pattern
This enables a powerful multi-step workflow:
1. Navigate to page
2. Extract content (readable + interactive + anchors)
3. LLM reads readable_content to understand the page
4. LLM identifies target: "I need to click 'Sign Up'"
5. LLM searches content_anchors for 'sign up' text
6. LLM finds selector: 'button.signup-btn'
7. LLM clicks using discovered selector
8. Wait for new page load
9. Extract content again from new page
10. Repeat the cycle...
Complete Example: LLM-Powered Research Agent
from instavm import InstaVM
def llm_browser_workflow():
"""
Example: Find the latest Python release version from python.org
"""
client = InstaVM(api_key='your_api_key')
session = client.browser.create_session()
# Step 1: Navigate to target site
session.navigate("https://www.python.org")
session.wait_for("visible", "body")
# Step 2: Extract LLM-friendly content
content = session.extract_content(
include_interactive=True,
include_anchors=True,
max_anchors=30
)
# Step 3: LLM analyzes clean content (no noise)
article = content['readable_content']['content']
# LLM prompt: "Given this page content, find where to click for downloads"
# LLM response: "Look for 'Downloads' link"
# Step 4: LLM finds selector using content_anchors
target_selector = None
for anchor in content['content_anchors']:
if 'download' in anchor['text'].lower():
target_selector = anchor['selector']
break
# If not in anchors, search interactive elements
if not target_selector:
for elem in content['interactive_elements']:
if 'download' in elem['text'].lower():
target_selector = elem['selector']
break
# Step 5: Click using discovered selector
if target_selector:
session.click(target_selector)
session.wait_for("visible", "h1")
# Step 6: Extract content from new page
new_content = session.extract_content()
new_article = new_content['readable_content']['content']
# Step 7: LLM extracts answer from clean text
# LLM prompt: "Extract the latest Python version from this text"
# LLM reads: new_article (clean, no HTML noise)
# LLM responds: "Python 3.12.0"
session.close()
return "Task completed"
# Usage
result = llm_browser_workflow()
Content Extraction Methods
Low-level (requires session_id):
client = InstaVM(api_key='your_api_key')
session_id = client.create_browser_session()
content = client.browser_extract_content(
session_id,
include_interactive=True,
include_anchors=True,
max_anchors=50
)
High-level (BrowserSession):
session = client.browser.create_session()
content = session.extract_content(
include_interactive=True,
include_anchors=True,
max_anchors=50
)
Auto-session (BrowserManager):
# Creates session automatically if none exists
content = client.browser.extract_content(
include_interactive=True,
include_anchors=True
)
Response Structure
{
"success": True,
"readable_content": {
"title": "Article Title",
"byline": "Author Name",
"content": "Clean article text without JS/CSS/ads...",
"word_count": 1250,
"length": 6543
},
"interactive_elements": [
{
"text": "Sign Up",
"selector": "button.signup-btn",
"interactive_type": "button",
"position": {"x": 150, "y": 200, "width": 100, "height": 40}
},
{
"text": "Learn More",
"selector": "a#learn-more-link",
"interactive_type": "link",
"attributes": {"href": "/learn"}
}
],
"content_anchors": [
{
"text": "Click here to sign up for our newsletter",
"selector": "button.signup-btn",
"length": 45
}
],
"extraction_time": 0.85,
"url": "https://example.com/article",
"title": "Page Title"
}
Key Benefits
- Context Efficient: LLMs receive clean text, not bloated HTML
- Precise Interaction: Text-to-selector mapping eliminates guesswork
- Stateful Workflows: Multi-step automation across page loads
- Noise Filtering: Readability.js removes ads, navigation, footers
- Smart Element Detection: Automatically identifies all interactive elements
LLM Framework Integrations
InstaVM now includes built-in integrations with popular LLM frameworks, eliminating boilerplate code for AI-powered automation.
OpenAI Integration
from instavm import InstaVM
from instavm.integrations.openai import get_tools, execute_tool
from openai import OpenAI
client = InstaVM(api_key='your_api_key')
openai_client = OpenAI(api_key='your_openai_key')
# Get pre-built OpenAI function definitions
tools = get_tools()
# Let the LLM decide what to do
response = openai_client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Navigate to example.com and take a screenshot"}],
tools=tools,
tool_choice="auto"
)
# Execute the LLM's tool calls
browser_session = None
for tool_call in response.choices[0].message.tool_calls:
result = execute_tool(client, tool_call, browser_session)
if result.get("session"):
browser_session = result["session"]
print(f"Tool result: {result}")
Azure OpenAI Integration
from instavm import InstaVM
from instavm.integrations.azure_openai import get_azure_tools, execute_azure_tool
from openai import AzureOpenAI
client = InstaVM(api_key='your_api_key')
azure_client = AzureOpenAI(
api_key="your_azure_key",
api_version="2024-02-01",
azure_endpoint="https://your-resource.openai.azure.com/"
)
tools = get_azure_tools()
browser_session = None
response = azure_client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Find the current weather in New York"}],
tools=tools
)
for tool_call in response.choices[0].message.tool_calls:
result = execute_azure_tool(client, tool_call, browser_session)
if result.get("session"):
browser_session = result["session"]
Ollama Integration
from instavm import InstaVM
from instavm.integrations.ollama import get_ollama_tools, execute_ollama_tool
import requests
client = InstaVM(api_key='your_api_key')
# Get tool definitions for Ollama
tools = get_ollama_tools()
# Make request to local Ollama instance
response = requests.post('http://localhost:11434/api/chat', json={
'model': 'llama3',
'messages': [{'role': 'user', 'content': 'Navigate to github.com and extract the page title'}],
'tools': tools,
'stream': False
})
# Execute tool calls from Ollama response
browser_session = None
if response.json().get('message', {}).get('tool_calls'):
for tool_call in response.json()['message']['tool_calls']:
result = execute_ollama_tool(client, tool_call, browser_session)
if result.get("session"):
browser_session = result["session"]
LangChain Integration
from instavm import InstaVM
from instavm.integrations.langchain import InstaVMTool
from langchain.agents import initialize_agent, AgentType
from langchain.llms import OpenAI
# Create InstaVM client and LangChain tool
client = InstaVM(api_key='your_api_key')
instavm_tool = InstaVMTool(client)
# Initialize LangChain agent
llm = OpenAI(api_key='your_openai_key')
tools = [instavm_tool]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)
# Let the agent use InstaVM for web automation
result = agent.run("Go to example.com and tell me what you see on the page")
print(result)
LlamaIndex Integration
from instavm import InstaVM
from instavm.integrations.llamaindex import get_llamaindex_tools
from llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent
# Create InstaVM client
client = InstaVM(api_key='your_api_key')
# Get InstaVM function tools for LlamaIndex
tools = get_llamaindex_tools(client)
# Create agent with InstaVM capabilities
llm = OpenAI(model="gpt-4", api_key='your_openai_key')
agent = ReActAgent(
tools=tools,
llm=llm,
verbose=True
)
# Use the agent for web tasks
response = agent.chat("Navigate to news.ycombinator.com and summarize the top 3 posts")
print(response)
Complete LLM Intelligence Example
from instavm import InstaVM
from instavm.integrations.openai import get_tools, execute_tool
from openai import OpenAI
import json
class WebIntelligenceAgent:
def __init__(self, instavm_key, openai_key):
self.instavm = InstaVM(api_key=instavm_key)
self.openai = OpenAI(api_key=openai_key)
self.tools = get_tools()
self.browser_session = None
def run_task(self, task_description):
messages = [
{"role": "system", "content": "You are a web intelligence agent. Use browser automation and code execution to complete tasks."},
{"role": "user", "content": task_description}
]
for turn in range(5): # Max 5 turns
response = self.openai.chat.completions.create(
model="gpt-4",
messages=messages,
tools=self.tools,
tool_choice="auto"
)
message = response.choices[0].message
messages.append({
"role": "assistant",
"content": message.content,
"tool_calls": message.tool_calls
})
if message.tool_calls:
for tool_call in message.tool_calls:
result = execute_tool(self.instavm, tool_call, self.browser_session)
if result.get("session"):
self.browser_session = result["session"]
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call.function.name,
"content": json.dumps(result)
})
else:
# Task complete
return message.content
return "Task completed with maximum turns reached"
# Usage
agent = WebIntelligenceAgent('your_instavm_key', 'your_openai_key')
result = agent.run_task("Find the current Bitcoin price and create a Python chart showing the trend")
print(result)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file instavm-0.9.0.tar.gz.
File metadata
- Download URL: instavm-0.9.0.tar.gz
- Upload date:
- Size: 35.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb9a6bf69e87ffad4abe0659700526c0d06cba0d7ae469b3985efb09822f83fb
|
|
| MD5 |
7d63be51c4e8c6cb6c1a2790d6f5d146
|
|
| BLAKE2b-256 |
0663fc725eac5d944256c3a7fa72940e98cd6a7b3f3ad4027103c2b11ce9b92f
|
File details
Details for the file instavm-0.9.0-py3-none-any.whl.
File metadata
- Download URL: instavm-0.9.0-py3-none-any.whl
- Upload date:
- Size: 32.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1187d40e54354b8904a8fc06f5dc02a6ef565df8550209e40a5205029640dcb6
|
|
| MD5 |
4edd771313e12561a0f685260fc5d66a
|
|
| BLAKE2b-256 |
301ebe6765d84bb458b78abae9662e8178e470597576641830acf604c7b7a7a0
|