AI-powered web browser automation
Project description
Allyson Python SDK
AI-powered web browser automation.
Installation
pip install allyson
After installation, you'll need to install the Playwright browsers:
python -m playwright install
Features
- Simple, intuitive API for browser automation
- AI-powered element selection and interaction
- Support for multiple browsers (Chromium, Firefox, WebKit)
- Asynchronous and synchronous interfaces
- Robust error handling and recovery
- DOM extraction and analysis for AI integration
- Screenshot annotation with element bounding boxes
- Agent loop for automating tasks with natural language
Quick Start
from allyson import Browser, Agent, AgentLoop, Tool, ToolType
async def automate_task():
# Create a browser instance
async with Browser(
headless=False,
# Optional: Use your own Chrome installation instead of the default Chromium
executable_path="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
) as browser:
# Create an agent instance with your OpenAI API key
agent = Agent(api_key="your-api-key")
# Create a custom tool
weather_tool = Tool(
name="get_weather",
description="Get the current weather for a location",
type=ToolType.CUSTOM,
parameters_schema={
"location": {"type": "string", "description": "Location to get weather for"}
},
function=lambda location: {"temperature": 72, "condition": "Sunny"}
)
# Create an agent loop
agent_loop = AgentLoop(
browser=browser,
agent=agent,
tools=[weather_tool], # Optional custom tools
max_steps=15,
screenshot_dir="screenshots",
plan_dir="plans", # Directory to save task plans
verbose=True
)
# Run the agent loop with a natural language task
task = "Go to Google, search for 'Python programming language', and find information about it"
memory = await agent_loop.run(task)
# The memory contains the full conversation and actions taken
print("Task completed!")
# Print the final plan with completed steps
if agent_loop.state.plan_path:
with open(agent_loop.state.plan_path, "r") as f:
print(f.read())
# Run the async function
import asyncio
asyncio.run(automate_task())
Using Your Own Chrome Installation
By default, Allyson uses the Playwright-managed Chromium browser. However, for better stability and compatibility, you can use your own Chrome installation:
# Windows
browser = Browser(executable_path="C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe")
# macOS
browser = Browser(executable_path="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome")
# Linux
browser = Browser(executable_path="/usr/bin/google-chrome")
This is especially useful for automation tasks that require specific browser versions or configurations.
Agent Loop Features
The agent loop provides several powerful features for automating web tasks:
-
Natural Language Instructions: Describe tasks in plain English, and the agent will figure out how to accomplish them.
-
Task Planning: The agent automatically creates a step-by-step plan for completing the task and tracks progress by marking steps as completed.
-
Built-in Tools:
goto: Navigate to a URLclick: Click on an element by its ID numbertype: Type text into an element by its ID numberenter: Press the Enter key to submit formsscroll: Scroll the page in any directiondone: Mark the task as complete
-
Action Chaining: The agent can chain multiple actions together for efficiency:
# The agent can chain actions like typing and pressing Enter
{
"actions": [
{
"tool": "type",
"parameters": {
"element_id": 2,
"text": "search query"
}
},
{
"tool": "enter",
"parameters": {}
}
]
}
-
Custom Tools: Add your own tools to extend the agent's capabilities.
-
Memory and Context: The agent maintains a memory of all actions and observations, providing context for decision-making.
-
Error Handling: The agent can recover from errors and try alternative approaches.
-
Screenshot Annotations: Automatically take screenshots with annotated elements for better visibility.
Example Plan
The agent creates a Markdown plan like this for each task:
# Plan for: Search for information about Python programming language
## Steps:
- [x] Navigate to a search engine
- [x] Search for "Python programming language"
- [ ] Review search results
- [ ] Identify official Python website
- [ ] Identify Wikipedia page
- [ ] Visit the most relevant page
- [ ] Extract key information
- [ ] What is Python
- [ ] Key features
- [ ] Current version
- [ ] Summarize findings
As the agent completes steps, it automatically updates the plan by marking steps as completed with checkboxes.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Changelog
- 0.1.6 - Added support for custom Chrome browser path
- 0.1.5 - Added planner feature for creating and tracking task progress
- 0.1.4 - Enhanced agent loop with action chaining, Enter key tool, and improved error handling
- 0.1.3 - Added DOM extraction and screenshot annotation features
- 0.1.2 - Updated Description
- 0.1.1 - Test release for GitHub Actions automated publishing
- 0.1.0 - Initial release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file allyson-0.1.6.tar.gz.
File metadata
- Download URL: allyson-0.1.6.tar.gz
- Upload date:
- Size: 41.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccceac0b145baffc22cd9f414add410886352d538e162f92404f03d563e0c56d
|
|
| MD5 |
a90b65e2f94b8544307acde4e73701f9
|
|
| BLAKE2b-256 |
7fe008614abc6d7e38acc109c8c800c106d5b16197203b7cc58cf5bf848679e8
|
File details
Details for the file allyson-0.1.6-py3-none-any.whl.
File metadata
- Download URL: allyson-0.1.6-py3-none-any.whl
- Upload date:
- Size: 29.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ecbdd7ffd5e3f8c5f849facf2fa74e775e6b056c5249e9947a1d52b985c466c
|
|
| MD5 |
55653ee50597908edf552669cdedb2a7
|
|
| BLAKE2b-256 |
d18bfb10e804215ceaad152e251dc0b5acab2b7351fb52f8c6e3e887060b693e
|