Skip to main content

AI-powered web browser automation

Project description

Allyson Python SDK

AI-powered web browser automation.

Installation

pip install allyson

After installation, you'll need to install the Playwright browsers:

python -m playwright install

Features

  • Simple, intuitive API for browser automation
  • AI-powered element selection and interaction
  • Support for multiple browsers (Chromium, Firefox, WebKit)
  • Asynchronous and synchronous interfaces
  • Robust error handling and recovery
  • DOM extraction and analysis for AI integration
  • Screenshot annotation with element bounding boxes
  • Agent loop for automating tasks with natural language

Quick Start

from allyson import Browser, Agent, AgentLoop, Tool, ToolType

async def automate_task():
    # Create a browser instance
    async with Browser(headless=False) as browser:
        # Create an agent instance with your OpenAI API key
        agent = Agent(api_key="your-api-key")
        
        # Create a custom tool
        weather_tool = Tool(
            name="get_weather",
            description="Get the current weather for a location",
            type=ToolType.CUSTOM,
            parameters_schema={
                "location": {"type": "string", "description": "Location to get weather for"}
            },
            function=lambda location: {"temperature": 72, "condition": "Sunny"}
        )
        
        # Create an agent loop
        agent_loop = AgentLoop(
            browser=browser,
            agent=agent,
            tools=[weather_tool],  # Optional custom tools
            max_steps=15,
            screenshot_dir="screenshots",
            plan_dir="plans",      # Directory to save task plans
            verbose=True
        )
        
        # Run the agent loop with a natural language task
        task = "Go to Google, search for 'Python programming language', and find information about it"
        memory = await agent_loop.run(task)
        
        # The memory contains the full conversation and actions taken
        print("Task completed!")
        
        # Print the final plan with completed steps
        if agent_loop.state.plan_path:
            with open(agent_loop.state.plan_path, "r") as f:
                print(f.read())

# Run the async function
import asyncio
asyncio.run(automate_task())

Agent Loop Features

The agent loop provides several powerful features for automating web tasks:

  1. Natural Language Instructions: Describe tasks in plain English, and the agent will figure out how to accomplish them.

  2. Task Planning: The agent automatically creates a step-by-step plan for completing the task and tracks progress by marking steps as completed.

  3. Built-in Tools:

    • goto: Navigate to a URL
    • click: Click on an element by its ID number
    • type: Type text into an element by its ID number
    • enter: Press the Enter key to submit forms
    • scroll: Scroll the page in any direction
    • done: Mark the task as complete
  4. Action Chaining: The agent can chain multiple actions together for efficiency:

# The agent can chain actions like typing and pressing Enter
{
  "actions": [
    {
      "tool": "type",
      "parameters": {
        "element_id": 2,
        "text": "search query"
      }
    },
    {
      "tool": "enter",
      "parameters": {}
    }
  ]
}
  1. Custom Tools: Add your own tools to extend the agent's capabilities.

  2. Memory and Context: The agent maintains a memory of all actions and observations, providing context for decision-making.

  3. Error Handling: The agent can recover from errors and try alternative approaches.

  4. Screenshot Annotations: Automatically take screenshots with annotated elements for better visibility.

Example Plan

The agent creates a Markdown plan like this for each task:

# Plan for: Search for information about Python programming language

## Steps:
- [x] Navigate to a search engine
- [x] Search for "Python programming language"
- [ ] Review search results
  - [ ] Identify official Python website
  - [ ] Identify Wikipedia page
- [ ] Visit the most relevant page
- [ ] Extract key information
  - [ ] What is Python
  - [ ] Key features
  - [ ] Current version
- [ ] Summarize findings

As the agent completes steps, it automatically updates the plan by marking steps as completed with checkboxes.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

  • 0.1.5 - Added planner feature for creating and tracking task progress
  • 0.1.4 - Enhanced agent loop with action chaining, Enter key tool, and improved error handling
  • 0.1.3 - Added DOM extraction and screenshot annotation features
  • 0.1.2 - Updated Description
  • 0.1.1 - Test release for GitHub Actions automated publishing
  • 0.1.0 - Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

allyson-0.1.5.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

allyson-0.1.5-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file allyson-0.1.5.tar.gz.

File metadata

  • Download URL: allyson-0.1.5.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for allyson-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1783ddad49d2023090b42d4732c32c3062381476422d05fe9c87b2151a0661b1
MD5 25c3f84bfabac6fedc48f8c0224d7a5f
BLAKE2b-256 84e5f79b9b0fd36c114a81595be30d3c00cf41648f59ccca3de98a9c23d57aee

See more details on using hashes here.

File details

Details for the file allyson-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: allyson-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for allyson-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 95923b4dae3b51316e95953d6c8cbce5e57107591c80d95d4b8024fdeb356670
MD5 a18b6cd57ce0ce6ccef75cadd5cb3350
BLAKE2b-256 536980a1ddf7c54ef6c07eeb81dae176ffa57e3f536de9d0d66aca53989a82df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page