AI-powered web browser automation

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

Allyson Python SDK

AI-powered web browser automation.

Installation

pip install allyson

After installation, you'll need to install the Playwright browsers:

python -m playwright install

Features

Simple, intuitive API for browser automation
AI-powered element selection and interaction
Support for multiple browsers (Chromium, Firefox, WebKit)
Asynchronous and synchronous interfaces
Robust error handling and recovery
DOM extraction and analysis for AI integration
Screenshot annotation with element bounding boxes
Agent loop for automating tasks with natural language

Quick Start

from allyson import Browser

# Create a browser instance
browser = Browser()

# Navigate to a website
browser.goto("https://example.com")

# Interact with the page
browser.click("Sign in")
browser.fill("Email", "user@example.com")
browser.fill("Password", "password")
browser.click("Submit")

# Take a screenshot
browser.screenshot("login.png")

# Close the browser
browser.close()

Advanced Usage

from allyson import Browser

async def run_automation():
    # Use async API with context manager
    async with Browser(headless=False) as browser:
        await browser.goto("https://example.com")
        
        # Wait for specific element
        await browser.wait_for_selector(".content")
        
        # Execute JavaScript
        result = await browser.evaluate("document.title")
        print(f"Page title: {result}")
        
        # Multiple tabs/pages
        new_page = await browser.new_page()
        await new_page.goto("https://another-example.com")

# Run the async function
import asyncio
asyncio.run(run_automation())

DOM Extraction and Screenshot Annotation

from allyson import Browser, DOMExtractor

async def extract_and_annotate():
    async with Browser(headless=False) as browser:
        # Navigate to a website
        await browser.goto("https://example.com")
        
        # Create a DOM extractor
        dom_extractor = DOMExtractor(browser._page)
        
        # Extract interactive elements
        elements = await dom_extractor.extract_interactive_elements()
        print(f"Found {len(elements)} interactive elements")
        
        # Take a screenshot with annotations
        result = await dom_extractor.screenshot_with_annotations(
            path="screenshot.png",
            elements=elements,
            show_element_ids=True,
            box_color="red"
        )
        
        print(f"Clean screenshot: {result['clean']}")
        print(f"Annotated screenshot: {result['annotated']}")
        
        # Create an element map for AI analysis
        map_result = await dom_extractor.screenshot_with_element_map(
            path="element_map.png"
        )
        
        # The element map contains detailed information about each element
        for element in map_result["elementMap"]:
            print(f"Element #{element['id']}: {element['elementType']}")

# Run the async function
import asyncio
asyncio.run(extract_and_annotate())

Agent Loop for Task Automation

from allyson import Browser, Agent, AgentLoop, Tool, ToolType

async def automate_task():
    # Create a browser instance
    async with Browser(headless=False) as browser:
        # Create an agent instance with your OpenAI API key
        agent = Agent(api_key="your-api-key")
        
        # Create a custom tool
        weather_tool = Tool(
            name="get_weather",
            description="Get the current weather for a location",
            type=ToolType.CUSTOM,
            parameters_schema={
                "location": {"type": "string", "description": "Location to get weather for"}
            },
            function=lambda location: {"temperature": 72, "condition": "Sunny"}
        )
        
        # Create an agent loop
        agent_loop = AgentLoop(
            browser=browser,
            agent=agent,
            tools=[weather_tool],  # Optional custom tools
            max_iterations=15,
            screenshot_dir="screenshots",
            verbose=True
        )
        
        # Run the agent loop with a natural language task
        task = "Go to Google, search for 'Python programming language', and find information about it"
        memory = await agent_loop.run(task)
        
        # The memory contains the full conversation and actions taken
        print("Task completed!")

# Run the async function
import asyncio
asyncio.run(automate_task())

Agent Loop Features

The agent loop provides several powerful features for automating web tasks:

Natural Language Instructions: Describe tasks in plain English, and the agent will figure out how to accomplish them.
Built-in Tools:
- goto: Navigate to a URL
- click: Click on an element by its ID number
- type: Type text into an element by its ID number
- enter: Press the Enter key to submit forms
- scroll: Scroll the page in any direction
- done: Mark the task as complete
Action Chaining: The agent can chain multiple actions together for efficiency:

# The agent can chain actions like typing and pressing Enter
{
  "actions": [
    {
      "tool": "type",
      "parameters": {
        "element_id": 2,
        "text": "search query"
      }
    },
    {
      "tool": "enter",
      "parameters": {}
    }
  ]
}

Custom Tools: Add your own tools to extend the agent's capabilities.
Memory and Context: The agent maintains a memory of all actions and observations, providing context for decision-making.
Error Handling: The agent can recover from errors and try alternative approaches.
Screenshot Annotations: Automatically take screenshots with annotated elements for better visibility.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Automated Publishing

This package uses GitHub Actions for automated testing and publishing to PyPI. The workflow is configured to:

Run tests on every push to the main branch and on pull requests
Build the package on every push to the main branch
Publish to PyPI automatically when:
- A new tag is pushed with the format v* (e.g., v0.1.0, v1.0.0)
- A new GitHub Release is created

To publish a new version:

Update the version number in setup.py
Commit and push your changes to the main branch
Create and push a new tag:
```
git tag v0.1.1
git push origin v0.1.1
```
The GitHub Action will automatically build and publish the package to PyPI

Note: You need to set up a PyPI API token as a GitHub secret named PYPI_API_TOKEN for the automated publishing to work.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

0.1.4 - Enhanced agent loop with action chaining, Enter key tool, and improved error handling
0.1.3 - Added DOM extraction and screenshot annotation features
0.1.2 - Updated Description
0.1.1 - Test release for GitHub Actions automated publishing
0.1.0 - Initial release

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

0.1.6

Mar 9, 2025

0.1.5

Mar 9, 2025

This version

0.1.4

Mar 9, 2025

0.1.3

Mar 9, 2025

0.1.2

Mar 9, 2025

0.1.1

Mar 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

allyson-0.1.4.tar.gz (39.2 kB view details)

Uploaded Mar 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

allyson-0.1.4-py3-none-any.whl (28.1 kB view details)

Uploaded Mar 9, 2025 Python 3

File details

Details for the file allyson-0.1.4.tar.gz.

File metadata

Download URL: allyson-0.1.4.tar.gz
Upload date: Mar 9, 2025
Size: 39.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for allyson-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`af5105205bcb8fe3062992fb01d8f2ac89211ee3830c536dae381cde37e8f9be`
MD5	`5708addac17d8f43269c79965be84b64`
BLAKE2b-256	`da262080a39468b87aa02b97ba8210af6c0f506785db6e00a24bf3752247e9e2`

See more details on using hashes here.

File details

Details for the file allyson-0.1.4-py3-none-any.whl.

File metadata

Download URL: allyson-0.1.4-py3-none-any.whl
Upload date: Mar 9, 2025
Size: 28.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for allyson-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d507df251564086786327bbed2a64a82c33136025e249ad31d441672d754a6f1`
MD5	`8af85048761d5cb0d3a1393521c0cd4c`
BLAKE2b-256	`a2096fcafd1ea33946e4e4b3bd60ae0b0348fa02ea1dd339f2f442119129d66d`

See more details on using hashes here.

allyson 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Allyson Python SDK

Installation

Features

Quick Start

Advanced Usage

DOM Extraction and Screenshot Annotation

Agent Loop for Task Automation

Agent Loop Features

Contributing

Automated Publishing

License

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes