Skip to main content

Automate anything with a vision agent.

Project description

🤖 AskUI Vision Agent

🔧 Setup

1. Install AskUI Agent OS

Windows
AMD64

AskUI Installer for AMD64

ARM64

AskUI Installer for ARM64

Linux
AMD64
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run

bash /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run
ARM64
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run

bash /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run
MacOS
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run

bash /tmp/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run

2. Install vision-agent in your Python environment

pip install git+https://github.com/askui/vision-agent.git

3. Authenticate with Anthropic

Set the ANTHROPIC_API_KEY environment variable to access the Claude computer use model. (Create a Anthropic key here)

Linux & MacOS

Use export to set an evironment variable:

export ANTHROPIC_API_KEY=<your-api-key-here>
Windows PowerShell

Set an environment variable with $env:

$env:ANTHROPIC_API_KEY="<your-api-key-here>"

▶️ Start Building

from askui import VisionAgent

# Initialize your agent context manager
with VisionAgent() as agent:
    # Use the webbrowser tool to start browsing
    agent.webbrowser.open_new("http://www.google.com")

    # Start to automate individual steps
    agent.click("url bar")
    agent.type("http://www.google.com")
    agent.keyboard("enter")

    # Extract information from the screen
    datetime = agent.get("What is the datetime at the top of the screen?")
    print(datetime)

    # Or let the agent work on its own
    agent.act("search for a flight from Berlin to Paris in January")

📜 Logging

You want a better understanding of what you agent is doing? Set the log_level to DEBUG.

import logging

with VisionAgent(log_level=logging.DEBUG) as agent:
    agent...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

askui-0.1.0.tar.gz (49.1 kB view details)

Uploaded Source

Built Distribution

askui-0.1.0-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file askui-0.1.0.tar.gz.

File metadata

  • Download URL: askui-0.1.0.tar.gz
  • Upload date:
  • Size: 49.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.20.0.post1 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for askui-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f375d2d6b9b14c7f650c6cffb9114fc21406cd5c54a2c2c1fea4571bee1770d0
MD5 4bae680f2b7772f19020c5fe94a9a495
BLAKE2b-256 aba98a482bb8e1052f173ff968ea06abc2c26385c638f4832ea68a402ce15433

See more details on using hashes here.

File details

Details for the file askui-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: askui-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.20.0.post1 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for askui-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 88d61fadd49e1666b1300106510bed592363a6f162bdec2833ae6f096b6b3b3d
MD5 f8860f685aa5acd83d4b10558c74b683
BLAKE2b-256 493458c5434fcd013ad2967c75ea58572f00c30e90782b5f4164813fe272f678

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page