Automate computer tasks in Python
Project description
🤖 AskUI Vision Agent
⚡ Automate computer tasks in Python ⚡
🔧 Setup
1. Install AskUI Agent OS
Linux
⚠️ Warning: Agent OS currently does not work on Wayland. Switch to XOrg to use it.
AMD64
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run
bash /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-x64-Full.run
ARM64
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run
bash /tmp/AskUI-Suite-24.9.1-User-Installer-Linux-ARM64-Full.run
MacOS
curl -o /tmp/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run https://files.askui.com/releases/Installer/24.9.1/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run
bash /tmp/AskUI-Suite-24.9.1-User-Installer-MacOS-ARM64-Full.run
2. Install vision-agent in your Python environment
pip install askui
3. Authenticate with Anthropic
Set the ANTHROPIC_API_KEY
environment variable to access the Claude computer use model. (Create a Anthropic key here)
Linux & MacOS
Use export to set an evironment variable:
export ANTHROPIC_API_KEY=<your-api-key-here>
Windows PowerShell
Set an environment variable with $env:
$env:ANTHROPIC_API_KEY="<your-api-key-here>"
▶️ Start Building
from askui import VisionAgent
# Initialize your agent context manager
with VisionAgent() as agent:
# Use the webbrowser tool to start browsing
agent.tools.webbrowser.open_new("http://www.google.com")
# Start to automate individual steps
agent.click("url bar")
agent.type("http://www.google.com")
agent.keyboard("enter")
# Extract information from the screen
datetime = agent.get("What is the datetime at the top of the screen?")
print(datetime)
# Or let the agent work on its own
agent.act("search for a flight from Berlin to Paris in January")
📜 Logging
You want a better understanding of what you agent is doing? Set the log_level
to DEBUG.
import logging
with VisionAgent(log_level=logging.DEBUG) as agent:
agent...
🖥️ Multi-Monitor Support
You have multiple monitors? Choose which one to automate by setting display
to 1 or 2.
with VisionAgent(display=1) as agent:
agent...
What is AskUI Vision Agent?
AskUI Vision Agent is a versatile AI powered framework that enables you to automate computer tasks in Python.
It connects Agent OS with powerful computer use models like Anthropic's Claude Sonnet 3.5 v2 and the AskUI Prompt-to-Action series. It is your entry point for building complex automation scenarios with detailed instructions or let the agent explore new challenges on its own.
Agent OS is a custom-built OS controller designed to enhance your automation experience.
It offers powerful features like
- multi-screen support,
- support for all major operating systems (incl. Windows, MacOS and Linux),
- process visualizations,
- real Unicode character typing
and more exciting features like application selection, in background automation and video streaming are to be released soon.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file askui-0.1.2.tar.gz
.
File metadata
- Download URL: askui-0.1.2.tar.gz
- Upload date:
- Size: 159.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: pdm/2.20.0.post1 CPython/3.10.12 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 162fbcec490689482617991f3b9042e440edfbae300757dec6ccc41ac9df1589 |
|
MD5 | 25fe60cbf9a79478a70d6336c9390e33 |
|
BLAKE2b-256 | bec54b2430ec7107a2cff33edeff38035987a03c89d5184f0400b23a440c07b0 |
File details
Details for the file askui-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: askui-0.1.2-py3-none-any.whl
- Upload date:
- Size: 32.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: pdm/2.20.0.post1 CPython/3.10.12 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c463a9e09c8d88a457125d4392fd4a0ee373420a6d7c24c779d4d82c3af941b |
|
MD5 | 48d975096b24e8908a661b6d43a364d2 |
|
BLAKE2b-256 | 25b2552ab8af5f578da8d3de63ad046a681fb267a8ace9b58e677e24a1c95149 |