Skip to main content

The world's most powerful personal AGI agent โ€” control your PC via Telegram, voice, or code. 100% free & open source.

Project description

๐Ÿค– DoItAgent โ€” The World's Most Powerful Personal AGI

PyPI version Python 3.9+ License: MIT Platform

Control your entire PC from anywhere via Telegram, voice, or Python.
Natural language. 100% free. Runs 24/7. Fully autonomous.


โšก What is DoItAgent?

DoItAgent is a Python library that turns any computer into an AI-powered agent you can control from anywhere in the world via Telegram. Tell it what to do in plain English โ€” it executes.

from doitagent import Agent

ai = Agent()                                                    # auto-loads config
ai.do("take a screenshot and save it to Desktop")
ai.do("find all PDFs in Downloads, zip them, email to me")
ai.do("search for today's AI news and create a Word summary doc")
ai.do("play lofi music on YouTube")
ai.do("what processes are consuming most memory?")
ai.do("create an Excel budget spreadsheet with 3 months of data")

Or message your Telegram bot from your phone โ€” your laptop executes it instantly.


๐Ÿš€ Quick Start

# Install
pip install doitagent

# Setup (interactive wizard โ€” takes ~3 minutes)
doit setup

# Start the 24/7 Telegram bot
doit start

# Or use the interactive terminal
doit

That's it. The wizard handles everything.


๐Ÿ“ฑ Telegram Setup

After doit setup, message your bot from anywhere:

take a screenshot
list files on my Desktop
create a Word doc about AI trends
open Chrome and go to YouTube  
run: ipconfig
search for Python news
make an Excel file with my budget
set volume to 60%
zip my Documents folder
say "task complete" out loud
teach skill morning: take screenshot, check email, play music
run morning skill

๐Ÿง  Supported LLM Backends

Backend Cost Privacy Setup
Ollama Free 100% local doit setup โ†’ option 1
NVIDIA NIM Free tier API nvapi-... key from build.nvidia.com
Anthropic ~$0.05/task API sk-ant-... key
OpenAI ~$0.05/task API sk-... key
DeepSeek ~$0.001/task API Key from platform.deepseek.com
# Ollama (free, local)
ai = Agent.with_ollama("mistral")

# NVIDIA NIM (free tier)
ai = Agent.with_nvidia("nvapi-your-key", model="llama-3.1-70b")

# Anthropic
ai = Agent.with_anthropic("sk-ant-...", model="claude-opus-4-6")

# OpenAI
ai = Agent.with_openai("sk-...", model="gpt-4o")

# DeepSeek
ai = Agent.with_deepseek("your-key")

๐Ÿ”ง 12 Capability Modules

ai.screen โ€” See the Screen

ai.screen.screenshot()                           # Full screenshot โ†’ file path
ai.screen.screenshot(region=(0,0,800,600))       # Region screenshot
ai.screen.read_text()                            # OCR all text on screen
ai.screen.find_text_on_screen("Submit")          # โ†’ (x, y) coordinates
ai.screen.find_image_on_screen("icon.png")       # Template matching
ai.screen.watch_for_change((0,0,300,100), cb)    # Monitor region

ai.files โ€” File CRUD

ai.files.create_file("~/Desktop/notes.txt", "content")
ai.files.read_file("~/Desktop/notes.txt")
ai.files.list_folder("~/Desktop", "*.pdf")
ai.files.find_files("*.pdf", "~/Downloads")
ai.files.move("~/Downloads/file.pdf", "~/Documents/")
ai.files.copy("~/file.txt", "~/backup/")
ai.files.trash("~/old.txt")                      # Recycle bin (safe)
ai.files.zip_folder("~/Project")
ai.files.unzip("~/archive.zip", "~/extracted/")
ai.files.read_json("~/config.json")
ai.files.write_json("~/config.json", {"key": "val"})

ai.browser โ€” Web Browser (Playwright)

ai.browser.open("https://google.com")
ai.browser.goto("https://news.ycombinator.com")
ai.browser.click("text=Sign In")
ai.browser.type_text("input[name='q']", "python")
ai.browser.search_google("best Python books 2024")
ai.browser.get_text()                            # Full page text
ai.browser.screenshot("page.png", full_page=True)
ai.browser.fill_form({"#email": "me@x.com", "#pw": "pass"})
ai.browser.download_file("https://x.com/file.pdf", "~/file.pdf")
ai.browser.execute_js("return document.title")

ai.shell โ€” Commands & Scripts

ai.shell.run("ipconfig")
ai.shell.run("dir C:/Users")
ai.shell.powershell("Get-Process | Sort CPU -Desc | Select -First 10")
ai.shell.open_app("notepad")
ai.shell.open_app("chrome")
ai.shell.kill_process("notepad.exe")
ai.shell.run_python("~/scripts/process.py", ["--input", "data.csv"])
ai.shell.run_python_code("import math; print(math.pi)")
ai.shell.pip_install("pandas numpy")

ai.kb โ€” Keyboard & Mouse

ai.kb.click(500, 300)
ai.kb.click(500, 300, clicks=2)                  # Double-click
ai.kb.click_text("OK")                           # Find & click text on screen
ai.kb.type("Hello World!")
ai.kb.type_fast("Long text goes here...")        # Paste method
ai.kb.hotkey("ctrl", "c")
ai.kb.hotkey("ctrl", "shift", "esc")             # Task manager
ai.kb.hotkey("win", "d")                         # Show desktop
ai.kb.press("enter")
ai.kb.scroll(-5)                                 # Scroll down
ai.kb.drag(100, 100, 500, 100)
ai.kb.set_clipboard("Copy this!")

ai.system โ€” Windows System

ai.system.notify("DoItAgent", "Task complete!")
ai.system.list_processes()                       # โ†’ List[dict]
ai.system.kill_process("notepad.exe")
ai.system.get_system_info()                      # CPU, RAM, disk
ai.system.get_battery()
ai.system.get_network_info()
ai.system.set_volume(70)
ai.system.lock_screen()
ai.system.get_installed_apps()
ai.system.schedule_task("MyTask", "python script.py", "daily")

ai.media โ€” Audio & Video

ai.media.play("C:/Music/song.mp3")
ai.media.play_youtube("lofi hip hop study")
ai.media.play_spotify("Discover Weekly")
ai.media.pause()
ai.media.next_track()
ai.media.set_volume(80)
ai.media.text_to_speech("Task completed!")
ai.media.record_audio(duration=10, save_path="~/recording.wav")

ai.docs โ€” Document Creation

# Word
path = ai.docs.create_word("report.docx", title="Q3 Report",
    paragraphs=["Revenue up 20%", "New markets: Asia, Europe"])
ai.docs.add_table_to_word("report.docx",
    headers=["Name","Sales"], rows=[["Alice","$50K"]])

# Excel
path = ai.docs.create_excel("data.xlsx",
    headers=["Date","Revenue","Units"],
    rows=[["Jan",50000,120], ["Feb",62000,155]])

# PDF
path = ai.docs.create_pdf("report.pdf", title="Annual Report",
    paragraphs=["Summary paragraph here."])
text = ai.docs.read_pdf("document.pdf")

# CSV
ai.docs.create_csv("export.csv", ["Name","Email"], [["Alice","a@b.com"]])
data = ai.docs.read_csv("export.csv")

ai.db โ€” SQLite Database

ai.db.connect("~/myapp.db")                      # or ":memory:"
ai.db.create_table("tasks", {
    "id":   "INTEGER PRIMARY KEY AUTOINCREMENT",
    "name": "TEXT NOT NULL",
    "done": "INTEGER DEFAULT 0",
    "created_at": "TEXT DEFAULT CURRENT_TIMESTAMP",
})
id = ai.db.insert("tasks", {"name": "Buy groceries"})
rows = ai.db.select("tasks")
rows = ai.db.select("tasks", "done = ?", (0,))
ai.db.update("tasks", {"done": 1}, "id = ?", (id,))
ai.db.delete("tasks", "done = 1")
ai.db.execute("SELECT COUNT(*) as total FROM tasks")
ai.db.export_to_csv("tasks", "~/tasks.csv")

ai.search โ€” Web Search & Scraping

results = ai.search.web_search("Python automation 2024", num_results=10)
summary = ai.search.search_and_summarize("latest AI news")
text    = ai.search.scrape_page("https://example.com")
links   = ai.search.scrape_page("https://example.com", extract="links")
news    = ai.search.search_news("artificial intelligence", num_results=5)
page    = ai.search.get_page_text("https://docs.python.org", max_chars=5000)

ai.voice โ€” Speech I/O

command = ai.voice.listen()                      # Listen 5 seconds
command = ai.voice.listen(duration=10)
ai.voice.speak("Task completed successfully!")
text    = ai.voice.transcribe_file("~/meeting.mp3")
path    = ai.voice.save_speech("Hello", "~/hello.mp3")
voices  = ai.voice.list_voices()

ai.email โ€” Send & Read Email

ai.email.configure("me@gmail.com", "app_password")
ai.email.send("boss@company.com", "Report Ready", "See attached.", ["~/report.pdf"])
emails = ai.email.read_inbox(limit=10, unread_only=True)
found  = ai.email.search_emails("invoice")

๐Ÿ”— Advanced Features

Pipelines โ€” Chain tasks

ai.pipeline(
    "search the web for top 5 Python frameworks in 2024",
    "create a Word document summarizing the results with a comparison table",
    "save it to Desktop as python_frameworks_2024.docx",
)

Watch Triggers โ€” React to events

ai.watch(
    trigger="CPU usage exceeds 90%",
    action="notify me and save a list of top processes to Desktop/cpu_alert.txt",
    interval=60,
)
ai.watch(
    trigger="new PDF appears in Downloads folder",
    action="move it to Documents/PDFs/ and send me a Telegram notification",
    interval=15,
)

Schedule Tasks โ€” Time-based automation

ai.schedule("take a screenshot and save with timestamp", "every day at 9am")
ai.schedule("search for AI news and append to ~/daily_news.txt", "every morning at 7am")
ai.schedule("clear old files in ~/Downloads older than 30 days", "every sunday at midnight")

Learn Skills โ€” Reusable workflows

ai.learn(
    skill_name="morning_briefing",
    description="My morning routine",
    steps=[
        "take a screenshot of current desktop",
        "search the web for today's top tech news",
        "create a text file ~/Desktop/briefing.txt with the news",
        "say 'Good morning! Your briefing is ready.'",
    ],
)
ai.do("run morning_briefing skill")

Voice Command Mode

ai.listen()  # Speak commands aloud. Say "DoIt stop" to exit.

๐Ÿ›ก๏ธ Security

  • Safe mode โ€” confirms before deleting files or dangerous actions
  • Sandbox โ€” shell commands run in isolated temp directory
  • Risk classification โ€” CRITICAL/HIGH/MEDIUM/LOW for every command
  • Audit log โ€” every action logged to ~/.doitagent/logs/audit.jsonl
  • Approval workflow โ€” Telegram bot asks before destructive actions
  • User whitelist โ€” only your Telegram user ID can control the bot
  • Secrets in keyring โ€” API keys stored in OS keyring, not plain text

๐Ÿ”„ 24/7 Daemon

doit start           # Start with watchdog (auto-restarts on crash)
doit stop            # Stop the daemon
doit status          # Check status

Features:

  • Automatic restart on crash (configurable max attempts)
  • Periodic Telegram heartbeat messages
  • Cross-platform: Windows Task Scheduler / Linux systemd / macOS LaunchAgent
  • PID file management
  • Full crash log

๐Ÿ“ฆ Installation Options

# Minimal
pip install doitagent

# With voice (Whisper STT + pyttsx3 TTS)
pip install doitagent[voice]

# With Windows-specific extras
pip install doitagent[windows]

# Everything
pip install doitagent[all]

๐Ÿ—‚๏ธ Project Structure

doitagent/
โ”œโ”€โ”€ core.py              # Agent โ€” the main class
โ”œโ”€โ”€ config.py            # Secure configuration
โ”œโ”€โ”€ cli.py               # CLI (doit command)
โ”œโ”€โ”€ exceptions.py        # Exception hierarchy
โ”œโ”€โ”€ _version.py
โ”œโ”€โ”€ agent/
โ”‚   โ”œโ”€โ”€ executor.py      # LLM reasoning loop
โ”‚   โ””โ”€โ”€ memory.py        # Persistent memory
โ”œโ”€โ”€ llm/
โ”‚   โ””โ”€โ”€ client.py        # Multi-backend LLM client
โ”œโ”€โ”€ modules/
โ”‚   โ”œโ”€โ”€ screen.py        # Screen vision & OCR
โ”‚   โ”œโ”€โ”€ files.py         # File CRUD
โ”‚   โ”œโ”€โ”€ browser.py       # Playwright automation
โ”‚   โ”œโ”€โ”€ shell.py         # Shell execution
โ”‚   โ”œโ”€โ”€ keyboard_mouse.py
โ”‚   โ”œโ”€โ”€ system.py        # Windows system
โ”‚   โ”œโ”€โ”€ media.py         # Audio/video
โ”‚   โ”œโ”€โ”€ documents.py     # Word/Excel/PDF
โ”‚   โ”œโ”€โ”€ database.py      # SQLite
โ”‚   โ”œโ”€โ”€ search.py        # Web search
โ”‚   โ”œโ”€โ”€ voice.py         # STT + TTS
โ”‚   โ””โ”€โ”€ email_mod.py     # Email
โ”œโ”€โ”€ telegram/
โ”‚   โ””โ”€โ”€ bot.py           # Telegram bot
โ”œโ”€โ”€ security/
โ”‚   โ””โ”€โ”€ sandbox.py       # Sandboxing + audit
โ”œโ”€โ”€ daemon/
โ”‚   โ””โ”€โ”€ watchdog.py      # 24/7 daemon
โ””โ”€โ”€ utils/
    โ”œโ”€โ”€ logger.py         # Rich logging
    โ””โ”€โ”€ installer.py      # Auto-install deps

๐Ÿ’ป CLI Reference

doit setup                    # First-time setup wizard
doit start                    # Start Telegram bot (24/7)
doit stop                     # Stop daemon
doit status                   # Show agent status
doit "take a screenshot"      # Run a single task
doit                          # Interactive chat mode
doit listen                   # Voice command mode
doit run reset                # Reset configuration
doit --version                # Show version

๐Ÿค Contributing

git clone https://github.com/doitagent/doitagent
cd doitagent
pip install -e .[dev]
pytest tests/

๐Ÿ“„ License

MIT License โ€” Free to use, modify, distribute.


Because life's too short to click things yourself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doitagent-2.0.0.tar.gz (77.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doitagent-2.0.0-py3-none-any.whl (80.4 kB view details)

Uploaded Python 3

File details

Details for the file doitagent-2.0.0.tar.gz.

File metadata

  • Download URL: doitagent-2.0.0.tar.gz
  • Upload date:
  • Size: 77.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for doitagent-2.0.0.tar.gz
Algorithm Hash digest
SHA256 db171a4089b371b758270c22c2ee7c24ec7a756b1c399b8a38123f5f02e56fb4
MD5 8de7e9829940efdac0f76cde4d0d29b5
BLAKE2b-256 4039c4ee1edea60b6363464303680b366bf0d5bb354f2f89b57671909d12d1b5

See more details on using hashes here.

File details

Details for the file doitagent-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: doitagent-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 80.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for doitagent-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29615878e8a2c06fdd3f31fb5b9b2937768a9a14bff4d2210f701a031f8ba799
MD5 47b7a204729dbad16e4b3a74e25c2aa4
BLAKE2b-256 ee6c367212ab33cb0d84cef3b9496c0d919a868c4d05def99492d66507fe7d56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page