Skip to main content

State-of-the-art AGI powered by NVIDIA NIM + Moonshot Kimi 2.5 โ€” control your PC via Telegram, voice, or code. 100+ frontier models.

Project description

๐Ÿค– DoItAgent โ€” State-of-the-Art AGI ยท Powered by NVIDIA NIM + Moonshot Kimi 2.5

PyPI version Python 3.9+ License: MIT Platform

Control your entire PC from anywhere via Telegram, voice, or Python.
Natural language. 100+ frontier models. 100% free tier. Runs 24/7. Fully autonomous.

๐ŸŸข Primary AI: NVIDIA NIM โ€” FREE access to Moonshot Kimi 2.5, DeepSeek R1, LLaMA 405B, and 100+ models.


โšก What is DoItAgent?

DoItAgent is a Python library that turns any computer into an AI-powered agent you can control from anywhere in the world via Telegram. Tell it what to do in plain English โ€” it executes.

from doitagent import Agent

ai = Agent()                                                    # auto-loads config
ai.do("take a screenshot and save it to Desktop")
ai.do("find all PDFs in Downloads, zip them, email to me")
ai.do("search for today's AI news and create a Word summary doc")
ai.do("play lofi music on YouTube")
ai.do("what processes are consuming most memory?")
ai.do("create an Excel budget spreadsheet with 3 months of data")

Or message your Telegram bot from your phone โ€” your laptop executes it instantly.


๐Ÿš€ Quick Start

# Install
pip install doitagent

# Setup (interactive wizard โ€” takes ~3 minutes, NVIDIA NIM is default)
doit setup

# Start the 24/7 Telegram bot
doit start

# Or use the interactive terminal
doit

The wizard walks you through getting a free NVIDIA NIM API key and selecting your model.


๐ŸŸข NVIDIA NIM โ€” Primary AI (Free Tier)

Get a free API key at build.nvidia.com and unlock:

Model Best For
๐ŸŒ™ Kimi 2.5 (default) General intelligence, long context, multimodal
๐Ÿง  DeepSeek R1 Complex reasoning, math, science
๐Ÿฆ™ LLaMA 3.1 405B Most capable open model
โšก Nemotron Super 49B NVIDIA-optimised, blazing fast
๐Ÿ”ข Qwen 2.5 Coder 32B Code generation
๐Ÿ’ก DeepSeek V3 Latest frontier model
doit models           # See all 100+ available models
doit switch-model deepseek-r1   # Swap model instantly

๐Ÿ“ฑ Telegram Setup

After doit setup, message your bot from anywhere:

take a screenshot
list files on my Desktop
create a Word doc about AI trends
open Chrome and go to YouTube  
run: ipconfig
search for Python news
make an Excel file with my budget
set volume to 60%
zip my Documents folder
say "task complete" out loud
teach skill morning: take screenshot, check email, play music
run morning skill

๐Ÿง  Supported LLM Backends

Backend Cost Privacy Notes
๐ŸŸข NVIDIA NIM โ˜… Free tier API 100+ models incl. Kimi 2.5, DeepSeek R1
Ollama Free 100% local Runs on your GPU/CPU
Anthropic ~$0.05/task API Claude Opus 4
OpenAI ~$0.05/task API sk-... key
DeepSeek ~$0.001/task API Key from platform.deepseek.com
# Ollama (free, local)
ai = Agent.with_ollama("mistral")

# NVIDIA NIM (free tier)
ai = Agent.with_nvidia("nvapi-your-key", model="llama-3.1-70b")

# Anthropic
ai = Agent.with_anthropic("sk-ant-...", model="claude-opus-4-6")

# OpenAI
ai = Agent.with_openai("sk-...", model="gpt-4o")

# DeepSeek
ai = Agent.with_deepseek("your-key")

๐Ÿ”ง 12 Capability Modules

ai.screen โ€” See the Screen

ai.screen.screenshot()                           # Full screenshot โ†’ file path
ai.screen.screenshot(region=(0,0,800,600))       # Region screenshot
ai.screen.read_text()                            # OCR all text on screen
ai.screen.find_text_on_screen("Submit")          # โ†’ (x, y) coordinates
ai.screen.find_image_on_screen("icon.png")       # Template matching
ai.screen.watch_for_change((0,0,300,100), cb)    # Monitor region

ai.files โ€” File CRUD

ai.files.create_file("~/Desktop/notes.txt", "content")
ai.files.read_file("~/Desktop/notes.txt")
ai.files.list_folder("~/Desktop", "*.pdf")
ai.files.find_files("*.pdf", "~/Downloads")
ai.files.move("~/Downloads/file.pdf", "~/Documents/")
ai.files.copy("~/file.txt", "~/backup/")
ai.files.trash("~/old.txt")                      # Recycle bin (safe)
ai.files.zip_folder("~/Project")
ai.files.unzip("~/archive.zip", "~/extracted/")
ai.files.read_json("~/config.json")
ai.files.write_json("~/config.json", {"key": "val"})

ai.browser โ€” Web Browser (Playwright)

ai.browser.open("https://google.com")
ai.browser.goto("https://news.ycombinator.com")
ai.browser.click("text=Sign In")
ai.browser.type_text("input[name='q']", "python")
ai.browser.search_google("best Python books 2024")
ai.browser.get_text()                            # Full page text
ai.browser.screenshot("page.png", full_page=True)
ai.browser.fill_form({"#email": "me@x.com", "#pw": "pass"})
ai.browser.download_file("https://x.com/file.pdf", "~/file.pdf")
ai.browser.execute_js("return document.title")

ai.shell โ€” Commands & Scripts

ai.shell.run("ipconfig")
ai.shell.run("dir C:/Users")
ai.shell.powershell("Get-Process | Sort CPU -Desc | Select -First 10")
ai.shell.open_app("notepad")
ai.shell.open_app("chrome")
ai.shell.kill_process("notepad.exe")
ai.shell.run_python("~/scripts/process.py", ["--input", "data.csv"])
ai.shell.run_python_code("import math; print(math.pi)")
ai.shell.pip_install("pandas numpy")

ai.kb โ€” Keyboard & Mouse

ai.kb.click(500, 300)
ai.kb.click(500, 300, clicks=2)                  # Double-click
ai.kb.click_text("OK")                           # Find & click text on screen
ai.kb.type("Hello World!")
ai.kb.type_fast("Long text goes here...")        # Paste method
ai.kb.hotkey("ctrl", "c")
ai.kb.hotkey("ctrl", "shift", "esc")             # Task manager
ai.kb.hotkey("win", "d")                         # Show desktop
ai.kb.press("enter")
ai.kb.scroll(-5)                                 # Scroll down
ai.kb.drag(100, 100, 500, 100)
ai.kb.set_clipboard("Copy this!")

ai.system โ€” Windows System

ai.system.notify("DoItAgent", "Task complete!")
ai.system.list_processes()                       # โ†’ List[dict]
ai.system.kill_process("notepad.exe")
ai.system.get_system_info()                      # CPU, RAM, disk
ai.system.get_battery()
ai.system.get_network_info()
ai.system.set_volume(70)
ai.system.lock_screen()
ai.system.get_installed_apps()
ai.system.schedule_task("MyTask", "python script.py", "daily")

ai.media โ€” Audio & Video

ai.media.play("C:/Music/song.mp3")
ai.media.play_youtube("lofi hip hop study")
ai.media.play_spotify("Discover Weekly")
ai.media.pause()
ai.media.next_track()
ai.media.set_volume(80)
ai.media.text_to_speech("Task completed!")
ai.media.record_audio(duration=10, save_path="~/recording.wav")

ai.docs โ€” Document Creation

# Word
path = ai.docs.create_word("report.docx", title="Q3 Report",
    paragraphs=["Revenue up 20%", "New markets: Asia, Europe"])
ai.docs.add_table_to_word("report.docx",
    headers=["Name","Sales"], rows=[["Alice","$50K"]])

# Excel
path = ai.docs.create_excel("data.xlsx",
    headers=["Date","Revenue","Units"],
    rows=[["Jan",50000,120], ["Feb",62000,155]])

# PDF
path = ai.docs.create_pdf("report.pdf", title="Annual Report",
    paragraphs=["Summary paragraph here."])
text = ai.docs.read_pdf("document.pdf")

# CSV
ai.docs.create_csv("export.csv", ["Name","Email"], [["Alice","a@b.com"]])
data = ai.docs.read_csv("export.csv")

ai.db โ€” SQLite Database

ai.db.connect("~/myapp.db")                      # or ":memory:"
ai.db.create_table("tasks", {
    "id":   "INTEGER PRIMARY KEY AUTOINCREMENT",
    "name": "TEXT NOT NULL",
    "done": "INTEGER DEFAULT 0",
    "created_at": "TEXT DEFAULT CURRENT_TIMESTAMP",
})
id = ai.db.insert("tasks", {"name": "Buy groceries"})
rows = ai.db.select("tasks")
rows = ai.db.select("tasks", "done = ?", (0,))
ai.db.update("tasks", {"done": 1}, "id = ?", (id,))
ai.db.delete("tasks", "done = 1")
ai.db.execute("SELECT COUNT(*) as total FROM tasks")
ai.db.export_to_csv("tasks", "~/tasks.csv")

ai.search โ€” Web Search & Scraping

results = ai.search.web_search("Python automation 2024", num_results=10)
summary = ai.search.search_and_summarize("latest AI news")
text    = ai.search.scrape_page("https://example.com")
links   = ai.search.scrape_page("https://example.com", extract="links")
news    = ai.search.search_news("artificial intelligence", num_results=5)
page    = ai.search.get_page_text("https://docs.python.org", max_chars=5000)

ai.voice โ€” Speech I/O

command = ai.voice.listen()                      # Listen 5 seconds
command = ai.voice.listen(duration=10)
ai.voice.speak("Task completed successfully!")
text    = ai.voice.transcribe_file("~/meeting.mp3")
path    = ai.voice.save_speech("Hello", "~/hello.mp3")
voices  = ai.voice.list_voices()

ai.email โ€” Send & Read Email

ai.email.configure("me@gmail.com", "app_password")
ai.email.send("boss@company.com", "Report Ready", "See attached.", ["~/report.pdf"])
emails = ai.email.read_inbox(limit=10, unread_only=True)
found  = ai.email.search_emails("invoice")

๐Ÿ”— Advanced Features

Pipelines โ€” Chain tasks

ai.pipeline(
    "search the web for top 5 Python frameworks in 2024",
    "create a Word document summarizing the results with a comparison table",
    "save it to Desktop as python_frameworks_2024.docx",
)

Watch Triggers โ€” React to events

ai.watch(
    trigger="CPU usage exceeds 90%",
    action="notify me and save a list of top processes to Desktop/cpu_alert.txt",
    interval=60,
)
ai.watch(
    trigger="new PDF appears in Downloads folder",
    action="move it to Documents/PDFs/ and send me a Telegram notification",
    interval=15,
)

Schedule Tasks โ€” Time-based automation

ai.schedule("take a screenshot and save with timestamp", "every day at 9am")
ai.schedule("search for AI news and append to ~/daily_news.txt", "every morning at 7am")
ai.schedule("clear old files in ~/Downloads older than 30 days", "every sunday at midnight")

Learn Skills โ€” Reusable workflows

ai.learn(
    skill_name="morning_briefing",
    description="My morning routine",
    steps=[
        "take a screenshot of current desktop",
        "search the web for today's top tech news",
        "create a text file ~/Desktop/briefing.txt with the news",
        "say 'Good morning! Your briefing is ready.'",
    ],
)
ai.do("run morning_briefing skill")

Voice Command Mode

ai.listen()  # Speak commands aloud. Say "DoIt stop" to exit.

๐Ÿ›ก๏ธ Security

  • Safe mode โ€” confirms before deleting files or dangerous actions
  • Sandbox โ€” shell commands run in isolated temp directory
  • Risk classification โ€” CRITICAL/HIGH/MEDIUM/LOW for every command
  • Audit log โ€” every action logged to ~/.doitagent/logs/audit.jsonl
  • Approval workflow โ€” Telegram bot asks before destructive actions
  • User whitelist โ€” only your Telegram user ID can control the bot
  • Secrets in keyring โ€” API keys stored in OS keyring, not plain text

๐Ÿ”„ 24/7 Daemon

doit start           # Start with watchdog (auto-restarts on crash)
doit stop            # Stop the daemon
doit status          # Check status

Features:

  • Automatic restart on crash (configurable max attempts)
  • Periodic Telegram heartbeat messages
  • Cross-platform: Windows Task Scheduler / Linux systemd / macOS LaunchAgent
  • PID file management
  • Full crash log

๐Ÿ“ฆ Installation Options

# Minimal
pip install doitagent

# With voice (Whisper STT + pyttsx3 TTS)
pip install doitagent[voice]

# With Windows-specific extras
pip install doitagent[windows]

# Everything
pip install doitagent[all]

๐Ÿ—‚๏ธ Project Structure

doitagent/
โ”œโ”€โ”€ core.py              # Agent โ€” the main class
โ”œโ”€โ”€ config.py            # Secure configuration
โ”œโ”€โ”€ cli.py               # CLI (doit command)
โ”œโ”€โ”€ exceptions.py        # Exception hierarchy
โ”œโ”€โ”€ _version.py
โ”œโ”€โ”€ agent/
โ”‚   โ”œโ”€โ”€ executor.py      # LLM reasoning loop
โ”‚   โ””โ”€โ”€ memory.py        # Persistent memory
โ”œโ”€โ”€ llm/
โ”‚   โ””โ”€โ”€ client.py        # Multi-backend LLM client
โ”œโ”€โ”€ modules/
โ”‚   โ”œโ”€โ”€ screen.py        # Screen vision & OCR
โ”‚   โ”œโ”€โ”€ files.py         # File CRUD
โ”‚   โ”œโ”€โ”€ browser.py       # Playwright automation
โ”‚   โ”œโ”€โ”€ shell.py         # Shell execution
โ”‚   โ”œโ”€โ”€ keyboard_mouse.py
โ”‚   โ”œโ”€โ”€ system.py        # Windows system
โ”‚   โ”œโ”€โ”€ media.py         # Audio/video
โ”‚   โ”œโ”€โ”€ documents.py     # Word/Excel/PDF
โ”‚   โ”œโ”€โ”€ database.py      # SQLite
โ”‚   โ”œโ”€โ”€ search.py        # Web search
โ”‚   โ”œโ”€โ”€ voice.py         # STT + TTS
โ”‚   โ””โ”€โ”€ email_mod.py     # Email
โ”œโ”€โ”€ telegram/
โ”‚   โ””โ”€โ”€ bot.py           # Telegram bot
โ”œโ”€โ”€ security/
โ”‚   โ””โ”€โ”€ sandbox.py       # Sandboxing + audit
โ”œโ”€โ”€ daemon/
โ”‚   โ””โ”€โ”€ watchdog.py      # 24/7 daemon
โ””โ”€โ”€ utils/
    โ”œโ”€โ”€ logger.py         # Rich logging
    โ””โ”€โ”€ installer.py      # Auto-install deps

๐Ÿ’ป CLI Reference

doit setup                    # First-time setup wizard
doit start                    # Start Telegram bot (24/7)
doit stop                     # Stop daemon
doit status                   # Show agent status
doit "take a screenshot"      # Run a single task
doit                          # Interactive chat mode
doit listen                   # Voice command mode
doit run reset                # Reset configuration
doit --version                # Show version

๐Ÿค Contributing

git clone https://github.com/doitagent/doitagent
cd doitagent
pip install -e .[dev]
pytest tests/

๐Ÿ“„ License

MIT License โ€” Free to use, modify, distribute.


Because life's too short to click things yourself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doitagent-2.1.0.tar.gz (96.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doitagent-2.1.0-py3-none-any.whl (99.3 kB view details)

Uploaded Python 3

File details

Details for the file doitagent-2.1.0.tar.gz.

File metadata

  • Download URL: doitagent-2.1.0.tar.gz
  • Upload date:
  • Size: 96.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for doitagent-2.1.0.tar.gz
Algorithm Hash digest
SHA256 453e35c78b7a7a68da345609c6c34d0813d144df62d630a096563963968efa47
MD5 90397b4a7a7946a7f5105db1df00582d
BLAKE2b-256 39752ba1d6581270b0929772431f4971eb6e367c6365406c70c85468dea3061e

See more details on using hashes here.

File details

Details for the file doitagent-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: doitagent-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 99.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for doitagent-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7a64ba6d5ac78b7b77032d7872563eb7f635f114bd06af60fcfcae9e4aae4586
MD5 c7d81ecdb619fdda73c517843eefc12c
BLAKE2b-256 8f68d9737fb51e2e08fe693c7921c98d24793437a910d19b1310995f5cc69bfb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page