Skip to main content

AI Kit is the first CLI thats not designed for you. Its for your agent.

Project description

AI Kit - Supercharge Your AI IDE ⚡️

AI Kit is the first CLI thats not designed for you. Its designed for your agent.

Designed to integrate with a text editor like Cursor or Windsurf (or any environment with a shell), it extends your agent with search, reasoning, and memory. The goal: make myself obsolete. So I can go do other, better, stuff - like scrolling TikTok.

Only one command for you

This creates an .ai-kit directory that both you and your agent have control over.

init Initialize or reset AI Kit in your project.

ai-kit init

Princples

  • Local first, for full control
  • Hardcode as little as possible, instead use composable patterns and leverage agency
  • At the current state of LLMs, is best not to have a non-reasoning model (like Claude-3.5-sonnet) to plan or make decisions. These models can operate effecitvely wihtin hardcded workflows, for example:
  1. If the user says "what is the weather in SF?" use the weather tool to get the weather in SF.
  2. If the user says "what is 3+3?" use the math tool to get the answer.

The problem arises when we have a complex task that requires planning and orchestration that cant be hardcoded into a set of heuristics like the above. In this case, we want to avoid extending the execution agent with many tools and instructions. Instead, we will build a composable system that mirrors the human workflow.

  • Runtime first, prefill as little as possible.

Consider the following: The user askes the Executioner LLM to write code with the Anthropic Python SDK. Since:

  1. the Executioner LLM's training date is outdated (as with all LLMs),
  2. The Anthropic Python SDK is changing often

Thus, the model will hallucinate and write outdated code. that doesnt work. The typical way to solve this is to provide the relevant information in context.

Since we have a robust crawler and vector store, principles of Retrival Augmented Generation would recommend:

  1. Use the crawler to crawl lots of relevant docs
  2. Store them all in the vector store
  3. Use a search funciton to retrieve the most relevant docs

There are a few problems with this approach:

  1. We need to put the docs to the vector store before our query. Or, we can have the model use a tool to crawl what it needs. This adds some complexity.
  2. We need to crawl the entire docs, so that all the infromation is aviable. This is clearly prolematic.

Is this how humans would do it? Definitly not. If I'm using a new package I will:

  1. Search for the docs
  2. Poke around for the page i want
  3. Use that information only
  4. Note to refere back to it later if i need to

With AI Agents, instead of using traditional RAG patterns, we can use an Agent to go get what we need at runtime, just like a human would.

This is the advantage of agents - we don't have to prefill data and setup heuristics, instead we use an Augmented LLM with a task description but without a specific workflow.

Setting up your system prompt

The AI has different commands that it can run. The core workflow is as follows

The System

graph TD
    User([User])
    Exec["Executioner LLM (claude-3.5-sonnet)<br/>1) Shell<br/>2) Read/Write<br/>3) Local Grep"]
    Router{Router}
    Think["Thinking Model (r1)"]
    Prompts[(User Prompts)]
    Mem[(Memory)]
    System[(System Prompts)]
    
    User <--> Exec
    Exec --> Router
    
    subgraph Brain
        Router -->|1\. simple| Exec
        Router -->|2\. complex| Think
        Think --> Prompts
        Think --> Mem
        Think --> System
        Think -->|thought stream| Exec
    end

An example system prompt (.cursorrules) is as follows:

# <-- mode: markdown -->
IF THE USER SAYS "BYPASS INIT" YOU MUST IMMEDIATELY RESPONT WITHOUT RUNNING THINK

<IMPORTANT>
EVERY TIME the user talks to you except for the `BYPASS INIT` command, you must use `ai-kit think [prompt]` to generate a thought stream.

When the user's request involves files or code, run `ai-kit think "prompt {{ path/to/file_1 }} {{ path/to/file_2 }} ..."` You must use the `{{ path/to/file_1 }}` syntax. Do not just use the file name.
</IMPORTANT>

DO NOT RESPOND TO THE USER WITHOUT RUNNING `ai-kit think "prompt {{ path/to/file_1 }} {{ path/to/file_2 }} ..."`

Notes:

  • We include a "BYPASS INIT" command so that we can prevent the initialization when we'd like. For example, if we are using a non-agent that doesn't have tool access, we would use this command to prevent the initialization.
Be wary of adding too much to the system prompt, which will derade performance. Instead, use `user_prompts` that are passed ot the `Thinking LLM`. If you're having a hard time steering the agent, make sure to pass your ptompt into the conversation *before* giving it a task.

The Exection LLM - our entry point

The Execution LLM is the LLM that

  1. Talks to the user
  2. Excutes things on disk via read/write, shell, and grep

In my case, the executioner is a claude-3.5-sonnet model built in to Cursor's composer.

The Executioner LLM is instructed to run the ai-kit think "prompt {{ some_file_path }}..." command, which will call the Router and decide what to do next.

The Router - not every tasks needs deep though

The router ai_kit.core.router is used to route the user's request to the appropriate route.

There are currently two routes:

  • simple: a simple route that immediately returns to the Executioner. For simple tasks that don't require thinking.
  • complex: a complex route that uses the thinking model to think about the user's request.

The router has 3 input sources:

  • user_prompts: the user's prompt
  • system_prompts: the system prompt
  • memory: the memory

The Thinking LLM - injecting the thought stream

The Thinking LLM is the LLM that is used to think about the user's request. In this case, we're using Deepseek's R1, which always outputs a thought stream before responding to the user. We take only the thought stream and pass it back to the Executioner LLM. This is a significant performance increase over the Executioner LLM alone. It also gives us more control over the overflow workflow, since we can "steer" this LLM with

Note: The trick here is, when streaming the thought stream into stdout for the Executioner LLM, we essentially prefil the context window with the thought stream. The Executioner LLM will think that it had generated the thought stream, and adjust its response accordingly.

Development Principles

  • dont integration test somethign you can test manually, waste of time
  • unit tests as much as possible

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_ai_kit-0.8.1.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_ai_kit-0.8.1-py3-none-any.whl (59.8 kB view details)

Uploaded Python 3

File details

Details for the file python_ai_kit-0.8.1.tar.gz.

File metadata

  • Download URL: python_ai_kit-0.8.1.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for python_ai_kit-0.8.1.tar.gz
Algorithm Hash digest
SHA256 a3629c3052255eb64fdccce6af742a3709838486549e2ed869f514a7acacca62
MD5 38c778d99823d35f0cbdaa2c167f3271
BLAKE2b-256 36c19dd9d02e548a7181df49a6bc93f3b7228466f0da6740b5fad9936e6c2298

See more details on using hashes here.

File details

Details for the file python_ai_kit-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: python_ai_kit-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 59.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for python_ai_kit-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f693f51a44ab48b15217200507526ba577a0cd47b43d0c2a38d99f9b48769963
MD5 d959c220d0a48baf779195c7873c2227
BLAKE2b-256 97834c9ee3db2e6d224d99a90e763a6a5e76da22af7d5eb8dbc0160b59407c7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page