Add your description here
Project description
ML Intern
An ML intern that autonomously researches, writes, and ships good quality ML related code using the Hugging Face ecosystem — with deep access to docs, papers, datasets, and cloud compute.
Quick Start
Installation
git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .
That's it. Now ml-intern works from any directory:
ml-intern
Create a .env file in the project root (or export these in your shell):
ANTHROPIC_API_KEY=<your-anthropic-api-key> # if using anthropic models
OPENAI_API_KEY=<your-openai-api-key> # if using openai models
HF_TOKEN=<your-hugging-face-token>
GITHUB_TOKEN=<github-personal-access-token>
If no HF_TOKEN is set, the CLI will prompt you to paste one on first launch. To get a GITHUB_TOKEN follow the tutorial here.
Usage
Interactive mode (start a chat session):
ml-intern
Headless mode (single prompt, auto-approve):
ml-intern "fine-tune llama on my dataset"
Options:
ml-intern --model anthropic/claude-opus-4-6 "your prompt"
ml-intern --model openai/gpt-5.5 "your prompt"
ml-intern --max-iterations 100 "your prompt"
ml-intern --no-stream "your prompt"
Supported Gateways
ML Intern currently supports one-way notification gateways from CLI sessions. These gateways send out-of-band status updates; they do not accept inbound chat messages.
Slack
Slack notifications use the Slack Web API to post messages when the agent needs
approval, hits an error, or completes a turn. Create a Slack app with a bot token
that has chat:write, invite the bot to the target channel, then set:
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_ID=C...
The CLI automatically creates a slack.default destination when both variables
are present. Optional environment variables for the env-only default:
ML_INTERN_SLACK_NOTIFICATIONS=false
ML_INTERN_SLACK_DESTINATION=slack.ops
ML_INTERN_SLACK_AUTO_EVENTS=approval_required,error,turn_complete
ML_INTERN_SLACK_ALLOW_AGENT_TOOL=true
ML_INTERN_SLACK_ALLOW_AUTO_EVENTS=true
For a persistent user-level config, put overrides in
~/.config/ml-intern/cli_agent_config.json or point ML_INTERN_CLI_CONFIG at a
JSON file:
{
"messaging": {
"enabled": true,
"auto_event_types": ["approval_required", "error", "turn_complete"],
"destinations": {
"slack.ops": {
"provider": "slack",
"token": "${SLACK_BOT_TOKEN}",
"channel": "${SLACK_CHANNEL_ID}",
"allow_agent_tool": true,
"allow_auto_events": true
}
}
}
}
Architecture
Component Overview
┌─────────────────────────────────────────────────────────────┐
│ User/CLI │
└────────────┬─────────────────────────────────────┬──────────┘
│ Operations │ Events
↓ (user_input, exec_approval, ↑
submission_queue interrupt, compact, ...) event_queue
│ │
↓ │
┌────────────────────────────────────────────────────┐ │
│ submission_loop (agent_loop.py) │ │
│ ┌──────────────────────────────────────────────┐ │ │
│ │ 1. Receive Operation from queue │ │ │
│ │ 2. Route to handler (run_agent/compact/...) │ │ │
│ └──────────────────────────────────────────────┘ │ │
│ ↓ │ │
│ ┌──────────────────────────────────────────────┐ │ │
│ │ Handlers.run_agent() │ ├──┤
│ │ │ │ │
│ │ ┌────────────────────────────────────────┐ │ │ │
│ │ │ Agentic Loop (max 300 iterations) │ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────┐ │ │ │ │
│ │ │ │ Session │ │ │ │ │
│ │ │ │ ┌────────────────────────────┐ │ │ │ │ │
│ │ │ │ │ ContextManager │ │ │ │ │ │
│ │ │ │ │ • Message history │ │ │ │ │ │
│ │ │ │ │ (litellm.Message[]) │ │ │ │ │ │
│ │ │ │ │ • Auto-compaction (170k) │ │ │ │ │ │
│ │ │ │ │ • Session upload to HF │ │ │ │ │ │
│ │ │ │ └────────────────────────────┘ │ │ │ │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ ┌────────────────────────────┐ │ │ │ │ │
│ │ │ │ │ ToolRouter │ │ │ │ │ │
│ │ │ │ │ ├─ HF docs & research │ │ │ │ │ │
│ │ │ │ │ ├─ HF repos, datasets, │ │ │ │ │ │
│ │ │ │ │ │ jobs, papers │ │ │ │ │ │
│ │ │ │ │ ├─ GitHub code search │ │ │ │ │ │
│ │ │ │ │ ├─ Sandbox & local tools │ │ │ │ │ │
│ │ │ │ │ ├─ Planning │ │ │ │ │ │
│ │ │ │ │ └─ MCP server tools │ │ │ │ │ │
│ │ │ │ └────────────────────────────┘ │ │ │ │ │
│ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────────────────────┐ │ │ │ │
│ │ │ │ Doom Loop Detector │ │ │ │ │
│ │ │ │ • Detects repeated tool patterns │ │ │ │ │
│ │ │ │ • Injects corrective prompts │ │ │ │ │
│ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │
│ │ │ Loop: │ │ │ │
│ │ │ 1. LLM call (litellm.acompletion) │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 2. Parse tool_calls[] │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 3. Approval check │ │ │ │
│ │ │ (jobs, sandbox, destructive ops) │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 4. Execute via ToolRouter │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 5. Add results to ContextManager │ │ │ │
│ │ │ ↓ │ │ │ │
│ │ │ 6. Repeat if tool_calls exist │ │ │ │
│ │ └────────────────────────────────────────┘ │ │ │
│ └──────────────────────────────────────────────┘ │ │
└────────────────────────────────────────────────────┴──┘
Agentic Loop Flow
User Message
↓
[Add to ContextManager]
↓
╔═══════════════════════════════════════════╗
║ Iteration Loop (max 300) ║
║ ║
║ Get messages + tool specs ║
║ ↓ ║
║ litellm.acompletion() ║
║ ↓ ║
║ Has tool_calls? ──No──> Done ║
║ │ ║
║ Yes ║
║ ↓ ║
║ Add assistant msg (with tool_calls) ║
║ ↓ ║
║ Doom loop check ║
║ ↓ ║
║ For each tool_call: ║
║ • Needs approval? ──Yes──> Wait for ║
║ │ user confirm ║
║ No ║
║ ↓ ║
║ • ToolRouter.execute_tool() ║
║ • Add result to ContextManager ║
║ ↓ ║
║ Continue loop ─────────────────┐ ║
║ ↑ │ ║
║ └───────────────────────┘ ║
╚═══════════════════════════════════════════╝
Events
The agent emits the following events via event_queue:
processing- Starting to process user inputready- Agent is ready for inputassistant_chunk- Streaming token chunkassistant_message- Complete LLM response textassistant_stream_end- Token stream finishedtool_call- Tool being called with argumentstool_output- Tool execution resulttool_log- Informational tool log messagetool_state_change- Tool execution state transitionapproval_required- Requesting user approval for sensitive operationsturn_complete- Agent finished processingerror- Error occurred during processinginterrupted- Agent was interruptedcompacted- Context was compactedundo_complete- Undo operation completedshutdown- Agent shutting down
Development
Adding Built-in Tools
Edit agent/core/tools.py:
def create_builtin_tools() -> list[ToolSpec]:
return [
ToolSpec(
name="your_tool",
description="What your tool does",
parameters={
"type": "object",
"properties": {
"param": {"type": "string", "description": "Parameter description"}
},
"required": ["param"]
},
handler=your_async_handler
),
# ... existing tools
]
Adding MCP Servers
Edit configs/cli_agent_config.json for CLI defaults, or
configs/frontend_agent_config.json for web-session defaults:
{
"model_name": "anthropic/claude-sonnet-4-5-20250929",
"mcpServers": {
"your-server-name": {
"transport": "http",
"url": "https://example.com/mcp",
"headers": {
"Authorization": "Bearer ${YOUR_TOKEN}"
}
}
}
}
Note: Environment variables like ${YOUR_TOKEN} are auto-substituted from .env.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ml_intern-0.1.0.tar.gz.
File metadata
- Download URL: ml_intern-0.1.0.tar.gz
- Upload date:
- Size: 199.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"20.04","id":"focal","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7275d3a37b17d123df7018dbf9441ceb72d1a2c617adb3931bfcd67b1b75e234
|
|
| MD5 |
0e75d26b0e317f0170946f45e479d9c9
|
|
| BLAKE2b-256 |
11629001008aa74f086e78b58735294fa3cea494df0a2aeb5cc3e1339cc15ab2
|
File details
Details for the file ml_intern-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ml_intern-0.1.0-py3-none-any.whl
- Upload date:
- Size: 224.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"20.04","id":"focal","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e70dc6e443bec3c6948a95e1d79acd2c7fe17e9df742dad5a9b43664eebc14c
|
|
| MD5 |
019801f3919b33fefa9dde1480bdc033
|
|
| BLAKE2b-256 |
c340817a5dcaf2b92f4b02e74b2ae12023080411f25d5f98811ebb10bff27804
|