No project description provided
Project description
Moatless Tree Search
Code for paper SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Note: The original development code can be found at github.com/a-antoniades/swe-search. It is only intended for reproducing the results in the paper. This is a clean refactor with a modular design, which will be maintained and extended.
Overview of SWE-Search showing the tree search process, where states (nodes) and actions (edges) are evaluated using contextual information and value function feedback to guide expansion.
Installation
Install the package:
pip install moatless-tree-search
Environment Setup
Before running the evaluation, you'll need:
- At least one LLM provider API key (e.g., OpenAI, Anthropic, etc.)
- A Voyage AI API key from voyageai.com to use the pre-embedded vector stores for SWE-Bench instances.
- (Optional) Access to a testbed environment - see moatless-testbeds for setup instructions
You can configure these settings by either:
-
Create a
.env
file in the project root (copy from.env.example
):cp .env.example .env # Edit .env with your values
-
Or export the variables directly:
# Directory for storing vector index store files export INDEX_STORE_DIR="/tmp/index_store" # Directory for storing clonedrepositories export REPO_DIR="/tmp/repos" # Required: At least one LLM provider API key export OPENAI_API_KEY="<your-key>" export ANTHROPIC_API_KEY="<your-key>" export HUGGINGFACE_API_KEY="<your-key>" export DEEPSEEK_API_KEY="<your-key>" # ...or Base URL for custom LLM API service (optional) export CUSTOM_LLM_API_BASE="<your-base-url>" export CUSTOM_LLM_API_KEY="<your-key>" # Required: API Key for Voyage Embeddings export VOYAGE_API_KEY="<your-key>" # Optional: Configuration for testbed environment (https://github.com/aorwall/moatless-testbeds) export TESTBED_API_KEY="<your-key>" export TESTBED_BASE_URL="<your-base-url>"
Streamlit
To launch the Streamlit app, run:
# Launch with direct file loading
moatless-streamlit path/to/trajectory.json
# Launch interactive UI (file can be selected in browser)
moatless-streamlit
The following badges are used to indicate the status of a node:
Badge | Shape | Color | Description |
---|---|---|---|
⭐ | Star | Green | Node is marked as resolved |
❌ | X | Red | Invalid edits or failed tests |
🟢 | Circle | Green | Correct code spans present in the context |
🟡 | Circle | Yellow | Either: • Found files but not spans • Found spans but in wrong files |
Evaluation
To run the evaluation script
moatless-evaluate \
--model "gpt-4o-mini" \
--repo_base_dir /tmp/repos \
--eval_dir "./evaluations" \
--eval_name mts \
--temp 0.7 \
--num_workers 1 \
--use_testbed \
--feedback \
--max_iterations 100 \
--max_expansions 5
You can optionally set the --instance_ids
to evaluate on a specific instance or a list of instances.
Use --use_testbed
if you got access to a testbed environment. Otherwise, tests will not be run.
Examples
Example: Basic Flow
Basic setup similar to the moatless-tools agent.
from moatless.agent import CodingAgent
from moatless.agent.code_prompts import SIMPLE_CODE_PROMPT
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject
index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"
instance = get_moatless_instance("django__django-16379")
completion_model = CompletionModel(model="gpt-4o", temperature=0.0)
repository = create_repository(instance)
code_index = CodeIndex.from_index_name(
instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)
actions = [
FindClass(code_index=code_index, repository=repository),
FindFunction(code_index=code_index, repository=repository),
FindCodeSnippet(code_index=code_index, repository=repository),
SemanticSearch(code_index=code_index, repository=repository),
RequestMoreContext(repository=repository),
RequestCodeChange(repository=repository, completion_model=completion_model),
Finish(),
Reject()
]
file_context = FileContext(repo=repository)
agent = CodingAgent(actions=actions, completion=completion_model, system_prompt=SIMPLE_CODE_PROMPT)
search_tree = SearchTree.create(
message=instance["problem_statement"],
agent=agent,
file_context=file_context,
max_expansions=1,
max_iterations=50
)
node = search_tree.run_search()
print(node.observation.message)
Example: MCTS Flow
How to setup the evaluation flow with MCTS and testbeds.
from moatless.agent import CodingAgent
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.discriminator import AgentDiscriminator
from moatless.feedback import FeedbackGenerator
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.selector import BestFirstSelector
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject, RunTests
from moatless.value_function import ValueFunction
from testbeds.sdk import TestbedSDK
from moatless.runtime.testbed import TestbedEnvironment
index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"
instance = get_moatless_instance("django__django-16379")
completion_model = CompletionModel(model="gpt-4o-mini", temperature=0.7)
repository = create_repository(instance, repo_base_dir=repo_base_dir)
code_index = CodeIndex.from_index_name(
instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)
file_context = FileContext(repo=repository)
selector = BestFirstSelector()
value_function = ValueFunction(completion=completion_model)
discriminator = AgentDiscriminator(
completion=completion_model,
n_agents=5,
n_rounds=3,
)
feedback = FeedbackGenerator()
runtime = TestbedEnvironment(
testbed_sdk=TestbedSDK(),
repository=repository,
instance=instance
)
actions = [
FindClass(code_index=code_index, repository=repository),
FindFunction(code_index=code_index, repository=repository),
FindCodeSnippet(code_index=code_index, repository=repository),
SemanticSearch(code_index=code_index, repository=repository),
RequestMoreContext(repository=repository),
RequestCodeChange(repository=repository, completion_model=completion_model),
RunTests(code_index=code_index, repository=repository, runtime=runtime),
Finish(),
Reject()
]
agent = CodingAgent(actions=actions, completion=completion_model)
search_tree = SearchTree.create(
message=instance["problem_statement"],
agent=agent,
file_context=file_context,
selector=selector,
value_function=value_function,
discriminator=discriminator,
feedback_generator=feedback,
max_iterations=100,
max_expansions=3,
max_depth=25,
persist_path=persist_path,
)
node = search_tree.run_search()
print(node.observation.message)
Citation
@misc{antoniades2024swesearchenhancingsoftwareagents,
title={SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement},
author={Antonis Antoniades and Albert Örwall and Kexun Zhang and Yuxi Xie and Anirudh Goyal and William Wang},
year={2024},
eprint={2410.20285},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.20285},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file moatless_tree_search-0.0.4.tar.gz
.
File metadata
- Download URL: moatless_tree_search-0.0.4.tar.gz
- Upload date:
- Size: 3.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.6-arch1-1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6df5a7fb3f82b077fe3e73638878a418a4b28b2ea8befb9120db385cba9abaca |
|
MD5 | 486a9dbfc44885a415918e36d170a5f2 |
|
BLAKE2b-256 | 4fd154a669c2a85a6bb8479826246acd6746a6b0cd9a51e7abc95c102cd2a025 |
File details
Details for the file moatless_tree_search-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: moatless_tree_search-0.0.4-py3-none-any.whl
- Upload date:
- Size: 4.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.6-arch1-1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08d6bb1681e47656aa1286a8afa24022ad6d066fa8e0b92400b83d4c0b94b7df |
|
MD5 | a40b4d3302e6ed0a70eb13dbbab0a702 |
|
BLAKE2b-256 | 11bb771845d4455f18c6e85c5eb3d92a29aa053900cc8e4a5044ead54e2cec35 |