Skip to main content

No project description provided

Project description

Moatless Tree Search

Code for paper SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement

Note: The original development code can be found at github.com/a-antoniades/swe-search. It is only intended for reproducing the results in the paper. This is a clean refactor with a modular design, which will be maintained and extended.

License arXiv Streamlit YouTube Twitter Discord

Method Diagram

Overview of SWE-Search showing the tree search process, where states (nodes) and actions (edges) are evaluated using contextual information and value function feedback to guide expansion.

Installation

Install the package:

pip install moatless-tree-search

Environment Setup

Before running the evaluation, you'll need:

  1. At least one LLM provider API key (e.g., OpenAI, Anthropic, etc.)
  2. A Voyage AI API key from voyageai.com to use the pre-embedded vector stores for SWE-Bench instances.
  3. (Optional) Access to a testbed environment - see moatless-testbeds for setup instructions

You can configure these settings by either:

  1. Create a .env file in the project root (copy from .env.example):

    cp .env.example .env
    # Edit .env with your values
    
  2. Or export the variables directly:

    # Directory for storing vector index store files  
    export INDEX_STORE_DIR="/tmp/index_store"    
    
    # Directory for storing clonedrepositories 
    export REPO_DIR="/tmp/repos"
    
    # Required: At least one LLM provider API key
    export OPENAI_API_KEY="<your-key>"
    export ANTHROPIC_API_KEY="<your-key>"
    export HUGGINGFACE_API_KEY="<your-key>"
    export DEEPSEEK_API_KEY="<your-key>"
    
    # ...or Base URL for custom LLM API service (optional)
    export CUSTOM_LLM_API_BASE="<your-base-url>"
    export CUSTOM_LLM_API_KEY="<your-key>"
    
    # Required: API Key for Voyage Embeddings
    export VOYAGE_API_KEY="<your-key>"
    
    # Optional: Configuration for testbed environment (https://github.com/aorwall/moatless-testbeds)
    export TESTBED_API_KEY="<your-key>"
    export TESTBED_BASE_URL="<your-base-url>"
    

Streamlit

To launch the Streamlit app, run:

# Launch with direct file loading
moatless-streamlit path/to/trajectory.json

# Launch interactive UI (file can be selected in browser)
moatless-streamlit

The following badges are used to indicate the status of a node:

Badge Shape Color Description
Star Green Node is marked as resolved
X Red Invalid edits or failed tests
🟢 Circle Green Correct code spans present in the context
🟡 Circle Yellow Either:
• Found files but not spans
• Found spans but in wrong files

Evaluation

To run the evaluation script

moatless-evaluate \
    --model "gpt-4o-mini" \
    --repo_base_dir /tmp/repos \
    --eval_dir "./evaluations" \
    --eval_name mts \
    --temp 0.7 \
    --num_workers 1 \
    --use_testbed \
    --feedback \
    --max_iterations 100 \
    --max_expansions 5

You can optionally set the --instance_ids to evaluate on a specific instance or a list of instances.

Use --use_testbed if you got access to a testbed environment. Otherwise, tests will not be run.

Examples

Example: Basic Flow

Basic setup similar to the moatless-tools agent.

from moatless.agent import CodingAgent
from moatless.agent.code_prompts import SIMPLE_CODE_PROMPT
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o", temperature=0.0)

repository = create_repository(instance)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    Finish(),
    Reject()
]

file_context = FileContext(repo=repository)
agent = CodingAgent(actions=actions, completion=completion_model, system_prompt=SIMPLE_CODE_PROMPT)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    max_expansions=1,
    max_iterations=50
)

node = search_tree.run_search()
print(node.observation.message)

Example: MCTS Flow

How to setup the evaluation flow with MCTS and testbeds.

from moatless.agent import CodingAgent
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.discriminator import AgentDiscriminator
from moatless.feedback import FeedbackGenerator
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.selector import BestFirstSelector
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject, RunTests
from moatless.value_function import ValueFunction
from testbeds.sdk import TestbedSDK
from moatless.runtime.testbed import TestbedEnvironment

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o-mini", temperature=0.7)

repository = create_repository(instance, repo_base_dir=repo_base_dir)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

file_context = FileContext(repo=repository)

selector = BestFirstSelector()

value_function = ValueFunction(completion=completion_model)

discriminator = AgentDiscriminator(
    completion=completion_model,
    n_agents=5,
    n_rounds=3,
)

feedback = FeedbackGenerator()

runtime = TestbedEnvironment(
    testbed_sdk=TestbedSDK(),
    repository=repository,
    instance=instance
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    RunTests(code_index=code_index, repository=repository, runtime=runtime),
    Finish(),
    Reject()
]

agent = CodingAgent(actions=actions, completion=completion_model)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    selector=selector,
    value_function=value_function,
    discriminator=discriminator,
    feedback_generator=feedback,
    max_iterations=100,
    max_expansions=3,
    max_depth=25,
    persist_path=persist_path,
)

node = search_tree.run_search()
print(node.observation.message)

Citation

@misc{antoniades2024swesearchenhancingsoftwareagents,
      title={SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement}, 
      author={Antonis Antoniades and Albert Örwall and Kexun Zhang and Yuxi Xie and Anirudh Goyal and William Wang},
      year={2024},
      eprint={2410.20285},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.20285}, 
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moatless_tree_search-0.0.1.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

moatless_tree_search-0.0.1-py3-none-any.whl (3.3 MB view details)

Uploaded Python 3

File details

Details for the file moatless_tree_search-0.0.1.tar.gz.

File metadata

  • Download URL: moatless_tree_search-0.0.1.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.5-arch1-1

File hashes

Hashes for moatless_tree_search-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7b9e088b5f3d8b62b0e16847be8b2a81573d1472d6c644e97e84ebb550705549
MD5 347aa3be56021385571688de71cf5f42
BLAKE2b-256 db05e63addca027d5ec400edadd0a9483a3398c4ba15afc820f6f65815237a77

See more details on using hashes here.

File details

Details for the file moatless_tree_search-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for moatless_tree_search-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1e4162cf0889daa1284f0614ca2a5e59fa8d35895dbab8808c1454bb718b4edd
MD5 d434992821a7ddd6cbab40e05efb0702
BLAKE2b-256 9e30a143d954ce4a1d0d0235e251ff13e32300959595d6e7fa5d165e451051d5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page