Skip to main content

No project description provided

Project description

Moatless Tree Search

Code for paper SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement

Note: The original development code can be found at github.com/a-antoniades/swe-search. It is only intended for reproducing the results in the paper. This is a clean refactor with a modular design, which will be maintained and extended.

License arXiv Streamlit YouTube Twitter Discord

Method Diagram

Overview of SWE-Search showing the tree search process, where states (nodes) and actions (edges) are evaluated using contextual information and value function feedback to guide expansion.

Installation

Install the package:

pip install moatless-tree-search

Environment Setup

Before running the evaluation, you'll need:

  1. At least one LLM provider API key (e.g., OpenAI, Anthropic, etc.)
  2. A Voyage AI API key from voyageai.com to use the pre-embedded vector stores for SWE-Bench instances.
  3. (Optional) Access to a testbed environment - see moatless-testbeds for setup instructions

You can configure these settings by either:

  1. Create a .env file in the project root (copy from .env.example):

    cp .env.example .env
    # Edit .env with your values
    
  2. Or export the variables directly:

    # Directory for storing vector index store files  
    export INDEX_STORE_DIR="/tmp/index_store"    
    
    # Directory for storing clonedrepositories 
    export REPO_DIR="/tmp/repos"
    
    # Required: At least one LLM provider API key
    export OPENAI_API_KEY="<your-key>"
    export ANTHROPIC_API_KEY="<your-key>"
    export HUGGINGFACE_API_KEY="<your-key>"
    export DEEPSEEK_API_KEY="<your-key>"
    
    # ...or Base URL for custom LLM API service (optional)
    export CUSTOM_LLM_API_BASE="<your-base-url>"
    export CUSTOM_LLM_API_KEY="<your-key>"
    
    # Required: API Key for Voyage Embeddings
    export VOYAGE_API_KEY="<your-key>"
    
    # Optional: Configuration for testbed environment (https://github.com/aorwall/moatless-testbeds)
    export TESTBED_API_KEY="<your-key>"
    export TESTBED_BASE_URL="<your-base-url>"
    

Streamlit

To launch the Streamlit app, run:

# Launch with direct file loading
moatless-streamlit path/to/trajectory.json

# Launch interactive UI (file can be selected in browser)
moatless-streamlit

The following badges are used to indicate the status of a node:

Badge Shape Color Description
Star Green Node is marked as resolved
X Red Invalid edits or failed tests
🟢 Circle Green Correct code spans present in the context
🟡 Circle Yellow Either:
• Found files but not spans
• Found spans but in wrong files

Evaluation

To run the evaluation script

moatless-evaluate \
    --model "gpt-4o-mini" \
    --repo_base_dir /tmp/repos \
    --eval_dir "./evaluations" \
    --eval_name mts \
    --temp 0.7 \
    --num_workers 1 \
    --use_testbed \
    --feedback \
    --max_iterations 100 \
    --max_expansions 5

You can optionally set the --instance_ids to evaluate on a specific instance or a list of instances.

Use --use_testbed if you got access to a testbed environment. Otherwise, tests will not be run.

Examples

Example: Basic Flow

Basic setup similar to the moatless-tools agent.

from moatless.agent import CodingAgent
from moatless.agent.code_prompts import SIMPLE_CODE_PROMPT
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o", temperature=0.0)

repository = create_repository(instance)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    Finish(),
    Reject()
]

file_context = FileContext(repo=repository)
agent = CodingAgent(actions=actions, completion=completion_model, system_prompt=SIMPLE_CODE_PROMPT)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    max_expansions=1,
    max_iterations=50
)

node = search_tree.run_search()
print(node.observation.message)

Example: MCTS Flow

How to setup the evaluation flow with MCTS and testbeds.

from moatless.agent import CodingAgent
from moatless.benchmark.swebench import create_repository
from moatless.benchmark.utils import get_moatless_instance
from moatless.completion import CompletionModel
from moatless.discriminator import AgentDiscriminator
from moatless.feedback import FeedbackGenerator
from moatless.file_context import FileContext
from moatless.index import CodeIndex
from moatless.search_tree import SearchTree
from moatless.selector import BestFirstSelector
from moatless.actions import FindClass, FindFunction, FindCodeSnippet, SemanticSearch, RequestMoreContext, RequestCodeChange, Finish, Reject, RunTests
from moatless.value_function import ValueFunction
from testbeds.sdk import TestbedSDK
from moatless.runtime.testbed import TestbedEnvironment

index_store_dir = "/tmp/index_store"
repo_base_dir = "/tmp/repos"
persist_path = "trajectory.json"

instance = get_moatless_instance("django__django-16379")

completion_model = CompletionModel(model="gpt-4o-mini", temperature=0.7)

repository = create_repository(instance, repo_base_dir=repo_base_dir)

code_index = CodeIndex.from_index_name(
    instance["instance_id"], index_store_dir=index_store_dir, file_repo=repository
)

file_context = FileContext(repo=repository)

selector = BestFirstSelector()

value_function = ValueFunction(completion=completion_model)

discriminator = AgentDiscriminator(
    completion=completion_model,
    n_agents=5,
    n_rounds=3,
)

feedback = FeedbackGenerator()

runtime = TestbedEnvironment(
    testbed_sdk=TestbedSDK(),
    repository=repository,
    instance=instance
)

actions = [
    FindClass(code_index=code_index, repository=repository),
    FindFunction(code_index=code_index, repository=repository),
    FindCodeSnippet(code_index=code_index, repository=repository),
    SemanticSearch(code_index=code_index, repository=repository),
    RequestMoreContext(repository=repository),
    RequestCodeChange(repository=repository, completion_model=completion_model),
    RunTests(code_index=code_index, repository=repository, runtime=runtime),
    Finish(),
    Reject()
]

agent = CodingAgent(actions=actions, completion=completion_model)

search_tree = SearchTree.create(
    message=instance["problem_statement"],
    agent=agent,
    file_context=file_context,
    selector=selector,
    value_function=value_function,
    discriminator=discriminator,
    feedback_generator=feedback,
    max_iterations=100,
    max_expansions=3,
    max_depth=25,
    persist_path=persist_path,
)

node = search_tree.run_search()
print(node.observation.message)

Citation

@misc{antoniades2024swesearchenhancingsoftwareagents,
      title={SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement}, 
      author={Antonis Antoniades and Albert Örwall and Kexun Zhang and Yuxi Xie and Anirudh Goyal and William Wang},
      year={2024},
      eprint={2410.20285},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.20285}, 
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moatless_tree_search-0.0.2.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

moatless_tree_search-0.0.2-py3-none-any.whl (3.3 MB view details)

Uploaded Python 3

File details

Details for the file moatless_tree_search-0.0.2.tar.gz.

File metadata

  • Download URL: moatless_tree_search-0.0.2.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.11.5-arch1-1

File hashes

Hashes for moatless_tree_search-0.0.2.tar.gz
Algorithm Hash digest
SHA256 45bc8e419d49ab89ce3d4395e6981e3feda9de0457b5f23a8132d4e9debf8f06
MD5 f10f842219df1c206179704f541d75c5
BLAKE2b-256 5e3cfcc41587e704bee8b404ddf9f9df83338df69e263517747b02ba9bd41676

See more details on using hashes here.

File details

Details for the file moatless_tree_search-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for moatless_tree_search-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2e36f00498d79cbfd13662367a7065b6cda0a932be465e69e64be6b059c4d197
MD5 c84c20ba10d6fd84d7b7562d947097a7
BLAKE2b-256 26033d2c49c908133846c0dd8f46090b68c32c64d809c9dc180e3bb6935b8cf3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page