Build smarter AI agents with tree search

These details have not been verified by PyPI

Project links

Homepage

Project description

🌳 Saplings

Saplings lets you build agents that reason using tree search.

Think of this as tree-of-thoughts meets tool use. With tree search, an agent can explore and evaluate different tool-use trajectories before choosing the optimal path. This ability to look multiple steps ahead and backtrack reduces mistakes and boosts reasoning compared to traditional CoT/ReAct-style agents.

Agents that use tree search achieve SOTA on tasks like:

Coding: 92.7% on HumanEval using MCTS
Q&A/RAG: 63% on HotPotQA using MCTS
Web Navigation: 26.4% on VisualWebArena using A*

Features:

Plug-and-play. Build a smarter agent with just a couple lines of code.
Supports popular search algorithms: Monte Carlo Tree Search (MCTS), A*, and greedy best-first search.
Uses function calling under the hood.
Customize the value function, prompts, search parameters, etc.
Supports 100+ LLMs (via LiteLLM).

Demo

Installation
Quickstart
- Creating a tool
- Configuring an agent
Docs
Roadmap

Installation

$ pip install saplings

Quickstart

Below is a simple agent implementing Monte Carlo tree search (MCTS). It's equipped with a multiplication tool to solve tricky arithmetic problems.

from saplings.examples import MultiplicationTool
from saplings import MonteCarloAgent, Evaluator, Model

model = Model(model="openai/gpt-4o") # Wraps LiteLLM
evaluator = Evaluator(model)
tools = [MultiplicationTool()]

agent = MonteCarloAgent(tools, model, evaluator)
messages, _, _ = agent.run("Let x = 9418.343 * 8.11 and y = 2x. Calculate (xy)(x^2).")

This is the "bare minimum" for setting up a search agent with saplings –– just a few lines of code. There are a lot more parameters you can control, all covered in the docs. But let's first walk through the basics of creating your own tools and configuring an agent.

Creating a tool

Tools are what your agent will use to perform a task or answer a query. Each tool must extend the Tool base class and implement a few variables and methods. Here's an example of a simple tool that multiples two numbers together:

from saplings.abstract import Tool

class MultiplicationTool(Tool):
   def __init__(self, **kwargs):
      self.name = "multiply"
      self.description = "Multiplies two numbers and returns the result number."
      self.parameters = {
         "type": "object",
         "properties": {
            "a": {
               "type": "number",
               "description": "The number to multiply."
            },
            "b": {
               "type": "number",
               "description": "The number to multiply by."
            }
         },
         "required": ["a", "b"],
         "additionalProperties": False
      }
      self.is_terminal = False

   async def run(self, a, b, **kwargs):
      return a * b

Variables:

The instance variables in the class tell the agent when and how to call the tool. If you've used OpenAI function calling before, most of this should be familiar to you.

name (str): Name of the tool.
description (str): Description of what the tool does and when to call it.
parameters (dict): Arguments for the tool as a JSON schema.
is_terminal (bool): If True, calling this tool will terminate a search trajectory –– meaning, no subsequent tools can be called after this one. This is typically used for tools that generate a final output for the user (e.g. an answer to a question). More on this here.

run() method:

This is what actually executes the tool when the agent calls it. Arguments should be the same as the input parameters in the tool schema.

Advanced options:

There are additional things you can do with tools, such as accessing the agent's memory during tool execution, or controlling how tool output is shown to the model (vs. how it's stored in memory). You can read about these options here.

Configuring an agent

Choosing a model:

Saplings wraps LiteLLM to provide access to 100+ LLMs. Choose a model from their list of supported providers and create a Model object with it:

from saplings import Model

model = Model("openai/gpt-4o")

Note: any additional kwargs will be passed down to all the LiteLLM completion calls.

Setting up the evaluator:

This is what will guide the search process. The evaluator takes a search trajectory (i.e. a list of OpenAI-style messages) and returns a score between 0 and 1, indicating how promising the trajectory is. By default, a score of 1.0 means the agent has solved the problem and can terminate the search. You can change the solution cutoff by setting the threshold parameter in the agent –– more on that here.

from saplings import Evaluator

evaluator = Evaluator(model)

The default evaluator provided by saplings uses a LLM (i.e. the model you pass in above) to score trajectories. The Evaluator object has parameters that let you control things like the system prompt used and the sampling rate. You can also define your own custom evaluator if necessary. Read more about evaluators here.

Choosing an agent/search algorithm:

Once your tools, model, and evaluator are ready, you can simply plug them into a saplings agent. There are multiple to choose from, each implementing their own tree search algorithm: MonteCarloAgent, AStarAgent, and GreedyAgent. There's also a regular chain-of-thought agent available, COTAgent, which does not implement any search. Each agent has their own advantages and disadvantages, which you can read about here.

from saplings import MonteCarloAgent

agent = MonteCarloAgent(tools, model, evaluator)

This will initialize your agent. To actually run it on an input, call the run method. To run it asynchronously, call the run_async method.

messages, score, is_solution = agent.run("What's 2 * 2?") # await agent.run_async("What's 2 * 2?")

The output is a list of messages representing the best tool-use trajectory, the final score of the trajectory (as given by the evaluator), and whether or not the search terminated because the evaluator deemed the trajectory a solution to the prompt. The messages are Message objects, which are special objects native to saplings that wrap OpenAI messages.

Notably, there are many more parameters you can set for the agent, such as the system prompt that governs it.

Docs

Agents

Parameters

Every agent in saplings has the same parameters, listed below:

tools (List[Tool]): List of tools your agent can use.
model (Model): LLM provider that your agent will use to call tools.
evaluator (BaseEvaluator): Evaluation function that the agent will use to guide the search process.
prompt (str): System prompt for the agent.
b_factor (int): Branching factor, i.e. the number of potential next tool calls to evaluate at each step in a search trajectory. Note that this parameter does not do anything for COTAgent.
max_depth (int): Maximum depth of the search tree, indicating how many levels the agent can explore.
threshold (float): A cutoff value for the evaluation function. If a trajectory's evaluation score is above this threshold, the search will terminate and that trajectory will be accepted as the solution.
verbose (bool): Whether to print logging statements when you run the agent.
tool_choice ("auto" | "required"): Same as the tool_choice parameter in the OpenAI chat completions function. Indicates whether the model must always call a tool, or if it can decide to generate a normal response instead.
parallel_tool_calls (bool): Same as the parallel_tool_calls parameter in the OpenAI chat completions function. Indicates whether the model can generate multiple tool calls in a single completion request.

GreedyAgent: Greedy best-first serach

This agent implements a greedy best-first search. It's the fastest and cheapest search agent, in terms of LLM calls, but it's also incapable of backtracking, thus making it the least effective agent. GreedyAgent works by taking the input and generating a set of candidate tool calls. It executes each tool call and evaluates their outputs. Then, it picks the best tool call based on its evaluation and generates a set of candidate next tool calls. It repeats this process until a termination condition is met.

MonteCarloAgent: Monte Carlo tree search

Demo

This agent implements the Monte Carlo tree search (MCTS) algorithm, based on the paper Language Agent Tree Search (Zhou, et. al). It is the most effective agent you can build with saplings, but also the slowest and most expensive (in terms of LLM calls) in the worst case. The primary advantage of this agent is its ability to balance exploration and exploitation, allowing it to efficiently find optimal trajectories by using past experiences and adjusting its strategy accordingly.

Note that, besides the parameters listed above, this agent has one additional parameter:

max_rollouts (int, default = 10): This controls the maximum # of simulations the agent can perform.

AStarAgent: A* search

Demo

Implements a variation of the A* pathfinding algorithm, based on the paper Tree Search for Language Model Agents (Koh, et al.). Unlike GreedyAgent, this agent makes more LLM calls in the worst case, but is capable of backtracking and recovering from mistakes. However, unlike MonteCarloAgent, it does not update its search strategy based on the trajectories it has already explored. Oftentimes, AStarAgent is the perfect middle-ground between GreedyAgent (dumb but fast) and MonteCarloAgent (smart but slow).

COTAgent: Chain-of-thought (no search)

This is a standard tool-calling agent and does not implement any search. It takes an input, calls a tool, then uses the tool output to inform the next tool call, and so on until a termination condition is met. Think of COTAgent as a baseline to compare your search agents to.

The `Message` object

Messages are a core data structure in saplings. They are essentially equivalent to OpenAI messages (e.g. user input, tool calls, tool responses, assistant responses), with a few extra properties and helper methods. A list of messages represents a search trajectory. When you run an agent, it will return a list of messages representing the best trajectory it found.

Saplings messages can be easily converted into OpenAI-style messages using the to_openai_message() method.

messages, _, _ = agent.run("This is my prompt!")
messages = [message.to_openai_response() for message in messages]

print(messages)
# [{"role": "user", "content": "This is my prompt!"}, ..., {"role": "assistant", "content": "This is a response!"}]

Message objects have only one additional attribute that OpenAI messages don't have. If a message represents a tool response, it will have a raw_output property that contains the output of that tool. What's stored here may be different than the tool response that gets shown to the model, which is stored in the content property.

Termination conditions

Every tool has an is_terminal property. This is a boolean flag that tells the agent if calling the tool should terminate a search trajectory. If it's True, no subsequent tool calls can be made after the tool is invoked, and the agent will terminate that search trajectory. Terminal tools are typically used to generate some sort of final output for the user (e.g. an answer to a question).

We say that an agent can self-terminate if it has at least one terminal tool, OR if the tool_choice parameter is set to "auto." In the latter case, this means that calling a tool is optional for the agent, and instead of a tool call, it can generate a regular assistant response to the input prompt. We consider such a response to also terminate a search trajectory.

If an agent cannot self-terminate, then a search trajectory will only ever terminate if either a maximum depth is reached (set by the max_depth parameter), or the evaluator marks a trajectory as solved (i.e. the score is >= the agent's threshold parameter) –– in which case the entire search itself terminates.

An important point of confusion here: even if an evaluator marks a trajectory as solved, the search may not terminate if the agent can self-terminate. This happens when a trajectory ends with a non-terminal tool call (or a non-assistant response, in the case when tool use is optional) but is still given a score above the solution threshold. In this case, the search will continue unless until a terminal state is reached that is marked as solved. If no terminal state is ever reached, the trajectory with the best score is returned. If no solution is ever found, and there is one trajectory with a terminal state and another with a non-terminal state but a higher score, the terminal trajectory is preferred and returned.

Advanced tool options

Accessing agent memory

In some cases, running your tool may depend on the output of the previous tools your agent has used, or the user input itself. If this is the case, you can access the agent's current search trajectory in the run method when you implement your tool. Simply use kwargs.get("trajectory"). This will return a list of Message objects, which are wrappers around OpenAI messages.

Reformatting tool output

In some cases, it makes sense for the raw output of a tool to be separated from the output that's shown to the model. By default, the output of run() is what's shown to the model. But you can add the optional format_output method to your tool class to change how the output is presented to the agent. For example, in our quickstart example, instead of seeing the multiplication result N, you might want the model to see "A * B = N" so the agent can more easily keep track of what numbers have been multiplied. Here's how you'd modify the tool to do that:

from saplings.abstract import Tool

class MultiplicationTool(Tool):
   ...

   async def run(self, a, b, **kwargs):
      return {"a": a, "b": "result": a * b}

   def format_output(self, output):
      a, b = output['a'], output['b']
      result = output['result']
      return f"{a} * {b} = {result}"

The unformatted output of the tool is still stored in the agent's memory. It can be access via the raw_output property of the Message object that represents the tool response.

Custom evaluators

Every agent implements a heuristic search algorithm, meaning that it uses some heuristic or value function to guide the search. By default, saplings offers the Evaluator object, which evaluates a search trajectory using a LLM. It takes a trajectory (i.e. a list of OpenAI messages) as input and returns a score between 0 and 1 which tells the agent if its on the right track or not, along with some written justification for the score.

The Evaluator object has the following parameters:

model (Model): The LLM used to generate the score.
n_samples (int): The number of scores to generate for a given trajectory. Equivalent to the n parameter in an OpenAI chat completion. If it's greater than 1, multiple candidate scores will be generated for a given trajectory and then averaged to return the final score. Making this greater than 1 is equivalent to enabling self-consistency in the evaluation process.
prompt (str): The system prompt that tells the model how it should evaluate a trajectory and generate a score.

In most cases, simply customizing this object will be sufficient, but in some situations it makes sense to build your own evaluator. For example, if you're building a coding agent, you may want to evaluate a search trajectory using some external feedback, such as whether the code compiles or whether a set of unit tests are passing. To build a custom evaluator, you must extend the Evaluator base class and implement a run method. This method must take in a list of Message objects as input, representing a search trajectory, and return an Evaluation object as output. This object has two properties: score (a value between 0 and 1) and reasoning (an optional string with written justification for the score).

from saplings.abstract import Evaluator
from saplings.dtos import Evaluation

class CustomEvaluator(Evaluator):
   def __init__(self):
      pass

   async def run(self, trajectory: List[Message]) -> Evaluation:
      # Implement this
      return Evaluation(score=1.0, reasoning="Justification goes here.")

Note that the trajectory will always contain the original input message, every tool call, and every tool response. For the tool responses, you can access the raw output of the tool using the Message.raw_output property, discussed in more detail here.

Each agent has a threshold parameter, which determines the minimum score at which to terminate the search and deem a trajectory as a solution. By default, it is 1.0, so you should keep this in mind when designing your evaluator.

Roadmap

Support for chat history
Support for vision agents
Add an llm_call_budget parameter to every agent

In general, as inference gets cheaper and faster, it will become table stakes for agents to use search.

Note from the author

One of my other open-source packages used to be called saplings. It has since been renamed to syntaxis and is now associated with the package of the same name on PyPi.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

6.2.0

Jun 22, 2025

6.1.0

Jun 4, 2025

6.0.3

Apr 6, 2025

6.0.2

Mar 29, 2025

6.0.1

Mar 9, 2025

6.0.0

Mar 5, 2025

5.0.9

Feb 23, 2025

5.0.8

Feb 18, 2025

5.0.7

Feb 17, 2025

5.0.6

Feb 17, 2025

5.0.5

Jan 19, 2025

5.0.4

Jan 13, 2025

5.0.3

Jan 13, 2025

5.0.2

Nov 25, 2024

5.0.1

Nov 22, 2024

5.0.0

Nov 20, 2024

4.3.1

Nov 3, 2022

4.2.1

Jun 3, 2022

4.2.0

May 30, 2022

4.1.1

Dec 24, 2020

4.1.0

Dec 24, 2020

4.0.3

Dec 22, 2020

4.0.2

Dec 22, 2020

4.0.1

Dec 22, 2020

4.0.0

Dec 22, 2020

3.0.0

Jul 26, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

saplings-6.2.0.tar.gz (33.0 kB view details)

Uploaded Jun 22, 2025 Source

File details

Details for the file saplings-6.2.0.tar.gz.

File metadata

Download URL: saplings-6.2.0.tar.gz
Upload date: Jun 22, 2025
Size: 33.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.0 CPython/3.9.19

File hashes

Hashes for saplings-6.2.0.tar.gz
Algorithm	Hash digest
SHA256	`db3d8345cd0aaefdf4b2092dd0dcc45ab69464447e0457a11028378a8b72ce6e`
MD5	`b34df30650e3a3de0978a7d9bc7cec0f`
BLAKE2b-256	`cdd5b29bc84dac0ba4642f24d41e6bd091af6b3c353ecb854309192db34b1505`

See more details on using hashes here.

saplings 6.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

🌳 Saplings

Installation

Quickstart

Creating a tool

Configuring an agent

Docs

Agents

Parameters

GreedyAgent: Greedy best-first serach

MonteCarloAgent: Monte Carlo tree search

AStarAgent: A* search

COTAgent: Chain-of-thought (no search)

The `Message` object

Termination conditions

Advanced tool options

Accessing agent memory

Reformatting tool output

Custom evaluators

Roadmap

Note from the author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

saplings 6.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

🌳 Saplings

Installation

Quickstart

Creating a tool

Configuring an agent

Docs

Agents

Parameters

GreedyAgent: Greedy best-first serach

MonteCarloAgent: Monte Carlo tree search

AStarAgent: A* search

COTAgent: Chain-of-thought (no search)

The Message object

Termination conditions

Advanced tool options

Accessing agent memory

Reformatting tool output

Custom evaluators

Roadmap

Note from the author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

The `Message` object