MCTS + LLM + Prompt Engineering => Enhanced LLM Reponse Quality.

These details have not been verified by PyPI

Project links

Project description

🌐 mcts-llm

MCTS + LLM + Prompt Engineering => Enhanced LLM Reponse Quality 🌲📝✨

🌟 Overview

mcts-llm is a lightweight repo that integrates Monte Carlo Tree Search (MCTS) with prompt engineering techniques to enhance the performance of Large Language Models (LLMs). The idea is that scaling up during inference for better LLM reponse quality could become very valuable versus spending more on compute during training. This can extend beyond math problems such as reasoning, knowledge extraction. This repo can fine-tune prompt instructions and benchmark the performance of various MCTS adaptations for prompt engineering.

🛠️ Installation

PyPI

pip install mcts-llm

Docker

Create a .env file with the following variables:

OPENAI_API_KEY=<your-openai-api-key>
DEEPSEEK_API_KEY=<your-deepseek-api-key>
DEEPSEEK_BASE_URL=<your-deepseek-base-url>
OLLAMA_BASE_URL=http://host.docker.internal:11434

Build the docker container:

cd mcts-llm
make debug

🚀 Run

Quickstart

import dspy
from mcts_llm.mctsr import MCTSr

ollama = dspy.OllamaLocal(
    model="qwen2.5:7b-instruct",
    model_type="chat",
    temperature=1.0,
    max_tokens=1024,
    num_ctx=1024,
    timeout_s=600
)
dspy.settings.configure(lm=ollama, experimental=True)
mctsr = MCTSr()
mctsr_answer = mctsr(problem).answer
print(f"MCStr answer: {mctsr_answer}")

Demo

make debug
python examples/demo.py

📊 Preliminary Results

Initial experiments conducted using qwen2.5:7B-Instruct with the following settings:

Temperature: 1.0
Model Type: Chat
Max Tokens: 1024
Context Length: 1024
Dataset: Shuffled GSM8K (20 examples)
Prompts: Standard, non-optimized instructions
Hardware: M3 Mac Pro (12 threads)

Default hyperparameters:

c: sqrt(2)
initialization: "I don't know."
eps: 1e-8
reward_ub: 95
reward_penalty: 50
default_uct_score: 1000

Method	Accuracy	Total Time	Avg Time per Example	Additional Parameters
Zero-shot CoT	13 / 20 (65%)	2m 01s	6.09s	N/A
One-Turn Self-Refine	15 / 20 (75%)	7m 45s	23.75s	N/A
MCTSr	16 / 20 (80%)	43m 03s	129.18s	• max_rollouts = 4 • policy = "greedy" • samples_per_node = 3
MCTSr	17 / 20 (85%)	44m 09s	132.50s	• max_rollouts = 4 • policy = "importance_sampling" • samples_per_node = 3
MCTSr	16 / 20 (80%)	51m 10s	153.51s	• max_rollouts = 4 • policy = "importance_sampling" • samples_per_node = 4
MCTSr	18 / 20 (90%)	51m 42s	153.13s	• max_rollouts = 4 • policy = "greedy" • samples_per_node = 4
MCTSr	15 / 20 (75%)	1h 38m 53s	296.68s	• max_rollouts = 8 • policy = "greedy" • samples_per_node = 4
MCTSr	14 / 20 (70%)	1h 39m 03s	298.40s	• max_rollouts = 8 • policy = "importance_sampling" • samples_per_node = 4

Note: These results are preliminary and obtained under specific conditions. Further experimentation is needed to generalize the findings.

Paper Implementations

🚀 TODOs

Upgrade DSPy to >= 2.5.0.
Include datasets for evaluation such as MATH, AIME, Math Odyssey.
Fine-Tune optimal hyperparameters for MCTSr.
Fine-Tune with Llama3.1-8B.
Fine-Tune with Qwen2.5-7B.
Fine-Tune with DeepSeek-Chat as the prompting model and smaller LLMs with Ollama as the task model.

⚠️ Disclaimer

Please be aware of potential costs when using OpenAI/Anthropic LLMs, especially with larger rollouts. Familiarize yourself with DSPy and its optimizers before extensive use.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.1

Oct 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcts_llm-0.0.1.tar.gz (7.9 kB view details)

Uploaded Oct 7, 2024 Source

Built Distribution

mcts_llm-0.0.1-py3-none-any.whl (9.0 kB view details)

Uploaded Oct 7, 2024 Python 3

File details

Details for the file mcts_llm-0.0.1.tar.gz.

File metadata

Download URL: mcts_llm-0.0.1.tar.gz
Upload date: Oct 7, 2024
Size: 7.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.6 Darwin/23.6.0

File hashes

Hashes for mcts_llm-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`cd1ab1ff7deef10e4c70adec958d1b1db3bae30b573ce85a050af4ddc9a89ec1`
MD5	`2770166cd4f1b4bc0da9ab89ff5408b5`
BLAKE2b-256	`e269686f4156e3a9111f21c6dfe02905cbe0e8b06b5d3bf46c550ec4e56e197f`

See more details on using hashes here.

File details

Details for the file mcts_llm-0.0.1-py3-none-any.whl.

File metadata

Download URL: mcts_llm-0.0.1-py3-none-any.whl
Upload date: Oct 7, 2024
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.6 Darwin/23.6.0

File hashes

Hashes for mcts_llm-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d735932edf6560f0e4a17a7ae870b21fd878c359e5068b655fda4eb441f0c355`
MD5	`3709b41ac083c0f408dc69b3d5fc4bcd`
BLAKE2b-256	`fc2d2561ab56e7f44d8669f1aa265c05ab02541095bf61b1cc156238ec944abc`