In-Context Reinforcement Learning framework for LLMs — no fine-tuning required.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

makoeta

These details have not been verified by PyPI

Project description

FastICRL

No fine-tuning required No GPU required

In-Context Reinforcement Learning for LLMs — no fine-tuning, no gradient updates, no GPU.

FastICRL implements the ICRL paradigm from Reward Is Enough: LLMs Are In-Context Reinforcement Learners (Song et al., 2025). A learner LLM improves its outputs purely by reading its own history of attempts and rewards inside the context window — guided by a meta-cognitive strategist. No training, no infrastructure, just inference.

How it works

Three LLM agents collaborate in a feedback loop:

┌──────────────────────────────────────────────────┐
│                   ICRLLearner                    │
│                                                  │
│  Task ──► Learner ──► Output ──► Reward Agent    │
│             ▲                         │          │
│             │        Attempt          │          │
│             │  (task, output, score)  │          │
│             └─────────────────────────┘          │
│                          │                       │
│                  (every N episodes)              │
│                          ▼                       │
│                      Strategist                  │
│                (refines the strategy)            │
└──────────────────────────────────────────────────┘

Agent	Role
Learner	Generates task outputs; balances exploration vs. exploitation based on reward history
Reward	Scores each output on a 0–10 scale (acts as the reward function)
Strategist	Analyzes past attempts to synthesize actionable strategies for future episodes

Each agent can be backed by a different model — e.g. a cheap model for reward, a powerful one for the learner.

Installation

pip install fasticrl

Or with uv:

uv add fasticrl

Model provider extras (install whichever you use):

pip install "fasticrl[openai]"   # OpenAI
pip install "fasticrl[ollama]"   # Ollama (local models)

Requires Python ≥ 3.13.

Quick start

from fasticrl import ICRLLearner
from agno.models.openai import OpenAIChat

model = OpenAIChat(id="gpt-4o-mini")

learner = ICRLLearner(
    learner_model=model,
    reward_model=model,
    strategy_model=model,
    task_description="Write a concise, compelling product description for an e-commerce listing.",
    tasks=[
        "Wireless noise-cancelling headphones",
        "Ergonomic standing desk",
        "Portable espresso maker",
    ],
)

# Run 3 episodes, update strategy every 2 steps, show progress bar
learner.auto_learn(episodes=3, batch_size=2, cli_mode=True, strategy_update_interval=2)

# Inspect what the agent learned
print(learner.strategy)

API

`ICRLLearner`

ICRLLearner(
    learner_model,        # agno Model for the learner agent
    reward_model,         # agno Model for the reward agent
    strategy_model,       # agno Model for the strategist agent
    task_description,     # describes the overall task domain (required)
    tasks,                # list of concrete task instances to cycle through
    buffer,               # optional: pre-loaded list of Attempt objects
    strategy,             # optional: pre-loaded strategy string
)

Key methods

Method	Description
`auto_learn(episodes, batch_size, cli_mode, strategy_update_interval)`	Run N episodes. `batch_size > 1` parallelizes tasks with a thread pool. `cli_mode=True` shows a progress bar. `strategy_update_interval=K` refreshes the strategy every K episodes.
`generate_action(task)`	Run the learner on a single task and return its output
`generate_reward(task, action)`	Score a learner output with the reward agent
`generate_attempt_by_present_task()`	Single step: generate + score the current task
`update_strategy()`	Ask the strategist to refine the strategy from the current buffer
`to_yaml(path)`	Persist the full agent state (buffer + strategy) to a YAML file
`ICRLLearner.from_yaml(path, ...)`	Resume from a saved state

Saving and resuming

# Save
learner.to_yaml("my_agent.yaml")

# Resume later
learner = ICRLLearner.from_yaml(
    "my_agent.yaml",
    learner_model=model,
    reward_model=model,
    strategy_model=model,
)
learner.auto_learn(episodes=5)

Using Ollama (local models)

from agno.models.ollama import Ollama

learner = ICRLLearner(
    learner_model=Ollama(id="llama3.2"),
    reward_model=Ollama(id="llama3.2"),
    strategy_model=Ollama(id="llama3.2"),
    task_description="...",
    tasks=[...],
)

Any agno-compatible model works.

Citation

This project is based on and inspired by the following papers:

Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Kefan Song, Amir Moeini, Peng Wang, Lei Gong, Rohan Chandra, Shangtong Zhang, Yanjun Qi
arXiv:2506.06303 — https://arxiv.org/abs/2506.06303

Large Language Models as Optimizers
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen
arXiv:2309.03409 — https://arxiv.org/abs/2309.03409

Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs
Yifan Zhou, Sachin Grover, Mohamed El Mistiri, Kamalesh Kalirathinam, Pratyush Kerhalkar, Swaroop Mishra, Neelesh Kumar, Sanket Gaurav, Oya Aran, Heni Ben Amor
NeurIPS 2025 — https://openreview.net/forum?id=95plu1Mo20

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

makoeta

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Jun 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasticrl-1.0.0.tar.gz (8.4 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fasticrl-1.0.0-py3-none-any.whl (14.0 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file fasticrl-1.0.0.tar.gz.

File metadata

Download URL: fasticrl-1.0.0.tar.gz
Upload date: Jun 28, 2026
Size: 8.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fasticrl-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`5ae4bda7966e9752d7e4ba54d7bcfe08d883a0bbbf2255c78de2fcb5529e7b95`
MD5	`4d2fcb4880a8569c6dbb6e4ebd11ed38`
BLAKE2b-256	`e38235f563d048101cad2805583835648e1e45e18367e96199c511f34de837e9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fasticrl-1.0.0.tar.gz:

Publisher: python-publish.yml on makoeta/FastICRL

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fasticrl-1.0.0.tar.gz
- Subject digest: 5ae4bda7966e9752d7e4ba54d7bcfe08d883a0bbbf2255c78de2fcb5529e7b95
- Sigstore transparency entry: 2000818940
- Sigstore integration time: Jun 28, 2026
Source repository:
- Permalink: makoeta/FastICRL@3020f0922afe30323c2cb7a90ee952e8b7a79bf1
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/makoeta
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@3020f0922afe30323c2cb7a90ee952e8b7a79bf1
- Trigger Event: release

File details

Details for the file fasticrl-1.0.0-py3-none-any.whl.

File metadata

Download URL: fasticrl-1.0.0-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 14.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fasticrl-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cb6afcf207f6dd39bcd9369cf75f4df3519a53d6d2a07819ba4d1a65a1844074`
MD5	`2c9eaea237ba5879e64210a94245babd`
BLAKE2b-256	`34925b6a5394bdc5b84720ae97511d78688b3ae314a5a183d93be4e9e82b05af`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fasticrl-1.0.0-py3-none-any.whl:

Publisher: python-publish.yml on makoeta/FastICRL

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fasticrl-1.0.0-py3-none-any.whl
- Subject digest: cb6afcf207f6dd39bcd9369cf75f4df3519a53d6d2a07819ba4d1a65a1844074
- Sigstore transparency entry: 2000819052
- Sigstore integration time: Jun 28, 2026
Source repository:
- Permalink: makoeta/FastICRL@3020f0922afe30323c2cb7a90ee952e8b7a79bf1
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/makoeta
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@3020f0922afe30323c2cb7a90ee952e8b7a79bf1
- Trigger Event: release

fasticrl 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

FastICRL

How it works

Installation

Quick start

API

ICRLLearner

Key methods

Saving and resuming

Using Ollama (local models)

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`ICRLLearner`