Add your description here

Project description

orign-py

A Python client for Orign

Installation

pip install orign

Install the Orign CLI

curl -fsSL -H "Cache-Control: no-cache" https://storage.googleapis.com/orign/releases/install.sh | bash

To run the server locally

orign server --docker

Quick Start

Let's create an LLM that can be trained and inferred online.

from orign import Qwen2_5, TRLOpts, VLLMOpts

llm = Qwen2_5(
    name="greeter",
    model="Qwen/Qwen2.5-VL-7B-Instruct",
    platform="runpod",
    bucket="my-bucket",
    train_opts=TRLOpts(
        accelerators=["2:H100_SXM"],
        train_type="sft",
        num_train_epochs=1,
    ),
    infer_opts=VLLMOpts(
        accelerators=["1:A100_SXM"],
    ),
    train_every=50,
    sample_n=100,
)

messages = [
    {"role": "user", "content": "Dlrow Olleh?"},
    {"role": "assistant", "content": "Hello World!"},
]
resp = llm.chat(messages)
print(resp)

# Now we can train the model on the response (if it's good)
messages.append(resp['choices'][0]['message'])
llm.learn(messages)

Next, let's create a human that can provide feedback to the model.

from orign import Human, V1FeedbackResponse

# This function will be called when the human provides feedback
def on_feedback(feedback: V1FeedbackResponse):
    from orign import OnlineLLM

    llm = OnlineLLM.get("greeter")
    if feedback.approved:
        llm.learn(feedback.messages)

# The Orign app must be installed in your slack workspace
human = Human(
    name="my-slack-human",
    medium="slack",
    channel="#my-channel",
    response_func=on_feedback,
)

# This will send a message to the human asking for feedback
needs_review = [
    {"role": "user", "content": "Hello, how are you?"}, 
    {"role": "assistant", "content": "I'm good, thank you!"}
]
human.request_feedback(content="Is this a good response?", messages=needs_review)

# We can also post update messages to the human
human.post_message(content="I'm training the model on your feedback...")

Now putting it all together, let't train a model to learn to accomplish tasks interactively.

task = "Search for the latest news on Cats"
mcp_server = # ... MCP server
max_steps = 20

for i in range(max_steps):
    prompt = "Please try to accomplish the task: " + task + "with these tools: "  # ... MCP tools
    messages = [{"role": "user", "content": prompt}]

    mcp_state = # ... get MCP state

    resp = llm.chat(messages)
    print(resp)

    mcp_action = # ... take MCP action

    messages.append(resp['choices'][0]['message'])
    human.request_feedback(content="Was this a good action?", messages=messages)

Or optionally use our high level objects.

from orign import actor, validator, solve

@actor
def act(task: str, mcp_servers: List[Any], history: List[Step]) -> Step:
    prompt = "Please try to accomplish the task: " + task + "with these tools: "  # ... MCP tools
    messages = [{"role": "user", "content": prompt}]

    mcp_state = # ... get MCP state

    resp = llm.chat(messages)
    print(resp)

    mcp_action = # ... take MCP action

    messages.append(resp['choices'][0]['message'])
    human.request_feedback(content="Was this a good action?", messages=messages)

    return Step(
        state=EnvState(
            text=mcp_state,
        ),
        action=mcp_action,
    )

@validator
def score(step: Step) -> float:

    prompt = f"""Given the step {step.model_dump()}, return a value between 1-10 on how good 
    it was with respect to the task {step.task} 
    """
    messages = [{"role": "user", "content": prompt}]
    resp = reward_llm.chat()

    human.request_feedback(content="Was this a good action?", messages=messages)

    return resp['choices'][0]['message']

solve(
    task="Find the latest news on Cats",
    actor=act,
    validator=score,
    mcp_servers=[],
)

Now as you solve tasks with the actor, every action will be sent for a human to review. Once they do the on_feedback function will be called sending the feedback to the replay buffer which will train the model online.

Usage

Replay Buffer

Replay buffers store agent experiences and offer a means of training models in an online fashion.

In this example, we create a replay buffer that will launch a TRL training job every 50 steps by randomly sampling 100 experiences from the buffer.

from orign import ReplayBuffer, TRLRequest

train_job = TRLRequest(
    name="my-train-job",
    namespace="default",
    model="Qwen/Qwen2.5-VL-7B-Instruct",
    platform="runpod",
    bucket="my-bucket",
    accelerators=["2:H100_SXM"],
    train_type="sft",
    num_train_epochs=1,
    save_steps=1,
    save_total_limit=3,
    use_peft=True,
)

buffer = ReplayBuffer(
    name="my-buffer",
    namespace="default",
    train_every=50,
    sample_n=100,
    sample_strategy="Random",
    train_job=train_job,
)


messages = [
    {"role": "user", "content": "Dlrow Olleh?"},
    {"role": "assistant", "content": "Hello World!"},
]
buffer.send(messages)

Online LLM

Online LLMs are models that can be trained and inferred online.

In this example, we create a Qwen 2.5 model that will be trained using TRL on 2 H100 GPUs, and served using VLLM on 1 A100 GPU.

from orign import Qwen2_5, TRLOpts, VLLMOpts

llm = Qwen2_5(
    name="my-llm",
    model="Qwen/Qwen2.5-7B-Instruct",
    platform="runpod",
    bucket="my-bucket",
    train_opts=TRLOpts(
        accelerators=["2:H100_SXM"],
        train_type="sft",
        num_train_epochs=1,
    ),
    infer_opts=VLLMOpts(
        accelerators=["1:A100_SXM"],
    ),
    train_every=50,
    sample_n=100,
)

Now we can chat with the model.

messages = [
    {"role": "user", "content": "What's the capitol of the moon?"},
]
resp = llm.chat(messages)
print(resp)

and we can also train the model online.

llm.learn(resp)

This model will by default train every 50 steps by randomly sampling 100 experiences from the buffer. However, if you want to manually trigger a training job, you can do so with the train method.

llm.train()

Human

Conmect to humans on slack that can provide feedback to the model.

from orign import Human, V1FeedbackResponse

def on_feedback(feedback: V1FeedbackResponse):
    print(feedback)

human = Human(
    name="my-human",
    medium="slack",
    channel="#my-channel",
    response_func=on_feedback,
)

messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm good, thank you!"},
]

human.request_feedback(content=f"Is this a good response? {resp}", messages=messages)

Examples

See the examples directory for more usage examples.

Project details

Release history Release notifications | RSS feed

0.2.143

Jul 20, 2025

0.2.142

Jul 19, 2025

0.2.141

Jul 18, 2025

0.2.140

Jun 24, 2025

0.2.139

Jun 24, 2025

0.2.138

Jun 23, 2025

0.2.137

Jun 23, 2025

0.2.136

Jun 23, 2025

0.2.135

Jun 23, 2025

0.2.134

Jun 23, 2025

0.2.132

Jun 22, 2025

0.2.131

Jun 22, 2025

0.2.130

Jun 17, 2025

0.2.129

Jun 16, 2025

0.2.128

Jun 16, 2025

0.2.126

Jun 14, 2025

0.2.125

Jun 13, 2025

0.2.124

Jun 13, 2025

0.2.123

Jun 10, 2025

0.2.122

Jun 10, 2025

0.2.121

Jun 8, 2025

0.2.119

Jun 5, 2025

0.2.118

Jun 5, 2025

0.2.117

Jun 5, 2025

0.2.116

Jun 3, 2025

0.2.115

May 31, 2025

0.2.114

May 28, 2025

0.2.113

May 28, 2025

0.2.112

May 27, 2025

0.2.111

May 27, 2025

0.2.110

May 26, 2025

0.2.109

May 25, 2025

0.2.108

May 25, 2025

0.2.107

May 25, 2025

0.2.106

May 22, 2025

0.2.105

May 21, 2025

0.2.104

May 21, 2025

0.2.103

May 21, 2025

0.2.102

May 20, 2025

0.2.101

May 19, 2025

0.2.100

May 15, 2025

0.2.99

May 15, 2025

0.2.98

May 15, 2025

0.2.97

May 15, 2025

0.2.95

May 14, 2025

0.2.94

May 14, 2025

0.2.92

May 14, 2025

0.2.91

May 12, 2025

0.2.90

May 12, 2025

0.2.89

May 12, 2025

0.2.88

May 12, 2025

0.2.87

May 12, 2025

0.2.84

May 7, 2025

0.2.82

May 7, 2025

0.2.81

May 7, 2025

0.2.80

May 6, 2025

0.2.79

May 5, 2025

0.2.78

May 2, 2025

0.2.77

May 2, 2025

0.2.76

May 1, 2025

0.2.75

May 1, 2025

0.2.74

May 1, 2025

0.2.73

May 1, 2025

0.2.72

May 1, 2025

0.2.71

May 1, 2025

0.2.70

May 1, 2025

0.2.69

May 1, 2025

0.2.68

May 1, 2025

0.2.67

Apr 30, 2025

0.2.66

Apr 30, 2025

0.2.65

Apr 30, 2025

0.2.64

Apr 30, 2025

0.2.62

Apr 30, 2025

0.2.61

Apr 30, 2025

0.2.60

Apr 29, 2025

0.2.58

Apr 29, 2025

0.2.57

Apr 29, 2025

0.2.56

Apr 29, 2025

0.2.55

Apr 29, 2025

0.2.53

Apr 29, 2025

0.2.51

Apr 29, 2025

0.2.50

Apr 29, 2025

0.2.48

Apr 28, 2025

0.2.47

Apr 28, 2025

0.2.46

Apr 28, 2025

0.2.45

Apr 28, 2025

0.2.44

Apr 27, 2025

0.2.43

Apr 27, 2025

0.2.42

Apr 26, 2025

0.2.40

Apr 25, 2025

0.2.38

Apr 25, 2025

0.2.37

Apr 25, 2025

0.2.35

Apr 25, 2025

0.2.34

Apr 25, 2025

0.2.33

Apr 24, 2025

0.2.32

Apr 24, 2025

0.2.31

Apr 24, 2025

0.2.30

Apr 24, 2025

0.2.28

Apr 24, 2025

0.2.27

Apr 24, 2025

0.2.26

Apr 24, 2025

0.2.25

Apr 24, 2025

0.2.21

Apr 22, 2025

0.2.20

Apr 22, 2025

0.2.19

Apr 21, 2025

0.2.18

Apr 20, 2025

0.2.17

Apr 20, 2025

0.2.16

Apr 20, 2025

0.2.15

Apr 20, 2025

0.2.14

Apr 20, 2025

0.2.13

Apr 20, 2025

0.2.12

Apr 20, 2025

0.2.8

Apr 19, 2025

0.2.6

Apr 18, 2025

0.2.5

Apr 18, 2025

This version

0.2.2

Apr 2, 2025

0.1.44

Mar 5, 2025

0.1.43

Mar 5, 2025

0.1.42

Mar 5, 2025

0.1.41

Mar 5, 2025

0.1.40

Mar 1, 2025

0.1.39

Mar 1, 2025

0.1.38

Feb 28, 2025

0.1.36

Feb 27, 2025

0.1.35

Feb 26, 2025

0.1.34

Feb 15, 2025

0.1.33

Feb 12, 2025

0.1.32

Feb 12, 2025

0.1.31

Feb 10, 2025

0.1.30

Feb 10, 2025

0.1.29

Feb 10, 2025

0.1.28

Feb 10, 2025

0.1.26

Feb 4, 2025

0.1.25

Feb 4, 2025

0.1.23

Feb 3, 2025

0.1.21

Jan 29, 2025

0.1.20

Jan 27, 2025

0.1.19

Jan 27, 2025

0.1.18

Jan 27, 2025

0.1.17

Jan 27, 2025

0.1.16

Jan 26, 2025

0.1.14

Dec 12, 2024

0.1.13

Dec 12, 2024

0.1.12

Dec 10, 2024

0.1.11

Dec 10, 2024

0.1.10

Dec 10, 2024

0.1.9

Nov 13, 2024

0.1.8

Nov 12, 2024

0.1.7

Nov 6, 2024

0.1.6

Nov 6, 2024

0.1.5

Nov 4, 2024

0.1.2

Nov 2, 2024

0.1.1

Nov 2, 2024

0.1.0

Oct 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orign-0.2.2.tar.gz (19.5 kB view details)

Uploaded Apr 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

orign-0.2.2-py3-none-any.whl (24.0 kB view details)

Uploaded Apr 2, 2025 Python 3

File details

Details for the file orign-0.2.2.tar.gz.

File metadata

Download URL: orign-0.2.2.tar.gz
Upload date: Apr 2, 2025
Size: 19.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.29

File hashes

Hashes for orign-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`6a37fa8ebc16f8b236e4e7376c9040f98258c48557c654c18e822d7a2e65e6cb`
MD5	`a407c35bec097c9e2260ab1ce52e0935`
BLAKE2b-256	`d7d8738ba0c5a251f6d620b526e90b94bebf57d8ba60a8b7759834b7eafab4fc`

See more details on using hashes here.

File details

Details for the file orign-0.2.2-py3-none-any.whl.

File metadata

Download URL: orign-0.2.2-py3-none-any.whl
Upload date: Apr 2, 2025
Size: 24.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.29

File hashes

Hashes for orign-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0646927abd18d19f16122433cdac2bf505fb1930cddc6fed1660fabb406b7207`
MD5	`21ddb47c32f9b2de7fd8e35feeeaa8e1`
BLAKE2b-256	`bf290526d12223d658c99302df69c0fd994d0ed477dc9df4e11f0b0aa33135e5`

See more details on using hashes here.

orign 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

orign-py

Installation

Quick Start

Usage

Replay Buffer

Online LLM

Human

Examples

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes