rcr-lm: Collapse layers into a recurrent block
Project description
rcr-lm
A lightweight, high-performance research framework for Large Language Models built on Apple's MLX.
Quickstart
rcr-lm is available on PyPI:
# macOS
pip install rcrlm
# Linux with CUDA
pip install rcrlm[cuda]
# Linux (CPU only)
pip install rcrlm[cpu]
To generate text with an LLM:
>>> rlm
┌────────────────────────────── Streaming ──────────────────────────────┐
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they can process and generate text, not just a few words. Then explain their training data, like the amount of text they're trained on. Also, their capabilities: understanding and generating text, answering questions, etc. Need
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Inp 00000 ──────────────────────────────┐
<|im_start|>user
Give me a short introduction to large language model.
<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00000 ──────────────────────────────┐
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they can process and generate text, not just a few words. Then explain their training data, like the amount of text they're trained on. Also, their capabilities: understanding and generating text, answering questions, etc. Need
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Benchmark ──────────────────────────────┐
Prompt processing: 298.1 tokens/sec ( 18 tokens in 0.1s)
Tokens generation: 217.3 tokens/sec (100 tokens in 0.5s)
└───────────────────────────────────────────────────────────────────────┘
Key Features
Accelerated Inference
rcr-lm achieves generation speeds exceeding 200 tokens/sec, offering a measurable performance uplift over standard LLM implementations.
from rcrlm import load, infer
m = load()
_ = infer("Write a story about Einstein\n", **m, max_new_tokens=256)
┌────────────────────────────── Inp 00000 ──────────────────────────────┐
<|im_start|>user
Write a story about Einstein
<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00000 ──────────────────────────────┐
<think>
Okay, the user wants a story about Einstein. Let me start by recalling Einstein's life. He was a genius, a scientist, and a philosopher. I need to make sure the story includes his contributions to science, maybe his work on relativity, and his personal life.
First, I should set the scene. Maybe start with his early life in Germany, where he was born. Then introduce his family, his parents, maybe his mother's influence. Then his education, the famous lectures, and his breakthroughs.
I need to highlight his scientific achievements, like the theory of relativity. Also, his personal struggles, like the time he spent in the Alps, the Alps being a place of isolation and inspiration.
I should include some quotes or references to his work. Maybe mention his quote about the universe being infinite. Also, his later years and how he passed away.
Wait, the user might want the story to be engaging and highlight his legacy. I need to make sure the story flows well, with a good narrative arc. Avoid clichés, but still capture his essence. Check for any inaccuracies, like his actual birth date and death year. Let me confirm: Einstein was born on April 14, 18
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Benchmark ──────────────────────────────┐
Prompt processing: 232.2 tokens/sec ( 14 tokens in 0.1s)
Tokens generation: 200.2 tokens/sec (256 tokens in 1.3s)
└───────────────────────────────────────────────────────────────────────┘
mlx-lm (for comparison)
from mlx_lm import load, generate
model, tokenizer = load("Qwen/Qwen3-0.6B")
prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
text = generate(model, tokenizer, prompt=prompt, verbose=True)
<think>
Okay, the user wants a story about Einstein. Let me start by recalling Einstein's life and achievements. He was a genius, but the story needs to be engaging. Maybe set it in his early years to highlight his early brilliance. I should include some key moments, like his work on the theory of relativity, but also show his personal life and challenges.
I need to make sure the story has a beginning, middle, and end. Maybe start with his childhood, then his education, his breakthroughs, and his later years. Including some quotes from his work would add depth. Also, the user might want to know about his legacy, so I should mention his impact on science and society.
Wait, the user didn't specify the genre. It could be a historical fiction or a modern story. Since Einstein is a well-known figure, maybe a historical account would be better. But I should make sure the story is engaging and not too technical. Maybe include some emotional elements, like his struggles with time and his family.
I should check for any inaccuracies. For example, his early life was not as famous as he is known. Maybe mention his parents, his education, and his eventual fame. Also, the story should end on a positive note,
==========
Prompt: 13 tokens, 34.744 tokens-per-sec
Generation: 256 tokens, 174.251 tokens-per-sec
Peak memory: 1.415 GB
transformers (for comparison)
from transformers import AutoModelForCausalLM, AutoTokenizer
import time
model_name = "Qwen/Qwen3-0.6B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
prompt = "Write a story about Einstein"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
tic_gen = time.perf_counter()
generated_ids = model.generate(
**model_inputs,
max_new_tokens=256
)
elp_gen = time.perf_counter() - tic_gen
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=False).strip("\n")
print(content)
print('==========')
print(f'{model_inputs['input_ids'].shape[1]} prompt tokens and {len(output_ids)} generated tokens in {elp_gen:.2f} seconds')
<think>
Okay, the user wants me to write a story about Einstein. Let me start by thinking about the key elements that make Einstein a famous figure. He's a genius, a scientist, and a man of philosophy. I need to make sure the story captures these aspects.
First, I should set the scene. Maybe start with his early life to show his early brilliance. His childhood, maybe a small village, and his parents. Then introduce his academic journey, the famous equations, and his work on the theory of relativity. Including his personal life, like his family and his later life, would add depth.
I need to include some conflict or challenge to make the story engaging. Perhaps a time when he faced opposition or personal struggles. Maybe his work on the theory of relativity was controversial, which adds tension. Also, highlighting his personality traits—like his curiosity, determination, and the impact he had on society.
I should make sure the story has a clear beginning, middle, and end. Start with his early achievements, then his scientific contributions, and end with his legacy. Avoid clichés, but still capture his essence. Maybe include some quotes or references to his work to add authenticity.
Wait, the user might be looking for a story that's both
==========
13 prompt tokens and 256 generated tokens in 10.66 seconds
Note: Included for baseline reference. transformers is not fully optimized for inference on Apple Silicon (MPS) compared to its performance on NVIDIA GPUs.
Batched Decoding
infer(["#write a quick sort algorithm\n", "Give me a short introduction to large language model.\n", "Write a neurology ICU admission note.\n", "Comparison of Sortino Ratio for Bitcoin and Ethereum."], **m)
┌────────────────────────────── Inp 00000 ──────────────────────────────┐
<|im_start|>user
#write a quick sort algorithm
<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00000 ──────────────────────────────┐
<think>
Okay, I need to write a quick sort algorithm. Let me think about how to approach this. Quick sort is a divide-and-conquer algorithm, right? The basic idea is to select a pivot element, partition the array into elements less than or equal to the pivot and greater than or equal to it, and then recursively sort the subarrays.
First, I should outline the steps. The algorithm should have a function that takes an array and a pivot index. Wait, but how do
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Inp 00001 ──────────────────────────────┐
<|im_start|>user
Give me a short introduction to large language model.
<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00001 ──────────────────────────────┐
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they can process and generate text, not just a few words. Then explain their training data, like the amount of text they're trained on. Also, their capabilities: understanding and generating text, answering questions, etc. Need
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Inp 00002 ──────────────────────────────┐
<|im_start|>user
Write a neurology ICU admission note.
<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00002 ──────────────────────────────┐
<think>
Okay, I need to write a neurology ICU admission note. Let me start by recalling what an ICU admission note typically includes. It's a medical record that outlines the patient's condition, initial assessment, interventions, and any ongoing care.
First, the patient's name and date of admission. I should make sure to include that. Then, the patient's name, age, gender, and primary diagnosis. Since it's a neurology ICU, the main issue is likely a neurological
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Inp 00003 ──────────────────────────────┐
<|im_start|>user
Comparison of Sortino Ratio for Bitcoin and Ethereum.<|im_end|>
<|im_start|>assistant
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00003 ──────────────────────────────┐
<think>
Okay, the user is asking for a comparison between the Sortino Ratio for Bitcoin and Ethereum. Let me start by recalling what the Sortino Ratio is. It's a measure of risk-adjusted return for a portfolio, right? It's calculated as (Return - Risk-Free Rate) divided by the Risk (Standard Deviation). The Sortino Ratio is usually used for a single asset, so comparing it between Bitcoin and Ethereum would be useful.
First, I need to check the historical data
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Benchmark ──────────────────────────────┐
Prompt processing: 1411.6 tokens/sec ( 72 tokens in 0.1s)
Tokens generation: 469.2 tokens/sec (400 tokens in 0.9s)
└───────────────────────────────────────────────────────────────────────┘
Efficient Fine-Tuning
Supports DoRA (Weight-Decomposed Low-Rank Adaptation) for parameter-efficient training workflows.
from rcrlm import train
m = load()
lora_test_path = 'test_lora.safetensors'
train("RandomNameAnd6/SVGenerator", **m, lora_cfg=dict(wt_to=lora_test_path))
del m
m = load()
_ = infer("medium red circle\n", **m, lora_path=lora_test_path, stream=False, max_new_tokens=256, use_jit=False)
〄 Testing DoRA training...
epoch= 0 avg_loss= 0.29 elp_train= 10.18
└ test output: ['<svg width="100" height="100" viewBox="-50 -5']
epoch= 1 avg_loss= 0.04 elp_train= 11.19
└ test output: ['<svg width="100" height="100" viewBox="-50 -5']
〄 Testing DoRA decoding...
┌────────────────────────────── Inp 00000 ──────────────────────────────┐
<|im_start|>user
medium red circle<|im_end|>
<|im_start|>assistant
<think>
</think>
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Out 00000 ──────────────────────────────┐
<svg width="100" height="100" viewBox="-50 -50 100 100" xmlns="http://www.w3.org/2000/svg"><circle cx="0" cy="0" r="18" fill="#f6f448"/></svg>
<|im_end|>
└───────────────────────────────────────────────────────────────────────┘
┌────────────────────────────── Benchmark ──────────────────────────────┐
Prompt processing: 435.0 tokens/sec ( 15 tokens in 0.0s)
Tokens generation: 162.9 tokens/sec (256 tokens in 1.6s)
└───────────────────────────────────────────────────────────────────────┘
Also supports KD (Knowledge Distillation) from a teacher model.
m = load()
m['model'] = collapse(m['model'])
teacher = load()['model']
m['model'] = distill("HuggingFaceH4/instruction-dataset", **m, teacher=teacher)
_ = infer("Write a story about Einstein\n", **m, stream=False, max_new_tokens=1024)
Integrated Evaluation
Native integration with lm-evaluation-harness. Benchmark vanilla and customized models against standard metrics (MMLU, GSM8K, etc.) with a single command.
m = load()
eval_str += f'✓ Original:\n{eval_lm(**m)}\n'
m['model'] = collapse(m['model'])
eval_str += f'✓ Collapsed:\n{eval_lm(**m)}\n'
teacher = load()['model']
m['model'] = distill("HuggingFaceH4/instruction-dataset", **m, teacher=teacher)
eval_str += f'✓ Healed:\n{eval_lm(**m)}\n'
m['model'] = dampen(m['model'])
eval_str += f'✓ Dampened:\n{eval_lm(**m)}\n'
print(eval_str)
✓ Original:
- GPQA : 0.40
- GSM8k : 0.30
- MGSM : 0.50
- MMLU : 0.42
✓ Collapsed:
- GPQA : 0.25
- GSM8k : 0.00
- MGSM : 0.00
- MMLU : 0.25
✓ Healed:
- GPQA : 0.25
- GSM8k : 0.10
- MGSM : 0.05
- MMLU : 0.31
✓ Dampened:
- GPQA : 0.25
- GSM8k : 0.10
- MGSM : 0.05
- MMLU : 0.30
Codes adapted from [nnx-lm](https://pypi.org/project/nnx-lm/) to try some stuff
## Etc~/D/rcr[1 jobs]> python -m rcrlm.main 〄 Testing vanilla decoding... ┌────────────────────────────── Streaming ──────────────────────────────┐ <think> Okay, the user wants a story about Einstein. Let me start by recalling Einstein's life. He was a genius, a scientist, and a philosopher. I need to make sure the story includes his contributions to science, maybe his work on relativity, and his personal life. First, I should set the scene. Maybe start with his early life in Germany, where he was born. Then introduce his family, his parents, maybe his mother's influence. Then his education, the famous lectures, and his breakthroughs. I need to highlight his scientific achievements, like the theory of relativity. Also, his personal struggles, like the time he spent in the Alps, the Alps being a place of isolation and inspiration. I should include some quotes or references to his work. Maybe mention his quote about the universe being infinite. Also, his later years and how he passed away. Wait, the user might want the story to be engaging and highlight his legacy. I need to make sure the story flows well, with a good narrative arc. Avoid clichés, but still capture his essence. Check for any inaccuracies, like his actual birth date and death year. Let me confirm: Einstein was born on April 14, 18 └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, the user wants a story about Einstein. Let me start by recalling Einstein's life. He was a genius, a scientist, and a philosopher. I need to make sure the story includes his contributions to science, maybe his work on relativity, and his personal life. First, I should set the scene. Maybe start with his early life in Germany, where he was born. Then introduce his family, his parents, maybe his mother's influence. Then his education, the famous lectures, and his breakthroughs. I need to highlight his scientific achievements, like the theory of relativity. Also, his personal struggles, like the time he spent in the Alps, the Alps being a place of isolation and inspiration. I should include some quotes or references to his work. Maybe mention his quote about the universe being infinite. Also, his later years and how he passed away. Wait, the user might want the story to be engaging and highlight his legacy. I need to make sure the story flows well, with a good narrative arc. Avoid clichés, but still capture his essence. Check for any inaccuracies, like his actual birth date and death year. Let me confirm: Einstein was born on April 14, 18 └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 245.8 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 200.0 tokens/sec (256 tokens in 1.3s) └───────────────────────────────────────────────────────────────────────┘ 〄 Testing batch decoding... ┌────────────────────────────── Streaming ──────────────────────────────┐ <think> Okay, the user is asking for a comparison between the Sortino Ratio for Bitcoin and Ethereum. Let me start by recalling what the Sortino Ratio is. It's a measure of risk-adjusted return for a portfolio, right? It's calculated as (Return - Risk-Free Rate) divided by the Risk (Standard Deviation). The Sortino Ratio is usually used for a single asset, so comparing it between Bitcoin and Ethereum would be useful. First, I need to check the historical data └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user #write a quick sort algorithm <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, I need to write a quick sort algorithm. Let me think about how to approach this. Quick sort is a divide-and-conquer algorithm, right? The basic idea is to select a pivot element, partition the array into elements less than or equal to the pivot and greater than or equal to it, and then recursively sort the subarrays. First, I should outline the steps. The algorithm should have a function that takes an array and a pivot index. Wait, but how do └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Inp 00001 ──────────────────────────────┐ <|im_start|>user Give me a short introduction to large language model. <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00001 ──────────────────────────────┐ <think> Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they can process and generate text, not just a few words. Then explain their training data, like the amount of text they're trained on. Also, their capabilities: understanding and generating text, answering questions, etc. Need └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Inp 00002 ──────────────────────────────┐ <|im_start|>user Write a neurology ICU admission note. <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00002 ──────────────────────────────┐ <think> Okay, I need to write a neurology ICU admission note. Let me start by recalling what an ICU admission note typically includes. It's a medical record that outlines the patient's condition, initial assessment, interventions, and any ongoing care. First, the patient's name and date of admission. I should make sure to include that. Then, the patient's name, age, gender, and primary diagnosis. Since it's a neurology ICU, the main issue is likely a neurological └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Inp 00003 ──────────────────────────────┐ <|im_start|>user Comparison of Sortino Ratio for Bitcoin and Ethereum.<|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00003 ──────────────────────────────┐ <think> Okay, the user is asking for a comparison between the Sortino Ratio for Bitcoin and Ethereum. Let me start by recalling what the Sortino Ratio is. It's a measure of risk-adjusted return for a portfolio, right? It's calculated as (Return - Risk-Free Rate) divided by the Risk (Standard Deviation). The Sortino Ratio is usually used for a single asset, so comparing it between Bitcoin and Ethereum would be useful. First, I need to check the historical data └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 1389.8 tokens/sec ( 72 tokens in 0.1s) Tokens generation: 471.5 tokens/sec (400 tokens in 0.8s) └───────────────────────────────────────────────────────────────────────┘ 〄 Testing DoRA training... epoch= 0 avg_loss= 0.27 elp_train= 13.35 └ test output: ['<svg width="100" height="100" viewBox="-50 -5'] epoch= 1 avg_loss= 0.05 elp_train= 13.37 └ test output: ['<svg width="100" height="100" viewBox="-50 -5'] 〄 Testing DoRA decoding... ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user medium red circle<|im_end|> <|im_start|>assistant <think> </think> └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <svg width="100" height="100" viewBox="-50 -50 100 100" xmlns="http://www.w3.org/2000/svg"><circle cx="0" cy="0" r="25" fill="#f6f5f4"/></svg> └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 178.3 tokens/sec ( 15 tokens in 0.1s) Tokens generation: 79.0 tokens/sec (256 tokens in 3.2s) └───────────────────────────────────────────────────────────────────────┘ 〄 Testing collapse... ✓ Colapsed: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ </think> **The Story of Einstein: A Brief version** *by: The story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of the story of └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 492.6 tokens/sec ( 14 tokens in 0.0s) Tokens generation: 240.8 tokens/sec (100 tokens in 0.4s) └───────────────────────────────────────────────────────────────────────┘ teacher_to_student=[(16, 13)] layer_idx=13 epoch= 0 avg_loss= 2.56 elp_train= 95.08 └ test output: ["I'm sorry, I can't help with that request. It's a medium red circle. Let"] array(-0.0832719, dtype=float32) array(0.548253, dtype=float32) ✓ Healed: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, so I need to write a story about Einstein. Let me think about the key elements. Einstein was a famous scientist, right? He was known for his work in relativity and quantum mechanics. The story should highlight his contributions, maybe his personal life, and his impact on science. Let me start by setting the scene. Einstein was born in 1905, so I need to include his early life, his work, and his legacy. Maybe he had a difficult childhood, like his parents were working in a small town. I should mention his early achievements, like his work on the theory of relativity, and his later contributions. The story should show his journey from a young boy to a respected scientist, and his impact on the world. I need to include elements like his personal life, his work, and his legacy. Let me think about the structure. Start with his early life, his work, and his legacy. Maybe end with his death and his impact. I should make sure to include his scientific contributions and his personal life. Let me make sure to highlight his scientific achievements and his personal life. Maybe include a moment of personal struggle, like his parents being in a difficult situation. I need to make sure the story flows well and includes all the key points. Let me check if all the elements are covered: his early life, work, contributions, and legacy. I think that's all. </think> **Einstein's Journey** In the quiet streets of Vienna, Einstein was born in 1905, a boy of 13. His parents, parents, and his friends, all around him, were working in small factories. He was known for his curiosity and brilliance, and his parents were both scientists. He was a curious and imaginative child, who would often ask questions about the universe. He was fascinated by the idea of time and space, and he always asked questions that led him to the answers. His parents, both scientists, were able to answer his questions with great confidence. Einstein's childhood was marked by a mix of curiosity and determination. He was always fascinated by the mysteries of the universe, and he often asked questions that led him to the answers. He was always the one who took the lead in his studies. He was known for his ability to think and solve problems, and he always took the lead in his studies. He was always the one who took the lead in his studies. Einstein's parents were both scientists, and he was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. **The End.** --- **Einstein's Legacy** Einstein's work in relativity and quantum mechanics revolutionized modern science. He was known for his ability to think and solve problems, and he was always the one who took the lead in his studies. He was known for his ability to think and solve problems, and he was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always the one who took the lead in his studies. He was always └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 130.4 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 57.8 tokens/sec (1024 tokens in 17.7s) └───────────────────────────────────────────────────────────────────────┘ [──────────────────────────────] Processing model.layers.13.layer.mlp.up_proj... -> Found 912 candidates. -> Subtracting 70% of rank 200 (Sim: 0.1876) [──────────────────────────────] Processing model.layers.13.layer.mlp.down_proj... -> Found 983 candidates. -> Subtracting 70% of rank 879 (Sim: 0.1530) [──────────────────────────────] Processing model.layers.13.layer.mlp.gate_proj... -> Found 982 candidates. -> Subtracting 70% of rank 413 (Sim: 0.1497) [──────────────────────────────] Processing model.layers.13.layer.self_attn.o_proj... -> Found 1018 candidates. -> Subtracting 70% of rank 406 (Sim: 0.1630) [──────────────────────────────] Processing model.layers.13.layer.self_attn.v_proj... -> Found 615 candidates. -> Subtracting 70% of rank 229 (Sim: 0.2665) [──────────────────────────────] Processing model.layers.13.layer.self_attn.k_proj... -> Found 960 candidates. -> Subtracting 70% of rank 407 (Sim: 0.1506) [──────────────────────────────] Processing model.layers.13.layer.self_attn.q_proj... -> Found 992 candidates. -> Subtracting 70% of rank 443 (Sim: 0.1397) [──────────────────────────────] Successfully healed and merged 7 layers. ✓ Dampened: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, I need to write a story about Einstein. Let me think about the key elements. First, Einstein was a famous scientist, so I should focus on his life. He was born in 1905, lived in the 20th century, and died in 1915. His work in relativity and quantum mechanics are central. I should highlight his contributions, maybe his theories, and his impact on science. Let me structure the story in three parts: his early life, his work, and his legacy. I need to make sure the story flows well and includes his key ideas. Let me check if all the details are accurate. Einstein's death in 1915 is correct. His theories in relativity are accurate. His impact is significant. I should make sure the story is engaging and highlights his achievements. Let me make sure the characters are well-developed and the plot is clear. I need to avoid clichés and make the story relatable. Let me think about the structure. Start with his early life, his work, and his legacy. Maybe end with a reflection on his impact. I should make sure the story is engaging and highlights his contributions. Let me check if all the details are accurate. Einstein's death in 1915 is correct. His theories in relativity are accurate. His impact is significant. I think that's a solid story. </think> **Einstein's Legacy** In the quiet halls of the University of Zurich, Einstein’s name is etched in history. Born in 1905, he was a prodigious young scientist whose genius defied the limits of human understanding. His work in relativity and quantum mechanics laid the foundation for modern physics. His theories, particularly the ones that challenged the classical understanding of space and time, would become the cornerstone of modern science. His legacy endures, and his name is remembered as one of the greatest minds in human history. Einstein’s early life was marked by a relentless curiosity and a passion for discovery. He was a prodigious student, who quickly became one of the most influential figures in the field. His work in relativity and quantum mechanics revolutionized the understanding of space and time. He proposed the theory of relativity, which later became the basis for modern physics. His theories, particularly the ones that challenged the classical understanding of space and time, were revolutionary. He was known for his ability to think creatively and to challenge the established scientific paradigms. His work in relativity and quantum mechanics has become the foundation of modern science. Einstein’s legacy is one of brilliance and innovation. He was a pioneer in both fields, and his theories have become the basis of modern physics. His work in relativity and quantum mechanics has not only changed the course of human history but has also inspired future generations of scientists. His theories have not only changed the course of human history but have also inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but has also inspired future generations of scientists. Einstein’s legacy is one of brilliance and innovation. He was a pioneer in both fields, and his work in relativity and quantum mechanics has become the foundation of modern physics. His theories have not only changed the course of human history but have also inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have also inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of human history but have inspired future generations of scientists. His work in relativity and quantum mechanics has not only changed the course of └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 297.4 tokens/sec ( 14 tokens in 0.0s) Tokens generation: 134.2 tokens/sec (1024 tokens in 7.6s) └───────────────────────────────────────────────────────────────────────┘ 〄 Testing lm-eval... Evaluating generation: 100%|██████████| 40/40 [12:34<00:00, 18.87s/it] Evaluating loglikelihood: 100%|██████████| 4640/4640 [04:25<00:00, 17.49it/s] | Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:| |gpqa_main_zeroshot | 1|none | 0|acc |↑ |0.4000|± |0.1124| | | |none | 0|acc_norm |↑ |0.4000|± |0.1124| |gsm8k | 3|flexible-extract | 0|exact_match|↑ |0.3000|± |0.1051| | | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000| |mgsm_direct_zh | 3|flexible-extract | 0|exact_match|↑ |0.5000|± |0.1147| | | |remove_whitespace| 0|exact_match|↑ |0.0000|± |0.0000| |mmlu | 2|none | |acc |↑ |0.4193|± |0.0145| | - humanities | 2|none | |acc |↑ |0.4385|± |0.0310| | - formal_logic | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_european_history | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - high_school_us_history | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_world_history | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - international_law | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - jurisprudence | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - logical_fallacies | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - moral_disputes | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - moral_scenarios | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - philosophy | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - prehistory | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - professional_law | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - world_religions | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - other | 2|none | |acc |↑ |0.4077|± |0.0299| | - business_ethics | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - clinical_knowledge | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_medicine | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - global_facts | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - human_aging | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - management | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - marketing | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - medical_genetics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - miscellaneous | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - nutrition | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - professional_accounting | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - professional_medicine | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - virology | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - social sciences | 2|none | |acc |↑ |0.4292|± |0.0309| | - econometrics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_geography | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_psychology | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - human_sexuality | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - professional_psychology | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - public_relations | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - security_studies | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - sociology | 1|none | 0|acc |↑ |0.7000|± |0.1051| | - us_foreign_policy | 1|none | 0|acc |↑ |0.6500|± |0.1094| | - stem | 2|none | |acc |↑ |0.4079|± |0.0251| | - abstract_algebra | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - anatomy | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - astronomy | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - college_biology | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - college_chemistry | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_computer_science | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - college_mathematics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_physics | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - computer_security | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - conceptual_physics | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - electrical_engineering | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - elementary_mathematics | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_biology | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - high_school_chemistry | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - high_school_computer_science | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - high_school_mathematics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_physics | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_statistics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - machine_learning | 1|none | 0|acc |↑ |0.3500|± |0.1094| - GPQA : 0.40 - GSM8k : 0.30 - MGSM : 0.50 - MMLU : 0.42 Starting lm-evaluation-harness on: ['mmlu', 'gpqa_main_zeroshot', 'gsm8k', 'mgsm_direct_zh'] Evaluating generation: 100%|██████████| 40/40 [26:13<00:00, 39.33s/it] Evaluating loglikelihood: 100%|██████████| 4640/4640 [04:08<00:00, 18.69it/s] | Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:| |gpqa_main_zeroshot | 1|none | 0|acc |↑ |0.2500|± |0.0993| | | |none | 0|acc_norm |↑ |0.2500|± |0.0993| |gsm8k | 3|flexible-extract | 0|exact_match|↑ |0.0000|± |0.0000| | | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000| |mgsm_direct_zh | 3|flexible-extract | 0|exact_match|↑ |0.0000|± |0.0000| | | |remove_whitespace| 0|exact_match|↑ |0.0000|± |0.0000| |mmlu | 2|none | |acc |↑ |0.2509|± |0.0128| | - humanities | 2|none | |acc |↑ |0.2154|± |0.0257| | - formal_logic | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - high_school_european_history | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_us_history | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_world_history | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - international_law | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - jurisprudence | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - logical_fallacies | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - moral_disputes | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - moral_scenarios | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - philosophy | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - prehistory | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - professional_law | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - world_religions | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - other | 2|none | |acc |↑ |0.2577|± |0.0272| | - business_ethics | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - clinical_knowledge | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_medicine | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - global_facts | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - human_aging | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - management | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - marketing | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - medical_genetics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - miscellaneous | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - nutrition | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - professional_accounting | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - professional_medicine | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - virology | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - social sciences | 2|none | |acc |↑ |0.2625|± |0.0280| | - econometrics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_geography | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - high_school_psychology | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - human_sexuality | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - professional_psychology | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - public_relations | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - security_studies | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - sociology | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - us_foreign_policy | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - stem | 2|none | |acc |↑ |0.2632|± |0.0227| | - abstract_algebra | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - anatomy | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - astronomy | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_biology | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_chemistry | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_computer_science | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - college_mathematics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - college_physics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - computer_security | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - conceptual_physics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - electrical_engineering | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - elementary_mathematics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - high_school_biology | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - high_school_chemistry | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - high_school_computer_science | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_mathematics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_physics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_statistics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - machine_learning | 1|none | 0|acc |↑ |0.2500|± |0.0993| - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.00 - MMLU : 0.25 teacher_to_student=[(16, 13)] layer_idx=13 epoch= 0 avg_loss= 2.58 elp_train= 120.82 └ test output: ["I'm sorry, I can't help with that request. It's a medium red circle. Let"] array(-0.0801707, dtype=float32) array(0.548827, dtype=float32) Starting lm-evaluation-harness on: ['mmlu', 'gpqa_main_zeroshot', 'gsm8k', 'mgsm_direct_zh'] Evaluating generation: 100%|██████████| 40/40 [57:56<00:00, 86.91s/it] Evaluating loglikelihood: 100%|██████████| 4640/4640 [06:22<00:00, 12.12it/s] | Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:| |gpqa_main_zeroshot | 1|none | 0|acc |↑ |0.2500|± |0.0993| | | |none | 0|acc_norm |↑ |0.2500|± |0.0993| |gsm8k | 3|flexible-extract | 0|exact_match|↑ |0.0000|± |0.0000| | | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000| |mgsm_direct_zh | 3|flexible-extract | 0|exact_match|↑ |0.0500|± |0.0500| | | |remove_whitespace| 0|exact_match|↑ |0.0000|± |0.0000| |mmlu | 2|none | |acc |↑ |0.2667|± |0.0130| | - humanities | 2|none | |acc |↑ |0.2846|± |0.0283| | - formal_logic | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_european_history | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_us_history | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_world_history | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - international_law | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - jurisprudence | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - logical_fallacies | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - moral_disputes | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - moral_scenarios | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - philosophy | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - prehistory | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_law | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - world_religions | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - other | 2|none | |acc |↑ |0.2923|± |0.0283| | - business_ethics | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - clinical_knowledge | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_medicine | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - global_facts | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - human_aging | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - management | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - marketing | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - medical_genetics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - miscellaneous | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - nutrition | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - professional_accounting | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_medicine | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - virology | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - social sciences | 2|none | |acc |↑ |0.2375|± |0.0271| | - econometrics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_geography | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.0500|± |0.0500| | - high_school_psychology | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - human_sexuality | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_psychology | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - public_relations | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - security_studies | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - sociology | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - us_foreign_policy | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - stem | 2|none | |acc |↑ |0.2553|± |0.0220| | - abstract_algebra | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - anatomy | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - astronomy | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - college_biology | 1|none | 0|acc |↑ |0.5500|± |0.1141| | - college_chemistry | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - college_computer_science | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_mathematics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - college_physics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - computer_security | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - conceptual_physics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - electrical_engineering | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - elementary_mathematics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_biology | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_chemistry | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_computer_science | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_mathematics | 1|none | 0|acc |↑ |0.0500|± |0.0500| | - high_school_physics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_statistics | 1|none | 0|acc |↑ |0.0500|± |0.0500| | - machine_learning | 1|none | 0|acc |↑ |0.1500|± |0.0819| - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.05 - MMLU : 0.27 [──────────────────────────────] Processing model.layers.13.layer.mlp.up_proj... -> Found 912 candidates. -> Subtracting 70% of rank 345 (Sim: 0.1820) [──────────────────────────────] Processing model.layers.13.layer.mlp.down_proj... -> Found 985 candidates. -> Subtracting 70% of rank 832 (Sim: 0.1604) [──────────────────────────────] Processing model.layers.13.layer.mlp.gate_proj... -> Found 980 candidates. -> Subtracting 70% of rank 450 (Sim: 0.1510) [──────────────────────────────] Processing model.layers.13.layer.self_attn.o_proj... -> Found 1016 candidates. -> Subtracting 70% of rank 857 (Sim: 0.1532) [──────────────────────────────] Processing model.layers.13.layer.self_attn.v_proj... -> Found 615 candidates. -> Subtracting 70% of rank 86 (Sim: 0.2683) [──────────────────────────────] Processing model.layers.13.layer.self_attn.k_proj... -> Found 965 candidates. -> Subtracting 70% of rank 248 (Sim: 0.1629) [──────────────────────────────] Processing model.layers.13.layer.self_attn.q_proj... -> Found 993 candidates. -> Subtracting 70% of rank 286 (Sim: 0.1399) [──────────────────────────────] Successfully healed and merged 7 layers. Starting lm-evaluation-harness on: ['mmlu', 'gpqa_main_zeroshot', 'gsm8k', 'mgsm_direct_zh'] Evaluating generation: 100%|██████████| 40/40 [30:46<00:00, 46.16s/it] Evaluating loglikelihood: 100%|██████████| 4640/4640 [04:04<00:00, 18.98it/s] | Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr| |---------------------------------------|------:|-----------------|-----:|-----------|---|-----:|---|-----:| |gpqa_main_zeroshot | 1|none | 0|acc |↑ |0.2500|± |0.0993| | | |none | 0|acc_norm |↑ |0.2500|± |0.0993| |gsm8k | 3|flexible-extract | 0|exact_match|↑ |0.0000|± |0.0000| | | |strict-match | 0|exact_match|↑ |0.0000|± |0.0000| |mgsm_direct_zh | 3|flexible-extract | 0|exact_match|↑ |0.0500|± |0.0500| | | |remove_whitespace| 0|exact_match|↑ |0.0000|± |0.0000| |mmlu | 2|none | |acc |↑ |0.2640|± |0.0130| | - humanities | 2|none | |acc |↑ |0.2731|± |0.0277| | - formal_logic | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_european_history | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - high_school_us_history | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_world_history | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - international_law | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - jurisprudence | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - logical_fallacies | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - moral_disputes | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - moral_scenarios | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - philosophy | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - prehistory | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_law | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - world_religions | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - other | 2|none | |acc |↑ |0.2923|± |0.0283| | - business_ethics | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - clinical_knowledge | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_medicine | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - global_facts | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - human_aging | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - management | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - marketing | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - medical_genetics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - miscellaneous | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - nutrition | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - professional_accounting | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_medicine | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - virology | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - social sciences | 2|none | |acc |↑ |0.2458|± |0.0274| | - econometrics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_geography | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - high_school_psychology | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - human_sexuality | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - professional_psychology | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - public_relations | 1|none | 0|acc |↑ |0.1000|± |0.0688| | - security_studies | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - sociology | 1|none | 0|acc |↑ |0.5000|± |0.1147| | - us_foreign_policy | 1|none | 0|acc |↑ |0.4000|± |0.1124| | - stem | 2|none | |acc |↑ |0.2500|± |0.0218| | - abstract_algebra | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - anatomy | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - astronomy | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - college_biology | 1|none | 0|acc |↑ |0.6000|± |0.1124| | - college_chemistry | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - college_computer_science | 1|none | 0|acc |↑ |0.3500|± |0.1094| | - college_mathematics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - college_physics | 1|none | 0|acc |↑ |0.2000|± |0.0918| | - computer_security | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - conceptual_physics | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - electrical_engineering | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - elementary_mathematics | 1|none | 0|acc |↑ |0.2500|± |0.0993| | - high_school_biology | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_chemistry | 1|none | 0|acc |↑ |0.4500|± |0.1141| | - high_school_computer_science | 1|none | 0|acc |↑ |0.3000|± |0.1051| | - high_school_mathematics | 1|none | 0|acc |↑ |0.0500|± |0.0500| | - high_school_physics | 1|none | 0|acc |↑ |0.1500|± |0.0819| | - high_school_statistics | 1|none | 0|acc |↑ |0.0500|± |0.0500| | - machine_learning | 1|none | 0|acc |↑ |0.1500|± |0.0819| - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.05 - MMLU : 0.26 ✓ Original: - GPQA : 0.40 - GSM8k : 0.30 - MGSM : 0.50 - MMLU : 0.42 ✓ Collapsed: - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.00 - MMLU : 0.25 ✓ Healed: - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.05 - MMLU : 0.27 ✓ Dampened: - GPQA : 0.25 - GSM8k : 0.00 - MGSM : 0.05 - MMLU : 0.26 〄 Testing cascading... teacher_to_student=[(14, 13)] layer_idx=13 epoch= 0 avg_loss= 0.88 elp_train= 119.60 └ test output: ["I'm sorry, but I can't assist with that request. Please provide more details or clarify your"] array(-0.147721, dtype=float32) array(0.427817, dtype=float32) teacher_to_student=[(16, 14)] layer_idx=14 epoch= 0 avg_loss= 2.13 elp_train= 81.87 └ test output: ["I'm sorry, I can't help with that request. Let me know if you have any questions"] array(-0.129033, dtype=float32) array(0.54482, dtype=float32) ✓ Cascaded: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, so I need to write a story about Einstein. Let me think about the key elements. Einstein is a famous scientist, right? He was a genius, and I should highlight his contributions. Maybe start with his early life, how he became a scientist, and his impact. Maybe include some challenges, like his family struggles, and how he overcame them. I should make it relatable. Let's see... Maybe his early days in Vienna, his family's struggles, and his later achievements. How does he overcome his personal challenges? Maybe his family had to move to another place, and he had to work hard. I need to show his growth and how he overcame obstacles. Maybe include some moments where he faced adversity, like a personal loss or a scientific challenge. Let me think about the structure. Start with his early life, then his career, his personal life, and his legacy. Maybe end with his impact on the world. Let me outline the main points: his early life, his career, personal struggles, and his legacy. I need to make it engaging. Maybe include some specific details, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to show his growth and how he overcame these challenges. Let me think about the flow. Start with his early life, then his career, then his personal life, and his legacy. Maybe end with his impact on the world. I need to make it relatable. Let me think about the structure. Maybe start with his early life, then his career, then his personal life, and his legacy. I need to make it engaging. Let me think about the details. Maybe include specific moments, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the structure. Maybe start with his early life, then his career, then his personal life, and his legacy. I need to make it engaging. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the structure. Maybe start with his early life, then his career, then his personal life, and his legacy. I need to make it engaging. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another place, and he had to work hard. I need to make it relatable. Let me think about the structure. Maybe start with his early life, then his career, then his personal life, and his legacy. I need to make it engaging. Let me think about the details. Maybe include specific events, like his time in France, his work in the lab, and his later years. How does he overcome his personal challenges? Maybe he had to move to another country, like the U.S., and faced hardships. Maybe his family had to move to another └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 169.8 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 54.7 tokens/sec (1024 tokens in 18.7s) └───────────────────────────────────────────────────────────────────────┘ [──────────────────────────────] Processing model.layers.14.layer.mlp.up_proj... -> Found 938 candidates. -> Subtracting 70% of rank 206 (Sim: 0.1825) [──────────────────────────────] Processing model.layers.14.layer.mlp.down_proj... -> Found 989 candidates. -> Subtracting 70% of rank 696 (Sim: 0.1513) [──────────────────────────────] Processing model.layers.14.layer.mlp.gate_proj... -> Found 996 candidates. -> Subtracting 70% of rank 399 (Sim: 0.1424) [──────────────────────────────] Processing model.layers.14.layer.self_attn.o_proj... -> Found 990 candidates. -> Subtracting 70% of rank 689 (Sim: 0.1575) [──────────────────────────────] Processing model.layers.14.layer.self_attn.v_proj... -> Found 594 candidates. -> Subtracting 70% of rank 271 (Sim: 0.2613) [──────────────────────────────] Processing model.layers.14.layer.self_attn.k_proj... -> Found 936 candidates. -> Subtracting 70% of rank 361 (Sim: 0.1717) [──────────────────────────────] Processing model.layers.14.layer.self_attn.q_proj... -> Found 996 candidates. -> Subtracting 70% of rank 423 (Sim: 0.1336) [──────────────────────────────] Processing model.layers.13.layer.mlp.up_proj... -> Found 910 candidates. -> Subtracting 70% of rank 516 (Sim: 0.1967) [──────────────────────────────] Processing model.layers.13.layer.mlp.down_proj... -> Found 979 candidates. -> Subtracting 70% of rank 960 (Sim: 0.1597) [──────────────────────────────] Processing model.layers.13.layer.mlp.gate_proj... -> Found 982 candidates. -> Subtracting 70% of rank 345 (Sim: 0.1456) [──────────────────────────────] Processing model.layers.13.layer.self_attn.o_proj... -> Found 1017 candidates. -> Subtracting 70% of rank 881 (Sim: 0.1473) [──────────────────────────────] Processing model.layers.13.layer.self_attn.v_proj... -> Found 601 candidates. -> Subtracting 70% of rank 248 (Sim: 0.2690) [──────────────────────────────] Processing model.layers.13.layer.self_attn.k_proj... -> Found 960 candidates. -> Subtracting 70% of rank 430 (Sim: 0.1545) [──────────────────────────────] Processing model.layers.13.layer.self_attn.q_proj... -> Found 994 candidates. -> Subtracting 70% of rank 443 (Sim: 0.1417) [──────────────────────────────] Successfully healed and merged 14 layers. ✓ Dampened: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> Okay, I need to write a story about Einstein. Let's start with some basic elements. Einstein is a famous scientist, right? He was a genius, and I should highlight his contributions. Maybe include some key moments, like his work on relativity and his impact on science. How to make it engaging? Maybe start with a relatable character, like a person who's curious and wants to understand the universe. Maybe set it in a small town, like a place Einstein is known for. The story should show his journey from a young boy to a scientist, maybe with some challenges. I need to make it relatable and show his growth. Maybe include some specific events, like his discovery of the theory of relativity. How to make it engaging? Maybe show his struggles, like a young boy who is scared, and how he overcomes obstacles. Maybe include a moment where he makes a breakthrough, like when he realizes the theory of relativity. The story should have a positive ending, showing that Einstein's work has changed the world. I need to make sure the story flows well and has a good message. Let me think about the structure. Start with a relatable character, maybe a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some challenges, like a moment where he faces a problem, and he overcomes it. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and shows his growth. Maybe include some specific details, like his early days in a small town, his struggles, and his eventual success. I should also make sure the story is clear and has a good message. Let me think about the elements. Maybe start with a simple character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. Let me think about the structure. Start with a relatable character, maybe a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. Let me think about the elements. Maybe start with a simple character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. Let me think about the elements. Maybe start with a simple character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. I think this is a good start. Now, I need to flesh out the details. Maybe start with a relatable character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. Let me think about the elements. Maybe start with a simple character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has a good message. I think this is a good start. Now, I need to think about the details. Maybe start with a simple character, like a young boy who is curious and wants to understand the universe. Then show his journey from a young boy to a scientist, highlighting his contributions. Maybe include some specific events, like his discovery of the theory of relativity. The story should have a positive ending, showing that his work has changed the world. I need to make sure the story is engaging and has └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 248.3 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 136.9 tokens/sec (1024 tokens in 7.5s) └───────────────────────────────────────────────────────────────────────┘ 〄 Testing RetNet... ✓ Colapsed: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ ?? ???????? ?...???????????????????????????????????????????????????????????????????????????????????????? └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 140.2 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 104.3 tokens/sec (100 tokens in 1.0s) └───────────────────────────────────────────────────────────────────────┘ teacher_to_student=[(16, 13)] layer_idx=13 epoch= 0 avg_loss= 2.93 elp_train= 86.67 └ test output: ["I'm sorry for the confusion. I'm sorry for the confusion. I'm sorry for the confusion"] array(-0.060288, dtype=float32) array(0.544353, dtype=float32) ✓ Healed: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> </think> Certainly! Here is a story about Einstein, a brilliant scientist who made groundbreaking contributions to modern physics. Here's a brief summary: Einstein, born in 1904, was a genius in the field of physics. He made significant contributions to the theory of relativity, including the theory of general relativity, which revolutionized our understanding of space and time. His work laid the foundation for modern physics, and he became a celebrated figure in the scientific community. His ideas have had a lasting impact on the field of science and technology. Einstein's work continues to be studied and celebrated, and his legacy is recognized in the scientific community. His contributions have helped us understand the universe and the cosmos. Einstein's story is one of remarkable achievement and innovation. └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 185.7 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 300.9 tokens/sec (1024 tokens in 3.4s) └───────────────────────────────────────────────────────────────────────┘ [──────────────────────────────] Processing model.layers.13.layer.mlp.up_proj... -> Found 915 candidates. -> Subtracting 70% of rank 400 (Sim: 0.1813) [──────────────────────────────] Processing model.layers.13.layer.mlp.down_proj... -> Found 978 candidates. -> Subtracting 70% of rank 776 (Sim: 0.1573) [──────────────────────────────] Processing model.layers.13.layer.mlp.gate_proj... -> Found 981 candidates. -> Subtracting 70% of rank 431 (Sim: 0.1483) [──────────────────────────────] Processing model.layers.13.layer.self_attn.o_proj... -> Found 1015 candidates. -> Subtracting 70% of rank 930 (Sim: 0.1572) [──────────────────────────────] Processing model.layers.13.layer.self_attn.v_proj... -> Found 605 candidates. -> Subtracting 70% of rank 209 (Sim: 0.2631) [──────────────────────────────] Processing model.layers.13.layer.self_attn.k_proj... -> Found 962 candidates. -> Subtracting 70% of rank 507 (Sim: 0.1587) [──────────────────────────────] Processing model.layers.13.layer.self_attn.q_proj... -> Found 994 candidates. -> Subtracting 70% of rank 230 (Sim: 0.1341) [──────────────────────────────] Successfully healed and merged 7 layers. ✓ Dampened: ┌────────────────────────────── Inp 00000 ──────────────────────────────┐ <|im_start|>user Write a story about Einstein <|im_end|> <|im_start|>assistant └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Out 00000 ──────────────────────────────┐ <think> </think> Certainly! Here is a story about Einstein, a brilliant scientist who made groundbreaking contributions to modern physics. Here's a brief summary: Einstein, born in 1904, was a genius in the field of physics. He made significant contributions to the theory of relativity, including the theory of general relativity, which revolutionized our understanding of space and time. His work laid the foundation for modern physics, and he became a celebrated figure in the scientific community. His ideas have had a lasting impact on the field of science and technology. Einstein's work continues to be studied and celebrated, and his legacy is recognized in the scientific community. His contributions have helped us understand the universe and the cosmos. Einstein's story is one of both brilliance and innovation, and his legacy remains one of the most important achievements in science. └───────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────── Benchmark ──────────────────────────────┐ Prompt processing: 142.7 tokens/sec ( 14 tokens in 0.1s) Tokens generation: 335.2 tokens/sec (1024 tokens in 3.1s) └───────────────────────────────────────────────────────────────────────┘
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rcrlm-0.0.2a6.tar.gz.
File metadata
- Download URL: rcrlm-0.0.2a6.tar.gz
- Upload date:
- Size: 57.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
225e086b0a15fc1186cb03fd261e6b21312a8ad56ae3a75aff06e11b3b092b02
|
|
| MD5 |
38c4e1692092ce3b7e4ac208d5bcd8af
|
|
| BLAKE2b-256 |
f0c8dc9afe5af563e0c00686e3868bbe5bc1b2c51a2289c0c6b75017bf86b243
|
File details
Details for the file rcrlm-0.0.2a6-py3-none-any.whl.
File metadata
- Download URL: rcrlm-0.0.2a6-py3-none-any.whl
- Upload date:
- Size: 36.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15b5f9686d795b293cb861f3e7fc2770a8a58b746c8dee263633067677bdbce6
|
|
| MD5 |
2bc458442512231a72c57f0b69068768
|
|
| BLAKE2b-256 |
6212668e9945fbcfae357d290e57dad021d3505a7511089f508a33ba8c863ec7
|