Skip to main content

The official Cartesia PyTorch library.

Project description


license: apache-2.0 language:

  • en datasets:
  • allenai/dolma tags:
  • rene
  • mamba
  • cartesia

Model Card for Rene

Rene is a 1.3 billion-parameter language model trained by Cartesia. Rene has a hybrid architecture based on Mamba-2, with feedforward and sliding window attention layers interspersed. It uses the allenai/OLMo-1B-hf tokenizer. Rene was pretrained on 1.5 trillion tokens of the Dolma-1.7 dataset. For more details, see our blog post.

Usage

Installation

The Rene model depends on the cartesia-pytorch package, which can be installed with pip as follows:

pip install --no-binary :all: cartesia-pytorch

Generation example

from cartesia_pytorch import ReneLMHeadModel
from transformers import AutoTokenizer

model = ReneLMHeadModel.from_pretrained("cartesia-ai/Rene-v0.1-1.3b-pytorch").half().cuda()
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1B-hf")
in_message = ["Rene Descartes was"]
inputs = tokenizer(in_message, return_tensors="pt")
outputs = model.generate(inputs.input_ids.cuda(), max_length=50, top_k=100, top_p=0.99)
out_message = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(out_message)
# Example output: "Rene Descartes was a French mathematician, philosopher, and scientist. Descartes is famously credited for creating the Cartesian coordinate system: a 3 dimensional representation of points, vectors, and directions. This work is, for the most part" ...

Evaluation example

You can use our cartesia_lm_eval wrapper around the Language Model Evaluation Harness to evaluate our model on standard text benchmarks. Example command (clone this repo and run the below from within the cartesia-pytorch directory):

python -m evals.cartesia_lm_eval --model rene_ssm --model_args pretrained=cartesia-ai/Rene-v0.1-1.3b-pytorch,trust_remote_code=True --trust_remote_code --tasks copa,hellaswag,piqa,arc_easy,arc_challenge,winogrande,openbookqa --cache_requests true --batch_size auto:4 --output_path outputs/rene_evals/

Results on common benchmarks

Model Params (B) Train Tokens COPA HellaSwag MMLU (5-shot) PIQA ARC-e ARC-c WinoGrande OpenBookQA Average
allenai/OLMo-1B-hf 1.2 3.0 82.0 62.9 26.2 75.1 57.4 31.1 60.0 36.2 53.9
apple/OpenELM-1_1B 1.1 1.5 81.0 64.8 27.1 75.6 55.4 32.3 61.9 36.2 54.3
state-spaces/mamba2-1.3b 1.3 0.3 82.0 60.0 25.8 73.7 64.2 33.3 61.0 37.8 54.7
microsoft/phi-1_5 1.4 0.15 79.0 62.6 42.5 75.5 73.2 48.0 72.8 48.0 62.7
Qwen/Qwen2-1.5B 1.5 7.0 80.0 65.4 56.0 75.5 60.4 35.0 65.8 36.4 59.3
RWKV/rwkv-6-world-1b6 1.6 1.1 84.0 58.3 25.9 73.5 56.7 34.1 60.0 37.4 53.7
stabilityai/stablelm-2-1_6b 1.6 4.0 86.0 69.0 38.1 76.7 68.1 38.9 63.6 38.8 59.9
HuggingFaceTB/SmolLM-1.7B 1.7 1.0 76.0 65.8 29.9 76.1 73.5 46.4 60.9 42.0 58.8
h2oai/h2o-danube2-1.8b-base 1.8 3.0 82.0 72.4 39.9 77.3 69.0 39.9 63.9 41.4 60.7
google/recurrentgemma-2b 2.7 2.0 62.0 61.8 32.3 68.8 46.4 29.9 57.1 29.0 48.4
cognitivecomputations/TinyDolphin-2.8.1-1.1b 1.1 71.0 59.9 25.7 73.1 55.8 33.0 59.7 36.6 51.9
cartesia-ai/Rene-v0.1-1.3b-pytorch (OUR MODEL) 1.3 1.5 82.0 69.4 32.6 77.5 61.7 34.4 62.9 39.2 57.5

Bias, Risks, and Limitations

Rene is a pretrained base model which has not undergone any alignment or instruction tuning, and therefore does not have any moderation or safety guarantees. Users should implement appropriate guardrails and moderation mechanisms based on their particular needs in order to ensure responsible and ethical usage.

About Cartesia

At Cartesia, we're building real-time multimodal intelligence for every device.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cartesia_pytorch-0.0.1.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

cartesia_pytorch-0.0.1-py2.py3-none-any.whl (9.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file cartesia_pytorch-0.0.1.tar.gz.

File metadata

  • Download URL: cartesia_pytorch-0.0.1.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for cartesia_pytorch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 883a3e598ca1c2333a5f0fb12e178ee1d6ba0df46668f52ad419df021e677d5f
MD5 cc301c414a14db515ab4807820ff4eb8
BLAKE2b-256 a8e65293bc72dc51ea9ea1b62a9bb67eea5814329735eb207c7df4d8e4dfad64

See more details on using hashes here.

File details

Details for the file cartesia_pytorch-0.0.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for cartesia_pytorch-0.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a62e26103e5c9e24a0a22d340ebd612788c9ff7340a7c46b4661ebac546ea6d6
MD5 1fe777326b3fa0b80f656deeb640b088
BLAKE2b-256 f732759595523665d799103c22abc0f56042fa8676f6d98963e385d91dcb9cd2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page