nnx-lm: A portable, pip-installable CLI for running LLMs via JAX on any hardware backend.
Project description
nnx-lm: A portable, pip-installable CLI for running LLMs via JAX on any hardware backend.
Quick Start
pip install nnx-lm
nlm
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just
=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant
=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just text
=== Benchmarks ===
Prompt processing: 28.4 tokens/sec (18 tokens in 0.6s)
Token generation: 22.8 tokens/sec (100 tokens in 4.4s)
Examples
Scan:
nlm --scan
=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant
=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by defining what they mean. They might be a student or someone new to the field, or maybe they just want a simple intro.
I should avoid technical terms and instead focus on the key features. Let's see, the introduction should highlight the advantages of the current understanding.
The user's example is a bit of the structure where the assistant has to generate a sentence.
So, the answer is
=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 76.0 tokens/sec (100 tokens in 1.3s)
Batch:
nlm --no-format
=== Input ===
#write a quick sort algorithm
=== Output===
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
left = [x for x in arr if x < pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + [pivot] + quicksort(right)
arr = [5, 3, 8, 1, 4, 2]
print(quicksort(arr))
#this code is not working
=== Input ===
Give me a short introduction to large language model.
=== Output===
Large language models (LLMs) are artificial intelligence models that can understand and generate human language. They are trained on vast amounts of text data to understand and generate human language. LLMs are used in various applications, such as chatbots, translation, and content creation. They are also used in other areas like customer service, customer support, and even in creative writing. LLMs are becoming more advanced and are capable of understanding and generating more complex language. They are also being used in research
=== Benchmarks ===
Prompt processing: 31.6 tokens/sec (20 tokens in 0.6s)
Token generation: 45.0 tokens/sec (200 tokens in 4.4s)
Batched scan:
nlm --no-format --scan
=== Benchmarks ===
Prompt processing: 32.0 tokens/sec (20 tokens in 0.6s)
Token generation: 135.7 tokens/sec (200 tokens in 1.5s)
Jit:
nlm --jit
UserWarning: Some donated buffers were not usable: ShapedArray(int32[1,1]), ShapedArray(float32[1,1]), ShapedArray(bfloat16[1,1,1,118]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bool[1,1]).
Donation is not implemented for ('METAL',).
See an explanation at https://jax.readthedocs.io/en/latest/faq.html#buffer-donation.
warnings.warn("Some donated buffers were not usable:"
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can
=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant
=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can handle
=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 18.0 tokens/sec (100 tokens in 5.6s)
Python:
import nnxlm as nl
model, tokenizer, config = nl.load('Qwen/Qwen3-0.6B')
return nl.generate(model, tokenizer, config)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nnx_lm-0.0.2a0.tar.gz
(14.8 kB
view details)
File details
Details for the file nnx_lm-0.0.2a0.tar.gz.
File metadata
- Download URL: nnx_lm-0.0.2a0.tar.gz
- Upload date:
- Size: 14.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b5519fe38740927cecb29825ec333c7c90e60ddd707ef29536a3585e862fb83
|
|
| MD5 |
45448f7fa981e3e692e4f5a6b791e2cb
|
|
| BLAKE2b-256 |
90b23203f524e82858bae57a8f4091d338cabd79e0c1f9ead1fe0e35508f5c2d
|