nnx-lm: A portable, pip-installable CLI for running LLMs via JAX on any hardware backend.

Project description

nnx-lm: A portable, pip-installable CLI for running LLMs via JAX on any hardware backend.

Quick Start

pip install nnx-lm
nlm -p "Give me a short introduction to large language model.\n"

<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets, so they can learn a lot. Then talk about their capabilities, like understanding context, generating coherent responses, and being able to handle various tasks. Also, mention that they're not just text

=== Benchmarks ===
Prompt processing: 28.4 tokens/sec (18 tokens in 0.6s)
Token generation: 22.8 tokens/sec (100 tokens in 4.4s)

Examples

Scan:

nlm --scan -p "Give me a short introduction to large language model.\n"

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by defining what they mean. They might be a student or someone new to the field, or maybe they just want a simple intro.

I should avoid technical terms and instead focus on the key features. Let's see, the introduction should highlight the advantages of the current understanding.

The user's example is a bit of the structure where the assistant has to generate a sentence.

So, the answer is

=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 76.0 tokens/sec (100 tokens in 1.3s)

Batch:

nlm -p "Give me a short introduction to large language model.\n"  "#write a quick sort algorithm\n"

=== Input ===
#write a quick sort algorithm

=== Output===
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    left = [x for x in arr if x < pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + [pivot] + quicksort(right)

arr = [5, 3, 8, 1, 4, 2]
print(quicksort(arr))

#this code is not working

=== Input ===
Give me a short introduction to large language model.

=== Output===
Large language models (LLMs) are artificial intelligence models that can understand and generate human language. They are trained on vast amounts of text data to understand and generate human language. LLMs are used in various applications, such as chatbots, translation, and content creation. They are also used in other areas like customer service, customer support, and even in creative writing. LLMs are becoming more advanced and are capable of understanding and generating more complex language. They are also being used in research

=== Benchmarks ===
Prompt processing: 31.6 tokens/sec (20 tokens in 0.6s)
Token generation: 45.0 tokens/sec (200 tokens in 4.4s)

Batched scan:

nlm --scan -p "Give me a short introduction to large language model.\n" "#write a quick sort algorithm\n"

=== Benchmarks ===
Prompt processing: 32.0 tokens/sec (20 tokens in 0.6s)
Token generation: 135.7 tokens/sec (200 tokens in 1.5s)

Jit:

nlm --jit -p "Give me a short introduction to large language model.\n"

UserWarning: Some donated buffers were not usable: ShapedArray(int32[1,1]), ShapedArray(float32[1,1]), ShapedArray(bfloat16[1,1,1,118]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bfloat16[1,8,118,128]), ShapedArray(bool[1,1]).
Donation is not implemented for ('METAL',).
See an explanation at https://jax.readthedocs.io/en/latest/faq.html#buffer-donation.
  warnings.warn("Some donated buffers were not usable:"

<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can

=== Input ===
<|im_start|>user
Give me a short introduction to large language model.<|im_end|>
<|im_start|>assistant

=== Output===
<think>
Okay, the user wants a short introduction to a large language model. Let me start by recalling what I know about LLMs. They're big language models, right? So I should mention their ability to understand and generate text. Maybe start with the basics: they're trained on massive datasets. Then talk about their capabilities, like understanding context and generating coherent responses. Also, highlight their applications in various fields. Oh, and maybe mention that they're not just text generators but can handle

=== Benchmarks ===
Prompt processing: 28.3 tokens/sec (18 tokens in 0.6s)
Token generation: 18.0 tokens/sec (100 tokens in 5.6s)

Python:

import nnxlm as nl
m = nl.load('Qwen/Qwen3-0.6B')
nl.generate(*m, ["#write a quick sort algorithm\n", "Give me a short introduction to large language model.\n"])

Test:

nl.main.test()

Project details

Release history Release notifications | RSS feed

0.0.4

May 9, 2025

This version

0.0.3

May 9, 2025

0.0.2

May 8, 2025

0.0.2a0 pre-release

May 8, 2025

0.0.1

May 7, 2025

0.0.1a5 pre-release

May 7, 2025

0.0.1a4 pre-release

May 7, 2025

0.0.1a3 pre-release

May 7, 2025

0.0.1a2 pre-release

May 4, 2025

0.0.1a1 pre-release

May 4, 2025

0.0.1a0 pre-release

May 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nnx_lm-0.0.3.tar.gz (16.0 kB view details)

Uploaded May 9, 2025 Source

File details

Details for the file nnx_lm-0.0.3.tar.gz.

File metadata

Download URL: nnx_lm-0.0.3.tar.gz
Upload date: May 9, 2025
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for nnx_lm-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`31320fe16671c15b23afff9440aa710b475e5249366cb958cafd9e6eca078294`
MD5	`7dca822798f3ad04026831439a0fafbe`
BLAKE2b-256	`71fc1255c32d47047d9824d97f968c0d7a8bfc1cc9870d3d721e127bcc54ff76`

See more details on using hashes here.

nnx-lm 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

nnx-lm: A portable, pip-installable CLI for running LLMs via JAX on any hardware backend.

Quick Start

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes