Skip to main content

LAM (Linear Attention Models) — Deterministic recall with SAID Crystalline Attention. SAID-LAM-v1 is Linear Attention Memory.

Project description

SAID-LAM-v1

LAM (Linear Attention Models) — a new family beyond semantic transformers. SAID‑LAM‑v1 is Linear Attention Memory.

"The answer IS X. Because I Said so." — At ANY scale.

SAID-LAM-v1 is a 23.85M parameter embedding model with O(n) linear complexity. Where standard transformers rely on O(n²) attention that slows and runs out of memory as context grows, LAM models replace this entirely with a recurrent state update that runs in strict O(n) time and constant memory, defining a new direction separate from transformer-based semantic models.

Distilled from all-MiniLM-L6-v2, while extending context from 512 tokens to 32K+ tokens — and demonstrating 100% recall on LongEmbed Needle-in-a-Haystack benchmarks across evaluated scales.

Model Details

Property Value
Model Category LAM (Linear Attention Models) — SAID-LAM-v1: Linear Attention Memory
Parameters 23,848,788
Embedding Dimension 384
Max Context Length 32,768 tokens
Memory Usage ~95 MB
Complexity O(n) linear — time AND memory
Framework Pure Rust (Candle) — no PyTorch required
Package Size ~6 MB binary + 92 MB weights (auto-downloaded)
License Apache 2.0 (weights) / Proprietary (code)

Performance

O(n) Linear Scaling

LAM scales linearly with input length — empirically validated up to 1M words with R²=1.000, with memory growth from ~0 MB at small inputs up to ~15 MB at 1M words:

STS-B Semantic Quality

Spearman r = 0.8181 on the STS-B test set (1,379 sentence pairs):

MTEB LongEmbed Benchmarks

Combined LongEmbed score (SAID-LAM-v1, average over all six tasks): ~91.0%.

Task Score
LEMBNeedleRetrieval 100.00%
LEMBPasskeyRetrieval 100.00%
LEMBNarrativeQARetrieval 69.93%
LEMBSummScreenFDRetrieval 96.59%
LEMBQMSumRetrieval 85.76%
LEMBWikimQARetrieval 93.98%

LongEmbed SOTA comparison

Task SAID-LAM-v1 (23M) Global SOTA
LEMBNeedleRetrieval 100.00% 100.00%
LEMBPasskeyRetrieval 100.00% 100.00%
LEMBNarrativeQARetrieval 69.93% 66.10%
LEMBSummScreenFDRetrieval 96.59% 99.10%
LEMBQMSumRetrieval 85.76% 83.70%
LEMBWikimQARetrieval 93.98% 91.20%

Install

pip install said-lam

CUDA (GPU) wheels are published under a separate PyPI project:

pip install said-lam-gpu

To upgrade an existing installation to the latest version:

pip install --upgrade said-lam
# or for the GPU version:
pip install --upgrade said-lam-gpu

To install a specific older version:

pip install said-lam==1.0.2
# or for the GPU version:
pip install said-lam-gpu==1.0.2

To uninstall the package:

pip uninstall said-lam
# or for the GPU version:
pip uninstall said-lam-gpu

Note: Both packages use the exact same import said_lam namespace. Please ensure you only have one of them installed at a time to avoid conflicts.

On first use, model weights (~92 MB) are automatically downloaded from HuggingFace and cached locally. The pip package itself is only ~6 MB (compiled Rust binary — weights are NOT bundled).

Drop-in sentence-transformers Replacement

LAM is a drop-in replacement for sentence-transformers. Same API, same output format.

Before (sentence-transformers):

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(["Hello world", "Semantic search"])
# embeddings.shape == (2, 384), float32, L2-normalized
similarity = embeddings[0] @ embeddings[1]

After (LAM):

from said_lam import LAM

model = LAM("SAIDResearch/SAID-LAM-v1")
embeddings = model.encode(["Hello world", "Semantic search"])
# embeddings.shape == (2, 384), float32, L2-normalized
similarity = embeddings[0] @ embeddings[1]

Same output format, same shapes, same downstream compatibility. Everything that works with sentence-transformers embeddings (FAISS, ChromaDB, Pinecone, numpy dot product) works with LAM embeddings.

Property sentence-transformers LAM
Output (N, 384) ndarray, float32 (N, 384) ndarray, float32
L2-normalized Yes (default) Yes (default)
Cosine sim = dot product Yes Yes
Max tokens 512 12K (encode) / 32K (SCA)
Complexity O(n²) attention O(n) linear
Framework PyTorch (~2 GB) Rust (~6 MB)
Memory at 1M tokens (no chunking) OOM / impractical ~15 MB

Usage

FREE Tier — Embeddings (up to 12K tokens)

from said_lam import LAM

model = LAM("SAIDResearch/SAID-LAM-v1")
embeddings = model.encode(["Hello world", "Semantic search is powerful"])
# embeddings.shape == (2, 384)

# Cosine similarity (L2-normalized by default)
similarity = embeddings[0] @ embeddings[1]
print(f"Similarity: {similarity:.4f}")

BETA SCA (SAID Crystalline Attention) — MTEB testing only

BETA SCA (SAID Crystalline Attention) is activated for MTEB testing only, to enable perfect LongEmbed context retrieval (e.g. LEMBNeedleRetrieval, LEMBPasskeyRetrieval). Use the MTEB evaluation flow; no signup or activation required for benchmarking.

Common Patterns

Similarity Between Texts

Embeddings are L2-normalized — cosine similarity is just a dot product:

emb = model.encode(["The cat sat on the mat", "A kitten rested on the rug"])
similarity = float(emb[0] @ emb[1])
print(f"Similarity: {similarity:.4f}")  # ~0.5761

Batch Similarity Matrix

import numpy as np

queries = ["How is the weather?", "What time is it?"]
candidates = ["Is it raining today?", "Do you have the time?", "Nice shoes"]

emb_q = model.encode(queries)      # (2, 384)
emb_c = model.encode(candidates)   # (3, 384)
sim_matrix = emb_q @ emb_c.T       # (2, 3)

Semantic Search Over a Corpus (FREE Tier)

import numpy as np

corpus = ["Python is a language", "The Eiffel Tower is in Paris",
          "ML uses neural networks", "Speed of light is 299792458 m/s"]
corpus_emb = model.encode(corpus)

query_emb = model.encode(["fastest thing in physics"])
scores = (query_emb @ corpus_emb.T)[0]
ranked = np.argsort(scores)[::-1]
for i in ranked:
    print(f"  {scores[i]:.4f}  {corpus[i]}")

Matryoshka Dimensionality Reduction

emb_128 = model.encode(["Hello world"], output_dim=128)  # (1, 128)
emb_64  = model.encode(["Hello world"], output_dim=64)   # (1, 64)
# Automatically truncated and re-normalized to unit length

Example impact on STS12 (cosine main_score, GPU):

dim STS12 score rel. to 384d
384 0.7493 100.0%
256 0.7472 99.7%
128 0.7459 99.6%
64 0.7327 97.8%

Token Limits

  • encode(): Up to 12,000 tokens per text. Returns embeddings for your RAG.
  • index() + search(): Up to 32,768 tokens per text (MTEB BETA SCA — LongEmbed/MTEB testing group only). SCA streaming — no embeddings, perfect recall.

encode() — returns one embedding per input text, capped at 12K tokens:

# Each text gets one embedding — long texts are chunked at 12K tokens
embeddings = model.encode(["short text", "very long text..."])  # (2, 384)

# Use output_dim for smaller embeddings (Matryoshka)
embeddings = model.encode(["short text", "very long text..."], output_dim=128)  # (2, 128)

Long documents? encode() caps at 12K tokens. For LongEmbed benchmarks (MTEB BETA SCA testing only), index() + search() support up to 32K tokens via SCA.

MTEB Evaluation

One model, one class: use the same LAM with mteb.evaluate() (LAM implements the global MTEB encoder protocol).

from said_lam import LAM
import mteb

model = LAM("SAIDResearch/SAID-LAM-v1")
tasks = mteb.get_tasks(tasks=["LEMBNeedleRetrieval", "LEMBPasskeyRetrieval"])
results = mteb.evaluate(model=model, tasks=tasks)

API Reference

LAM(model_name_or_path, device)

Parameter Default Description
model_name_or_path "SAIDResearch/SAID-LAM-v1" Hugging Face model ID to load (default: SAIDResearch/SAID-LAM-v1), or a local directory path pointing to the model files
device None (auto) Auto-selects CUDA GPU if available, otherwise CPU

Core Methods

Method Tier Description
model.encode(sentences, output_dim=None) FREE+ Encode to embeddings (384, 256, 128, or 64 dims)
model.index(doc_id, text) MTEB Index a document for search (benchmarks)
model.search(query, top_k) MTEB Retrieve documents by query (benchmarks)
model.truncate_embeddings(emb, dim) FREE+ Matryoshka truncation (64/128/256)
model.clear() MTEB Clear indexed documents (benchmarks)
model.stats() FREE+ Model statistics

Tier System

Tier encode() (SCA) How to Get Features
FREE 12K Default encode() only — embeddings for RAG
MTEB 12K 32K Auto-detected SCA for LongEmbed retrieval (benchmarks only)
LICENSED 32K 32K Coming soon + persistent storage + cloud sync
INFINITE Unlimited Unlimited Coming soon Oracle mode

GPU Support

CPU wheels are installed by default. For GPU acceleration:

# Build from source with CUDA (Linux)
pip install maturin
maturin build --release --features cuda

# Metal (macOS Apple Silicon)
maturin build --release --features metal

Model Files

File Size Description
model.safetensors 92 MB Model weights (SafeTensors format)
config.json 1 KB Model configuration
tokenizer.json 467 KB Tokenizer vocabulary
tokenizer_config.json 350 B Tokenizer settings
vocab.txt 232 KB WordPiece vocabulary
special_tokens_map.json 112 B Special token definitions

Citation

@misc{said-lam-v1,
  title={SAID-LAM-v1: Linear Attention Memory},
  author={SAIDResearch},
  year={2026},
  url={https://saidhome.ai},
  note={23.85M parameter embedding model with O(n) linear complexity.
        384-dim embeddings, 32K context window, 100% NIAH recall.
        Distilled from all-MiniLM-L6-v2. Pure Rust (Candle) implementation.}
}

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

said_lam-1.0.3-cp312-cp312-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.12Windows x86-64

said_lam-1.0.3-cp312-cp312-manylinux_2_35_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

said_lam-1.0.3-cp312-cp312-manylinux_2_35_aarch64.whl (3.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ ARM64

said_lam-1.0.3-cp312-cp312-macosx_11_0_arm64.whl (3.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

said_lam-1.0.3-cp311-cp311-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11Windows x86-64

said_lam-1.0.3-cp311-cp311-manylinux_2_35_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ x86-64

said_lam-1.0.3-cp311-cp311-manylinux_2_35_aarch64.whl (3.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ ARM64

said_lam-1.0.3-cp311-cp311-macosx_11_0_arm64.whl (3.1 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

said_lam-1.0.3-cp310-cp310-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10Windows x86-64

said_lam-1.0.3-cp310-cp310-manylinux_2_35_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.35+ x86-64

said_lam-1.0.3-cp310-cp310-manylinux_2_35_aarch64.whl (3.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.35+ ARM64

said_lam-1.0.3-cp310-cp310-macosx_11_0_arm64.whl (3.1 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file said_lam-1.0.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: said_lam-1.0.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for said_lam-1.0.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ca87921d99b198719206b7f48745417d9c6f471e612b6094bfe468b4566f8fc8
MD5 3e1ee33e22beff630886044b0d6cf63c
BLAKE2b-256 8c0f071fc66d6cce9707c3c02ce2a170adfb63935d743a59c6cdc95e6e6e2622

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp312-cp312-win_amd64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 5e5bab274fdba172368cde6f281ef2983f7e26c0f5045bb8b8bb583ac2e4de7f
MD5 1665bc6b565615a700ad8c1dc5515b25
BLAKE2b-256 ec821a96d529a15a5a708aaa214a770ec47bac0c984d2088e9658a37feda5419

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp312-cp312-manylinux_2_35_x86_64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp312-cp312-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp312-cp312-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 c868d87cf085768e19b7cbc6a4d3e3fd050c257e1545d11e6f1430386fe25251
MD5 3faf385d3be8d73f9a48e52a09bcdd01
BLAKE2b-256 53d27d2b1e6e2f148bfe0e29cedae7e1560d5af5240a9c23d08d537603a444f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp312-cp312-manylinux_2_35_aarch64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 74e4d290b1b0ba8e35f6ea75ea0265cc6e65b6d7cfdac8da269dcc75db32f280
MD5 4e89c86c7e675b1d44d4cc159c62ba8d
BLAKE2b-256 f1164c722e44d77c0b6926148d7641b0e392b08a23c473930fbdc5b1fbf0ead5

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: said_lam-1.0.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for said_lam-1.0.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 40f9add72025ecb8208e8b8741d6392e9c2ba54b6a69a1e50545b440346847a9
MD5 a50ea008205e4d8fecec9c5d4d8db12e
BLAKE2b-256 ccc80985618cba664be1f7f08356ee65c0c6c5ff2f86e207d0cc8bc61fcd6424

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp311-cp311-win_amd64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 f6d72be4f1d2d2fa3cb090e4bd493b240bd5ddca162d5eb9f2e9aa2178f603fb
MD5 6c2c5de25058a2bf4ad4fe93019bb0cd
BLAKE2b-256 a66fe68331da47ec79a5e1bf919560a90c9d86df4e0cd90c2bb47e6e345bd01a

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp311-cp311-manylinux_2_35_x86_64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp311-cp311-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp311-cp311-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 33d753fed186367b3dc4909c84ed92d0b8cd14949a49e14ad2d8afb7c7230772
MD5 37c630c9060d9b0ea85e97f181237953
BLAKE2b-256 f5e5fa9e48bf8194415c72ad12cec4809c49e7018cb40a659e0abf7e696b99c7

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp311-cp311-manylinux_2_35_aarch64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3e561916feb30954e70f80d4e823c2cda3773e767cb6b1f8b17a654700a53578
MD5 56d4beae43b719d9534dd9e8a0f0f432
BLAKE2b-256 fd11dab0390ccc456774fe4b21b0913cb82e7cdb6de46111a4f99f6724fe1187

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: said_lam-1.0.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for said_lam-1.0.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 bdc10928c277e0d050fb846a82a7bd2a6592bb811224f35eb42b51aca1e450d0
MD5 7b67b97402729db141090a38b8081c91
BLAKE2b-256 bc7430397a72660c33d2ca3fd4a59079902a1239815a58f91e3dec7bbfa040d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp310-cp310-win_amd64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp310-cp310-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp310-cp310-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 83426a42051d59639d6888281033bde67c0f5e13f5f38ff6d9ce530c603d22ba
MD5 b8c15b8e1284e65d85b6bd879be3e1a7
BLAKE2b-256 c87fff96548f0b4cb9ec9cc4a4bd310a3af30856eb11166bd2c6d37000c1e144

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp310-cp310-manylinux_2_35_x86_64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp310-cp310-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp310-cp310-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 582c06a3e7f1b52499688afe544682ed3c535472a2f697824093c7b8d2a69d03
MD5 44db2d8263767335267db26b0fe64ae3
BLAKE2b-256 63c18daeb768ad7851e883d8c470b89bbf47a72d515526bd25b1a32aaed2d2da

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp310-cp310-manylinux_2_35_aarch64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file said_lam-1.0.3-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for said_lam-1.0.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d1dcb5530ad39f482885f8bd05b8b15b2363fe22710a0a87115d94aaeaa29fe9
MD5 8aefa3d195fdcda5d88964293fe0b3b1
BLAKE2b-256 f47040492e5f4a4a86fd83f7703e68eac85d3a67994a55ad39ff333064b0de32

See more details on using hashes here.

Provenance

The following attestation bundles were made for said_lam-1.0.3-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on SAIDResearch/SAID-LAM-private

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page