Skip to main content

A library for advanced LLM sampling techniques

Project description

LLM Samplers

A Python library for advanced LLM sampling techniques, providing a collection of sophisticated sampling methods for language models.

Features

  • Temperature Scaling
  • Top-K Sampling
  • Top-P (Nucleus) Sampling
  • Min-P Sampling
  • Anti-Slop Sampling
  • XTC (Exclude Top Choices) Sampling

Installation

From PyPI

pip install llm-samplers

From Source

  1. Clone the repository:
git clone https://github.com/iantimmis/samplers.git
cd samplers
  1. Create and activate a virtual environment (recommended):
# Using venv
python -m venv .venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows

# Using uv (recommended)
uv venv
source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows
  1. Install the package in development mode:
# Using pip
pip install -e ".[dev]"  # Includes development dependencies

# Using uv (recommended)
uv pip install -e .  # uv installs dev dependencies by default

Development

Running Tests

The test suite uses pytest. To run the tests:

# Run all tests
python -m pytest tests/

# Run tests with verbose output
python -m pytest tests/ -v

# Run a specific test file
python -m pytest tests/test_temperature.py

Code Quality

This project uses Ruff for linting and code formatting. To check your code:

# Run Ruff linter
ruff check .

# Run Ruff formatter
ruff format .

The project is configured with a GitHub Action that automatically runs Ruff on all pull requests.

Usage

While the package is installed via pip install llm-samplers, you import it using llm_samplers:

from llm_samplers import TemperatureSampler, TopKSampler, TopPSampler, MinPSampler, AntiSlopSampler, XTCSampler

# Initialize a sampler
sampler = TemperatureSampler(temperature=0.7)

# Use with a HuggingFace model
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Generate text with the sampler
input_text = "Once upon a time"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output_ids = sampler.sample(model, input_ids)
generated_text = tokenizer.decode(output_ids[0])

Available Samplers

Temperature Scaling

Adjusts the "sharpness" of the probability distribution:

  • Low temperature (<1.0): More deterministic, picks high-probability tokens
  • High temperature (>1.0): More random, flatter distribution

Top-K Sampling

Considers only the 'k' most probable tokens, filtering out unlikely ones.

Top-P (Nucleus) Sampling

Selects the smallest set of tokens whose cumulative probability exceeds threshold 'p'.

Min-P Sampling

Dynamically adjusts the sampling pool size based on the probability of the most likely token.

Anti-Slop

Down-weights probabilities at word & phrase level, using backtracking to retry with adjusted probabilities.

XTC (Exclude Top Choices)

Enhances creativity by nudging the model away from its most predictable choices.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_samplers-0.1.1.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_samplers-0.1.1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_samplers-0.1.1.tar.gz.

File metadata

  • Download URL: llm_samplers-0.1.1.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for llm_samplers-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3d3a37cde8cad47f11c3680fdbabbd6ef09796029975b3945623f186e6208e5a
MD5 b385efdf620a30bc2e90b2b3cc9df955
BLAKE2b-256 99f289e35743c9a2e4860859f03ae2ed1bc31b1198b0d42e3db93114375f17d5

See more details on using hashes here.

File details

Details for the file llm_samplers-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llm_samplers-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for llm_samplers-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d721aabddb4769803474c00eef726abc8e1298b3c01a9f30f07104fd007b413f
MD5 b9a2449c5cbb1defb4c5bd4a65ebbd2e
BLAKE2b-256 bc213356dc14712e41aaf6e913e3cb05f878b61e1df338004ca36666c9c4e3fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page