Pseudo-random seed data generation for ML/LLM training diversity

These details have not been verified by PyPI

Project description

liquidrandom

Pseudo-random seed data for ML/LLM training diversity.

When using LLMs to generate training data, outputs tend to be repetitive and lack variety. liquidrandom solves this by providing a large pool of diverse, pre-generated seed data (personas, jobs, scenarios, etc.) that you can inject into your prompts to steer generation toward more varied outputs.

Installation

pip install liquidrandom
# or
uv add liquidrandom

Quick Start

import liquidrandom

# Get a random persona to inject into your LLM prompt
persona = liquidrandom.persona()
print(persona)
# Alice is a 30-year-old female from Canada. They work as an engineer. ...

# Get a random coding task
task = liquidrandom.coding_task()
print(task)
# [Python, medium] Implement a trie: Build a trie data structure ...

Available Categories

Function	Returns	Description
`liquidrandom.persona()`	`Persona`	Random personas with name, age, gender, occupation, nationality, personality traits, background
`liquidrandom.job()`	`Job`	Professions with title, industry, description, required skills, experience level
`liquidrandom.coding_task()`	`CodingTask`	Programming challenges with title, language, difficulty, description, constraints, expected behavior
`liquidrandom.math_category()`	`MathCategory`	Math categories with name, field, description, example problems
`liquidrandom.writing_style()`	`WritingStyle`	Writing styles with name, tone, characteristics, description
`liquidrandom.scenario()`	`Scenario`	Real-world scenarios with title, context, setting, stakes, description
`liquidrandom.domain()`	`Domain`	Knowledge domains with name, parent field, description, key concepts
`liquidrandom.science_topic()`	`ScienceTopic`	Scientific topics with name, field, subfield, description
`liquidrandom.language()`	`Language`	Languages/locales with name, region, register, script, cultural notes
`liquidrandom.reasoning_pattern()`	`ReasoningPattern`	Reasoning approaches with name, category, description, when to use
`liquidrandom.emotional_state()`	`EmotionalState`	Emotional states with name, intensity, valence, behavioral description
`liquidrandom.instruction_complexity()`	`InstructionComplexity`	Instruction complexity levels with level, ambiguity, description, example

Usage Example

Use liquidrandom to add diversity to your LLM data generation pipeline:

import liquidrandom
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="<OPENROUTER_API_KEY>",
)

persona = liquidrandom.persona()
style = liquidrandom.writing_style()
topic = liquidrandom.science_topic()

prompt = f"""You are {persona}
Write in the following style: {style}
Explain the following topic: {topic}"""

response = client.chat.completions.create(
    model="liquid/lfm-2-24b-a2b",
    messages=[{"role": "user", "content": prompt}],
)

Each call to a liquidrandom function returns a typed dataclass. You can use them directly in f-strings (via __str__) or access their individual fields:

persona = liquidrandom.persona()
print(persona.name)               # "Alice"
print(persona.age)                 # 30
print(persona.personality_traits)  # ["curious", "patient"]

How It Works

The dataset contains 340,000+ samples across 12 categories, generated using hierarchical taxonomy trees with LLM-based quality validation and fuzzy deduplication.

Seed data is hosted on HuggingFace (mlech26l/liquidrandom-data) as zstd-compressed Parquet files. On first use, only the requested category file is downloaded and cached locally. Subsequent calls use the cached data.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

Apr 15, 2026

0.3.4

Apr 15, 2026

0.3.3

Apr 2, 2026

0.3.2

Mar 19, 2026

0.3.1

Mar 19, 2026

0.3.0

Mar 19, 2026

0.2.0

Mar 10, 2026

This version

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

liquidrandom-0.1.0.tar.gz (86.7 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

liquidrandom-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file liquidrandom-0.1.0.tar.gz.

File metadata

Download URL: liquidrandom-0.1.0.tar.gz
Upload date: Mar 4, 2026
Size: 86.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for liquidrandom-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`dee05c1aad613b4575fc7b2ed27e7feb471128ab23a563bcda2eecdb90763d62`
MD5	`c8f3b6d9dfd8f92102ffc754090b61fc`
BLAKE2b-256	`1c2fd9ede0a7f2a81592ea69d291be50586fccd0fecb291827b6622aa1b3cf4e`

See more details on using hashes here.

File details

Details for the file liquidrandom-0.1.0-py3-none-any.whl.

File metadata

Download URL: liquidrandom-0.1.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 11.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for liquidrandom-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e979fe7c6f5356012b01acfe693ee6a97b56a09f8c44342a40f05b44ae61493f`
MD5	`60844b3857c9ac2eed52d523b21830ec`
BLAKE2b-256	`a845b50af538683f10c2c56acff7f871f1729e3b33352c825e132ecab010ab1f`

See more details on using hashes here.

liquidrandom 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

liquidrandom

Installation

Quick Start

Available Categories

Usage Example

How It Works

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes