Generate ambient music from text. Locally. No GPU required.
Project description
LatentScore
Generate ambient music from text. Locally. No GPU required.
import latentscore as ls
ls.render("warm sunset over water", model="fast_heavy").play()
That's it. One line. You get audio playing on your speakers.
⚠️ Alpha — under active development. API may change between versions. Read more about how it works.
Install
Requires Python 3.10–3.12. If you don't have it: brew install python@3.10 (macOS) or pyenv install 3.10.
pip install latentscore
Or with conda:
conda create -n latentscore python=3.10 -y
conda activate latentscore
pip install latentscore
CLI
latentscore doctor # check setup and model availability
latentscore demo # render and play a sample
latentscore demo --duration 30 # 30-second demo
latentscore demo --output ambient.wav # save to file
Quick Start
Render and play
import latentscore as ls
audio = ls.render("warm sunset over water", model="fast_heavy", duration=10.0)
audio.play() # plays on your speakers
audio.save("output.wav") # save to WAV
Different vibes
ls.render("jazz cafe at midnight", model="fast_heavy").play()
ls.render("thunderstorm on a tin roof", model="fast_heavy").play()
ls.render("lo-fi study beats", model="fast_heavy").play()
Controlling the Sound
MusicConfig (full control)
Build a config directly with human-readable labels:
import latentscore as ls
config = ls.MusicConfig(
tempo="slow",
brightness="dark",
space="vast",
density=3,
bass="drone",
pad="ambient_drift",
melody="contemplative",
rhythm="minimal",
texture="shimmer",
echo="heavy",
root="d",
mode="minor",
)
ls.render(config, duration=10.0).play()
MusicConfigUpdate (tweak a vibe)
Start from a vibe and override specific parameters:
import latentscore as ls
audio = ls.render(
"morning coffee shop",
duration=10.0,
update=ls.MusicConfigUpdate(
brightness="very_bright",
rhythm="electronic",
),
)
audio.play()
Relative steps
Step(+1) moves one level up the scale, Step(-1) moves one down. Saturates at boundaries.
from latentscore.config import Step
audio = ls.render(
"morning coffee shop",
duration=10.0,
update=ls.MusicConfigUpdate(
brightness=Step(+2), # two levels brighter
space=Step(-1), # one level less spacious
),
)
audio.play()
Streaming
Chain vibes together with smooth crossfade transitions:
import latentscore as ls
stream = ls.stream(
"morning coffee",
"afternoon focus",
"evening wind-down",
duration=60, # 60 seconds per vibe
transition=5.0, # 5-second crossfade
)
stream.play()
# Or collect and save
stream.collect().save("session.wav")
Live Streaming
For dynamic, interactive use (games, installations, adaptive UIs), use a generator to feed vibes and steer the music in real time:
import asyncio
from collections.abc import AsyncIterator
import latentscore as ls
from latentscore.config import Step
async def my_set() -> AsyncIterator[str | ls.MusicConfigUpdate]:
yield "warm jazz cafe at midnight"
await asyncio.sleep(8)
# Absolute override: switch to bright electronic
yield ls.MusicConfigUpdate(tempo="fast", brightness="very_bright", rhythm="electronic")
await asyncio.sleep(8)
# Relative nudge: dial brightness back down, add more echo
yield ls.MusicConfigUpdate(brightness=Step(-2), echo=Step(+1))
session = ls.live(my_set(), transition_seconds=2.0)
session.play(seconds=30)
Sync generators work too — use Iterator instead of AsyncIterator and time.sleep instead of await asyncio.sleep.
Async API
For web servers and async apps:
import asyncio
import latentscore as ls
async def main() -> None:
audio = await ls.arender("neon city rain")
audio.save("neon.wav")
asyncio.run(main())
Bring Your Own LLM
Use any LLM through LiteLLM — OpenAI, Anthropic, Google, Mistral, Groq, and 100+ others. LiteLLM is included with latentscore.
import latentscore as ls
# Gemini (free tier available)
ls.render("cyberpunk rain on neon streets", model="external:gemini/gemini-3-flash-preview").play()
# Claude
ls.render("cozy library with rain outside", model="external:anthropic/claude-sonnet-4-5-20250929").play()
# GPT
ls.render("space station ambient", model="external:openai/gpt-4o").play()
API keys are read from environment variables automatically (GEMINI_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY).
LLM Metadata
External models return rich metadata alongside audio:
audio = ls.render("cyberpunk rain", model="external:gemini/gemini-3-flash-preview")
if audio.metadata is not None:
print(audio.metadata.title) # e.g. "Neon Rain Drift"
print(audio.metadata.thinking) # the LLM's reasoning
print(audio.metadata.config) # the MusicConfig it chose
for palette in audio.metadata.palettes:
print([c.hex for c in palette.colors])
Note: LLM models are slower than
fast_heavy(network round-trips) and can occasionally produce invalid configs.fast_heavyis recommended for production use.
How It Works
You give LatentScore a vibe (a short text description) and it generates ambient music that matches.
The recommended fast_heavy model uses LAION-CLAP audio embeddings: your vibe text is encoded with CLAP's text encoder and matched against pre-computed CLAP audio embeddings of 10,000+ rendered music configurations. This matches text directly against what configs actually sound like. The best-matching config drives a real-time audio synthesizer.
The lighter fast model uses text-to-text retrieval instead (MiniLM sentence embeddings). It's marginally faster but scores 71% lower on audio-text alignment benchmarks.
Both approaches are instant (~2s), 100% reliable (no LLM hallucinations), and require no API keys. Our CLAP benchmarks showed that embedding retrieval outperforms Claude Opus 4.5 and Gemini 3 Flash at mapping vibes to music configurations, and fast_heavy outperforms fast by 71%.
Audio Contract
All audio produced by LatentScore follows this contract:
- Format:
float32mono - Sample rate:
44100Hz - Range:
[-1.0, 1.0] - Shape:
(n,)numpy array
import numpy as np
import latentscore as ls
audio = ls.render("deep ocean")
samples = np.asarray(audio) # NDArray[np.float32]
Additional Info
Config Reference
Every MusicConfig field uses human-readable labels. Full reference:
| Field | Labels |
|---|---|
tempo |
very_slow slow medium fast very_fast |
brightness |
very_dark dark medium bright very_bright |
space |
dry small medium large vast |
motion |
static slow medium fast chaotic |
stereo |
mono narrow medium wide ultra_wide |
echo |
none subtle medium heavy infinite |
human |
robotic tight natural loose drunk |
attack |
soft medium sharp |
grain |
clean warm gritty |
density |
2 3 4 5 6 |
root |
c c# d ... a# b |
mode |
major minor dorian mixolydian |
Layer styles:
| Layer | Styles |
|---|---|
bass |
drone sustained pulsing walking fifth_drone sub_pulse octave arp_bass |
pad |
warm_slow dark_sustained cinematic thin_high ambient_drift stacked_fifths bright_open |
melody |
procedural contemplative rising falling minimal ornamental arp_melody contemplative_minor call_response heroic |
rhythm |
none minimal heartbeat soft_four hats_only electronic kit_light kit_medium military tabla_essence brush |
texture |
none shimmer shimmer_slow vinyl_crackle breath stars glitch noise_wash crystal pad_whisper |
accent |
none bells pluck chime bells_dense blip blip_random brass_hit wind arp_accent piano_note |
Local LLM (Expressive Mode)
Not recommended. The default
fastandfast_heavymodels are faster, more reliable, and produce higher-quality results. Expressive mode exists for experimentation only.
Runs a 270M-parameter Gemma 3 LLM locally. On macOS Apple Silicon, inference uses MLX (~5–15s). On CPU-only Linux/Windows, it uses transformers (30–120s per render). The local model can produce invalid configs and our benchmarks showed it barely outperforms a random baseline.
pip install 'latentscore[expressive]'
latentscore download expressive
ls.render("jazz cafe at midnight", model="expressive").play()
Research & Training Pipeline
The data_work/ folder contains the full research pipeline: data preparation, LLM-based config generation, SFT/GRPO training on Modal, CLAP benchmarking, and model export.
See data_work/README.md and docs/architecture.md for details.
Contributing
See CONTRIBUTE.md for environment setup and contribution guidelines.
See docs/coding-guidelines.md for code style requirements.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file latentscore-0.1.4.tar.gz.
File metadata
- Download URL: latentscore-0.1.4.tar.gz
- Upload date:
- Size: 122.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bb80fe4748677d008a93d4288cf03fdf56911c6a50c67a71fb9d6594f836901
|
|
| MD5 |
176ea8d8aa16a8449ff6c019c96c777e
|
|
| BLAKE2b-256 |
5878c926f286f0f41411062c64d072d6ea4cad92b611da7c7d7cba692d641c82
|
Provenance
The following attestation bundles were made for latentscore-0.1.4.tar.gz:
Publisher:
workflow.yml on prabal-rje/latentscore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
latentscore-0.1.4.tar.gz -
Subject digest:
8bb80fe4748677d008a93d4288cf03fdf56911c6a50c67a71fb9d6594f836901 - Sigstore transparency entry: 972725931
- Sigstore integration time:
-
Permalink:
prabal-rje/latentscore@80138b5c044182b628813329aa32abbac6fe3bf1 -
Branch / Tag:
refs/tags/v0.1.4-alpha - Owner: https://github.com/prabal-rje
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@80138b5c044182b628813329aa32abbac6fe3bf1 -
Trigger Event:
release
-
Statement type:
File details
Details for the file latentscore-0.1.4-py3-none-any.whl.
File metadata
- Download URL: latentscore-0.1.4-py3-none-any.whl
- Upload date:
- Size: 117.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcfbbbd21b1fea0860df29e4a628339f43c2f0cc2e14886f2f446185b159da22
|
|
| MD5 |
0d125c6608f2063aa3df6d328f221f18
|
|
| BLAKE2b-256 |
0d9f841af46c3e5ff335ebe925b897de4f7cbd55c6c004ecdf414d67ca07701a
|
Provenance
The following attestation bundles were made for latentscore-0.1.4-py3-none-any.whl:
Publisher:
workflow.yml on prabal-rje/latentscore
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
latentscore-0.1.4-py3-none-any.whl -
Subject digest:
bcfbbbd21b1fea0860df29e4a628339f43c2f0cc2e14886f2f446185b159da22 - Sigstore transparency entry: 972726014
- Sigstore integration time:
-
Permalink:
prabal-rje/latentscore@80138b5c044182b628813329aa32abbac6fe3bf1 -
Branch / Tag:
refs/tags/v0.1.4-alpha - Owner: https://github.com/prabal-rje
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@80138b5c044182b628813329aa32abbac6fe3bf1 -
Trigger Event:
release
-
Statement type: