Skip to main content

Turn any recording into a 7.1 surround mix: a model designs a per-song spatial placement and renders a lossless 8-channel FLAC.

Reason this release was yanked:

Installer didn't work easily on a new macbook.

Project description

Natural Perspective Spatial Audio

Turn any recording into a 7.1 surround mix. After a song is split into its instrument stems, a model invents a scene and decides where each instrument sits around you — then a deterministic renderer builds a lossless 8-channel (7.1) FLAC for your media server.

Natural Perspective — a different soundstage designed per song

One real mix: every stem placed on the stage, the crowd behind you.

See it — no install

Open examples/index.html in any browser, on any OS. It's a finished mix's scene, soundstage, routing, and stem levels — the actual output of the tool.

Run it

Easiest — install from PyPI with pipx, no clone needed. Needs Python 3.10–3.12 and a system FFmpeg:

brew install ffmpeg python@3.12      # macOS; Linux: apt install ffmpeg python3.12
pipx install --python python3.12 'natural-perspective-spatial-audio[full]'
spatial-standards-gui                # or:  spatial-standards song.flac

Or from a clone — one command (macOS/Linux) sets up a private virtualenv and installs everything; on a Mac with Homebrew it installs Python for you too:

./install.sh
./gui                                # the GUI   (or: .venv/bin/spatial-standards song.flac)
Manual / Windows

Needs Python 3.10+ (3.12 recommended — widest wheel coverage):

python3 -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install '.[full]'                # quote it — the [..] is a shell glob otherwise
spatial-standards song.flac          # also: a folder, or a URL
spatial-standards-gui                # or the GUI

The GUI uses Tkinter: it ships with the python.org installer (recommended on macOS), with Homebrew add python-tk, on Debian/Ubuntu apt install python3-tk.

[full] brings everything as Python packages — FFmpeg + ffprobe (via static-ffmpeg), Demucs, the crowd model (audio-separator), and yt-dlp — so a fresh machine works after one install, no system setup. It's CPU by default; for an NVIDIA GPU install a CUDA build of PyTorch from pytorch.org and pip install 'audio-separator[gpu]' (much faster). The first run downloads model weights (a few hundred MB).

Prefer your own tools? pip install . has no required Python packages and just calls ffmpeg, demucs, audio-separator, and yt-dlp from your PATH.

The model layer needs an ANTHROPIC_API_KEY; without one the tool falls back to a built-in mix and runs fully offline.

Output drops straight into Plex/Jellyfin/Kodi:

<Artist>/Natural Perspective Spatial Audio/<Title> [...].flac   (+ per-album index.html)

How it works

Separate stems → measure each stem's level → a model invents a scene and emits a full mix configuration → a safety-clamped renderer builds the 7.1 FLAC. No audio is uploaded — only metadata, cover art, and the measured stem levels inform the design. Every track saves its config and an index.html documenting the scene, routing, and the exact model prompt/response.

Your responsibility, and credits

By processing a file or URL you affirm you have the right to it. This tool hosts no content and ships no audio. See NOTICE for that and full credit to the projects it builds on — FFmpeg, Demucs, audio-separator, yt-dlp, and the Mel-Band RoFormer crowd model.

License

Apache-2.0 — see LICENSE and NOTICE. Provided as is, without warranty of any kind.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

natural_perspective_spatial_audio-0.1.1.tar.gz (54.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file natural_perspective_spatial_audio-0.1.1.tar.gz.

File metadata

File hashes

Hashes for natural_perspective_spatial_audio-0.1.1.tar.gz
Algorithm Hash digest
SHA256 18a64dfe5cf9e10dde09af1ec92035bfb989e837df4958c9b406e9ec4fab8451
MD5 5054855a366056a2d8f74dfd0ed943c9
BLAKE2b-256 a27536f4259654f93ea1c60969134a61b91bfbfef177cbb1da2d3064b4161ec8

See more details on using hashes here.

Provenance

The following attestation bundles were made for natural_perspective_spatial_audio-0.1.1.tar.gz:

Publisher: publish.yml on mattluttrell/natural-perspective-spatial-audio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file natural_perspective_spatial_audio-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for natural_perspective_spatial_audio-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 271f489c7045f64438c9b94064af0674b81f318e152cac782ec4af0a15d24486
MD5 8b2053ddb0056f93f69757785d591f66
BLAKE2b-256 db147c91575de78ce512d284e34918225c0eaeafed2bbaa83f6ffc1c572142a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for natural_perspective_spatial_audio-0.1.1-py3-none-any.whl:

Publisher: publish.yml on mattluttrell/natural-perspective-spatial-audio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page