Skip to main content

Cryptographically irreversible, speaker-aware voice anonymisation

Project description

๐ŸฆŽ Chimera

Cryptographically Irreversible, Speaker-Aware Voice Anonymisation

CI Coverage Python PyPI License: MIT Code style: black

Chimera disguises the identity of one or more speakers in any audio recording. It runs entirely on CPU, requires no cloud services, no GPU, and no pre-trained model weights. Given only the output audio and any key, no known algorithm can recover the original speaker's acoustic identity.

Wiki ยท Quick Start ยท Paper


Why Chimera?

Feature Chimera Pitch shift Neural VC
Fully local / offline โœ… โœ… โš ๏ธ
No model weights โœ… โœ… โŒ
Multi-speaker, per-speaker keys โœ… โŒ โŒ
Automatic VAD (ignores noise) โœ… โŒ โŒ
Natural-sounding output โœ… โŒ โœ…
Deterministic / auditable โœ… โŒ โŒ
Cryptographically one-way โœ… โŒ โŒ
Real-time microphone support โœ… โœ… GPU

How It Works

Chimera applies six processing stages to every audio input:

Input audio
    โ”‚
    โ–ผ  [1] VAD โ€” isolates speech from noise, music, silence
    โ”‚
    โ–ผ  [2] Diarization โ€” MFCC k-means, assigns segments to speakers
    โ”‚
    โ–ผ  [3] Key Derivation โ€” HKDF-SHA256 per speaker โ†’ 8 parameters
    โ”‚
    โ–ผ  [4] 7-Layer Vocoder Stack (WORLD)
    โ”‚       L1 Pitch shift       F0 ร— 2^(ฮ”st/12)
    โ”‚       L2 Sinusoidal vibrato modulation
    โ”‚       L3 Micro-temporal jitter
    โ”‚       L4 Formant warp      SP resampled at ฮฑยทf
    โ”‚       L5 Spectral tilt     ยฑ4 dB/kHz ramp
    โ”‚       L6 Sub-harmonic injection
    โ”‚       L7 Breathiness blend toward noise
    โ”‚
    โ–ผ  [5] COWL โ€” Cryptographic One-Way Layer
    โ”‚       Sub-perceptual spectral noise (SSNI)
    โ”‚       Phase randomisation
    โ”‚       Non-linear spectral quantisation (NLSQ)
    โ”‚
    โ–ผ  [6] Cross-fade stitching + normalisation
    โ”‚
Output audio (mono float64, same sample rate)

All randomness is derived from the key via HKDF, making the transformation fully deterministic and fully one-way: same key โ†’ same output; output โ†’ original is computationally infeasible.


Installation

pip install chimera-voice

With real-time microphone support:

pip install "chimera-voice[realtime]"

From source:

git clone https://github.com/Ohswedd/chimera
cd chimera
pip install -e ".[dev]"

Quick Start

Anonymise a file

import chimera

chimera.mask_file("interview.wav", "anonymous.wav", key="my-secret", preset="strong")

Work with NumPy arrays

import chimera
import soundfile as sf

audio, sr = sf.read("interview.wav")
result = chimera.mask_array(audio, sr, key="my-secret", preset="strong")
sf.write("anonymous.wav", result.audio, sr)

Multi-speaker: independent key per speaker

result = chimera.mask_array(
    audio, sr,
    key        = "my-secret",
    preset     = "moderate",
    mode       = chimera.MaskMode.ALL_UNIQUE,   # default
    n_speakers = 3,
)
print(result.speakers_masked)    # ['SPEAKER_0', 'SPEAKER_1', 'SPEAKER_2']
print(f"Processed in {result.processing_time_s:.2f}s")

Mask only selected speakers

result = chimera.mask_array(
    audio, sr,
    key          = "my-secret",
    preset       = "strong",
    mode         = chimera.MaskMode.SELECTED,
    speaker_ids  = ["SPEAKER_0"],   # only mask speaker 0
)

Real-time microphone

from chimera.realtime import RealtimeAnonymiser

anon = RealtimeAnonymiser(key="my-secret", preset="moderate")
anon.start()
input("๐ŸŽ™  Recording โ€” press Enter to stop...")
anon.stop()
anon.save("recorded_anonymous.wav")

Streaming (generator-based)

from chimera.realtime import mask_stream

for masked_chunk in mask_stream(my_chunk_generator, sr=22050,
                                key="my-secret", preset="strong"):
    send_to_output(masked_chunk)

Inspect parameters

p = chimera.get_params("my-secret", preset="strong")
print(p.summary())

# Chimera MaskParams
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
#   Speaker label      : (none)
#   Pitch shift        : +5.743 st
#   Formant warp       : 0.91234ร—
#   Spectral tilt      : -2.814 dB/kHz
#   Breathiness        : 0.3122
#   Temporal jitter    : 0.01203 ฯƒ
#   Vibrato rate       : 4.210 Hz
#   Vibrato depth      : 0.2891 st
#   Subharmonic mix    : 0.0984
#   Master intensity   : 0.780
# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Presets

Preset Intensity ASV EERโ€  Use case
whisper 0.12 ~5 % Soft watermarking
subtle 0.28 ~15 % Light disguise
moderate 0.52 ~35 % Speaker unrecognisable to humans
strong 0.78 ~48 % ASV systems fail
extreme 1.00 ~50 % Maximum โ€” content only

โ€  Indicative Equal Error Rate against ECAPA-TDNN (VoicePrivacy 2024 protocol).


MaskMode

Mode Behaviour
MaskMode.ALL_UNIQUE Each speaker gets independent parameters (default)
MaskMode.ALL_SAME All speakers share the same transformation
MaskMode.SELECTED Only speakers listed in speaker_ids are masked

Security

Chimera is designed with three security properties:

Determinism โ€” F(audio, key) always returns the same output. Every bit of randomness is derived from the key via HKDF-SHA256.

One-way โ€” Given F(audio, key), recovering the original speaker's acoustic identity requires simultaneously inverting:

  1. Non-linear ฮผ-law spectral quantisation (ill-posed)
  2. Key-seeded phase randomisation (requires HKDF seed)
  3. Key-derived sub-perceptual noise injection (requires HMAC sub-key)
  4. Seven vocoder transformation layers (non-invertible without all params)

Key independence โ€” HKDF-SHA256 provides 128-bit second-preimage resistance. Per-speaker keys are domain-separated: key + ":chimera:spk:" + speaker_id.

โš ๏ธ Chimera is a privacy-enhancing tool, not an encryption scheme. For high-stakes deployments, combine it with access control, key rotation, and additional anonymisation measures. See the Security wiki page for the full threat model.


Supported Audio Formats

Any format supported by libsndfile: WAV, FLAC, OGG, AIFF, and more.

Supported sample rates: 8 000, 16 000, 22 050, 24 000, 44 100, 48 000 Hz.


Project Structure

chimera/
โ”œโ”€โ”€ chimera/              # Library source
โ”‚   โ”œโ”€โ”€ __init__.py       # Public API surface
โ”‚   โ”œโ”€โ”€ core.py           # High-level functions: mask_file, mask_array, get_params
โ”‚   โ”œโ”€โ”€ pipeline.py       # ChimeraPipeline โ€” full orchestration
โ”‚   โ”œโ”€โ”€ keygen.py         # HKDF-SHA256 parameter derivation
โ”‚   โ”œโ”€โ”€ vad.py            # Voice Activity Detector
โ”‚   โ”œโ”€โ”€ diarize.py        # MFCC k-means speaker diarizer
โ”‚   โ”œโ”€โ”€ transform.py      # 7-layer WORLD vocoder stack
โ”‚   โ”œโ”€โ”€ irreversible.py   # Cryptographic One-Way Layer (COWL)
โ”‚   โ”œโ”€โ”€ realtime.py       # Real-time microphone / streaming engine
โ”‚   โ”œโ”€โ”€ presets.py        # Named intensity presets
โ”‚   โ”œโ”€โ”€ types.py          # MaskParams, ChimeraResult, MaskMode, โ€ฆ
โ”‚   โ”œโ”€โ”€ exceptions.py     # Exception hierarchy
โ”‚   โ””โ”€โ”€ py.typed          # PEP 561 marker
โ”œโ”€โ”€ tests/                # Full test suite (26 tests)
โ”œโ”€โ”€ examples/             # Runnable usage examples
โ”œโ”€โ”€ docs/                 # Documentation
โ”œโ”€โ”€ benchmarks/           # Performance benchmarks
โ”œโ”€โ”€ paper/                # Academic paper (PDF)
โ”œโ”€โ”€ .github/workflows/    # CI and release workflows
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ CHANGELOG.md
โ”œโ”€โ”€ CONTRIBUTING.md
โ””โ”€โ”€ LICENSE

Development

Setup

git clone https://github.com/Ohswedd/chimera
cd chimera
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

Run tests

pytest                         # all tests
pytest -m "not slow"           # skip slow tests
pytest --cov=chimera           # with coverage

Lint and format

black chimera tests examples
ruff check chimera tests examples
mypy chimera

Build distribution

pip install build
python -m build

Academic Paper

The full technical paper is available at paper/chimera_paper.pdf.

It covers the full threat model (A1โ€“A4), HKDF key derivation, all seven vocoder layers, COWL security argument, diarization architecture, real-time latency profile, comparison with state-of-the-art, and 15 references.

Citation:

@software{chimera2026,
  title   = {Chimera: Cryptographically Irreversible Speaker-Aware Voice Anonymisation},
  author  = {Ohswedd},
  year    = {2026},
  url     = {https://github.com/Ohswedd/chimera},
  version = {0.1.0},
  license = {MIT}
}

Changelog

See CHANGELOG.md.

Contributing

See CONTRIBUTING.md.

License

MIT ยฉ 2026 Ohswedd

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chimera_voice-0.1.0.tar.gz (80.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chimera_voice-0.1.0-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file chimera_voice-0.1.0.tar.gz.

File metadata

  • Download URL: chimera_voice-0.1.0.tar.gz
  • Upload date:
  • Size: 80.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chimera_voice-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c409c75b8d0f0f57723969136588e278438a36454b8e033186511b5a918c47be
MD5 f1e6385cc1637d34affd60da6bc0ee16
BLAKE2b-256 5c56001f63fb33332c5e8acc24228e09f359ee16a01211e6b46e2a7185488f28

See more details on using hashes here.

Provenance

The following attestation bundles were made for chimera_voice-0.1.0.tar.gz:

Publisher: release.yml on Ohswedd/chimera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file chimera_voice-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: chimera_voice-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 35.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chimera_voice-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a74c2a81fda12f5f2dc8a685be0d1ad8bbe281258784404ace9e28829de1a09a
MD5 a6682b7e17cb24b78ae8a0012b0779a5
BLAKE2b-256 b8a572598016ca84046857ea1abfa1d14dfcf6d8a3813ef12828e129710ed346

See more details on using hashes here.

Provenance

The following attestation bundles were made for chimera_voice-0.1.0-py3-none-any.whl:

Publisher: release.yml on Ohswedd/chimera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page