All-In-One Music Structure Analyzer with source separation
Project description
All-In-One-Fix Music Structure Analyzer
An enhanced version of All-In-One with integrated source separation and modern PyTorch compatibility
๐ Acknowledgments:
This package builds upon the exceptional work of the foundational project:
- All-In-One by Taejun Kim and Juhan Nam - The core music structure analysis algorithms and models. We are deeply grateful for their groundbreaking research in music information retrieval.
This enhanced version preserves all original research contributions while improving compatibility and workflow flexibility. All credit for the core algorithms belongs to the original authors.
This package provides models for music structure analysis, predicting:
- Tempo (BPM)
- Beats
- Downbeats
- Functional segment boundaries
- Functional segment labels (e.g., intro, verse, chorus, bridge, outro)
๐ What's New in All-In-One-Fix (v2.0.0)
๐ต Integrated Source Separation
- Source Separation: Uses demucs-infer package for high-quality source separation
- Clean Dependencies: Inference-only Demucs integration via demucs-infer package
- Model Caching: Intelligent model caching for improved performance (6x faster on repeated use)
- GPU Memory Management: Automatic GPU cleanup prevents out-of-memory errors
- Better Error Messages: Fuzzy matching suggestions for model names
๐ง Enhanced Compatibility
- PyTorch 2.x Support: Compatible with PyTorch 2.0 through 2.7+ and CUDA 11.7-12.8
- NATTEN 0.17.x Verified: Fully tested and working with PyTorch 2.0-2.7+
- Automatic version detection supports NATTEN 0.17.x-0.19.x
- Extensively tested with real music analysis workloads
- Note: NATTEN 0.20+ (including 0.21.0) is not compatible due to API changes requiring dimensional validation updates
- Unified Package: Single package with all functionality included
- Modern Packaging: UV-style packaging with full pip compatibility
๐๏ธ Flexible Source Separation
- Custom Models: Integrate your own source separation models via pluggable architecture
- Pre-computed Stems: Use existing separated stems from any source separation tool
- Direct Stems Input: Skip source separation entirely by providing stems directly
- Hybrid Workflows: Mix custom separation, pre-computed stems, and default separation
๐๏ธ Cache Management
- View Cache:
allin1fix --cache-infoto see cached separation models - Clear Cache:
allin1fix --clear-cacheto free up disk space - Python API:
allin1fix.print_cache_info(),allin1fix.clear_model_cache()
๐ฆ Enhanced CLI & API
- Backward Compatible: All original functionality preserved with
allin1fixnamespace - Rich CLI Options: New stems handling and cache management options
- Python API: Enhanced analyze function with new stem provider system
Table of Contents
- Motivation & Changes
- Installation
- Usage for CLI
- Usage for Python
- Visualization & Sonification
- Available Models
- Speed
- Advanced Usage for Research
- Concerning MP3 Files
- Migration from All-In-One
- Citation
- About All-In-One-Fix
- Documentation
๐ก Motivation & Changes
Why This Fork?
The original All-In-One package is an excellent music structure analysis tool, but needed updates for modern PyTorch environments:
- PyTorch 2.x Compatibility: NATTEN library needed upgrade for PyTorch 2.x
- Source Separation: Required separate source separation setup
- Performance: No model caching, repeated model loading
- Modern Tooling: Packaging and dependency management improvements
Note: This fork uses demucs-infer, a maintained inference-only package with PyTorch 2.x support for source separation.
What Changed in v2.0.0?
This fork addresses these issues through strategic integration and improvements:
1. NATTEN Upgrade for Modern PyTorch ๐ง
Before:
# Original All-In-One used NATTEN 0.15.0 (PyTorch 1.x only)
dependencies = ["natten>=0.15.0"]
After:
# Flexible NATTEN support: 0.17.5 through 0.21.0+
dependencies = ["natten>=0.17.5"] # Supports PyTorch 2.0-2.7.0
Changes Made:
- Upgraded NATTEN dependency from 0.15.0 to 0.17.5+ (flexible)
- Added automatic version detection for NATTEN 0.17.5-0.21.0+
- Code automatically adapts to installed NATTEN version
- Tested with PyTorch 2.0-2.7.0 and CUDA 11.7-12.8
NATTEN Version Support:
- NATTEN 0.17.5: PyTorch 2.0-2.6, CUDA 11.7-12.1
- NATTEN 0.21.0: PyTorch 2.7.0, CUDA 12.8
Impact: All-In-One models work with both legacy and latest PyTorch versions
2. Streamlined Source Separation ๐ต
Before:
# Required external demucs package (no longer maintained)
dependencies = ["demucs"] # PyTorch 1.x only, not actively maintained
After:
# Uses demucs-infer (maintained, PyTorch 2.x compatible)
dependencies = ["demucs-infer"]
Changes Made:
- Switched to demucs-infer (maintained fork of Demucs for inference)
- PyTorch 2.x compatibility (no
torchaudio<2.1restriction) - Added intelligent model caching for 6x performance improvement
- Implemented automatic GPU memory cleanup
- Enhanced error messages with model name suggestions
Impact: Actively maintained dependencies, faster processing, modern PyTorch support
3. Enhanced Cache Management ๐๏ธ
Added Features:
- View cached models:
allin1fix --cache-info - Clear cache:
allin1fix --clear-cache(with--clear-cache-dry-runpreview) - Python API:
get_cache_size(),list_cached_models(),clear_model_cache() - Tracks model files (
.th,.pth) in~/.cache/torch/hub/checkpoints/
Impact: Better disk space management, visibility into cached models
4. Documentation & Code Cleanup ๐
Changes:
- Updated to use demucs-infer package instead of embedded code
- Added proper acknowledgments to both All-In-One and Demucs projects
- Clarified integration source and original authorship
- Improved docstrings and code comments
Impact: Clear attribution, easier maintenance, better developer experience
5. Modern Packaging with UV ๐ฆ
Before:
# Traditional setup.py or older pyproject.toml
After:
# Modern pyproject.toml with UV support
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
Changes Made:
- Converted to modern UV-style packaging using
pyproject.toml - Uses hatchling as build backend
- Maintains full pip compatibility - works with both
uvandpip - Follows PEP 621 for project metadata
Installation Methods:
# With UV (recommended, faster)
uv pip install allin1fix
# With traditional pip (still supported)
pip install allin1fix
# Editable install for development
uv pip install -e .
pip install -e .
Impact: Faster dependency resolution with UV, while maintaining compatibility with traditional pip workflows
Respect for Original Work
This project integrates two foundational open-source projects:
Original Projects:
- All-In-One - Music structure analysis by Taejun Kim & Juhan Nam
- demucs-infer - Source separation package (PyTorch 2.x compatible)
What's New in v2.0.0:
- โ NATTEN 0.17.5 for PyTorch 2.x compatibility
- โ demucs-infer integration for source separation
- โ Performance improvements (6x faster with model caching)
- โ Enhanced error handling and cache management
- โ Modern packaging with UV support
- โ 100% backward compatible with All-In-One API
What Stayed the Same:
- โ All-In-One model architectures (unchanged)
- โ Beat/downbeat tracking algorithms (unchanged)
- โ Tempo estimation (unchanged)
- โ Structure segmentation (unchanged)
- โ Research quality and accuracy (unchanged)
Credit:
- All-In-One research โ Taejun Kim & Juhan Nam (original paper)
- Source separation โ demucs-infer package (openmirlab/demucs-infer)
- This fork โ PyTorch 2.x compatibility, performance improvements, modern tooling
๐ฆ Installation
๐ฆ Available on PyPI: https://pypi.org/project/all-in-one-fix/
Quick Install from PyPI ๐
For most users, use these commands:
# Install PyTorch first
pip install torch>=2.0.0
# Install all-in-one-fix (madmom will be auto-installed during installation)
pip install all-in-one-fix --no-build-isolation
Note: madmom will be automatically installed from git during the all-in-one-fix installation. If auto-installation fails, install it manually:
pip install git+https://github.com/CPJKU/madmom
Or if you prefer UV (faster):
uv add torch
uv add git+https://github.com/CPJKU/madmom
uv add all-in-one-fix --no-build-isolation
Step-by-Step Installation from PyPI
If you prefer to install step-by-step:
Step 1: Install PyTorch first (required)
pip install torch>=2.0.0
Step 2: Install all-in-one-fix (madmom will be auto-installed)
pip install all-in-one-fix --no-build-isolation
Note: madmom is automatically installed during Step 2. If it fails, install manually:
pip install git+https://github.com/CPJKU/madmom
Why Multiple Steps?
โ ๏ธ Important:
- PyTorch must be installed first because
nattenrequirestorchduring its build process - Use
--no-build-isolationflag when installingall-in-one-fix(sonattencan accesstorchduring build) - madmom is automatically installed from git during
all-in-one-fixinstallation (PyPI doesn't allow git dependencies, so we use a post-install hook)
What happens if you skip this?
pip install all-in-one-fixalone will fail with:ModuleNotFoundError: No module named 'torch'- This is because pip's build isolation prevents access to installed packages during build
Requirements
- Python: 3.9 or later (required for
scipy>=1.13andmadmom) - PyTorch: 2.0.0 to <2.8.0 (PyTorch 2.8+ breaks natten 0.17.5 compatibility)
- OS: Linux, macOS, Windows
โ ๏ธ Important Compatibility Note:
- PyTorch 2.0-2.7: Fully supported with natten 0.17.5
- PyTorch 2.8+: Not compatible with natten 0.17.5 (internal API changes). Use PyTorch <2.8.0 or wait for natten 0.21+ support
๐ก NATTEN Version Compatibility:
- NATTEN 0.17.5: Works with PyTorch 2.0-2.7, CUDA 11.7-12.1
- NATTEN 0.21.0+: Requires code updates (API changes) - not yet supported
GPU Support (Optional)
For GPU acceleration, install PyTorch with CUDA support:
# Example: CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install allin1fix --no-build-isolation
โ What Won't Work
This will FAIL:
pip install allin1fix # โ Missing torch and --no-build-isolation
Error you'll see:
ModuleNotFoundError: No module named 'torch'
hint: This error likely indicates that `natten@0.17.5` depends on `torch`,
but doesn't declare it as a build dependency.
โ Verify Installation
After installation, verify it worked:
# Check if installed
python -c "import allin1fix; print('โ
allin1fix installed successfully!')"
# Check version
python -c "import allin1fix; print(allin1fix.__version__)"
# Test CLI
allin1fix --help
Troubleshooting
Installation fails with "No module named 'torch'"
- โ
Cause: Didn't install torch first or didn't use
--no-build-isolation - โ
Solution: Install
torch>=2.0.0first, then use--no-build-isolation
Installation fails with scipy version error
- โ Cause: Using Python < 3.9
- โ Solution: Ensure Python 3.9+ is used
ImportError: No module named 'madmom'
- โ
Cause:
madmommust be installed separately from git (PyPI limitation) - โ
Solution: Run
pip install git+https://github.com/CPJKU/madmombefore using allin1fix
Installation fails with madmom error
- โ
Cause: Installing
madmomfrom GitHub requires git and internet - โ
Solution: Ensure git is installed (
apt install gitorbrew install git) and you have internet access
Installation from GitHub (Development)
If you want to install the latest development version from GitHub:
Using UV (Recommended):
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install PyTorch first
uv pip install torch>=2.0.0
# Install from GitHub
uv pip install git+https://github.com/openmirlab/all-in-one-fix.git --no-build-isolation
Using pip:
# Install PyTorch first
pip install torch>=2.0.0
# Install from GitHub
pip install git+https://github.com/openmirlab/all-in-one-fix.git --no-build-isolation
Development Installation (Editable):
git clone https://github.com/openmirlab/all-in-one-fix.git
cd all-in-one-fix
pip install torch>=2.0.0
pip install -e . --no-build-isolation
Note: All dependencies (including demucs-infer and madmom) will be installed automatically from GitHub. You must install PyTorch first before installing allin1fix.
(Optional) Install FFmpeg for MP3 support
For Ubuntu:
sudo apt install ffmpeg
For macOS:
brew install ffmpeg
Usage for CLI
Basic Usage
To analyze audio files:
allin1fix your_audio_file1.wav your_audio_file2.mp3
๐๏ธ New Stems Features
1. Direct stems input from directory:
allin1fix --stems-from-dir ./my_stems --stems-id "my_song" -o ./results
# Expects: ./my_stems/{bass,drums,other,vocals}.wav
2. Custom stem filename patterns:
allin1fix --stems-from-dir ./stems --stems-pattern "track_{stem}.wav" -o ./results
# Expects: ./stems/track_{bass,drums,other,vocals}.wav
3. Individual stem files:
allin1fix \
--stems-bass path/to/bass.wav \
--stems-drums path/to/drums.wav \
--stems-other path/to/other.wav \
--stems-vocals path/to/vocals.wav \
--stems-id "my_track" -o ./results
4. Pre-computed stems mapping:
# Create stems_mapping.json:
{
"song1.wav": "/path/to/song1_stems/",
"song2.wav": "/path/to/song2_stems/"
}
allin1fix song1.wav song2.wav --stems-dict stems_mapping.json -o ./results
5. Skip separation (use existing stems in demix-dir):
allin1fix track.wav --skip-separation --demix-dir ./existing_stems -o ./results
Results will be saved in the ./struct directory by default:
./struct
โโโ your_audio_file1.json
โโโ your_audio_file2.json
The analysis results will be saved in JSON format:
{
"path": "/path/to/your_audio_file.wav",
"bpm": 100,
"beats": [ 0.33, 0.75, 1.14, ... ],
"downbeats": [ 0.33, 1.94, 3.53, ... ],
"beat_positions": [ 1, 2, 3, 4, 1, 2, 3, 4, 1, ... ],
"segments": [
{
"start": 0.0,
"end": 0.33,
"label": "start"
},
{
"start": 0.33,
"end": 13.13,
"label": "intro"
},
{
"start": 13.13,
"end": 37.53,
"label": "chorus"
},
{
"start": 37.53,
"end": 51.53,
"label": "verse"
},
...
]
}
๐๏ธ Cache Management
Separation models are downloaded to ~/.cache/torch/hub/checkpoints/ and can use several GB of disk space.
View cache information:
allin1fix --cache-info
Output:
============================================================
Model Cache Information
============================================================
Cache directory: /home/user/.cache/torch/hub/checkpoints
Total size: 3.19 GB
Number of models: 23
Cached models:
------------------------------------------------------------
7fd6ef75-a905dd85.th 37.61 MB 2025-09-08 12:20:50
14fc6a69-a89dd0ee.th 36.71 MB 2025-09-08 12:20:46
...
============================================================
Preview what would be deleted (dry run):
allin1fix --clear-cache-dry-run
Clear all cached models:
allin1fix --clear-cache
Python API:
import allin1fix
# View cache info
allin1fix.print_cache_info()
# Get cache size
size_gb = allin1fix.get_cache_size()
# List models
models = allin1fix.list_cached_models()
# Clear cache (dry run first!)
count = allin1fix.clear_model_cache(dry_run=True)
count = allin1fix.clear_model_cache() # Actually delete
๐ง Technical Improvements
All-In-One-Fix includes several technical enhancements over the original:
- Modern PyTorch Support: Compatible with PyTorch 2.x and CUDA 12.x
- NATTEN 0.17.5: Upgraded from 0.15.0 for PyTorch 2.x compatibility
- Source Separation: Uses demucs-infer package with model caching and GPU cleanup
- Memory Optimization: Automatic GPU memory cleanup prevents OOM errors on batch processing
- Performance: 6x faster on repeated use with intelligent model caching
- Error Handling: Better error messages with fuzzy matching and helpful suggestions
- Modular Architecture: Clean separation of concerns for easier maintenance and extension
- Cache Management: Built-in tools to view and manage cached separation models
๐ All Available CLI Options
$ allin1fix -h
usage: allin1fix [-h] [-o OUT_DIR] [-v] [--viz-dir VIZ_DIR] [-s]
[--sonif-dir SONIF_DIR] [-a] [-e] [-m MODEL] [-d DEVICE] [-k]
[--demix-dir DEMIX_DIR] [--spec-dir SPEC_DIR] [--overwrite]
[--no-multiprocess] [--stems-dict STEMS_DICT]
[--stems-dir STEMS_DIR] [--skip-separation] [--no-demucs]
[--stems-bass STEMS_BASS] [--stems-drums STEMS_DRUMS]
[--stems-other STEMS_OTHER] [--stems-vocals STEMS_VOCALS]
[--stems-from-dir STEMS_FROM_DIR]
[--stems-pattern STEMS_PATTERN] [--stems-id STEMS_ID]
[paths ...]
positional arguments:
paths Path to tracks (for single track mode) or omit for
stems input mode
Core Options:
-o, --out-dir Path to store analysis results (default: ./struct)
-v, --visualize Save visualizations (default: False)
-s, --sonify Save sonifications (default: False)
-m, --model Model to use (default: harmonix-all)
-d, --device Device to use (default: cuda if available else cpu)
-k, --keep-byproducts Keep demixed audio and spectrograms (default: False)
Stems Input Options:
--stems-dict JSON file mapping audio paths to stem directories
--stems-from-dir Directory containing bass.wav, drums.wav, other.wav, vocals.wav
--stems-pattern Pattern for stem files (e.g. "song_{stem}.wav")
--stems-bass Path to bass stem file
--stems-drums Path to drums stem file
--stems-other Path to other stem file
--stems-vocals Path to vocals stem file
--stems-id Identifier for the stem set
--skip-separation Skip source separation, use existing stems
Usage for Python
Basic Usage
from allin1fix import analyze
# Analyze audio files (uses demucs-infer for separation)
results = analyze(['song1.wav', 'song2.mp3'])
๐๏ธ New Stems API Features
1. Custom separation models:
from allin1fix import analyze, CustomSeparatorProvider
class MyCustomSeparator:
def __init__(self, model_path):
self.model = load_my_model(model_path)
def separate(self, audio_path, output_dir, device):
# Your separation logic here
# Must return Path to directory with bass.wav, drums.wav, other.wav, vocals.wav
stems_dir = output_dir / 'my_model' / audio_path.stem
stems_dir.mkdir(parents=True, exist_ok=True)
# Your model inference
stems = self.model.separate(audio_path, device)
# Save stems
for stem_name, audio_data in stems.items():
save_audio(audio_data, stems_dir / f"{stem_name}.wav")
return stems_dir
# Use your custom model
separator = MyCustomSeparator("path/to/model.pth")
provider = CustomSeparatorProvider(separator)
results = analyze(['song.wav'], stem_provider=provider)
2. Pre-computed stems:
from allin1fix import analyze, PrecomputedStemProvider
# Use stems from any source separation tool
stems_mapping = {
'song1.wav': '/path/to/spleeter_output/song1/',
'song2.wav': '/path/to/mdx_output/song2/',
'song3.wav': '/path/to/custom_stems/song3/'
}
provider = PrecomputedStemProvider(stems_mapping)
results = analyze(['song1.wav', 'song2.wav', 'song3.wav'], stem_provider=provider)
3. Direct stems input:
from allin1fix import analyze, StemsInput, create_stems_input_from_directory
# Method 1: Manual specification
stems = StemsInput(
bass='path/to/bass.wav',
drums='path/to/drums.wav',
other='path/to/other.wav',
vocals='path/to/vocals.wav',
identifier='my_song'
)
# Method 2: From directory (expects bass.wav, drums.wav, other.wav, vocals.wav)
stems = create_stems_input_from_directory('/path/to/stems_folder')
# Method 3: Multiple tracks with different stems
stems_list = [
create_stems_input_from_directory('/path/to/song1_stems'),
create_stems_input_from_directory('/path/to/song2_stems')
]
results = analyze(stems_input=stems_list)
4. Hybrid workflows:
# Mix different approaches in the same analysis
from allin1fix import analyze, PrecomputedStemProvider, StemsInput
# Some tracks have pre-computed stems
stems_mapping = {'song1.wav': '/path/to/stems/'}
provider = PrecomputedStemProvider(stems_mapping)
# Other tracks use default separation
regular_tracks = ['song2.wav', 'song3.wav']
# Process each group
results1 = analyze(['song1.wav'], stem_provider=provider)
results2 = analyze(regular_tracks) # Uses default HTDemucs
Available functions:
analyze()
Analyzes the provided audio files and returns the analysis results.
import allin1
# You can analyze a single file:
result = allin1.analyze('your_audio_file.wav')
# Or multiple files:
results = allin1.analyze(['your_audio_file1.wav', 'your_audio_file2.mp3'])
A result is a dataclass instance containing:
AnalysisResult(
path='/path/to/your_audio_file.wav',
bpm=100,
beats=[0.33, 0.75, 1.14, ...],
beat_positions=[1, 2, 3, 4, 1, 2, 3, 4, 1, ...],
downbeats=[0.33, 1.94, 3.53, ...],
segments=[
Segment(start=0.0, end=0.33, label='start'),
Segment(start=0.33, end=13.13, label='intro'),
Segment(start=13.13, end=37.53, label='chorus'),
Segment(start=37.53, end=51.53, label='verse'),
Segment(start=51.53, end=64.34, label='verse'),
Segment(start=64.34, end=89.93, label='chorus'),
Segment(start=89.93, end=105.93, label='bridge'),
Segment(start=105.93, end=134.74, label='chorus'),
Segment(start=134.74, end=153.95, label='chorus'),
Segment(start=153.95, end=154.67, label='end'),
]),
Unlike CLI, it does not save the results to disk by default. You can save them as follows:
result = allin1.analyze(
'your_audio_file.wav',
out_dir='./struct',
)
Parameters:
-
paths:Union[PathLike, List[PathLike]]
List of paths or a single path to the audio files to be analyzed. -
out_dir:PathLike(optional)
Path to the directory where the analysis results will be saved. By default, the results will not be saved. -
visualize:Union[bool, PathLike](optional)
Whether to visualize the analysis results or not. If a path is provided, the visualizations will be saved in that directory. Default is False. If True, the visualizations will be saved in './viz'. -
sonify:Union[bool, PathLike](optional)
Whether to sonify the analysis results or not. If a path is provided, the sonifications will be saved in that directory. Default is False. If True, the sonifications will be saved in './sonif'. -
model:str(optional)
Name of the pre-trained model to be used for the analysis. Default is 'harmonix-all'. Please refer to the documentation for the available models. -
device:str(optional)
Device to be used for computation. Default is 'cuda' if available, otherwise 'cpu'. -
include_activations:bool(optional)
Whether to include activations in the analysis results or not. -
include_embeddings:bool(optional)
Whether to include embeddings in the analysis results or not. -
demix_dir:PathLike(optional)
Path to the directory where the source-separated audio will be saved. Default is './demix'. -
spec_dir:PathLike(optional)
Path to the directory where the spectrograms will be saved. Default is './spec'. -
keep_byproducts:bool(optional)
Whether to keep the source-separated audio and spectrograms or not. Default is False. -
multiprocess:bool(optional)
Whether to use multiprocessing for extracting spectrograms. Default is True.
Returns:
Union[AnalysisResult, List[AnalysisResult]]
Analysis results for the provided audio files.
load_result()
Loads the analysis results from the disk.
result = allin1.load_result('./struct/24k_Magic.json')
visualize()
Visualizes the analysis results.
fig = allin1.visualize(result)
fig.show()
Parameters:
-
result:Union[AnalysisResult, List[AnalysisResult]]
List of analysis results or a single analysis result to be visualized. -
out_dir:PathLike(optional)
Path to the directory where the visualizations will be saved. By default, the visualizations will not be saved.
Returns:
Union[Figure, List[Figure]]List of figures or a single figure containing the visualizations.Figureis a class frommatplotlib.pyplot.
sonify()
Sonifies the analysis results. It will mix metronome clicks for beats and downbeats, and event sounds for segment boundaries to the original audio file.
y, sr = allin1.sonify(result)
# y: sonified audio with shape (channels=2, samples)
# sr: sampling rate (=44100)
Parameters:
result:Union[AnalysisResult, List[AnalysisResult]]
List of analysis results or a single analysis result to be sonified.out_dir:PathLike(optional)
Path to the directory where the sonifications will be saved. By default, the sonifications will not be saved.
Returns:
Union[Tuple[NDArray, float], List[Tuple[NDArray, float]]]
List of tuples or a single tuple containing the sonified audio and the sampling rate.
Visualization & Sonification
This package provides a simple visualization (-v or --visualize) and sonification (-s or --sonify) function for the analysis results.
allin1 -v -s your_audio_file.wav
The visualizations will be saved in the ./viz directory by default:
./viz
โโโ your_audio_file.pdf
The sonifications will be saved in the ./sonif directory by default:
./sonif
โโโ your_audio_file.sonif.wav
For example, a visualization looks like this:
You can try it at Hugging Face Space.
Available Models
The models are trained on the Harmonix Set with 8-fold cross-validation. For more details, please refer to the paper.
harmonix-all: (Default) An ensemble model averaging the predictions of 8 models trained on each fold.harmonix-foldN: A model trained on fold N (0~7). For example,harmonix-fold0is trained on fold 0.
By default, the harmonix-all model is used. To use a different model, use the --model option:
allin1 --model harmonix-fold0 your_audio_file.wav
Speed
With an RTX 4090 GPU and Intel i9-10940X CPU (14 cores, 28 threads, 3.30 GHz),
the harmonix-all model processed 10 songs (33 minutes) in 73 seconds.
Advanced Usage for Research
This package provides researchers with advanced options to extract frame-level raw activations and embeddings without post-processing. These have a resolution of 100 FPS, equivalent to 0.01 seconds per frame.
CLI
Activations
The --activ option also saves frame-level raw activations from sigmoid and softmax:
$ allin1 --activ your_audio_file.wav
You can find the activations in the .npz file:
./struct
โโโ your_audio_file1.json
โโโ your_audio_file1.activ.npz
To load the activations in Python:
>>> import numpy as np
>>> activ = np.load('./struct/your_audio_file1.activ.npz')
>>> activ.files
['beat', 'downbeat', 'segment', 'label']
>>> beat_activations = activ['beat']
>>> downbeat_activations = activ['downbeat']
>>> segment_boundary_activations = activ['segment']
>>> segment_label_activations = activ['label']
Details of the activations are as follows:
beat: Raw activations from the sigmoid layer for beat tracking (shape:[time_steps])downbeat: Raw activations from the sigmoid layer for downbeat tracking (shape:[time_steps])segment: Raw activations from the sigmoid layer for segment boundary detection (shape:[time_steps])label: Raw activations from the softmax layer for segment labeling (shape:[label_class=10, time_steps])
You can access the label names as follows:
>>> allin1.HARMONIX_LABELS
['start',
'end',
'intro',
'outro',
'break',
'bridge',
'inst',
'solo',
'verse',
'chorus']
Embeddings
This package also provides an option to extract raw embeddings from the model.
$ allin1 --embed your_audio_file.wav
You can find the embeddings in the .npy file:
./struct
โโโ your_audio_file1.json
โโโ your_audio_file1.embed.npy
To load the embeddings in Python:
>>> import numpy as np
>>> embed = np.load('your_audio_file1.embed.npy')
Each model embeds for every source-separated stem per time step,
resulting in embeddings shaped as [stems=4, time_steps, embedding_size=24]:
- The number of source-separated stems (the order is bass, drums, other, vocals).
- The number of time steps (frames). The time step is 0.01 seconds (100 FPS).
- The embedding size of 24.
Using the --embed option with the harmonix-all ensemble model will stack the embeddings,
saving them with the shape [stems=4, time_steps, embedding_size=24, models=8].
Python
The Python API allin1.analyze() offers the same options as the CLI:
>>> allin1.analyze(
paths='your_audio_file.wav',
include_activations=True,
include_embeddings=True,
)
AnalysisResult(
path='/path/to/your_audio_file.wav',
bpm=100,
beats=[...],
downbeats=[...],
segments=[...],
activations={
'beat': array(...),
'downbeat': array(...),
'segment': array(...),
'label': array(...)
},
embeddings=array(...),
)
Concerning MP3 Files
Due to variations in decoders, MP3 files can have slight offset differences. I recommend you to first convert your audio files to WAV format using FFmpeg (as shown below), and use the WAV files for all your data processing pipelines.
ffmpeg -i your_audio_file.mp3 your_audio_file.wav
In this package, audio files are read using Demucs. To my understanding, Demucs converts MP3 files to WAV using FFmpeg before reading them. However, using a different MP3 decoder can yield different offsets. I've observed variations of about 20~40ms, which is problematic for tasks requiring precise timing like beat tracking, where the conventional tolerance is just 70ms. Hence, I advise standardizing inputs to the WAV format for all data processing, ensuring straightforward decoding.
๐ Migration from All-In-One
All-In-One-Fix is designed to be a drop-in replacement with enhanced features. Here's how to migrate:
Package Name Changes
# Old (All-In-One)
from allin1 import analyze
# New (All-In-One-Fix)
from allin1fix import analyze
CLI Command Changes
# Old
allin1 track.wav -o ./results
# New
allin1fix track.wav -o ./results
Dependency Changes
# Old dependencies (All-In-One - original)
dependencies = ["demucs", "natten>=0.15.0"]
# v2.0.0+ dependencies (uses demucs-infer)
dependencies = ["natten==0.17.5", "demucs-infer"] # Clean separation via demucs-infer!
Installation Methods
All-In-One-Fix supports both UV (recommended, faster) and pip (traditional):
# With UV (recommended, faster dependency resolution)
uv pip install git+https://github.com/openmirlab/all-in-one-fix.git
# With traditional pip (still fully supported)
pip install git+https://github.com/openmirlab/all-in-one-fix.git
# Editable install for development (works with both)
git clone https://github.com/openmirlab/all-in-one-fix.git
cd all-in-one-fix
uv pip install -e .
# or
pip install -e .
Note: The package uses modern pyproject.toml with hatchling backend, following PEP 621 standards. Dependencies are automatically installed from GitHub (demucs-infer, madmom) and PyPI (other packages).
What Stays the Same
- โ All analysis results format (JSON structure unchanged)
- โ All function signatures and return types
- โ All model names and parameters
- โ All core functionality and accuracy
- โ All visualization and sonification features
What's Enhanced
- ๐ Modern PyTorch 2.x support (NATTEN 0.15.0 โ 0.17.5-0.21.0+ flexible support)
- ๐ Automatic NATTEN version detection (supports 0.17.5 through 0.21.0+)
- ๐ PyTorch 2.0-2.7.0 and CUDA 11.7-12.8 compatibility
- ๐ Uses demucs-infer package for PyTorch 2.x compatible separation
- ๐ Clean dependency management via demucs-infer
- ๐ Flexible source separation options
- ๐ Direct stems input capability
- ๐ Custom model integration
- ๐ Performance improvements (model caching, GPU cleanup)
- ๐ Better error handling and stability
- ๐ Modern packaging (UV-style with pip compatibility)
Training
Please refer to TRAINING.md.
Citation
If you use this package for your research, please cite the following papers:
All-In-One (core music structure analysis algorithms):
@inproceedings{taejun2023allinone,
title={All-In-One Metrical And Functional Structure Analysis With Neighborhood Attentions on Demixed Audio},
author={Kim, Taejun and Nam, Juhan},
booktitle={IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2023}
}
Demucs (source separation models):
@inproceedings{defossez2021hybrid,
title={Hybrid Spectrogram and Waveform Source Separation},
author={Dรฉfossez, Alexandre},
booktitle={Proceedings of the ISMIR 2021 Workshop on Music Source Separation},
year={2021}
}
๐ About All-In-One-Fix
What is This Project?
All-In-One-Fix (v2.0.0) is a unified package that combines:
- Music structure analysis from All-In-One by Taejun Kim & Juhan Nam
- Source separation via demucs-infer package
- NATTEN 0.17.5-0.21.0+ support with modern PyTorch 2.x compatibility
- Performance improvements and integration work
Key Principles
๐ฏ Respect Original Work:
Core Research: Unchanged โ
- โ All-In-One model architectures (100% original)
- โ Beat/downbeat/tempo algorithms (100% original)
- โ Structure segmentation (100% original)
- โ Research quality and accuracy (100% original)
This Fork's Contributions ๐ง
- PyTorch 2.x compatibility (NATTEN 0.17.5 upgrade)
- Performance optimizations (model caching, GPU management)
- Modern packaging and dependency management
- Enhanced error handling and user experience
- Source separation via demucs-infer package
Attribution ๐
- All-In-One research โ Taejun Kim & Juhan Nam (original)
- Source separation โ demucs-infer (openmirlab/demucs-infer)
- PyTorch 2.x compatibility โ This fork
For Researchers
When using this package, please cite the All-In-One paper for music structure analysis. See Citation section for BibTeX.
Project Information
Version: 2.0.0 License: MIT (same as All-In-One and Demucs) Original All-In-One: github.com/mir-aidj/all-in-one Original Demucs: github.com/facebookresearch/demucs
What Changed in v2.0.0?
See Motivation & Changes section above for detailed breakdown of modifications.
๐ Documentation
Comprehensive documentation is available in the docs/ directory:
- USAGE_EXAMPLES.md - Detailed usage examples and code snippets
- TRAINING.md - Guide for training All-In-One models
- INTEGRATION.md - Details about Demucs integration
- IMPROVEMENTS.md - Performance improvements documentation
- CHANGELOG.md - Version history and release notes
For more information, see the Documentation Index.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file all_in_one_fix-2.0.4.tar.gz.
File metadata
- Download URL: all_in_one_fix-2.0.4.tar.gz
- Upload date:
- Size: 85.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03b1887f72f4494a42421c45baef15a4d0ae94b0543e162142201f60f27cc44d
|
|
| MD5 |
839399a3441b6603e79b0a2f55bdeaac
|
|
| BLAKE2b-256 |
153588fece4689fed77185065b5a88d2de001a687dcf5ea9624e045b931f85cb
|
Provenance
The following attestation bundles were made for all_in_one_fix-2.0.4.tar.gz:
Publisher:
publish.yml on openmirlab/all-in-one-fix
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
all_in_one_fix-2.0.4.tar.gz -
Subject digest:
03b1887f72f4494a42421c45baef15a4d0ae94b0543e162142201f60f27cc44d - Sigstore transparency entry: 700862141
- Sigstore integration time:
-
Permalink:
openmirlab/all-in-one-fix@9ba8cac49d441f54e2d89aaefbd44acde8ee2c38 -
Branch / Tag:
refs/tags/v2.0.4 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ba8cac49d441f54e2d89aaefbd44acde8ee2c38 -
Trigger Event:
release
-
Statement type:
File details
Details for the file all_in_one_fix-2.0.4-py3-none-any.whl.
File metadata
- Download URL: all_in_one_fix-2.0.4-py3-none-any.whl
- Upload date:
- Size: 69.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06fa802c7d4ef55b2fbf237ede3f21a43e239b2b9617f6ae5e1f5bc203028416
|
|
| MD5 |
ef154698a3c52ffda5b53fa5df6d44a0
|
|
| BLAKE2b-256 |
0bbdf9cbfe1633ce2961dd9d5dfb6bc54eb533be4636695e613ad1ee955e425c
|
Provenance
The following attestation bundles were made for all_in_one_fix-2.0.4-py3-none-any.whl:
Publisher:
publish.yml on openmirlab/all-in-one-fix
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
all_in_one_fix-2.0.4-py3-none-any.whl -
Subject digest:
06fa802c7d4ef55b2fbf237ede3f21a43e239b2b9617f6ae5e1f5bc203028416 - Sigstore transparency entry: 700862142
- Sigstore integration time:
-
Permalink:
openmirlab/all-in-one-fix@9ba8cac49d441f54e2d89aaefbd44acde8ee2c38 -
Branch / Tag:
refs/tags/v2.0.4 - Owner: https://github.com/openmirlab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9ba8cac49d441f54e2d89aaefbd44acde8ee2c38 -
Trigger Event:
release
-
Statement type: