Audio Alignment and Recognition in Python

These details have not been verified by PyPI

Project links

Homepage

Project description

Audalign

Python package for aligning audio files using audio fingerprinting, cross-correlation, cross-correlation with spectrograms, or visual alignment techniques.

This package offers tools to align many recordings of the same event. This is primarily accomplished with fingerprinting, though where fingerprinting fails, correlation, correlation with spectrograms, and visual alignment techniques can be used to get a closer result. After an initial alignment is found, that alignment can be passed to "fine_align," which will find smaller, relative alignments to the main one.

Alignment consists of a dictionary containing alignment data for all files in a given directory. If an output directory is given, silence is placed before all files in the target directory so that all will automatically be aligned and writen to the output directory along with an audio file containing the sum of all audio.

All fingerprints are stored in memory and must be saved to disk with the save_fingerprinted_files method in order to persist them.

Regular file recogniton can also be done with Audalign similar to dejavu but held in memory.

"rankings" key is included in each alignment and recognition result. This helps determine the strength of the alignment but is not definitive proof. Values range from 1-10.

For more details on implementation and results, see the wiki!!

gif of audalign aligning

This package is primarirly focused on accuracy of alignments and has several accuracy settings. Parameters for visual alignment can be adjusted. Fingerprinting parameters can be generally set to get consistent results, but visual alignment requires case by case adjustment. Parameters for correlation are focused on sample rate or scipy's find_peaks.

Noisereduce is very useful for this application and a wrapper is implemented for ease of use. Uniformly leveling prior to noise reduction using uniform_level_file boosts quiet but important sound features.

Installation

Install from PyPI:

Don't forget to install ffmpeg/avlib (Below in the Readme)!

pip install audalign

git clone https://github.com/benfmiller/audalign.git
cd audalign/
pip install audalign

Download and extract audalign then

pip install audalign

in the directory

Aligning

import audalign
ada = audalign.Audalign()

print(ada.align("target/folder/", destination_path="write/alignments/to/folder"))
# or
print(ada.target_align(
    "target/files",
    "target/folder/",
    destination_path="write/alignments/to/folder",
    ))

# For Visual
print(ada.target_align(
    "target/files",
    "target/folder/",
    destination_path="write/alignments/to/folder",
    technique="visual",
    ))
# volume_threshold might need to be adjusted depending on the file

# For Correlation
print(ada.target_align(
    "target/files",
    "target/folder/",
    destination_path="write/alignments/to/folder",
    technique="correlation", # or "correlation_spectrogram"
    ))

Returns dictionary of each file recognized and best alignment. Also returns match info dictionary of each recognition in the folder

You can specify a destination folder to write the aligned files with the appropriate length of silence added to the front.

Target align only aligns with one target file rather than finding the file with the most and best matches.

Fine Aligning

import audalign
ada = audalign.Audalign()

rough_alignment = ada.align("target/folder/") # get rough alignment with regular aligning

fine_alignment = ada.fine_align( # get fine alignment with rough alignment
    rough_alignment,
    destination_path="write/alignments/to/folder"
    ) # defaults to correlation

# For Fingerprinting
print(ada.fine_align(
    rough_alignment,
    destination_path="write/alignments/to/folder",
    technique="fingerprints",
    ))

Fine aligning takes the output and alignment of regular alignments and finds alignments within the specified width.

This is very useful if there are multiple recordings with different relative offsets of the same event. Correlation is also more precice than fingerprints and does not fail to give an alignment.

Fingerprinting

Audalign is mostly built on fingerprinting.

import audalign
ada = audalign.Audalign()

ada.fingerprint_file("test_file.wav")

# or

ada.fingerprint_directory("audio/directory")

fingerprints are stored in ada and can be saved by

ada.save_fingerprinted_files("save_file.json") # or .pickle
# or loaded with 
ada.load_fingerprinted_files("save_file.json") # or .pickle

All formats that ffmpeg or libav support are supported here.

Recognizing

Alignments are accomplished with recognizing

# Only returns matches with total fingerprint matches greater than 50 within 5 second windows
print(ada.recognize("matching_file.mp3", filter_matches=50, locality=5))

# For Visual
print(ada.visrecognize(
    target_file_path="target_file.mp3", against_file_path="against_file.mp3"
    ))

# For Correlation
print(ada.correcognize(
    target_file_path="target_file.mp3", against_file_path="against_file.mp3"
    ))

# For Correlation with spectrogram
print(ada.correcognize_spectrogram(
    target_file_path="target_file.mp3", against_file_path="against_file.mp3"
    ))

File doesn't have to be fingerprinted already. If it is, the file is not re-fingerprinted

Returns dictionary match time and match info. Match info is a dictionary of each file it recognized with. Each file is a dictionary of match information.

Other Functions

# wrapper for timsainb/noisereduce
ada.remove_noise_file(
    "target/file",
    "5", # noise start in seconds
    "20", # noise end in seconds
    "destination/file",
    alt_noise_filepath="different/sound/file",
    prop_decrease="0.5", # If you want noise half reduced
)

ada.remove_noise_directory(
    "target/directory/",
    "noise/file",
    "5", # noise start in seconds
    "20", # noise end in seconds
    "destination/directory",
    prop_decrease="0.5", # If you want noise half reduced
)

ada.uniform_level_file(
    "target/file",
    "destination",
    mode="normalize",
    width=5,
)

ada.plot("file.wav") # Plots spectrogram with peaks overlaid
ada.convert_audio_file("audio.wav", "audio.mp3") # Also convert video file to audio file
ada.get_metadata("file.wav") # Returns metadata from ffmpeg/ avlib

You can easily recalcute the alignment shifts from previous results using recalc_shifts. You can then write those shifts using write_shifts_from_results. write_shifts_from_results also lets you use different source files for alignments too.

recalculated_results = ada.recalc_shifts(older_results)
ada.write_shifts_from_results(recalculated_results, "source_files_folder_or_file_list", "destination")

Audalign Functions

ada.set_multiprocessing(False) # If you want single threaded
ada.set_num_processors(4) # However many processors you have.
ada.set_accuracy(1) # from 1-4, sets fingerprinting variables for different levels of accuracy
ada.set_hash_style("base") #you can use "base" "base_three" "panako" "panako_mod"
ada.set_freq_threshold(100) # ignores frequencies below value. Max value is 2049. Not Hertz

Getting ffmpeg set up

You can use ffmpeg or libav.

Mac (using homebrew):

# ffmpeg
brew install ffmpeg --with-libvorbis --with-sdl2 --with-theora

####    OR    #####

# libav
brew install libav --with-libvorbis --with-sdl --with-theora

Linux (using apt):

# ffmpeg
apt-get install ffmpeg libavcodec-extra

####    OR    #####

# libav
apt-get install libav-tools libavcodec-extra

Windows:

Download and extract ffmpeg from Windows binaries provided here.
Add the ffmpeg /bin folder to your PATH environment variable

Download and extract libav from Windows binaries provided here.
Add the libav /bin folder to your PATH environment variable

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.3.1

Mar 10, 2025

1.3.0

Jun 3, 2024

1.2.4

Jan 8, 2024

1.2.3

Apr 12, 2023

1.2.2

Feb 6, 2023

1.2.1

Jan 9, 2023

1.2.0

Jul 7, 2022

1.1.0

May 21, 2022

1.0.1

Feb 23, 2022

1.0.0

Jan 10, 2022

This version

0.7.2

Sep 22, 2021

0.7.1

Aug 29, 2021

0.7.0

Jul 17, 2021

0.6.1

Jul 13, 2021

0.6.0

Jul 8, 2021

0.5.2

Jun 22, 2021

0.5.1

Jun 11, 2021

0.5.0

Jun 4, 2021

0.4.2

May 25, 2021

0.4.1

May 20, 2021

0.4.0

May 18, 2021

0.3.1

May 12, 2021

0.3.0

May 3, 2021

0.2.2

May 3, 2021

0.2.1

Apr 28, 2021

0.2.0

Apr 14, 2021

0.1.6

Feb 26, 2021

0.1.5

Feb 5, 2021

0.1.3

Jan 28, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audalign-0.7.2.tar.gz (43.8 kB view details)

Uploaded Sep 22, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audalign-0.7.2-py3-none-any.whl (43.8 kB view details)

Uploaded Sep 22, 2021 Python 3

File details

Details for the file audalign-0.7.2.tar.gz.

File metadata

Download URL: audalign-0.7.2.tar.gz
Upload date: Sep 22, 2021
Size: 43.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for audalign-0.7.2.tar.gz
Algorithm	Hash digest
SHA256	`a558139e2751bc4ca5a0e6acad8890422519304e0ef333fa885a3ff35ea9bc9c`
MD5	`24c712442dd189523e3e34fb54b94e03`
BLAKE2b-256	`3537696415afd139f3b6a6de622a85e6474f02f4a3cfd8d369463d8baf3104fb`

See more details on using hashes here.

File details

Details for the file audalign-0.7.2-py3-none-any.whl.

File metadata

Download URL: audalign-0.7.2-py3-none-any.whl
Upload date: Sep 22, 2021
Size: 43.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for audalign-0.7.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d7ddb893b771feee7c3c7b10c267a053d4465dbcd63c0e6c00864705ad41cf21`
MD5	`219ca5f54581d7d035f74e4ac5a02632`
BLAKE2b-256	`ae62aae3112d57f6864c49b2abd1ec17467ece99e8d38ee3da76eb686b3c35c7`

See more details on using hashes here.

audalign 0.7.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Audalign

Installation

Aligning

Fine Aligning

Fingerprinting

Recognizing

Other Functions

Audalign Functions

Getting ffmpeg set up

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes