Skip to main content

audio search/retrieval library

Project description

shira 🔖🎧

A simple audio search/retrieval library. (wip)

This is the audio version of ripple. Search through audio files/data with text queries or audio samples.
It's meant to be an neural encoded version of Shazam, but might just be for small scale/local usage.

Methodology

It's basically a semantic search library for audio.

The local audio data/files are indexed and embeddings are generated(with CLAP), then a FAISS vector index is created.
The files are retrieved based on cosine similarity between embeddings.

This process makes use of contrastively pretrained audio-language model, CLAP(like OpenAI CLIP for audio), specifically LAION's laion/larger_clap_music_and_speech checkpoint/model

usage

  • Install the library
pip install shira-audio
  • For text-based search
from shira import AudioSearch, AudioEmbedding

embedder = AudioEmbedding(data_path='.') # init embedder class
audio_data_embeds = embedder.index_files() # create embeddings and index audio files

neural_search = AudioSearch() # init semantic search class

text_query = 'classical music' # text description for search

# get k similar audio w/probability score pairs 
matching_samples, scores = neural_search.text_search(text_query, audio_data_embeds, k_count=5)

matching_samples[0]['audio']['path'] # get file path for the top sample

Acknowldgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shira_audio-0.1.22.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

shira_audio-0.1.22-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file shira_audio-0.1.22.tar.gz.

File metadata

  • Download URL: shira_audio-0.1.22.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for shira_audio-0.1.22.tar.gz
Algorithm Hash digest
SHA256 311a0ccc3e2b10b10fa1e4207a7e78ece95293ce117902f9df1aa51449bffb42
MD5 38f58b2133e31c40418c573fb9449a7c
BLAKE2b-256 2953d2bf8d7c875e67bfcd0162b6a948d7f13c171ea3c3957dc5748bb0b2e3a6

See more details on using hashes here.

File details

Details for the file shira_audio-0.1.22-py3-none-any.whl.

File metadata

  • Download URL: shira_audio-0.1.22-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for shira_audio-0.1.22-py3-none-any.whl
Algorithm Hash digest
SHA256 e5bee32bbc98597a757ee06dd2bebc246c2082aacdf2e15f95f06cedbe64b17c
MD5 7409b1dacba492ab08bc90447c7440bd
BLAKE2b-256 98a6d551e30d4fe69e2e1b679d7836b018b4e111ecde2fcf8ee5b84a3b3e9ca0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page