Skip to main content

Python bindings for the bark.cpp library via ctypes

Project description

🐶 bark-cpp-python 🐍

MIT license Python

Python bindings for bark.cpp using ctypes. Utilize the power of GGML with bark, one of the most popular TTS models, and its quantized versions through a friendly Python interface 🔥🔥🔥.

⚙️ Feature

Inpsired by llama-cpp-python, this package provides:

  • Low-level access to C API via ctypes interface
  • High-level Python API for TTS

🚀 Demo

This demo is on AMD Ryzen 5 5600H, Ubuntu 20.04

$ python demo.py ./models/bark-small/ggml_weights_q4_1.bin -p "Hi, I am Bark. Nice to meet you" -t 8 --dest output.wav

                 ___       _      ___     __  ___
 /\__/\  woof   |    \    / \    |    \  |  |/  /
/      \  woof  |    /   /   \   |    /  |     /
\      /        |    \  /  _  \  |  _ \  |     \
 \____/         |____/ /__/ \__\ |_| |_\ |__|\__\
    

encodec_load_model_weights: in_channels = 1
encodec_load_model_weights: hidden_dim  = 128
encodec_load_model_weights: n_filters   = 32
encodec_load_model_weights: kernel_size = 7
encodec_load_model_weights: res_kernel  = 3
encodec_load_model_weights: n_bins      = 1024
encodec_load_model_weights: bandwidth   = 24
encodec_load_model_weights: sample_rate = 24000
encodec_load_model_weights: ftype       = 1
encodec_load_model_weights: qntvr       = 0
encodec_load_model_weights: ggml tensor size    = 320 bytes
encodec_load_model_weights: backend buffer size =  54.36 MB
encodec_load_model_weights: using CPU backend
encodec_load_model_weights: model size =    44.36 MB
encodec_load_model: n_q = 32

bark_tokenize_input: prompt: 'Hi, I am Bark. Nice to meet you'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 30113 10165 10194 20440 30746 20222 10167 36966 



bark_print_statistics:   sample time =    49.21 ms / 455 tokens
bark_print_statistics:  predict time =  3471.03 ms / 7.63 ms per token
bark_print_statistics:    total time =  3542.42 ms



bark_print_statistics:   sample time =    21.86 ms / 1364 tokens
bark_print_statistics:  predict time = 33798.57 ms / 24.78 ms per token
bark_print_statistics:    total time = 33829.69 ms



bark_print_statistics:   sample time =    70.14 ms / 6144 tokens
bark_print_statistics:  predict time =  8684.00 ms / 1.41 ms per token
bark_print_statistics:    total time =  8783.56 ms

encodec_eval: compute buffer size: 230.30 MB

Evaluated time: 47.49s

output.webm

🔧 Installation

Pip

pip install bark-cpp-python

Build from source

  1. Clone the repo and submodules
git clone --recursive https://github.com/tranminhduc4796/bark-cpp-python.git

cd bark-cpp-python
  1. Build and install
pip install .
🤖 Debug

GLIBCXX_3.4.32 not found

If you meet this error when import bark_cpp:

RuntimeError: Failed to load shared library '~/miniconda3/envs/bark_cpp/lib/python3.10/site-packages/bark_cpp/lib/libbark.so': ~/miniconda3/envs/bark_cpp/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by ~/miniconda3/envs/bark_cpp/lib/python3.10/site-packages/bark_cpp/lib/libencodec.so)

Install the latest gcc with:

conda install -c conda-forge gcc

🐕 Usage

# Install dependencies
pip install -r requirements.txt

# Download the Bark checkpoints and vocabulary
python3 download_weights.py --out-dir ./models --models bark-small bark

# Convert the model to ggml format
python3 convert.py --dir-model ./models/bark-small --use-f16

# Quantize model (Optional), must enable --use-f16 in the above command
python quantize.py ./models/bark-small/ggml_weights.bin ./models/bark-small/ggml_weights_q4_1.bin q4_1

# Run the demo
python demo.py ./models/bark-small/ggml_weights.bin -p "Hi, I am Bark. Nice to meet you" -t 8 --dest output.wav

🐍 High-level Python API

args = parse_arguments()

bark = Bark(
        model_path=args.model_path,
        temp=args.temp,
        fine_temp=args.fine_temp,
        min_eos_p=args.min_eos_p,
        sliding_window_size=args.sliding_window_size,
        max_coarse_history=args.max_coarse_history,
        sample_rate=args.sample_rate,
        target_bandwidth=args.target_bandwidth,
        n_steps_text_encoder=args.n_steps_text_encoder,
        semantic_rate_hz=args.semantic_rate_hz,
        coarse_rate_hz=args.coarse_rate_hz,
        seed=args.seed
    )
audio_arr = bark.generate_audio(args.prompt, args.threads)

print("Evaluated time: {:.2f}s".format(bark.get_eval_time() / 1e6))
bark.write_wav(args.dest, audio_arr)

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bark_cpp_python-0.2.0.tar.gz (15.6 MB view details)

Uploaded Source

File details

Details for the file bark_cpp_python-0.2.0.tar.gz.

File metadata

  • Download URL: bark_cpp_python-0.2.0.tar.gz
  • Upload date:
  • Size: 15.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for bark_cpp_python-0.2.0.tar.gz
Algorithm Hash digest
SHA256 201847c8a19143803cc99d736c24afac2cab985fe8e5c7d9f359e85067cd9258
MD5 ca956e5aa413c6b80beeef906cfd9d65
BLAKE2b-256 16c2f6e706e9e196d00f0209cd8e9c0264b2ac204a86a4a1c0689da5d2e70905

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page