Skip to main content

Detect silence segment from speech signal.

Project description

pySATEN

[Repository] [Mirror] PyPI - Version Downloads

About

This library detects silence segment from speech signal.

(alt: Image of voice segment detection)

Installation

$ pip install pysaten

Usage

Command line

The audio file is exported as 24-bit PCM at 48 kHz, mono.

$ pysaten_trim input.wav trimmed.wav

Python

import pysaten

# y: Target signal, obtained using libraries such as librosa or soundfile.
# sr: Sampling rate.

# Get trimmed signal for the speech segment only.
y_trimmed = pysaten.trim(y, sr)

# If you trim manually or want to get start/end time...
start_s, end_s = pysaten.vsed(y, sr)
y_trimmed = y[int(start_s * sr) : int(end_s * sr)]
# start_s: Start of speech segment. Unit is seconds.
# end_s: End of speech segment. Unit is seconds.

For development (Linux only)

$ git clone https://gitlab.com/f-matano44/pysaten.git
$ poetry install

License

Copyright 2024 Fumiyoshi MATANO

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Acknowledgements

The following programs were used to [evaluate the performance of pysaten]. We would like to take this opportunity to express our gratitude.

Cite this

Library version 1.X (Non-peer-reviewed)

Japanese

俣野 文義,小口 純矢,森勢 将雅,``音声コーパス構築のための仮定を追加した発話区間検出法の提案と基礎評価,'' 日本音響学会第 152 回 (2024 年秋季) 研究発表会, pp.1161--1162 (2024.09).

English

F. Matano, J. Koguchi, M. Morise, ``Proposal and basic evaluation of a voice activity detection with additional assumptions for speech corpus construction,'' Proceedings of the 2024 Autumn meeting of the Acoustical Society of Japan, pp.1161--1162 (2024.09) (in Japanese).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysaten-1.4.1.post1.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysaten-1.4.1.post1-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file pysaten-1.4.1.post1.tar.gz.

File metadata

  • Download URL: pysaten-1.4.1.post1.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.4

File hashes

Hashes for pysaten-1.4.1.post1.tar.gz
Algorithm Hash digest
SHA256 c2b6da4ea4855646bb0295e1f007d32a8c7dfca8254431b9a7736c67e99b0d09
MD5 ef33257631b880aad02ae82865c62413
BLAKE2b-256 a0b64ba9110d4a7aec3af1426b7e9dd808437b67558e92b27a96067b32b536bd

See more details on using hashes here.

File details

Details for the file pysaten-1.4.1.post1-py3-none-any.whl.

File metadata

  • Download URL: pysaten-1.4.1.post1-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.4

File hashes

Hashes for pysaten-1.4.1.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 0971ed24eebf784a951ac55b4206b3651e016ae762c9278f022a513d1f73cb11
MD5 e139d455f520d0212793f711a77e8066
BLAKE2b-256 3f6a97ee62b85132a056e58607f4b56e0e9269fbae5dddcf56acda2b2c9e9d92

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page