Transript media file to text using google api.
Project description
# Voice2Text
Voice2Text is transcript media file to txt file to use Google Speach API &
## Installation
Voice2Text need GOOGLE_APPLICATION_CREDENTIALS files.
if you don't have this, please build google cloud projects and get from it.
#### Gcloud Project build
1. Google Cloud SDK Install
```
brew cask install google-cloud-sdk
```
2. Setting Gcloud Projects
```
gcloud auth login
gcloud alpha projects create voicetotext-123456 --name voice2text
```
3. Go to Projects URL and enable Google Speech API.
4. Please Enable (Billing)[https://support.google.com/cloud/answer/6293499?hl=en].
5. Create Service Key and Downlaod (Ref:[Service Acount](https://cloud.google.com/storage/docs/authentication#generating-a-private-key).)
5. set GOOGLE_APPLICATION_CREDENTIALS
```
export GOOGLE_APPLICATION_CREDENTIALS='/your/service/acount/key/xxx.json'
```
#### Install
```
pip install voicetotext
```
## Usage
This application has two commands.
splitvoice is convert the voice diving.
voicetotext is voice existing in the folder into a text through google api.
(See help command)
```
splitvoice --help
voicetotext --help
```
## Sample
#### Split Audio Files
Sample Japanese voices from [here](http://nergui.sakura.ne.jp/library.html)
```
$ splitvoice voices/hana_1.mp3 --relative
spliting /57
spliting Done!
File was separete 57 filesOutput Separeted files? [Y/n]:y
separeted done! Have a nice Day!⏎
```
#### Transript Japanese audio files
```
$ voicetotext results/ -s 22050 -l "ja_JP"
芥川龍之介
花
line
朗読池田秀雄
禅智内供の鼻といえば池で知らないものはない
長澤語録すがって上唇の上から顎の下まで下がっている
```
## Error Handling
#### "Sample rate in request does not match FLAC header."
You need to examine the sample rate.
I recommend ffprove to examine.
```
$ ffmprove results/000.flac
Input #0, flac, from 'results/000.flac':
Metadata:
ENCODER : Lavf57.56.101
Duration: 00:00:01.87, start: 0.000000, bitrate: 184 kb/s
Stream #0:0: Audio: flac, 22050 Hz, mono, s16
```
You can get framerate. In this case, frame rate is 22050.
So, your commands is this.
```
$ voicetotext results -s 22050
```
## Contributing
1. Fork it!
2. Create your feature branch: `git checkout -b my-new-feature`
3. Commit your changes: `git commit -am 'Add some feature'`
4. Push to the branch: `git push origin my-new-feature`
5. Submit a pull request :D
## Debugging
```
# virtualenv
python3 -m venv env
source ./env/bin/activate
# python packages install
pip install -r requirements.txt
```
## History
# License
This software is released under the MIT License, see LICENSE.txt.
Voice2Text is transcript media file to txt file to use Google Speach API &
## Installation
Voice2Text need GOOGLE_APPLICATION_CREDENTIALS files.
if you don't have this, please build google cloud projects and get from it.
#### Gcloud Project build
1. Google Cloud SDK Install
```
brew cask install google-cloud-sdk
```
2. Setting Gcloud Projects
```
gcloud auth login
gcloud alpha projects create voicetotext-123456 --name voice2text
```
3. Go to Projects URL and enable Google Speech API.
4. Please Enable (Billing)[https://support.google.com/cloud/answer/6293499?hl=en].
5. Create Service Key and Downlaod (Ref:[Service Acount](https://cloud.google.com/storage/docs/authentication#generating-a-private-key).)
5. set GOOGLE_APPLICATION_CREDENTIALS
```
export GOOGLE_APPLICATION_CREDENTIALS='/your/service/acount/key/xxx.json'
```
#### Install
```
pip install voicetotext
```
## Usage
This application has two commands.
splitvoice is convert the voice diving.
voicetotext is voice existing in the folder into a text through google api.
(See help command)
```
splitvoice --help
voicetotext --help
```
## Sample
#### Split Audio Files
Sample Japanese voices from [here](http://nergui.sakura.ne.jp/library.html)
```
$ splitvoice voices/hana_1.mp3 --relative
spliting /57
spliting Done!
File was separete 57 filesOutput Separeted files? [Y/n]:y
separeted done! Have a nice Day!⏎
```
#### Transript Japanese audio files
```
$ voicetotext results/ -s 22050 -l "ja_JP"
芥川龍之介
花
line
朗読池田秀雄
禅智内供の鼻といえば池で知らないものはない
長澤語録すがって上唇の上から顎の下まで下がっている
```
## Error Handling
#### "Sample rate in request does not match FLAC header."
You need to examine the sample rate.
I recommend ffprove to examine.
```
$ ffmprove results/000.flac
Input #0, flac, from 'results/000.flac':
Metadata:
ENCODER : Lavf57.56.101
Duration: 00:00:01.87, start: 0.000000, bitrate: 184 kb/s
Stream #0:0: Audio: flac, 22050 Hz, mono, s16
```
You can get framerate. In this case, frame rate is 22050.
So, your commands is this.
```
$ voicetotext results -s 22050
```
## Contributing
1. Fork it!
2. Create your feature branch: `git checkout -b my-new-feature`
3. Commit your changes: `git commit -am 'Add some feature'`
4. Push to the branch: `git push origin my-new-feature`
5. Submit a pull request :D
## Debugging
```
# virtualenv
python3 -m venv env
source ./env/bin/activate
# python packages install
pip install -r requirements.txt
```
## History
# License
This software is released under the MIT License, see LICENSE.txt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
voicetotext-1.0.1.tar.gz
(7.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voicetotext-1.0.1.tar.gz.
File metadata
- Download URL: voicetotext-1.0.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
734a8121a90e0974eefd254c69b2a8a5323c101167546ea47bdaa71a2ab8e163
|
|
| MD5 |
fc354b0c2ff3d39afad33e1260266b8e
|
|
| BLAKE2b-256 |
25c7a5491abe463f0ae6f5a4b05695bc08b4c9977d8b42a8e32473294388ab6c
|
File details
Details for the file voicetotext-1.0.1-py3-none-any.whl.
File metadata
- Download URL: voicetotext-1.0.1-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a433c2d851d53a35d45bfff339e06ea239fc31dfede9fbe43154d91b92ad8190
|
|
| MD5 |
55aa0ea0b88936ac95d0bb217b699b29
|
|
| BLAKE2b-256 |
c36523f889c28f07380bea8b359dba96462043eb4b10b14d30af8452702fd0e1
|