scikit-surgeryspeech

Interface to speech services for image-guided surgery.

These details have not been verified by PyPI

Project links

Homepage

Project description

Author: Kim-Celine Kahl

scikit-surgeryspeech is part of the SNAPPY software project, developed at the Wellcome EPSRC Centre for Interventional and Surgical Sciences, part of University College London (UCL).

scikit-surgeryspeech supports Python 3.6.

scikit-surgeryspeech is a project which runs the Python Speech Recognition API in the background listening for a specific command. After saying the keyword you can say different commands, which get converted to QT Signals.

The speech recognition is done by the Google Cloud API, you have to get the credentials to use it or change the recognition service.

Keyword detection is done by the Porcupine API, you have to set different paths in your environment variables to get it running, described below.

Please explore the project structure, and implement your own functionality.

Example usage

To run an example, just start

sksurgeryspeech.py

Make sure Google Cloud API is set up correctly as described in the section below.

Also you have to set all the Parameters for the Porcupine keyword detection, also described below.

You can then say the keyword depending on the Porcupine keyword file you chose and afterwards a command. The command “quit” exits the application.

Note: each time you have already entered a command, you need to say the keyword again to trigger the listening to commands.

Developing

Cloning

You can clone the repository using the following command:

git clone https://weisslab.cs.ucl.ac.uk/WEISS/SoftwareRepositories/SNAPPY/scikit-surgeryspeech

If you have problems running the application, you might need to install portaudio

brew install portaudio

Set up the Porcupine keyword detection

If you are running the keyword example, you need to clone the Porcupine API

git clone https://github.com/Picovoice/Porcupine.git

Then, you have to set the following environment variables (here the paths are just relative to the Porcupine folder, set the full paths) :

PYTHONPATH=Porcupine\binding\python
PORCUPINE_DYNAMIC_LIBRARY=Porcupine\lib\<your os>\<your processor type>\<dynamic-library-file>
PORCUPINE_PARAMS=Porcupine\lib\common\porcupine_params.pv
PORCUPINE_KEYWORD=Porcupine\resources\keyword_files\<your os>\<keyword file of your choice>

You can also generate your own keyword files

If you are using the speech recognition service within your own application, you have to start a background thread which calls the method to listen to the keyword over and over again.

You can find an example how to create such a thread in the sksurgeryspech_demo.py

Use the Google Cloud speech recognition service

To use the Google Cloud speech recognition service, you need to get the credentials first. After signing up, you should get a json file with your credentials. Download this file and set the environment variable

GOOGLE_APPLICATION_CREDENTIALS

To the path of your json file. You should then be able to run the application.

Change speech recognition service

To change the speech recognition service if you don’t want to use the Google Cloud API, just change the command

words = recognizer.recognize_google_cloud(audio, credentials_json=self.credentials)

(file “voice_recognition_service.py”, method “listen_to_command(self)”) to the recognition service of your choice. Currently available services are:

recognizer.recognize_sphinx(audio)
recognizer.recognize_google(audio)
recognizer.recognize_google_cloud(audio, credentials_json=GOOGLE_CLOUD_SPEECH_CREDENTIALS)
recognizer.recognize_wit(audio, key=WIT_AI_KEY)
recognizer.recognize_bing(audio, key=BING_KEY)
recognizer.recognize_azure(audio, key=AZURE_SPEECH_KEY)
recognizer.recognize_houndify(audio, client_id=HOUNDIFY_CLIENT_ID, client_key=HOUNDIFY_CLIENT_KEY)
recognizer.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD)

Python development

This project uses tox. Start with a clean python environment, then do:

pip install tox
tox

and the commands that are run can be found in tox.ini.

Installing

You can pip install directly from the repository as follows:

pip install git+https://weisslab.cs.ucl.ac.uk/WEISS/SoftwareRepositories/SNAPPY/scikit-surgeryspeech

Contributing

Please see the contributing guidelines.

Useful links

Source code repository

Licensing and copyright

Acknowledgements

Supported by Wellcome and EPSRC.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.3.2

Mar 1, 2022

0.3.0

Oct 23, 2020

0.2.0

Mar 27, 2020

0.1.0

Mar 26, 2020

0.0.8

Sep 20, 2019

0.0.7

Sep 18, 2019

0.0.6

Aug 5, 2019

0.0.5

Jul 31, 2019

This version

0.0.4

Jul 30, 2019

0.0.3

Jul 29, 2019

0.0.2

Jul 25, 2019

0.0.1

Jul 19, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scikit_surgeryspeech-0.0.4-py2.py3-none-any.whl (15.8 kB view details)

Uploaded Jul 30, 2019 Python 2Python 3

File details

Details for the file scikit_surgeryspeech-0.0.4-py2.py3-none-any.whl.

File metadata

Download URL: scikit_surgeryspeech-0.0.4-py2.py3-none-any.whl
Upload date: Jul 30, 2019
Size: 15.8 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/2.7.15+

File hashes

Hashes for scikit_surgeryspeech-0.0.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`7bfd935dcd25683e64c8db9f4f34987a3be964c47d5c1c3eafcf25716fbf46a9`
MD5	`5112edbc0492f790491d89471f2a0439`
BLAKE2b-256	`64067303dbdb8311a3b98d4e9f23c2110a387e8c8ca797bada3ddc1d017c4234`

See more details on using hashes here.

scikit-surgeryspeech 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Example usage

Developing

Cloning

Set up the Porcupine keyword detection

Use the Google Cloud speech recognition service

Change speech recognition service

Python development

Installing

Contributing

Useful links

Licensing and copyright

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes