No project description provided

These details have not been verified by PyPI

Project links

Project description

NeuralSpace VoiceAI Python Client

Installation

pip install -U neuralspace

Authentication

Set your NeuralSpace API Key to the environment variable NS_API_KEY:

export NS_API_KEY=YOUR_API_KEY

Alternatively, you can also provide your API Key as a parameter when initializing VoiceAI:

import neuralspace as ns

vai = ns.VoiceAI(api_key='YOUR_API_KEY')

Quickstart

File Transcription

import requests
import neuralspace as ns

filename = 'english_audio_sample.mp3'

# Download the sample audio file
print('Downloading sample audio file...')
resp = requests.get('https://github.com/Neural-Space/neuralspace-examples/raw/main/datasets/transcription/en/english_audio_sample.mp3')
with open(filename, 'wb') as fp:
    fp.write(resp.content)


vai = ns.VoiceAI()
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')

# Setup job configuration
config = {
    'file_transcription': {
        'language_id': 'en',
        'mode': 'advanced',
    },
}

# Create a new file transcription job
job_id = vai.transcribe(file=filename, config=config)
print(f'Created job: {job_id}')

# Check the job's status
result = vai.get_job_status(job_id)
print(f'Current status:\n{result}')

# This should finish in a minute for the sample audio used here.
# It will depend on the duration of the audio file and other config options.
print('Waiting for completion...')
result = vai.poll_until_complete(job_id)
print(result)

Output:

Downloading sample audio file...
Created job: 93e229c7-912d-43aa-9d87-96f873f69882
Current status:
{
  "success": True,
  "message": "Data fetched successfully",
  "data": {
    "timestamp": 1695210581508,
    "filename": "english_audio_sample.mp3",
    "jobId": "93e229c7-912d-43aa-9d87-96f873f69882",
    "filePath": "uploads/bf377596-7a1d-4de9-82a7-9799d83f0ad9",
    "params": {
      "file_transcription": {
        "language_id": "en",
        "mode": "advanced"
      }
    },
    "status": "Queued",
    "audioDuration": 131.568,
    "messsage": "",
    "progress": [
      "Queued"
    ]
  }
}
Waiting for completion...
{
  "success": true,
  "message": "Data fetched successfully",
  "data": {
    "timestamp": 1695210581508,
    "filename": "english_audio_sample.mp3",
    "jobId": "93e229c7-912d-43aa-9d87-96f873f69882",
    "params": {
      "file_transcription": {
        "language_id": "en",
        "mode": "advanced"
      }
    },
    "status": "Completed",
    "audioDuration": 131.568,
    "messsage": "",
    "progress": [
      "queued",
      "Started",
      "Transcription Started",
      "Transcription Completed",
      "Completed"
    ],
    "result": {
      "transcription": {
        "transcript": "We've been at this for hours now. Have you found anything useful in any of those books? Not a single thing, Lewis. I'm sure that there must be something in this library. It's not like there's nothing left to be discovered. Well, I have to say that I'm tired of searching. I'm gonna take a little break. You come and cut us. I am getting a little hungry. Do you want to get someone to eat? Yeah. Food town's great right about now. What was that noise, Curtis? Did you hear that? Yes, I heard that, Lewis. I don't know, but it sounded like it came from the back of the library. Let's check it out. Okay, where you go first? Looks like a book is falling off one of the shelves. It's an old book, but it looks a bit. It's a little dusty and I can't make out what it says. Look at this, Lewis. The last treasure of Lima. Lima? Isn't that the capital city of Peru? Yes, Lewis. And it looks like there's been a treasure missing for centuries now. Look at this, Lewis. Apparently, lost treasure is located inside a temple on the outskirts of Lima. Looks like this book is a map to the treasure. Either even corn that's written down on this page. Let's get some food and plan out this next adventure. As soon as we get to Peru, I'll go straight to these coordinates that are written in the book. Great, I'll talk to you again on our land. 92, 93, 94. I'll meet at the exact location Lewis and I don't see anything. There's absolutely nothing to be seen here, just trees. Faith, look around, is there anything written on any tree? I hope this wasn't a waste of time.",
        "timestamps": [
          {
            "word": "We've",
            "start": 6.69,
            "end": 7.03,
            "conf": 0.99
          },
          {
            "word": "been",
            "start": 7.03,
            "end": 7.09,
            "conf": 0.99
          },
          {
            "word": "at",
            "start": 7.09,
            "end": 7.23,
            "conf": 0.99
          },
          {
            "word": "this",
            "start": 7.23,
            "end": 7.37,
            "conf": 0.97
          },
          {
            "word": "for",
            "start": 7.37,
            "end": 7.47,
            "conf": 0.97
          },
          {
            "word": "hours",
            "start": 7.47,
            "end": 7.87,
            "conf": 0.56
          },
          {
            "word": "now.",
            "start": 7.87,
            "end": 8.43,
            "conf": 1
          }
          ...
        ]
      }
    }
  }
}

Streaming Real-Time Transcription

The following example shows how to use NeuralSpace VoiceAI to transcribe microphone input in real-time.
It uses the PyAudio library: pip install pyaudio
PyAudio depends on the PortAudio library. It needs to be installed via your OS package manager.

For Mac OS X
```
brew install portaudio
```
For Debian/Ubuntu Linux
```
apt install portaudio19-dev
```

import json
import threading
from queue import Queue

import pyaudio
import neuralspace as ns

q = Queue()

# callback for pyaudio to fill up the queue
def listen(in_data, frame_count, time_info, status):
    q.put(in_data)
    return (None, pyaudio.paContinue)

# transfer from queue to websocket
def send_audio(q, ws):
    try:
        while True:
            data = q.get()
            ws.send_binary(data)
    except:
        print('Stopped sending audio.')

# initialize VoiceAI
vai = ns.VoiceAI()
pa = pyaudio.PyAudio()

# open websocket connection
with vai.stream('en') as ws:
    # start pyaudio stream
    stream = pa.open(
        rate=16000,
        channels=1,
        format=pyaudio.paInt16,
        frames_per_buffer=4096,
        input=True,
        output=False,
        stream_callback=listen,
    )
    # start sending audio bytes on a new thread
    t = threading.Thread(target=send_audio, args=(q, ws))
    t.start()
    print('Listening...')
    # start receiving results on the current thread
    try:
        while True:
            resp = ws.recv()
            resp = json.loads(resp)
            text = resp['text']
            # optional output formatting; new lines on every 'full' utterance
            if resp['full']:
                print('\r' + ' ' * 120, end='', flush=True)
                print(f'\r{text}', flush=True)
            else:
                if len(text) > 120:
                    text = f'...{text[-115:]}'
                print(f'\r{text}', end='', flush=True)
    except KeyboardInterrupt:
        print('\nFinishing.')

Text to Speech

import neuralspace as ns

vai = ns.VoiceAI()
# print(vai.)
# or,
# vai = ns.VoiceAI(api_key='YOUR_API_KEY')

# TTS job configuration
data = {
    "text": "كيف حالك",
    "speaker_id": "ar-female-Nadia-saudi-neutral",
    "stream": True,
    "config": {
        "pace": 1,
        "volume": 1
    }
}

# result will be an audio byte array, as stream is set to True
result = vai.synthesize(data=data)
print(f'result with stream=True:\n{result}')

# Creating a new job with stream=False
data['stream'] = False
# result will have the metadata of the job submitted along with the audio upload path
result = vai.synthesize(data=data)
print(f'result with stream=False:\n{result}')

# Fetching the details of previous job
job_id = result['data']['jobId'] # example job_id
result = vai.get_tts_job_status(job_id)
print(f'Details of the job:\n{result}')

# Delete the job 
result = vai.delete_tts_job(job_id)
print(f'Response after deleting the job:\n{result}')

# Fetching the details of all previous jobs
query_params = {
    "pageNumber": 2,
    "pageSize": 10,
    "sort": "asc"
}
result = vai.get_tts_jobs(query_params=query_params)
print(f'Fetching the details of all previous jobs:\n{result}')

Output:

result with stream=True:
b'RIFF$\xb4\x00\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00"V\x00\x00D\xac\x00\x00\x02\x00\x10\x00data\x00\xb4\x00\x00\x0e\x00\x0b\x00\x0e\x00\x0f\x00\x0f\x00\x0e\x00\r\x00\t\x00\x0c\x00\r\x00\x12\x00\x0c\x00\r\x00\n\x00\x0b\x00\x0b\x00\x0f\x00\x10\x00\x0b\x00\r\x00\x11\x00\x10\x00\x11\x00\x12\x00\x11\x00\x11\x00\x0b\x00\x10\x00\x10\x00\x0b\x00\x12\x00\x0e\x00\t\x00\x12\x00\x19\x00\x14\x00\x12\x00\x0f\x00\x12\x00\r\x00\r\x00\x10\x00...'

result with stream=False:
{
    "success": true,
    "message": "Job created successfully",
    "data": {
        "jobId": "8cf89d36-b55e-4c4f-a480-65bcd8484fae",
        "timestamp": 1701418572768,
        "result": {
            "save_path": "https://largefilestoreprod.blob.core.windows.net/common/uploads/6272df27-81a6-442a-bb7a-f98b63243604"
        }
    }
}

Details of the job:
{
    "success": true,
    "message": "Data fetched successfully",
    "data": {
        "timestamp": 1701418685869,
        "jobId": "4170883b-5ef9-4395-8dd1-deef17e140f8",
        "text": "كيف حالك",
        "params": {
            "pace": 1,
            "volume": 1,
            "speaker_id": "ar-female-Nadia-saudi-neutral",
            "language_id": "ar"
        },
        "status": "Completed",
        "result": {
            "save_path": "https://largefilestoreprod.blob.core.windows.net/common/uploads/6272df27-81a6-442a-bb7a-f98b63243604"
        },
        "audioDuration": 2
    }
}

Response after deleting the job:
{
    "success": true,
    "message": "Job and associated files deleted successfully.",
    "data": {
        "deletedCount": 1
    }
}

Fetching the details of all previous jobs:
{
    "success": true,
    "message": "Data fetched successfully",
    "data": {
        "jobs": [
            {
                "timestamp": 1701326878096,
                "jobId": "a5bcc6fe-f3c6-4efa-ba26-d2e28e8e8914",
                "text": "hello how are you",
                "params": {
                    "pace": 1,
                    "volume": 1,
                    "pitch_shift": 0.5,
                    "pitch_scale": 0.5,
                    "speaker_id": "ar-female-Nadia-saudi-neutral",
                    "language_id": "ar"
                },
                "status": "Completed",
                "audioDuration": 2
            },
            {
                "timestamp": 1701326955634,
                "jobId": "b3bdddbe-3b85-4068-ba59-d659e7469bd3",
                "text": "hello how are you",
                "params": {
                    "pace": 1,
                    "volume": 1,
                    "pitch_shift": 0.5,
                    "pitch_scale": 0.5,
                    "speaker_id": "ar-female-Nadia-saudi-neutral",
                    "language_id": "ar"
                },
                "status": "Completed",
                "audioDuration": 2
            },
            {
                "timestamp": 1701326972188,
                "jobId": "567c720a-37c5-4132-bc10-647207d8e1ad",
                "text": "hello how are you",
                "params": {
                    "pace": 1,
                    "volume": 1,
                    "pitch_shift": 0.5,
                    "pitch_scale": 0.5,
                    "speaker_id": "ar-female-Nadia-saudi-neutral",
                    "language_id": "ar"
                },
                "status": "Completed",
                "audioDuration": 2
            }
            ...
        ],
        "total": 27,
        "pageSize": 10,
        "page": 2
    }
}

More Features

To enable additional features for file transcription such as automatic language detection, speaker diarization, translation and more, check out the NeuralSpace VoiceAI Docs.

List Languages

To get the list of supported language codes based on the transcription type, use:

# for file transcription
langs = vai.languages('file')

# for streaming transcription
langs = vai.languages('stream')

List voices

To get the list of supported voices along with its metadata, use:

voices = vai.voices()

Job Config

Instead of providing any config or params as a dict, you can provide it as a str, pathlib.Path or a file-like object.

job_id = vai.transcribe(
    file='path/to/audio.wav',
    config='{"file_transcription": {"language_id": "en", "mode": "advanced", "number_formatting": "words"}}',
)
# or, 
job_id = vai.transcribe(
    file='path/to/audio.wav',
    config='path/to/config.json',
)
# or, 
with open('path/to/config.json') as fp:
    job_id = vai.transcribe(
        file='path/to/audio.wav',
        config=fp
    )

Wait for Completion

You can also poll for the status and wait until the job completes:

result = vai.poll_until_complete(job_id)
print(result['data']['result']['transcription']['transcript'])

Note: This will block the calling thread until the job is complete.

Callbacks

You can also provide a callback function when creating the job.
It will be called with the result once the job completes.

def callback(result):
    print(f'job completed: {result["data"]["jobId"]}')
    print(result['data']['result']['transcription']['transcript'])

job_id = vai.transcribe(file='path/to/audio.wav', config=config, on_complete=callback)

Note: transcribe() will return the job_id as soon as the job is scheduled, and the provided callback will be called on a new thread. The calling thread will not be blocked in this case.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.2.0

Jul 23, 2024

1.1.0

Dec 6, 2023

1.0.6

Sep 22, 2023

1.0.5

Sep 20, 2023

1.0.4

Sep 14, 2023

1.0.3

Sep 13, 2023

1.0.2

Sep 6, 2023

1.0.1

Sep 6, 2023

1.0.0

Sep 5, 2023

0.3.1

Sep 5, 2023

0.3.0

Sep 5, 2023

0.2.35

Jul 12, 2022

0.2.34

Jun 9, 2022

0.2.33

Jun 9, 2022

0.2.32

Jun 8, 2022

0.2.31

Jun 7, 2022

0.2.30

Jun 7, 2022

0.2.29

Jun 7, 2022

0.2.28

Jun 7, 2022

0.2.27

Jun 7, 2022

0.2.26

Jun 7, 2022

0.2.25

May 4, 2022

0.2.24

May 4, 2022

0.2.23

Apr 21, 2022

0.2.22

Apr 5, 2022

0.2.21

Mar 30, 2022

0.2.20

Mar 30, 2022

0.2.19

Mar 30, 2022

0.2.18

Mar 29, 2022

0.2.17

Mar 29, 2022

0.2.16

Mar 29, 2022

0.2.15

Mar 29, 2022

0.2.14

Mar 29, 2022

0.2.12

Feb 15, 2022

0.2.11

Feb 15, 2022

0.2.10

Feb 15, 2022

0.2.9

Feb 15, 2022

0.2.8

Feb 15, 2022

0.2.7

Feb 15, 2022

0.2.6

Feb 15, 2022

0.2.5

Feb 13, 2022

0.2.4

Feb 13, 2022

0.2.3

Feb 13, 2022

0.2.2

Feb 9, 2022

0.2.1

Feb 8, 2022

0.2.0

Feb 2, 2022

0.1.50

Feb 2, 2022

0.1.49

Feb 1, 2022

0.1.48

Jan 18, 2022

0.1.47

Jan 5, 2022

0.1.46

Jan 5, 2022

0.1.45

Jan 5, 2022

0.1.44

Jan 5, 2022

0.1.43

Jan 5, 2022

0.1.42

Jan 3, 2022

0.1.41

Dec 29, 2021

0.1.38

Nov 11, 2021

0.1.37

Nov 10, 2021

0.1.36

Oct 4, 2021

0.1.35

Oct 1, 2021

0.1.34

Sep 29, 2021

0.1.33

Sep 22, 2021

0.1.32

Sep 20, 2021

0.1.31

Sep 11, 2021

0.1.30

Sep 11, 2021

0.1.29

Sep 11, 2021

0.1.28

Sep 11, 2021

0.1.27

Sep 11, 2021

0.1.26

Sep 11, 2021

0.1.25

Aug 31, 2021

0.1.24

Aug 26, 2021

0.1.23

Aug 26, 2021

0.1.22

Aug 26, 2021

0.1.21

Aug 26, 2021

0.1.20

Aug 26, 2021

0.1.19

Aug 26, 2021

0.1.18

Aug 26, 2021

0.1.17

Aug 26, 2021

0.1.16

Aug 26, 2021

0.1.15

Aug 26, 2021

0.1.14

Aug 26, 2021

0.1.13

Aug 26, 2021

0.1.12

Aug 25, 2021

0.1.11

Aug 25, 2021

0.1.10

Aug 25, 2021

0.1.9

Aug 25, 2021

0.1.8

Aug 25, 2021

0.1.7

Aug 25, 2021

0.1.6

Aug 25, 2021

0.1.5

Aug 5, 2021

0.1.4

Aug 5, 2021

0.1.3

Aug 5, 2021

0.1.2

Aug 4, 2021

0.1.1

Aug 3, 2021

0.1.0

Aug 2, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuralspace-1.2.0.tar.gz (15.8 kB view details)

Uploaded Jul 23, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

neuralspace-1.2.0-py3-none-any.whl (13.1 kB view details)

Uploaded Jul 23, 2024 Python 3

File details

Details for the file neuralspace-1.2.0.tar.gz.

File metadata

Download URL: neuralspace-1.2.0.tar.gz
Upload date: Jul 23, 2024
Size: 15.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for neuralspace-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`62cb9f7b82c8e89edcfd14052f0a8957dfd524058696bf411e0673dcdb6d8320`
MD5	`52d608e292af7a73952c2ddd1edd2112`
BLAKE2b-256	`dfe35293a896a97b1d292e940732189be59dae8ac34bb7e2c86370d401055361`

See more details on using hashes here.

File details

Details for the file neuralspace-1.2.0-py3-none-any.whl.

File metadata

Download URL: neuralspace-1.2.0-py3-none-any.whl
Upload date: Jul 23, 2024
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for neuralspace-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`59d123c34227311af1126262047bdb8462d091ee31e6c56fd240d5c75b0c6c23`
MD5	`a94910e2d4c0985572c10c65b00146af`
BLAKE2b-256	`2cae69a19448696771cb735bb0fdbf3dde48fe01dcff346622141d4833ff2270`

See more details on using hashes here.

neuralspace 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NeuralSpace VoiceAI Python Client

Installation

Authentication

Quickstart

File Transcription

Streaming Real-Time Transcription

Text to Speech

More Features

List Languages

List voices

Job Config

Wait for Completion

Callbacks

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes