Experimental voice receive extension for discord.py
Project description
discord-ext-voice-recv
Voice receive extension package for discord.py
Warning
This extension should be more or less functional, but the code is not yet feature complete. No guarantees are given for stability or random breaking changes.
Installing
Python 3.8 or higher is required
python -m pip install git+https://github.com/imayhaveborkedit/discord-ext-voice-recv
This package will be uploaded to pypi eventually.
Naturally, this extension depends on discord.py
being installed with voice support.
Example
See the example script.
Usage
VoiceRecvClient
The class voice_recv.VoiceRecvClient
must be used in VoiceChannel.connect()
to use voice receive functionality.
voice_client = await voice_channel.connect(cls=voice_recv.VoiceRecvClient)
New voice client functions
def listen(sink: voice_recv.AudioSink, *, after=None) -> None
Receives audio data into an AudioSink
. A sink is similar to the AudioSource
class, where most of the logic is done in a single callback function, but in reverse. Sinks are explained in detail in the Sinks section below.
The finalizer, after
is called after the sink has been exhausted or an error occurred. The callback signature is the same as the after callback for play()
, one parameter for an optional Exception object.
def is_listening() -> bool
Returns True
if the voice client is currently receiving audio. Specifically, if the bot is reading from the voice socket.
def stop() -> None
This function has been altered to stop both receiving and sending of audio.
def stop_listening() -> None
Stops receiving audio.
def stop_playing() -> None
Stops playing audio. This function is identical to discord.VoiceClient.stop()
.
Sinks
The api of this extension is designed to mirror the discord.py voice send api. Sending audio uses the AudioSource
class, while receiving audio uses the AudioSink
class. A sink is designed to be the inverse of a source. Essentially, a source is a callback called by discord.py to produce a chunk of audio data. Conversely, a sink is a callback called by the library to handle a chunk of audio. Sinks can be composed in the same fashion as sources, creating an audio processing pipeline. Sources and sinks can even combined into one object to handle both tasks, such as creating a feedback loop.
Special care should be taken not to write excessively computationally expensive code, as python is not particularly well suited to realtime audio processing.
Due to voice receive being somewhat more complex than voice sending, sinks have additional functionality compared to sources. However, the core sink functions should look relatively familiar.
class MySink(voice_recv.AudioSink):
def __init__(self):
super().__init__()
def wants_opus(self) -> bool:
return False
def write(self, user: Optional[User | Member], data: VoiceData):
...
def cleanup(self):
...
These are the main functions of a sink, names and purpose reflecting that of their source counterparts. It is important to note that super().__init__()
must be called when inheriting from AudioSink
, in contrast to AudioSource
which does not have a default __init__
function.
- The
wants_opus()
function determines if the sink should receive opus packets or decoded PCM packets. Care should be taken not to unintentionally mix sinks that want different types. - The
write()
function is the main callback, where the sink logic takes place. In a sink pipeline, this could alter, inspect, or log a packet, and then write it to a child sink.VoiceData
is a simple container class with attributes for the origin member, opus data, optionally pcm data, and raw audio packet. - The
cleanup()
function is identical toAudioSource.cleanup()
, a finalizer to cleanup any loose ends when the sink has finished its job.
In addition, sinks also have properties for their voice_client
, as well as parent
and child
/children
sinks. Furthermore, sinks will be able to receive events in a similar manner to cogs, but this has not been implemented yet. (TODO)
This extension comes with several useful built in sinks, which I will briefly explain another time. For now just source dive. (TODO)
New events
async def on_voice_member_speak(member: discord.Member, ssrc: int)
Called when a member first speaks (transmits audio) in a voice channel. This event is only called once per their voice session (ssrc assignment). Any packets received from this member before this event fires can (probably) be safely ignored since they are likely just silence packets. The main purpose of this event is to reveal the ssrc of a member, to map packets to their originating member.
This is NOT a speaking indicator event. The speaking indicator is determined by packet activity. This functionality will be added in the future.
async def on_voice_member_disconnect(member: discord.Member, ssrc: int)
Called when a member disconnects from a voice channel. The ssrc
parameter is the unique id a member has to identify which packets belong to them. This is useful when using custom sinks, particularly those that handle packets from multiple members.
async def on_voice_member_video(member: discord.Member, data: voice_recv.VoiceVideoStreams)
Called when a member in voice channel toggles their webcam on or off, NOT screenshare. Screenshare status is only indicated in the self_video
attribute of discord.VoiceState
.
async def on_voice_member_flags(member: discord.Member, flags: Optional[int])
An undocumented event dispatched when a member joins a voice channel containing a flags bitfield. Only values 0
, 2
, and None
have been observed so far, but their meaning remains unknown.
async def on_voice_member_platform(member: discord.Member, platform: Optional[int | str])
An undocumented event dispatched when a member joins a voice channel containing a platform key, presumably with what platform the member joined on. However, this field has only ever been seen to contain None
.
Currently missing features
- Sink events (similar to cog event handlers)
- Silence generation (will be implemented as an included AudioSink)
- Member speaking state status/event (design not yet decided)
- Various internal impl details to maintain audio stability and consistency
Future plans
- Muxer AudioSink (mixes multiple audio streams into a single stream)
- Rust implementations of some components for improved performance
- Alternative voice client implementation with a minimal interface intended for use with external data processing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for discord-ext-voice_recv-0.1.7a94.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | eec7fae1851f7d566a956a70fd0508ded4bf53172904ad1d0829c84c81b0a8f1 |
|
MD5 | 9d4a3dc6a176bf3d63106e9260f0c37f |
|
BLAKE2b-256 | 223a38c2ef63d0da44cd3dd3ed68db14356cf723cd5652cfdc1053a5836bd254 |
Hashes for discord_ext_voice_recv-0.1.7a94-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea79721b327974a842b1f6b262e2980528fe1a4c2bacf8d8b90e8c71cab3f4b3 |
|
MD5 | b8e776019a7e95223a6358f69d594208 |
|
BLAKE2b-256 | 6eb0e884729c1043febfc00e8bcbafa409b9290e0d1efe46f751456cfa3dd192 |