Speech-to-text for Flet apps via OS-native recognition (Android/iOS)
Project description
flet-stt
Speech-to-text for Flet apps. Wraps the speech_to_text Flutter plugin through a custom Flet extension, giving your Python code access to the OS-native speech recognizer on Android and iOS.
On Android this uses Google Speech Services (on-device for ~50 languages, no API key needed). On iOS it uses Apple's SFSpeechRecognizer.
Features
- real-time speech recognition with partial results and alternate transcriptions
- locale selection (50+ languages, on-device where available)
- sound level monitoring during listening
- listen modes: confirmation (short), search, dictation (long-form)
- configurable pause detection, sample rate, and error handling
- system locale detection, permission checking, listening state queries
- proper error and status callbacks
Install
pip install flet-stt
# or
poetry add flet-stt
In your app's pyproject.toml, declare the Android permission and tell Flet where to find the extension for APK builds:
[project]
dependencies = [
"flet>=0.82.0",
"flet-stt",
]
[tool.flet.android.permission]
"android.permission.RECORD_AUDIO" = true
[tool.flet.app]
exclude = ["flet_stt"]
[tool.flet.dev_packages]
flet-stt = "flet_stt"
The exclude line prevents the extension source from being raw-copied into the APK, which would shadow the installed package and break imports.
Usage
import json
import flet as ft
from flet_stt import FletStt
def main(page: ft.Page):
stt = FletStt()
def on_result(e):
data = json.loads(e.data)
if data["final"]:
print(f"Recognized: {data['text']} (confidence: {data['confidence']:.0%})")
stt.on_result = on_result
async def start_listening(e):
await stt.initialize()
await stt.listen(partial_results=True, listen_mode="dictation")
async def stop_listening(e):
await stt.stop()
page.add(
ft.Column([
ft.Button(content="Listen", on_click=start_listening),
ft.Button(content="Stop", on_click=stop_listening),
])
)
ft.run(main)
Just instantiate FletStt. Do not add it to page.overlay or page.controls - it's a service, not a visual control, and it registers itself automatically.
See the examples/ folder for more: basic.py, simple.py, continuous.py, locale_picker.py, diagnostic.py.
API
FletStt(on_result=..., on_sound_level=..., on_error=..., on_status=...)
The service. Instantiate once. All callbacks receive an event where e.data is a JSON string.
Events
| Event | e.data format | Description |
|---|---|---|
on_result |
{"text": "hello", "final": true, "confidence": 0.95, "alternates": [...]} |
Recognition result. final=false for partial results. |
on_sound_level |
{"level": -6.5} |
Microphone dB level during listening. |
on_error |
{"error": "error_speech_timeout", "permanent": false} |
Recognition error. Permanent errors require re-initialization. |
on_status |
{"status": "listening"} |
Status change: "listening", "notListening", "done". |
The alternates field in on_result is a list of {"text": "...", "confidence": 0.0} objects. The first entry matches the main text field. Additional entries are alternative transcriptions when the engine provides them (depends on platform/locale).
await initialize() -> bool
Initialize the speech recognizer and check availability. Must be called before listen(). Requests microphone permission on first call. Returns True if speech recognition is available.
await listen(...)
| Parameter | Type | Default | Description |
|---|---|---|---|
locale_id |
str |
"" |
BCP-47 locale (e.g. "en_US", "ro_RO"). Empty = system default. |
listen_for_seconds |
int |
0 |
Max listen duration. 0 = platform default (~60s on Android). |
pause_for_seconds |
int |
0 |
Auto-stop after this many seconds of silence. 0 = platform default. |
partial_results |
bool |
True |
Fire on_result for partial (non-final) results during recognition. |
on_device |
bool |
True |
Prefer on-device recognition when available. |
cancel_on_error |
bool |
False |
Cancel recognition on error instead of continuing. |
sample_rate |
int |
0 |
Audio sample rate in Hz. 0 = platform default. 16000 is recommended for speech. |
listen_mode |
str |
"confirmation" |
One of "confirmation", "search", "dictation". |
await stop()
Stop listening and trigger the final recognition result.
await cancel()
Cancel listening without triggering a final result.
await locales() -> list[dict]
Get available locales. Returns list of {"id": "en_US", "name": "English (United States)"}.
await system_locale() -> dict
Get the system's default speech recognition locale. Returns {"id": "en_US", "name": "English (United States)"}.
await is_listening() -> bool
Check whether the speech recognizer is currently listening.
await has_permission() -> bool
Check whether the app has microphone permission.
Building the APK
flet build apk -v
Or use build.py for automated build + deploy (dev-only, has hardcoded local paths):
python build.py
python build.py --skip-flet # reuse existing build dir
python build.py --skip-install # build only, don't deploy
On Windows, set PYTHONIOENCODING=utf-8 before building to avoid Unicode crashes from Rich's spinner characters.
Installing on device
Always do a full uninstall before installing a new APK. Flet's serious_python caches the extracted Python environment and won't pick up code changes with adb install -r:
adb uninstall com.yourapp.package
adb install build/apk/app-release.apk
Platform notes
Android
- Uses Google Speech Services, pre-installed on all phones with Google Play Services
- On-device recognition available for ~50 languages (no internet needed)
- Auto-stops after ~5s of silence or ~60s total - use the continuous example pattern to work around this
- Requires
RECORD_AUDIOpermission (declared in pyproject.toml, requested at runtime byinitialize())
iOS
- Uses Apple SFSpeechRecognizer
- On-device recognition since iOS 15 for major languages
- Requires
NSSpeechRecognitionUsageDescriptionandNSMicrophoneUsageDescriptionin Info.plist
Desktop
Does nothing. The service instantiates without error but recognition won't work since there's no native plugin backing it.
How it works
your Python app
-> FletStt (ft.Service)
-> _invoke_method() over Flet protocol
-> SttService (FletService, Dart)
-> speech_to_text plugin
-> Android SpeechRecognizer / iOS SFSpeechRecognizer
The extension is packaged as a standard Python package with a flutter/ namespace directory containing the Dart code. When you run flet build apk, Flet discovers the Dart code in site-packages and includes it as a path dependency in the generated Flutter project.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flet_stt-0.1.0.tar.gz.
File metadata
- Download URL: flet_stt-0.1.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05cc8929b7c0b7698b018470b116da8b5e7724a70e77bf812b45d78ab6cf1e93
|
|
| MD5 |
269c16685ec7b4f3de9696dd6316167c
|
|
| BLAKE2b-256 |
9ca05dd7f8070bbbeaba17c99b91cf8657abb5ec49e1795cca6483577ad56bff
|
File details
Details for the file flet_stt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: flet_stt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
649d5c865014cbc84cdf52bfef55b7d84d8b0edd99c3ecce2ee230e3419dbc92
|
|
| MD5 |
9daf848676c54f15981c2857b2c69013
|
|
| BLAKE2b-256 |
a9a7d601486152e20c927528ba32f476692f8c3e810297f4b350c034d30bc8c1
|