A multilingual text and voice processing toolkit

These details have not been verified by PyPI

Project links

Project description

LinguaLab

LinguaLab is a comprehensive multilingual text and voice processing toolkit designed for language translation, speech recognition, and text processing tasks. The package provides robust tools for translating text between languages and transcribing audio/video files using advanced AI services.

Features

Text Translation:
- Multi-language text translation using Google Translate API
- Automatic language detection
- Fallback to alternative translation services
- Support for bulk translations and nested text structures
- Configurable translation providers and parameters
Speech Recognition:
- Audio/video file transcription using IBM Watson Speech-to-Text
- Support for multiple audio formats (WAV, MP3, etc.)
- High-accuracy transcription with confidence scoring
- Batch processing capabilities
- Configurable transcription parameters
Language Processing:
- Comprehensive language detection
- Pronunciation assistance
- Confidence scoring for translations
- Error handling and fallback mechanisms
Defensive Programming:
- Automatic nested list flattening for text inputs
- Comprehensive parameter validation
- Enhanced error handling with detailed diagnostics
- Type safety with modern Python annotations

Installation

Prerequisites

Before installing, please ensure the following dependencies are available on your system:

External Tools (required for full functionality):
- Microphone access (for speech recognition features)
- Internet connection (for translation services)

Required Third-Party Libraries:

pip install numpy pandas SpeechRecognition googletrans gTTS

Or via Anaconda (recommended channel: conda-forge):

conda install -c conda-forge numpy pandas
pip install SpeechRecognition googletrans gTTS

Internal Package Dependencies:

pip install filewise paramlib
pip install pygenutils                    # Core functionality
pip install pygenutils[arrow]             # With arrow support (optional)

For regular users (from PyPI)

pip install lingualab

For contributors/developers (with latest Git versions)

# Install with development dependencies (includes latest Git versions)
pip install -e .[dev]

# Alternative: Use requirements-dev.txt for explicit Git dependencies
pip install -r requirements-dev.txt
pip install -e .

Benefits of the new approach:

Regular users: Simple pip install lingualab with all dependencies included
Developers: Access to latest Git versions for development and testing
PyPI compatibility: All packages can be published without Git dependency issues

If you encounter import errors:

For PyPI users: The package should install all dependencies automatically. If you get import errors, try:
```
pip install --upgrade lingualab
```
For developers: Make sure you've installed the development dependencies:
```
pip install -e .[dev]
```
Common issues:
- Missing dependencies: For regular users, all dependencies are included. For developers, use pip install -e .[dev]
- Python version: Ensure you're using Python 3.10 or higher
- Speech recognition: Ensure microphone access is granted for speech features

Verify Installation

To verify that your installation is working correctly:

try:
    import LinguaLab
    from filewise.file_operations.path_utils import find_files
    from pygenutils.arrays_and_lists.data_manipulation import flatten_list
    from paramlib.global_parameters import COMMON_DELIMITER_LIST
    
    print("✅ All imports successful!")
    print(f"✅ LinguaLab version: {LinguaLab.__version__}")
    print("✅ Installation is working correctly.")
    
except ImportError as e:
    print(f"❌ Import error: {e}")
    print("💡 For regular users: pip install lingualab")
    print("💡 For developers: pip install -e .[dev]")

Usage

Text Translation Example

from LinguaLab.text_translations import translate_string

# Translate a single phrase
result = translate_string(
    phrase_or_words="Hello, how are you?",
    lang_origin="en",
    lang_translation="es"
)
print(result.text)  # "Hola, ¿cómo estás?"

# Translate multiple phrases
phrases = ["Good morning", "Good afternoon", "Good evening"]
results = translate_string(
    phrase_or_words=phrases,
    lang_origin="en",
    lang_translation="fr"
)
for result in results:
    print(result.text)

# Handle nested lists automatically
nested_phrases = [
    ["Hello", "Goodbye"],
    ["Thank you", "Please"],
    "Welcome"
]
results = translate_string(
    phrase_or_words=nested_phrases,
    lang_origin="en",
    lang_translation="de"
)

Language Detection Example

from LinguaLab.text_translations import translate_string

# Detect language of text
detection = translate_string(
    phrase_or_words="Bonjour, comment allez-vous?",
    lang_origin="auto",
    procedure="detect",
    text_which_language_to_detect="Bonjour, comment allez-vous?"
)
print(f"Detected language: {detection.lang}")
print(f"Confidence: {detection.confidence}")

Speech Recognition Example

from LinguaLab.transcribe_video_files import save_transcription_in_file

# Note: Requires IBM Watson API credentials
# Set up your API_KEY and SERVICE_ID in the module

# The module automatically processes WAV files in the specified directory
# and can save transcriptions to text files

Project Structure

The package is organised as a focused language processing toolkit:

LinguaLab/
├── text_translations.py      # Text translation and language detection
├── transcribe_video_files.py # Speech recognition and transcription
├── __init__.py              # Package initialisation
└── README.md                # Package documentation

Key Functions

`translate_string()`

Purpose: Translate text between languages using multiple translation services

Key Features:

Supports single strings, lists, and nested lists of text
Automatic fallback between translation services
Language detection capabilities
Configurable translation parameters
Comprehensive error handling

Parameters:

phrase_or_words: Text to translate (supports nested lists)
lang_origin: Source language code
lang_translation: Target language code (default: "en")
procedure: "translate" or "detect"
provider: Translation service provider
print_attributes: Whether to print detailed results

`save_transcription_in_file()`

Purpose: Save speech transcription results to text files

Key Features:

Automatic file extension handling
Progress reporting
Error handling and validation
Flexible output formatting

Advanced Features

Defensive Programming

Nested List Support: Automatically flattens complex nested text structures
Parameter Validation: Comprehensive input validation with detailed error messages
Type Safety: Modern Python type annotations (PEP-604) for better IDE support
Error Handling: Detailed error reporting for debugging

Service Integration

Google Translate: Primary translation service with automatic fallback
IBM Watson: Speech-to-text transcription service
Alternative Services: Support for multiple translation providers
Connection Management: Robust handling of service availability

Performance Optimisation

Batch Processing: Efficient handling of multiple texts
Service Fallback: Automatic switching between translation services
Resource Management: Proper cleanup and memory management

Supported Languages

Translation Services

Google Translate: 100+ languages supported
Microsoft Translator: Enterprise-grade translation
MyMemory: Free translation service
LibreTranslate: Open-source translation

Speech Recognition

IBM Watson: 20+ languages supported
Multiple Audio Formats: WAV, MP3, FLAC, etc.
Real-time Processing: Stream-based transcription

Version Information

Current version: 3.6.1

Recent updates

3.6.x: NumPy / Pandas ≥2.2.3 packaging, aligned filewise / pygenutils / paramlib floors, requests / beautifulsoup4 / lxml added to pyproject.toml, and conda post-link for PyPI-only speech/translation deps (see changelog).
3.5.x: Type-hint and docstring standardisation, defensive list handling, and broader NLP workflow polish (see changelog for detail).

Error Handling

The package provides comprehensive error handling:

ValueError: For invalid language codes or parameters
RuntimeError: For service connection issues
AttributeError: For service availability problems
SyntaxError: For malformed input parameters

System Requirements

Python: 3.10 or higher
Internet Connection: Required for translation and speech services
Memory: Sufficient RAM for processing large text batches
Storage: Space for transcription output files

Dependencies

Core Dependencies

SpeechRecognition: Speech recognition capabilities
googletrans: Google Translate integration
gTTS: Google Text-to-Speech (if needed)

Internal Dependencies

filewise: File operations and path utilities
pygenutils: Utility functions and data manipulation
paramlib: Parameter and configuration management

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Guidelines

Follow existing code structure and language processing best practices
Add comprehensive docstrings with parameter descriptions
Include error handling for all service operations
Test with various languages and text formats
Update changelog for significant changes

License

This project is licensed under the MIT License — see the LICENSE file in the repository for details.

Acknowledgments

Google Translate Team for the translation API
IBM Watson Team for speech recognition services
Python NLP Community for ecosystem development
Open Source Translation Providers for free services

Contact

For any questions or suggestions, please open an issue on GitHub or contact the maintainers.

Troubleshooting

Common Issues

Translation Service Errors:
- Check internet connection
- Verify language codes are valid
- Try alternative translation providers
Speech Recognition Issues:
- Ensure IBM Watson credentials are set
- Check audio file format compatibility
- Verify API service availability
Import Errors:
- Run pip install -e . for development setup
- Check Python version compatibility
- Verify all dependencies are installed

Getting Help

Check function docstrings for parameter details
Review service provider documentation
Open an issue on GitHub for bugs or feature requests

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.6.1

Apr 3, 2026

3.5.10

Aug 19, 2025

3.5.9

Jul 28, 2025

3.5.8

Jul 17, 2025

3.5.6

Jul 4, 2025

3.5.5

Jul 4, 2025

3.5.3

Jun 27, 2025

3.5.2

Jun 27, 2025

3.5.0

Jun 24, 2025

3.4.3

May 5, 2025

3.4.2

May 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingualab-3.6.1.tar.gz (14.5 kB view details)

Uploaded Apr 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lingualab-3.6.1-py3-none-any.whl (14.6 kB view details)

Uploaded Apr 3, 2026 Python 3

File details

Details for the file lingualab-3.6.1.tar.gz.

File metadata

Download URL: lingualab-3.6.1.tar.gz
Upload date: Apr 3, 2026
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for lingualab-3.6.1.tar.gz
Algorithm	Hash digest
SHA256	`1c50e6b324abd073ee32eb76e7b88b02b14ac671edefabf0c21cabe7b4eb2549`
MD5	`b1f786a579ef803c8606ec9e5c9dc97d`
BLAKE2b-256	`9eac05691426e3c1600ac55a0bcae5af818be176cd2d02a2c9cb06b94c04a54a`

See more details on using hashes here.

File details

Details for the file lingualab-3.6.1-py3-none-any.whl.

File metadata

Download URL: lingualab-3.6.1-py3-none-any.whl
Upload date: Apr 3, 2026
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for lingualab-3.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`12e085b9b951c98511df69bf431d2c6ce0a4e2f6d788e13b2103251db915d6de`
MD5	`647b7c1df0f5175f532a0725b7e93542`
BLAKE2b-256	`d5060754a72c4df2089c5761d2c1d4421c3ffbefa29dac424d59bd055fccf852`

See more details on using hashes here.

LinguaLab 3.6.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LinguaLab

Features

Installation

Prerequisites

For regular users (from PyPI)

For contributors/developers (with latest Git versions)

Verify Installation

Usage

Text Translation Example

Language Detection Example

Speech Recognition Example

Project Structure

Key Functions

translate_string()

save_transcription_in_file()

Advanced Features

Defensive Programming

Service Integration

Performance Optimisation

Supported Languages

Translation Services

Speech Recognition

Version Information

Recent updates

Error Handling

System Requirements

Dependencies

Core Dependencies

Internal Dependencies

Contributing

Development Guidelines

License

Acknowledgments

Contact

Troubleshooting

Common Issues

Getting Help

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`translate_string()`

`save_transcription_in_file()`