AI Voice Detection System
Project description
VoiceAuth - AI Voice Detection System
A desktop application that uses machine learning to differentiate between AI-generated and human voices based on audio samples. VoiceAuth employs logistic regression and advanced audio feature extraction to provide accurate classification of voice samples.
Features
- ✅ Audio Processing: Extract features from various audio formats (WAV, MP3, OGG, FLAC)
- 🧠 Machine Learning: Logistic regression model with hyperparameter optimization
- 🔍 Real-time Classification: Analyze and classify voice samples instantly
- ⚡ Performance: Hardware acceleration through parallel processing
- 📊 Visualization: Feature importance and model performance metrics
- 🔄 Feedback System: Continuously improve model accuracy with user feedback
- 💾 Cross-Platform Support: Works on Windows, macOS, and Linux
Screenshots
[Screenshots would be included here]
Installation
Option 1: Install as a Package (Recommended)
VoiceAuth is now available as an installable Python package:
# Install directly from PyPI
pip install voiceauth
# Or install from GitHub
pip install git+https://github.com/zohaiblazuli/VoiceAuth.git
After installation, you can start the application by simply running:
voiceauth
For detailed installation instructions, see the INSTALL.md file.
Option 2: Manual Setup
Prerequisites
- Python 3.8 or higher
Setup
-
Clone the repository:
git clone https://github.com/zohaiblazuli/VoiceAuth.git cd VoiceAuth -
Install dependencies:
pip install -r requirements.txt -
Run the application:
python run_voiceauth.py
Usage
Running the Application
VoiceAuth features a user-friendly interface with several tabs:
- Import Sample Tab: Import audio files for classification
- Record Sample Tab: Record your voice directly for classification
- Feedback Tab: Provide feedback on classification results to improve the model
- Information Tab: View model statistics and performance metrics
Sample Dataset Structure
For training the model, organize your dataset as follows:
dataset_folder/
├── ai_generated/ (folder containing AI-generated voice samples)
│ ├── sample1.wav
│ ├── sample2.wav
│ └── ...
└── human/ (folder containing human voice samples)
├── sample1.wav
├── sample2.wav
└── ...
Technical Details
Path Management System
VoiceAuth uses a robust path management system to ensure compatibility across different environments, particularly when shared via GitHub. The system:
- Automatically determines the base directory: Works whether running from source, as a compiled executable, or from any relative directory
- Standardizes path access: All file paths are accessed through utility functions, not hardcoded strings
- Creates necessary directories: Output and model directories are automatically created if they don't exist
- Cross-platform compatibility: Paths are normalized for the operating system in use
Key path utility functions:
get_base_dir(): Gets the base application directoryget_resource_path(relative_path): Gets the absolute path to any resourceget_media_path(filename): Gets the path to media filesget_output_path(filename): Gets the path to output files or directoriesget_model_path(filename): Gets the path to model files
Feature Extraction
VoiceAuth extracts various audio features using the librosa library:
- MFCCs (Mel-Frequency Cepstral Coefficients)
- Spectral Centroid, Contrast, Rolloff
- Zero Crossing Rate
- Chroma Features
- Spectral Bandwidth
- Tempo and Beat Features
- Mel Spectrogram
Machine Learning Model
- Preprocessing: Standard scaling for feature normalization
- Feature Selection: Optional using SelectFromModel
- Model: Logistic regression with hyperparameter tuning
- Evaluation: Uses accuracy, precision, recall, and F1-score
Feedback System
The application includes an adaptive learning system that:
- Collects user feedback on classification results
- Stores correctly labeled samples
- Automatically retrains the model when sufficient feedback data is collected
- Updates the model in real-time
Development
Project Structure
voiceauth.py: Main application moduleui_components.py: UI components and widgetssimple_model.py: Machine learning model implementationaudio_processor.py: Audio processing and feature extractionbatch_process.py: Batch processing of audio samplestabs.py: Implementation of application tabsutils.py: Utility functions, including path managementmedia/: Contains graphics and media assetsoutput/: Contains generated files (features, models)
Extending the Application
To add new features:
- For new UI components, add them to
ui_components.py - For new tabs, extend the functionality in
tabs.py - For new model features, modify
simple_model.py - For additional audio processing, update
audio_processor.py
Troubleshooting
Common Issues
- Media files not found: If you see errors about missing media files, ensure the
mediadirectory is in the same location as the application. - Model loading errors: Make sure the model has been trained and the appropriate model files exist in the output directory.
- Audio recording issues: Check your microphone permissions and settings.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgements
- librosa for audio feature extraction
- scikit-learn for machine learning components
- PyQt5 for the GUI framework
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voiceauth-1.0.1.tar.gz.
File metadata
- Download URL: voiceauth-1.0.1.tar.gz
- Upload date:
- Size: 924.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4e039dabc9fa2c48b00391420795f2b7d922d7e6aba638625d6cdd90c73a699
|
|
| MD5 |
62e85bc460e2da87801f25702c9779aa
|
|
| BLAKE2b-256 |
5cca03210b42b0274b9d1912ffdfced26722224632bb21cad9116ed5d1c86e01
|
File details
Details for the file voiceauth-1.0.1-py3-none-any.whl.
File metadata
- Download URL: voiceauth-1.0.1-py3-none-any.whl
- Upload date:
- Size: 925.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efae83bb017bbda5f4a8fed6a91c113f8cecb889b3032221e8e5d1721827bd07
|
|
| MD5 |
302c2cd1c2b4532dd07dba781844b37c
|
|
| BLAKE2b-256 |
0c4e2ce32dbe7b7fde9b327f17a1b96120c06ab5936be10de4ad38fa5e720f5e
|