Noise reduction for audio and video files using DeepFilterNet
Project description
DeepFilter Multimedia
Remove noise from audio and video files using DeepFilterNet. This CLI tool (dfm) provides a simple interface for applying state-of-the-art deep learning-based noise reduction to multimedia files.
Features
- Audio Support: WAV, MP3, FLAC, OGG, M4A, AAC, WMA
- Video Support: MP4, MKV, AVI, MOV, WebM, FLV, WMV, M4V
- Easy CLI: Simple command-line interface (
dfm) - Auto-detection: Automatically detects file type
- Batch Processing: Process multiple files at once
Installation
From PyPI
pip install deepfilter-multimedia
From Source
# Clone the repository
git clone https://github.com/svemyh/deepfilter-multimedia.git
cd deepfilter-multimedia
# Install dependencies
pip install -e .
Requirements
- Python 3.8+
- PyTorch 1.9+
- FFmpeg (for video processing)
- DeepFilterNet
Install FFmpeg:
# Ubuntu/Debian
sudo apt install ffmpeg
# macOS
brew install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html
Usage
Basic Usage
Process a single audio or video file:
dfm input.mp4
The enhanced file will be saved to output/input_enhanced.mp4.
Specify Output Path
dfm input.mp4 -o output/clean.mp4
Process Multiple Files
dfm video1.mp4 video2.mkv audio1.wav
All files will be saved to the output/ directory.
Quiet Mode
Disable progress messages:
dfm input.mp4 -q
Help
dfm --help
Examples
Clean noisy interview recording
dfm noisy_interview.mp4 -o clean_interview.mp4
Process podcast audio
dfm podcast_episode.mp3
Batch process multiple videos
dfm video1.mkv video2.mp4 video3.avi
Use as Python module
from deepfilter_multimedia.core import process_file
# Process a file
output_path = process_file("noisy_video.mp4", "clean_video.mp4")
print(f"Enhanced video saved to: {output_path}")
How It Works
-
For Videos:
- Extracts audio track (48kHz, stereo)
- Applies DeepFilterNet noise reduction
- Reassembles video with enhanced audio
- Keeps original video quality
-
For Audio:
- Loads audio file
- Applies DeepFilterNet noise reduction
- Saves enhanced audio
Important Notes
Sample Rate: DeepFilterNet is optimized for 48kHz audio. Input files at other sample rates (16kHz, 44.1kHz, etc.) will be automatically resampled to 48kHz before processing. Output files are saved at 48kHz.
- Quality: The model works best with 48kHz audio as it was trained on this sample rate
- Upsampling: Files below 48kHz (e.g., 16kHz phone recordings) will be upsampled - results may vary
- Downsampling: Files above 48kHz (e.g., 96kHz studio recordings) will be downsampled - some high-frequency information may be lost
Model Download: On first run, DeepFilterNet will download the pretrained model (~50MB). This may take a few moments.
Attribution
This project is based on DeepFilterNet by Hendrik Schröter et al.
If you use this tool, please cite the original DeepFilterNet papers:
@inproceedings{schroeter2022deepfilternet,
title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},
author={Schröter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2022},
organization={IEEE}
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
DeepFilterNet is dual-licensed under MIT and Apache 2.0 licenses.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Links
Troubleshooting
Sample Rate Issues
Q: My audio is 44.1kHz (CD quality). Will it work?
A: Yes! It will be automatically resampled to 48kHz. Quality should be excellent since you're only changing sample rate slightly.
Q: My recording is 16kHz (phone/voice). Will noise reduction work?
A: Yes, but results may vary. The model is trained on 48kHz, so upsampling from 16kHz may not capture all the detail the model expects. Try it and see - many users report good results even with lower sample rates.
Q: Why is my output always 48kHz?
A: DeepFilterNet is specifically trained on 48kHz audio and cannot operate at other sample rates. This is a fundamental limitation of the model architecture.
FFmpeg Issues
FFmpeg not found:
Make sure FFmpeg is installed and available in your PATH:
ffmpeg -version
CUDA/GPU Support
For GPU acceleration, install PyTorch with CUDA support:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepfilter_multimedia-0.1.2.tar.gz.
File metadata
- Download URL: deepfilter_multimedia-0.1.2.tar.gz
- Upload date:
- Size: 211.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed48b245f52b7e587f19111c906196f2c8d0c486cafed4ac5406bec0c4b5f857
|
|
| MD5 |
9eda94e7498ab1fb293a93ab0daabac9
|
|
| BLAKE2b-256 |
15ab122666708219254538e26bf1762b9297454d07b8d8ff4db4f2f09698c6d5
|
File details
Details for the file deepfilter_multimedia-0.1.2-py3-none-any.whl.
File metadata
- Download URL: deepfilter_multimedia-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53f29ae33150bbe1b7fbdbb20a4cd64348c26dcdf59fa310f98337d307fed76d
|
|
| MD5 |
5ec980b7e5fe03d0067392444bece24e
|
|
| BLAKE2b-256 |
52716d162b9a7a5f8ea00b8f6368493aebf2d80859b360e7e5b6fb355100809f
|