A CLI tool that produces frequency animations for audio files with FFMPEG.
Project description
Frequensee
A command line interface (cli) tool used to create animations based on audio frequencies by utilizing the Fast Fourier Transform (fft) of the audio data. Frequensee uses FFMPEG to create the animations, so it either needs to be installed or have the ffmpeg.exe executable be present in the working directory.
| Bar animation |
|---|
| FFT animation |
|---|
Contents
- Installation
- Dependencies
- How to use
- Supported File formats
- Common output formats: Advantages and Disadvantages
- Command line options
Installation
Frequensee is released as a package on PyPI, so you can install it with pip using the following command:
pip install frequensee
or by downloading the source code from the GitHub repository.
Dependencies
Frequensee has 3 major dependencies:
- Soundfile: Used to read the data of the audio file and return them as a
numpyarray. - Matplotlib: Used to create the frames of the animation basd on the data above.
- FFMPEG: Used to combine the frames and produce the animation file.
The first two are installed automatically when installing frequensee through PyPI. You can also install them with pip using the requirements.txt file provided.
As for FFMPEG, you can either compile the source code yourself, or use an already compiled version. Personally, I use the executable provided by yt-dlp. The installation process is not too straight forward, but there are several available tutorials online. In any case, you can also have the FFMPEG executable file in the current working directory and it should work the same.
How to use
Cli version
If you installed through PyPI, the easiest way to run frequensee is with either of the following commands:
fqc -i "input file path" -o "output file path"
frequensee -i "input file path" -o "output file path"
Examples:
fqc -i "input.wav" -o "videos/output.gif"
frequensee -i "audio/input.mp3" -o "output.mp4"
If you installed from source, you can run the main.py file instead (remember to install the dependancies in requirements.txt first):
python main.py -i "input file path" -o "output file path"
or copy the code inside the main.py file to your desired python file.
Additional command line options can be found in the command line options section.
Importing from source
Instead of running the cli version, it is possible to import and create all the components of frequensee and customize them from the source code. For example:
from frequensee.config import Config
from frequensee.audio_viz import AudioVisualizer
config = Config()
viz = AudioVisualizer(config)
viz.load("path to audio file)
viz.create_bar_animation("path to output file")
viz.load("path to other audio file)
config2 = Config(different options)
viz.load_config(config2)
viz.create_fft_animation("path to output file")
viz.extract_json("path to output file")
The Config class contains values required for the AudioVisualizer instance to create the animations/export the data. Please refer to the class documentation if you wish to configure the default values.
The load method will read the audio data from the audio file in the provided filepath.
The load_config method can be used to update the configuration of the AudioVisualizer instance after it has been initialized.
There are 2 main AudioVisualizer methods used to produce the animations:
-
create_bar_animation: Creates an animation of bar shaped gradient images based on the relative amplitude. The frequency range is not visible in the resulting animation and the result is purely aesthetic. -
create_fft_animation: Creates an animation of the relative amplitude over frequency, over time, as it was created from the Fast Fourier Transform (fft) of the input audio data.
Additionally, there is the option to export the bar amplitude data in json format, if you prefer to use a different visualization methods or tool, by using the extract_json method. The structure is as follows:
{
"audio_filepath": "path to input audio file (str)",
"bars": "amount of bars created for the visuals (int)",
"bar_graph": [
[array of relative bar amplitudes for first frame (floats)],
...
[array of relative bar amplitudes for last frame (floats)]
]
}
The bar_graph array's length is equal to the number of frames generated, while each frame array's length is equal to the number of bars.
Supported file formats
Input
While it has not been tested, all formats compatible with the python package soundfile should be compatible with frequensee.
AIFF: AIFF (Apple/SGI)
AU: AU (Sun/NeXT)
AVR: AVR (Audio Visual Research)
CAF: CAF (Apple Core Audio File)
FLAC: FLAC (Free Lossless Audio Codec)
HTK: HTK (HMM Tool Kit)
SVX: IFF (Amiga IFF/SVX8/SV16)
MAT4: MAT4 (GNU Octave 2.0 / Matlab 4.2)
MAT5: MAT5 (GNU Octave 2.1 / Matlab 5.0)
MPC2K: MPC (Akai MPC 2k)
MP3: MPEG-1/2 Audio
OGG: OGG (OGG Container format)
PAF: PAF (Ensoniq PARIS)
PVF: PVF (Portable Voice Format)
RAW: RAW (header-less)
RF64: RF64 (RIFF 64)
SD2: SD2 (Sound Designer II)
SDS: SDS (Midi Sample Dump Standard)
IRCAM: SF (Berkeley/IRCAM/CARL)
VOC: VOC (Creative Labs)
W64: W64 (SoundFoundry WAVE 64)
WAV: WAV (Microsoft)
NIST: WAV (NIST Sphere)
WAVEX: WAVEX (Microsoft)
WVE: WVE (Psion Series 3)
XI: XI (FastTracker 2)
Output
The only output formats that have been tested and are natively supported for frequensee are: mp4, gif and webp.
While it has not been tested, all formats compatible with your version of FFMPEG should be compatible with frequensee, provided.
You can check the formats and codecs supported in your version of FFMPEG by running the command:
ffmpeg -formats
ffmpeg -codecs
Formats other than the natively supported ones might require additional ffmpeg options that can be passed by using the -fcommand line option to the cli version.
Unfortunately, the order of FFMPEG parameters matters, so it might be difficult to use the cli version of frequensee for some formats. In such cases, please look into the writers.py module, which contains a custom FFMpegWriter class. It's possible to add options in the _args method in the correct place in the command.
Common output formats: Advantages and Disadvantages
-
mp4: The creation of mp4 files is the default configuration for video formats. The advantages include low memory usage, embedded audio, and great compatibilty.On the other hand, the biggest disadvantage is the lack of background transparency. Of course, with a proper choice of background color, it will be possible to introduce transparency with video editing software.
-
gif: The biggest advantage of gifs is the option to have a transparent background on the resulting animation without having to use additional editing software.This comes with many disadvantages however. Memory usage for gif creation with
FFMPEGrises rapidly and can easily fill up your system's available memory. Additionally, the maximum framerate is limited to30fps, cannot include the source audio, and the resulting filesize can be orders of magnitude larger than an mp4 file. -
webp: A format similar togif, with the added advantages of smaller file size and low memory usage during creation. Unfortunately, it has many compatibility issues even in modern systems and is mostly useful in web development.
Currently, only gif and webp are supported for image formats.
For the above reasons, it is recommended to use mp4 or webp as the output format if possible. If a gif is needed, please make sure to limit the amount of frames included in each resulting gif part with the -g command line option, taking into account the input audio length, the available memory of your system, as well as the resulting framerate.
Command line options
You can get the following overview by using the help cli flag:
fqc -h
-h, --help
Show this help message and exit.
-i INPUT_PATH, --input_path INPUT_PATH
Filepath to the audio file.
-r FRAMERATE, --framerate FRAMERATE
Animation framerate (frames per second, default: 60). For GIFs maximum is 30, adjusted automatically.
-fw FFT_WINDOW_SEC, --fft_window_sec FFT_WINDOW_SEC
Window size for fft calculation (smaller -> more accurate, default: 0.25).
-b BARS, --bars BARS
Amount of bars showing on graph (Default: 20).
-bp BAR_PARTS, --bar_parts BAR_PARTS
Amount of parts to split each bar to, with 0 being a gradient (Positive integer or 0, default: 0).
-pg PART_GAP, --part_gap PART_GAP
Gap between bar parts as a percentage of the bar length (Between 0 and 1, excluding 1, default: 0.2).
-t AMPLITUDE_THRESHOLD, --amplitude_threshold AMPLITUDE_THRESHOLD
Minimum relative amplitude for frequencies, used to calculate the edges of the graph (between 0 and 1, default: 0.2).
-g MAX_FRAMES_PER_GIF, --max_frames_per_gif MAX_FRAMES_PER_GIF
Maximum frames per GIF. Due to high memory usage, please select according to your RAM size and framerate (Default: 1000).
-d DPI, --dpi DPI
Represents image quality (dots per inch, default: 100).
-w WIDTH, --width WIDTH
Width of resulting animation in pixels (Default: 1080).
-ht HEIGHT, --height HEIGHT
Height of resulting animation in pixels (Default: 1920).
-bg BACKGROUND, --background BACKGROUND
Figure backround colour as a string with format: 'red,green,blue,alpha', alpha is optional. Red/Green/Blue values: Between 0 and 255. Alpha value between 0 and 1. (Default: '0,0,0,0')
-bb BAR_COLOUR_BOTTOM, --bar_colour_bottom BAR_COLOUR_BOTTOM, --bar_color_bottom BAR_COLOUR_BOTTOM
RGB colour for the bottom of the bar gradient in the format `red,green,blue`. Red/Green/Blue values: Between 0 and 255. (Default: `0,0,255`)
-bt BAR_COLOUR_TOP, --bar_colour_top BAR_COLOUR_TOP, --bar_color_top BAR_COLOUR_TOP
RGB colour for the top of the bar gradient in the format `red,green,blue`. Red/Green/Blue values: Between 0 and 255. (Default: `255,0,0`)
-f FFMPEG_OPTIONS, --ffmpeg_options FFMPEG_OPTIONS
Additional options for FFMPEG as a string separated by space. Do not include spaces in the arguments.
-j, --export_json
If specified, exports the bar graph data over time in json format instead of producing an animation file. Includes audio filepath and framerate for which the data was created.
-fft, --animate_fft
If specified, creates an animation of the raw fft over time instead of the bars.
-o OUTPUT_PATH, --output_path OUTPUT_PATH
Path or filename of output file (including extension compatible with FFMPEG or json).
The only required flags are the input and output filepaths, provided the output file is of a natively supported format.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file frequensee-0.0.4.tar.gz.
File metadata
- Download URL: frequensee-0.0.4.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b3c44de22ceb0b865ce07451533a3066a7f2432ac8beb27b602ac06b75f5337
|
|
| MD5 |
0f19a70bb1f33a329d2e0f70e19190b5
|
|
| BLAKE2b-256 |
2e7fcc04a706c9a41e7d43caaedc53cf7b973945fb96d6d618f9048213999db5
|
Provenance
The following attestation bundles were made for frequensee-0.0.4.tar.gz:
Publisher:
pypi.yml on AntonisTorb/Frequensee
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
frequensee-0.0.4.tar.gz -
Subject digest:
5b3c44de22ceb0b865ce07451533a3066a7f2432ac8beb27b602ac06b75f5337 - Sigstore transparency entry: 161122359
- Sigstore integration time:
-
Permalink:
AntonisTorb/Frequensee@f65245dd0223ffbf2cf501af10affe54a419fcf2 -
Branch / Tag:
refs/tags/v0.0.4 - Owner: https://github.com/AntonisTorb
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@f65245dd0223ffbf2cf501af10affe54a419fcf2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file frequensee-0.0.4-py3-none-any.whl.
File metadata
- Download URL: frequensee-0.0.4-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e05157cb5cc86e4a84aa0c2206af7add2f3d83cf80c34d211446e19767a75e38
|
|
| MD5 |
53b4fc68640b9e49edb24387d659a9c5
|
|
| BLAKE2b-256 |
251c2b9c35e610b984b8d40546f67fbd87992946cc79fdcd8e81eba2808c8b29
|
Provenance
The following attestation bundles were made for frequensee-0.0.4-py3-none-any.whl:
Publisher:
pypi.yml on AntonisTorb/Frequensee
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
frequensee-0.0.4-py3-none-any.whl -
Subject digest:
e05157cb5cc86e4a84aa0c2206af7add2f3d83cf80c34d211446e19767a75e38 - Sigstore transparency entry: 161122362
- Sigstore integration time:
-
Permalink:
AntonisTorb/Frequensee@f65245dd0223ffbf2cf501af10affe54a419fcf2 -
Branch / Tag:
refs/tags/v0.0.4 - Owner: https://github.com/AntonisTorb
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@f65245dd0223ffbf2cf501af10affe54a419fcf2 -
Trigger Event:
push
-
Statement type: