Audio steganography methods, attacks, and speech-quality evaluation metrics.

These details have not been verified by PyPI

Project description

The A-Files

  _______ _                                   ______ _ _           
 |__   __| |                 /\              |  ____(_) |          
    | |  | |__   ___        /  \     ______  | |__   _| | ___  ___ 
    | |  | '_ \ / _ \      / /\ \   |______| |  __| | | |/ _ \/ __|
    | |  | | | |  __/     / ____ \           | |    | | |  __/\__ \
    |_|  |_| |_|\___|    /_/    \_\          |_|    |_|_|\___||___/

The A-Files is a powerful audio steganography software that allows users to embed secret data within an audio signal, measure speech quality metrics, and test the audio signal's robustness against different types of attacks.

With The A-Files, users can ensure that their sensitive information remains private and protected.

About
Package structure
Publishing
Migration note
Steganography algorithms
Metrics
Attacks
Diagram
References
Licence
Authors

About

The A-Files project contains:

Audio watermarking methods
Speech quality metrics
Audio attacks

Loading

The A-Files is an audio steganography software that allows users to hide secret data within an audio signal. To do this, users can load both the audio file and the data they want to hide into the software. The A-Files supports several types of audio files, such as WAV or FLAC. Once the audio file and the secret data are loaded, the software can embed the data within the audio signal.

Capacity

The A-Files offers several techniques for embedding secret data within an audio signal, such as LSB (Least Significant Bit) insertion, phase coding, and echo hiding. Each technique has its own strengths and weaknesses, and users can choose the technique that best fits their application's requirements.

Transparency

The A-Files includes tools for measuring speech quality metrics, such as PESQ (Perceptual Evaluation of Speech Quality), SNR (Signal-to-Noise Ratio) and STOI (Short-time objective intelligibility) metric. By measuring these metrics, users can ensure that the audio signal has not been degraded during the process of embedding secret data.

Robustness

The A-Files provides tools for testing the audio signal's robustness against different types of attacks. Attackers may attempt to remove the hidden data or alter the audio signal to render the hidden data useless. The A-Files includes tools for testing the audio signal's robustness against various types of attacks, such as noising, frequency cutting, and filtering. By testing the audio signal's robustness, users can determine the effectiveness of the audio steganography technique and make any necessary adjustments to improve the technique's strength and resilience.

Powered by some great GitHub repositories

Package structure

The canonical Python package now lives under src/taf/.

Repository layout:

src/taf/ contains the importable library code.
src/taf/resources/ contains packaged resources such as example audio, bundled sample datasets, and shipped model files.
tests/ contains import- and behavior-level tests.
Documentation/ remains outside the package as a project asset.

Supported Python versions: 3.10, 3.11, 3.12

For local development, install the package in editable mode:

python -m pip install -e .[dev]

TensorFlow-dependent features (FgasMethod, MosNetMetric) require the optional [ai] extra:

pip install "the-a-files[ai]"
# or, for local development:
pip install -e ".[dev,ai]"

Publishing

Packaging and release instructions are in docs/publishing.md. The PyPI distribution name is the-a-files; the import package remains taf.

SRMRpy is vendored under src/taf/metrics/speech_reverberation/srmrpy/ because it is required by the SRMR metric and is not published as a standard PyPI dependency.

Technical specification

Implementation contracts and code-level extension notes are documented in docs/technical_specification.md.

Packaged resources

Runtime assets that must work from source, tests, editable installs, and built distributions live under src/taf/resources/.

Use this layout:

src/taf/resources/audio/ for standalone example WAV files.
src/taf/resources/datasets/ for the bundled VCTK and LibriSpeech sample corpora used by the smoke workflow.
src/taf/resources/models/ for packaged model weights such as .h5.

Example usage:

from taf.models.WavFile import WavFile
from taf.resources.paths import example_wav_path, mosnet_model_path, packaged_dataset_audio_paths

with example_wav_path() as wav_path:
    wav_file = WavFile.load(wav_path)

with mosnet_model_path() as model_path:
    print(model_path)

with packaged_dataset_audio_paths() as dataset_paths:
    print(len(dataset_paths["vctk"]))

Migration note

The main packaging change is the introduction of a src/ layout so the core library is imported as taf instead of relying on the repository root being on PYTHONPATH.

The implementation modules were copied into src/taf/ with their existing internal layout preserved to avoid algorithm changes. The legacy top-level TAF/ tree was left in place for now because a direct filesystem move was blocked on this checkout, so the new canonical code lives in src/taf/ and future work should target that package.

Steganography algorithms

List of implemented methods:

Standard LSB coding (LsbMethod.py) [1]
Echo Hiding technique with single echo kernel (EchoMethod.py) [1]
Phase coding technique (PhaseCodingMethod.py) [1]
Improved Phase Coding technique (ImprovedPhaseCodingMethod.py) [19]
DCT Delta LSB (DctDeltaLsbMethod.py) [1]
DWT LSB based (DwtLsbMethod.py) [1]
First band of DCT coefficients, DCT-b1 (DctB1Method.py) [2]
Patchwork-Based multilayer (PatchworkMultilayerMethod.py) [3]
Norm space method (NormSpaceMethod.py) [4]
Frequency Singular Value Coefficient Modification, FSVC (FsvcMethod.py) [5]
Direct Sequence Spread Spectrum technique (DsssMethod.py) [6]
Blind SVD-based using entropy and log-polar transformation method (BlindSvdMethod.py) [20]
Prime Factor Interpolated method (PrimeFactorInterpolatedMethod.py) [21]
LWT method (LwtMethod.py) [22]
Foreground-Background Segmentation LSB, FBS-LSB (ForegroundBackgroundSegmentationMethod.py) [23]
Fixed-decoder network with adversarial perturbation generation, FGAS (FgasMethod.py) [24]
Adaptive +-1 LSB via AAC perceptual residual and Syndrome-Trellis Codes (AacStcMethod.py) [25]
Wireless-channel DWT LSB message embedding (WirelessDwtLsbMethod.py) [27]

Technical details for method implementation are in docs/technical_specification.md.

Metrics

AI Based

MOSNet: Deep Learning based Objective Assessment for Voice Conversion [16]

Speech Reverberation

Bark spectral distortion (BSD) [7]
Speech-to-reverberation modulation energy ratio (SRMR) [10]

Speech Intelligibility

Coherence and speech intelligibility index (CSII) [7]
Normalized-covariance measure (NCM) [7]
Short-time objective intelligibility (STOI) [9]

Speech Quality

Signal - to - Noise Ratio (SNR) [12]
Mel-cepstral distance measure for objective speech quality assessment [11]
Segmental Signal-to-Noise Ratio (SNRseg) [7]
Frequency-weighted Segmental SNR (fwSNRseg) [7]
Cepstrum Distance Objective Speech Quality Measure (CD) [7]
Log - likelihood Ratio (LLR) [7]
Weighted Spectral Slope (WSS) [7]
Perceptual Evaluation of Speech Quality (PESQ) [8]
Speech Enhancement Metrics (Csig, Covl, Cbak, Composite) [13]
Weighted Spectro-Temporal Modulation Index (wSTMI) [14]
Spectro-Temporal Glimpsing Index (STGI) [15]
Scale-invariant SDR(SISDR) [17]
BSSEval v4 [18]

Each metric extend abstract class Metric

from abc import ABC, abstractmethod
from numbers import Number

import numpy as np


class Metric(ABC):

    @abstractmethod
    def calculate(self,
                  samples_original: np.ndarray,
                  samples_processed: np.ndarray,
                  fs: int,
                  frame_len: float = 0.03,
                  overlap: float = 0.75) -> Number | np.ndarray:
        ...

    @abstractmethod
    def name(self) -> str:
        ...

Attacks

List of attack on audio samples:

Low pass filter
Additive noise
Frequency filter
Flip random samples
Cut random samples
Resample (downsampling, upsampling)
Amplitude scaling
Pitch shift
Time stretch

References

Articles

[1] Alsabhany, Ahmed A., Ahmed Hussain Ali, Farida Ridzuan, A. H. Azni, and Mohd Rosmadi Mokhtar. Digital Audio Steganography: Systematic Review, Classification, and Analysis of the Current State of the Art. Computer Science Review 38 (2020): 100316. https://doi.org/10.1016/j.cosrev.2020.100316
[2] Hu, Hwai Tsu, and Ling Yuan Hsu. Robust, Transparent and High-Capacity Audio Watermarking in DCT Domain. Signal Processing 109 (2015): 226–35. https://doi.org/10.1016/j.sigpro.2014.11.011
[3] Natgunanathan, Iynkaran, Yong Xiang, Guang Hua, Gleb Beliakov, and John Yearwood. Patchwork-Based Multilayer Audio Watermarking. IEEE/ACM Transactions on Audio Speech and Language Processing 25, no. 11 (2017): 2176–87. https://doi.org/10.1109/TASLP.2017.2749001
[4] Saadi, Slami, Ahmed Merrad, and Ali Benziane. Novel Secured Scheme for Blind Audio/Speech Norm-Space Watermarking by Arnold Algorithm. Signal Processing 154 (2019): 74–86. https://doi.org/10.1016/j.sigpro.2018.08.011
[5] Zhao, Juan, Tianrui Zong, Yong Xiang, Longxiang Gao, Wanlei Zhou, and Gleb Beliakov. Desynchronization Attacks Resilient Watermarking Method Based on Frequency Singular Value Coefficient Modification. IEEE/ACM Transactions on Audio Speech and Language Processing 29 (2021): 2282–95. https://doi.org/10.1109/TASLP.2021.3092555
[6] Nugraha, Rizky M. Implementation of Direct Sequence Spread Spectrum Steganography on Audio Data. Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011, no. July ( 2011). https://doi.org/10.1109/ICEEI.2011.6021662
[7] Philipos C. Loizou. Speech Enhancement. Theory and Practice, Second Edition CRC Press ( 2013). https://doi.org/10.1201/b14529
[8] Miao Wang, Christoph Boeddeker, Rafael G. Dantas and ananda seelan. PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users Zenodo 2022. https://doi.org/10.5281/zenodo.6549559
[9] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech, ICASSP 2010, Texas, Dallas. https://doi.org/10.1109/ICASSP.2010.5495701
[10] Tiago H. Falk, Chenxi Zheng, and Way-Yip Chan. A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech, IEEE Trans Audio Speech Lang Process, Vol. 18, No. 7, pp. 1766-1774, Sept.2010. https://doi.org/10.1109/TASL.2010.2052247
[11] R. Kubichek, Mel-cepstral distance measure for objective speech quality assessment, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing, Victoria, BC, Canada, 1993, pp. 125-128 vol.1, https://doi.org/10.1109/PACRIM.1993.407206.
[12] https://en.wikipedia.org/wiki/Signal-to-noise_ratio
[13] Yi Hu and Philipos C. Loizou, Evaluation of Objective Quality Measures for Speech Enhancement, IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 1, 229, JANUARY 2008, https://doi.org/10.1109/TASL.2007.911054
[14] A. Edraki, W.-Y. Chan, J. Jensen, & D. Fogerty, Speech Intelligibility Prediction Using Spectro-Temporal Modulation Analysis. IEEE/ACM Trans. Audio, Speech, & Language Processing, vol. 29, pp. 210-225, 2021, https://doi.org/10.1109/taslp.2020.3039929
[15] A. Edraki, W.-Y. Chan, J. Jensen, & D. Fogerty, A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction, Proc. Interspeech, 5 pages, Aug 2021, http://dx.doi.org/10.21437/Interspeech.2021-605
[16] Lo, Chen-Chou and Fu, Szu-Wei and Huang, Wen-Chin and Wang, Xin and Yamagishi, Junichi and Tsao, Yu and Wang, Hsin-Min, MOSNet: Deep Learning based Objective Assessment for Voice Conversion, arXiv preprint arXiv:1904.08352, 2019, https://arxiv.org/abs/1904.08352
[17] Roux, Jonathan Le and Wisdom, Scott and Erdogan, Hakan and Hershey, John R, SDR – Half-baked or Well Done?, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, https://dx.doi.org/10.1109/ICASSP.2019.8683855
[18] Stöter, Fabian-Robert and Liutkus, Antoine and Ito, Nobutaka, The 2018 Signal Separation Evaluation Campaign, Latent Variable Analysis and Signal Separation: 14th International Conference, LVA/ICA 2018, Surrey, UK, 2018, pp. 293–305, https://doi.org/10.5281/zenodo.3376621
[19] Yang, Guang, An Improved Phase Coding Audio Steganography Algorithm,arXiv preprint arXiv:2408.13277, 2024, https://doi.org/10.48550/arXiv.2408.13277
[20] Dhar, Pranab Kumar, and Shimamura, Tetsuya, Blind SVD-based audio watermarking using entropy and log-polar transformation, Journal of Information Security and Applications, Volume 20, 2015, Pages 74-83, https://doi.org/10.1016/j.jisa.2014.10.007.
[21] Adhiyaksa, F. A., Ahmad, T., Shiddiqi, A. M., Jati Santoso, B., Studiawan, H., & Pratomo, B. A. (2022). Reversible Audio Steganography using Least Prime Factor and Audio Interpolation. In 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE) (pp. 97–102). IEEE. https://doi.org/10.1109/ISMODE53584.2022.9743066
[22] Mushtaq, S., Mehraj, S., & Parah, S. A. (2024). Blind and Robust Watermarking Framework for Audio Signals. In 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) ( ICRITO) (pp. 1–5). IEEE. https://doi.org/10.1109/ICRITO61523.2024.10522195
[23] Wang, J., & Wang, K. (2025). A novel audio steganography based on the segmentation of the foreground and background of audio. Computers & Electrical Engineering, 117, 109247. https://doi.org/10.1016/j.compeleceng.2024.109247
[24] Yan, J., Cheng, Y., Yin, Z., Zhang, X., Wang, S., Sun, T., & Jiang, X. (2025). FGAS: Fixed Decoder Network-Based Audio Steganography with Adversarial Perturbation Generation. arXiv preprint arXiv:2505.22266. https://arxiv.org/abs/2505.22266
[25] Luo, W., Zhang, Y., & Li, H. (2017). Adaptive Audio Steganography Based on Advanced Audio Coding and Syndrome-Trellis Coding. In C. Kraetzer et al. (Eds.), Digital Forensics and Watermarking, IWDW 2017, Lecture Notes in Computer Science, vol 10431, pp. 177-186. Springer. https://doi.org/10.1007/978-3-319-64185-0_14
[26] Yan, Y., Li, Y., Xiao, Q., & Ren, Y. (2026). PRoADS: Provably Secure and Robust Audio Diffusion Steganography with Latent Optimization and Backward Euler Inversion. arXiv preprint arXiv:2603.10314 (ICASSP 2026). https://arxiv.org/abs/2603.10314
[27] Hamdi, A. A., Eyssa, A. A., Abdalla, M. I., ElAffendi, M., AlQahtani, A. A. S., Ateya, A. A., & Elsayed, R. A. Improving Audio Steganography Transmission over Various Wireless Channels. Journal of Sensor and Actuator Networks, 14(6), 106 (2025). https://doi.org/10.3390/jsan14060106

Links

Licence

The A-Files is an open source software under GPLv3 license.

Dependencies

PESQ requires Microsoft Visual C++ 14.0 or later. You can install it via Microsoft C++ Build Tools: https://visualstudio.microsoft.com/visual-cpp-build-tools/

In some cases, you may also need FFmpeg: https://ffmpeg.org/

Authors

Paweł Kaczmarek (@pawel-kaczmarek) - Military University of Technology, Faculty of Electronics
Zbigniew Piotrowski - Military University of Technology, Faculty of Electronics

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

May 25, 2026

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

the_a_files-0.1.1.tar.gz (18.7 MB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

the_a_files-0.1.1-py3-none-any.whl (18.7 MB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file the_a_files-0.1.1.tar.gz.

File metadata

Download URL: the_a_files-0.1.1.tar.gz
Upload date: May 25, 2026
Size: 18.7 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for the_a_files-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`da73784b6f8ba731ccd505c5c582eb3b63d06df833c7af41dad7115b281e304f`
MD5	`eeffe57883ccd581a288c99a79d9057d`
BLAKE2b-256	`7e71c75253202431f6b46c93dcd764d2d69fc715486fe6748c6b259c559d3d39`

See more details on using hashes here.

Provenance

The following attestation bundles were made for the_a_files-0.1.1.tar.gz:

Publisher: ci.yml on pawel-kaczmarek/The-A-Files

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: the_a_files-0.1.1.tar.gz
- Subject digest: da73784b6f8ba731ccd505c5c582eb3b63d06df833c7af41dad7115b281e304f
- Sigstore transparency entry: 1631018415
- Sigstore integration time: May 25, 2026
Source repository:
- Permalink: pawel-kaczmarek/The-A-Files@112f50832ab864ec4e0b1421e775446747d93a7b
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/pawel-kaczmarek
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@112f50832ab864ec4e0b1421e775446747d93a7b
- Trigger Event: push

File details

Details for the file the_a_files-0.1.1-py3-none-any.whl.

File metadata

Download URL: the_a_files-0.1.1-py3-none-any.whl
Upload date: May 25, 2026
Size: 18.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for the_a_files-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`550ceb10c84e5b6234acdd202c36749a387ce253269b4c80da731ff78236443c`
MD5	`c58cd2adc33ef0091489fccb29103d76`
BLAKE2b-256	`025502dd2e12ae025e0a99109d8cd4aef12a9f1332ddd64d04155348892bffae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for the_a_files-0.1.1-py3-none-any.whl:

Publisher: ci.yml on pawel-kaczmarek/The-A-Files

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: the_a_files-0.1.1-py3-none-any.whl
- Subject digest: 550ceb10c84e5b6234acdd202c36749a387ce253269b4c80da731ff78236443c
- Sigstore transparency entry: 1631018433
- Sigstore integration time: May 25, 2026
Source repository:
- Permalink: pawel-kaczmarek/The-A-Files@112f50832ab864ec4e0b1421e775446747d93a7b
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/pawel-kaczmarek
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@112f50832ab864ec4e0b1421e775446747d93a7b
- Trigger Event: push

the-a-files 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

The A-Files

Table of contents

About

Loading

Capacity

Transparency

Robustness

Package structure

Publishing

Technical specification

Packaged resources

Migration note

Steganography algorithms

Metrics

AI Based

Speech Reverberation

Speech Intelligibility

Speech Quality

Attacks

References

Articles

Links

Licence

Dependencies

Authors

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance