Skip to main content

An easy to use package for parsing media and transforming it for your Machine Learning projects.

Project description

ML Formatter

A simple set of scripts that take act as a middle man for media intake during machine learning pipelines.

Given input arguments, this program aims to produce the required output format for specific use cases.

Currently:

  • DeepSpeech

Usage

> python -m formatter

To view all arguments

> python -m formatter --help
usage: __main__.py [-h] [-v | -q] [--dont-shuffle] [--train TRAIN] [--test TEST] [--val VAL] [--parser {deepspeech}] [--media_type {wav}]
                   [--transcript_type {txt}] [--media MEDIA] [--transcript TRANSCRIPT] [--output OUTPUT]

Given a set of media, create and output to the required spec of certain ML programs

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose
  -q, --quiet
  --dont-shuffle        Don't shuffle before splitting into runs
  --train TRAIN         Training part of train/test/val split. Out of 1
  --test TEST           Testing part of train/test/val split. Out of 1
  --val VAL             Validation part of train/test/val split. Out of 1
  --parser {deepspeech}
                        The format you wish to receive as output
  --media_type {wav}    The file extension of media files
  --transcript_type {txt}
                        The file extension of text transcript files
  --media MEDIA         Path to the directory containing media files
  --transcript TRANSCRIPT
                        Path to files containing text transcripts
  --output OUTPUT       Path to directory to use as an output folder

Example usage

> python -m formatter --media ./media --transcript ./transcripts --output ./output --verbose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ML-Formatter-1.0.1.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

ML_Formatter-1.0.1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file ML-Formatter-1.0.1.tar.gz.

File metadata

  • Download URL: ML-Formatter-1.0.1.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for ML-Formatter-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e19acbbe404a1429d952251cb825fbd2ae0d404c0cba0af494295666d026f34c
MD5 6b482eb23b92547f68c59091542bc263
BLAKE2b-256 1733925d24e528bc01cd5d8e781895e8bf707dbc9e6df97c9b09b18c1ac0de74

See more details on using hashes here.

File details

Details for the file ML_Formatter-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ML_Formatter-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.5

File hashes

Hashes for ML_Formatter-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6bfb8ef533eda400af2365a1be05fa54f7536779ac8f991b44136fc6b0b10f17
MD5 8b6ba13ce031663b4c2f01a2befebe19
BLAKE2b-256 13d2475c243c9989798aa3086a9940da4c3e5e40dadc38a00dc7e5763d4334b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page