Skip to main content

An easy to use package for parsing media and transforming it for your Machine Learning projects.

Project description

ML Formatter

A simple set of scripts that take act as a middle man for media intake during machine learning pipelines.

Given input arguments, this program aims to produce the required output format for specific use cases.

Currently:

  • DeepSpeech

Usage

> python -m formatter

To view all arguments

> python -m formatter --help
usage: __main__.py [-h] [-v | -q] [--dont-shuffle] [--train TRAIN] [--test TEST] [--val VAL] [--parser {deepspeech}] [--media_type {wav}]
                   [--transcript_type {txt}] [--media MEDIA] [--transcript TRANSCRIPT] [--output OUTPUT]

Given a set of media, create and output to the required spec of certain ML programs

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose
  -q, --quiet
  --dont-shuffle        Don't shuffle before splitting into runs
  --train TRAIN         Training part of train/test/val split. Out of 1
  --test TEST           Testing part of train/test/val split. Out of 1
  --val VAL             Validation part of train/test/val split. Out of 1
  --parser {deepspeech}
                        The format you wish to receive as output
  --media_type {wav}    The file extension of media files
  --transcript_type {txt}
                        The file extension of text transcript files
  --media MEDIA         Path to the directory containing media files
  --transcript TRANSCRIPT
                        Path to files containing text transcripts
  --output OUTPUT       Path to directory to use as an output folder

Example usage

> python -m formatter --media ./media --transcript ./transcripts --output ./output --verbose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ML-Formatter-1.0.1.tar.gz (6.1 kB view hashes)

Uploaded Source

Built Distribution

ML_Formatter-1.0.1-py3-none-any.whl (7.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page