Skip to main content

Natural language processing augmentation library for deep neural networks

Project description

Build Status Codacy Badge Codecov Badge

nlpaug

This python library helps you with augmenting nlp for your machine learning projects. Visit this introduction to understand about Data Augmentation in NLP

Feature

  • Provide both character and word level augmentations which include:
    • Character Augmentation: OCR, QWERTY(Keyboard Distance), Random Behavior
    • Word Augmentation:
    • Speech Recognition Augmentation:
      • Spectrogram: Frequency Masking, Time Masking
      • Audio: Noise, Pitch, Shift, Speed
  • Flow orchestration is supported. Flow includes:
    • Sequential: Apply data augmentations one by one
    • Sometimes: Apply some augmentations randomly

Example

Frequency Masking Frequency Masking

Time Masking Frequency Masking

Installation

The library supports python 3.5+ in linux and window platform.

To install the library:

pip install nlpaug

Download word2vec or GloVe files if you use Word2VecAug or GloVeAug:

Recent Changes

0.0.3 May 23, 2019: Added Speed, Noise, Shift and Pitch augmenters for Audio

0.0.2 Apr 30, 2019: Added Frequency Masking and Time Masking for Speech Recognition (Spectrogram). Added librosa library dependency for converting wav to spectrogram.

0.0.1 Mar 20, 2019: Project initialization

Test

Word2vec and GloVe models are used in word insertion and substitution. Those model files are necessary in order to run test case. You have to add ".env" file in root directory and the content should be
	- MODEL_DIR={MODEL FILE PATH}
Folder structure of model should be
	-- root directory
		- glove.6B.50d.txt
		- GoogleNews-vectors-negative300.bin
		- wiki-news-300d-1M.vec

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpaug-0.0.3.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

nlpaug-0.0.3-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file nlpaug-0.0.3.tar.gz.

File metadata

  • Download URL: nlpaug-0.0.3.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.4

File hashes

Hashes for nlpaug-0.0.3.tar.gz
Algorithm Hash digest
SHA256 4a114a44e56f95cc27ce047aa495fe20880f90f831d87311cf6e3d0505df0ecc
MD5 e3c5e135f528c8abe0be81d43152ffcb
BLAKE2b-256 417f9c76537b093df942327854116d4353c1026d61e24b8c7d83acf838aec7e9

See more details on using hashes here.

File details

Details for the file nlpaug-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: nlpaug-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.4

File hashes

Hashes for nlpaug-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 376dea027ee782d593bbbcc27ae771826548517a02959c35680cef094b3e6348
MD5 ec69fb77024ce7fd0ced8023e95cb6b4
BLAKE2b-256 e48c5671bd785652c92839619cb2d2e3b2da26bd9bda90877b1083b8e1055f4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page