Skip to main content

Fast nlp augmentation library with rust backend

Project description

fast-aug - python bindings

Python Test Workflow Status PyPI - Version GitHub License

fast-aug is a library for fast text augmentation, available for both Rust and Python as fast-aug.
It is designed with focus on performance and real-time usage (e.g. during training), while providing a wide range of text augmentation methods.

Note: x25 times faster than nlpaug!


Installation

fast-aug is available on PyPI.

pip install fast-aug

Usage

from fast_aug import CharsRandomSwapAugmenter

text_data = "Some text!"
augmenter = CharsRandomSwapAugmenter(
    0.5,  # probability of words selection
    0.5,  # probability of characters selection
    None,  # stopwords
)
assert augmenter.augment(text_data) != text_data
assert augmenter.augment([text_data]) != [text_data]

TBA

Performance Comparison

Comparison of the fast-aug library with the other NLP augmentation libraries.

  • fast-aug - this, Fast Augmentation library written in Rust, with Python bindings
  • nlpaug - nlpaug - The most popular NLP augmentation library
  • fasttextaug - fasttextaug - re-write of some nlpaug's augmenters in Rust with Python bindings
  • augly not included as "Our text augmentations use nlpaug as their backbone"
  • augmenty not included as it is too slow (2-8 times slower than nlpaug)

It is end-to-end comparison, including dataset loading, classes initialization and augmentation of all samples (one-by-one or provided as a list).
See ./benchmarks/compare_text.py for details of the comparison.

comparison time comparison memory

All libs compared on tweeteval dataset - sentiment test set - 12k samples.
Note: dataset text file size is 1.1Mb, it is included in the memory usage.

Contributing and Development

Any contribution is warmly welcomed!
Please see the GitHub repository README at fast-aug.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

fast_aug-0.1.0-cp312-none-win_amd64.whl (288.4 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

fast_aug-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

fast_aug-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

fast_aug-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (426.7 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

fast_aug-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl (429.4 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

fast_aug-0.1.0-cp311-none-win_amd64.whl (287.4 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

fast_aug-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

fast_aug-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

fast_aug-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (426.5 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

fast_aug-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl (430.3 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

fast_aug-0.1.0-cp310-none-win_amd64.whl (287.3 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

fast_aug-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

fast_aug-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

fast_aug-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (426.0 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

fast_aug-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (429.8 kB view hashes)

Uploaded CPython 3.10 macOS 10.12+ x86-64

fast_aug-0.1.0-cp39-none-win_amd64.whl (287.7 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

fast_aug-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

fast_aug-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

fast_aug-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (426.4 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

fast_aug-0.1.0-cp39-cp39-macosx_10_12_x86_64.whl (429.6 kB view hashes)

Uploaded CPython 3.9 macOS 10.12+ x86-64

fast_aug-0.1.0-cp38-none-win_amd64.whl (287.4 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

fast_aug-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

fast_aug-0.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

fast_aug-0.1.0-cp38-cp38-macosx_11_0_arm64.whl (426.0 kB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

fast_aug-0.1.0-cp38-cp38-macosx_10_12_x86_64.whl (429.4 kB view hashes)

Uploaded CPython 3.8 macOS 10.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page