Fast nlp augmentation library with rust backend
Project description
fast-aug - python bindings
fast-aug
is a library for fast text augmentation, available for both Rust and Python as fast-aug
.
It is designed with focus on performance and real-time usage (e.g. during training), while providing a wide range of text augmentation methods.
Note: x25 times faster than nlpaug
!
Installation
fast-aug
is available on PyPI.
pip install fast-aug
Usage
from fast_aug import CharsRandomSwapAugmenter
text_data = "Some text!"
augmenter = CharsRandomSwapAugmenter(
0.5, # probability of words selection
0.5, # probability of characters selection
None, # stopwords
)
assert augmenter.augment(text_data) != text_data
assert augmenter.augment([text_data]) != [text_data]
TBA
Performance Comparison
Comparison of the fast-aug
library with the other NLP augmentation libraries.
fast-aug
- this, Fast Augmentation library written in Rust, with Python bindingsnlpaug
- nlpaug - The most popular NLP augmentation libraryfasttextaug
- fasttextaug - re-write of somenlpaug
's augmenters in Rust with Python bindingsaugly
not included as "Our text augmentations use nlpaug as their backbone"augmenty
not included as it is too slow (2-8 times slower thannlpaug
)
It is end-to-end comparison, including dataset loading, classes initialization and augmentation of all samples (one-by-one or provided as a list).
See ./benchmarks/compare_text.py for details of the comparison.
All libs compared on tweeteval dataset - sentiment test set - 12k samples.
Note: dataset text file size is 1.1Mb, it is included in the memory usage.
Contributing and Development
Any contribution is warmly welcomed!
Please see the GitHub repository README at fast-aug.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for fast_aug-0.1.0-cp312-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d1f522ec8892a283e686a54209fc4140b5646cde9cdca1786dacb28db85db69 |
|
MD5 | 58d3a221483884affb13e845cf2cceb0 |
|
BLAKE2b-256 | a8130da97d9078c75b0e46b5db0949e21ba94182566653ccbe9888b486c6ce13 |
Hashes for fast_aug-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a24cbac72bb7ffb8b1d415a4ac41176b579d250a2494216a16ec88440dded99 |
|
MD5 | f89e811ffcc411a7db066bb3c83203bd |
|
BLAKE2b-256 | 55714448779327477986877dcd86075055d3c670aeeb17eea03db7bdeef18b63 |
Hashes for fast_aug-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9238be1fd727557bbf1fd852c39e86b247e29113dcdba6fa43b373e7f2d3688 |
|
MD5 | 7283d12a026d4d5067de5e4b086602c2 |
|
BLAKE2b-256 | ae4e5b3158e35fac5726f228e2156b394e42c0517c5e472d6bd1ba49c7be2051 |
Hashes for fast_aug-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 866090744c8fd078e401a151cd949363326e3a938d6c46eba4c91c02deb0f79c |
|
MD5 | 817ae72c9e4fa65d8e8f7d13925717cd |
|
BLAKE2b-256 | aeace89cd99a96e81333869207c4cca698f185c6e5d16f6959a7c511aa32bc60 |
Hashes for fast_aug-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 278f8d36e77bad07c101fe7103f5f9e78dfb200daed5a41f450a57d9c52e5a20 |
|
MD5 | 1ac64fb7de5d70172b6cdeea566c8a31 |
|
BLAKE2b-256 | f7b8ef2963aac61b6817072063513052801cf40586afaa4a368f2523acf4577e |
Hashes for fast_aug-0.1.0-cp311-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b208480f86caf68915ebdc65bb76b213c7d9549b45359dad9dd4ededcc33b52 |
|
MD5 | f163570d663e5be8fc941712b7730563 |
|
BLAKE2b-256 | c3fb0d445f4d42500a2c93c3ac9b87d5a61de59708e9bb7781cd8631398734ff |
Hashes for fast_aug-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95508bcd52f7c1d6607ca888b0e4fcf6a4322fea6d85a03c8351e714a29d1cdc |
|
MD5 | b6063b0e7ca30eb9c831ca0cca31a36a |
|
BLAKE2b-256 | dba3ef43dc2bcdc190db6305ed39cb3a44c9454c0314ae18fa167132ab911772 |
Hashes for fast_aug-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9e350aedd46126cd7e9354ebe5abcf8ef93cf4905d0e9e494399063a2c8c9b3 |
|
MD5 | 32d16776dd28da1813ecfc1e9cc773fa |
|
BLAKE2b-256 | 23968de737044526f72510b9c7e28a419be4acb5ff5f6cb519409e62027051dc |
Hashes for fast_aug-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 639f766344c68e354cc3eacc10373aeb366f7089625a089503c99fcfd252b8ed |
|
MD5 | aeb736ae5d65dd88cde83958440a6db7 |
|
BLAKE2b-256 | aae6e4576fa683bee4b5369bc08b7cfb892ac731efb6c94692d36e8c4ab7145e |
Hashes for fast_aug-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8cab0f2c7626774793b870d3b7a3cd43c239380ad98016b64c1bcf94f3fa497f |
|
MD5 | d74deff3ae1829ca0abf1d7b68ac8076 |
|
BLAKE2b-256 | c23ba86066088f2566a2ce351c58813144bdb1f5646362e7c8ff74b0cb306e36 |
Hashes for fast_aug-0.1.0-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5ece19da60418412e729db92008da91ce3310169ca1cdf29148f89ee8984016 |
|
MD5 | 94ff17b2cccd13d0ac8c68dd44dd0551 |
|
BLAKE2b-256 | 5c261721d3dd5859e2c3afdfbacdcbe4623b930daf24e94a57ca39b3d59f1692 |
Hashes for fast_aug-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b899c12dcd2cf2600c6129a3271ebc9d4a32909dce51c122e1ad2f7f98234dd4 |
|
MD5 | d9b06312facfad41b245c417c797159b |
|
BLAKE2b-256 | ab57006d00e5cd879627a80592de662fc6a4494b2608c745b42484504b3b6103 |
Hashes for fast_aug-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a62a31b99751ec44ba1905e381e8830fb6a2dd9ecaddab41df321085a0a04cbb |
|
MD5 | 94e50984e8776198dd5b001dae700369 |
|
BLAKE2b-256 | 7736c8eb2262bfc6200a18537c4f9ce46ef25b1197957d6171684c1a0ce6d3dc |
Hashes for fast_aug-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dae2ebf0f6ce458debc41c9e4c5f7f1a2238a69c3f03e5b4a51ce40237ec1b42 |
|
MD5 | 11d607a11c6c932d85b59d88d46d58b0 |
|
BLAKE2b-256 | 5c09abd69b98c38276ff16d4b23b138b81d2931289f526c2d4652d4d9cc31ab9 |
Hashes for fast_aug-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd4e0117c96576c8d76cd84953b3b9d1b85f8ca6ab164db62513bd8c58abc48a |
|
MD5 | f925868f9934246aa274ae67d5f2ae74 |
|
BLAKE2b-256 | 7806f4858e394e66e181dd95d9bb003538d4aeaccc767c24dca00c0359df33ba |
Hashes for fast_aug-0.1.0-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c7b4fca127fed5ebd22845cee484af09464cca6b82ef4fb9d0f9ef39ec071a3 |
|
MD5 | 2b91dae261ab77a65010966d7d969911 |
|
BLAKE2b-256 | f77608d74991e24e298801ed3737dc207d12b87c9f8048dd27a1356539a30427 |
Hashes for fast_aug-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8174db434e7af2d3959e1bf42c21f9b66f34e87b039d4c97069084dc5a09b24 |
|
MD5 | 09368f4550aff19b9127a7a53b23681d |
|
BLAKE2b-256 | 40639d5fdaac9df2a1843307af1eaacd07253879d8fdad8fe931caa7acce73f3 |
Hashes for fast_aug-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 653b44137f4144b8ce140091a06b2eb8ff20a2e3e31cbbee2d605c7f2ccf88d8 |
|
MD5 | 0b7c49870ace530bf1ca73bf749d5294 |
|
BLAKE2b-256 | f2736965466c89a380424004199f0edfe71a43f9418ea624b5d540d14386e2a6 |
Hashes for fast_aug-0.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1aeea8dbc8145a63417072775a8ab8330235703c375c89fb5cde5eee2976f082 |
|
MD5 | 0b4757115900013e06e666be4af291bc |
|
BLAKE2b-256 | 0480fadb3ba78c9442ac240fa7286ed74e8ed20303cc89323ad1f637e554974a |
Hashes for fast_aug-0.1.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 922899c790283e5ba2279659c06de394955f9cfe19c52076ae837a77b4ee65fe |
|
MD5 | 2334de9060eae20f61af70ee9a6b3496 |
|
BLAKE2b-256 | ca81b72f460864a0aaf3107196069e7ebd9c25fa50d7e0ab38afa75d3edac1cc |
Hashes for fast_aug-0.1.0-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1af63a369975387066e76496b858f8fb673b7c97fb6a4366888cded86423d9c8 |
|
MD5 | 1cca837e6c363b2576007c88af2d7a50 |
|
BLAKE2b-256 | 7313b52e481a6039c545c806cb651319a9d142d55c8d27318cd59aaee1570f54 |
Hashes for fast_aug-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bcbcc5b2ddd82cacaf117927039e96dbf04e1e9532686b006317aaf0631d50d |
|
MD5 | caf4a62b5630f1a52934fc4e4c81e0ac |
|
BLAKE2b-256 | 14ab9d8784a58871a5ac430b0f244a20805e58fc5b800a7ec321f5c3fd9a3bff |
Hashes for fast_aug-0.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9d7d839920fd85f79554f85771a7fc06048c9c4c6c27a3f66a43ec18b1b1474 |
|
MD5 | 940cb866d83664da5a18340b62fd0f6a |
|
BLAKE2b-256 | 35449194a08e88b610a98939a4f102735bcd2ffa2aa230ee48c350f62453da59 |
Hashes for fast_aug-0.1.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4bece8f3b14c446e9b4efcd995469e908588ff1814219ac06ccad5028027ad7d |
|
MD5 | 8bec2ee5dc0b6083389885aef109d0d5 |
|
BLAKE2b-256 | 9a1dfe68af1c5fa53c2dbe42fc97896e3a90b0f9a571707bcde6ae215d253fc5 |
Hashes for fast_aug-0.1.0-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6632e078c99ef17be94fcf721bf0104a14fe29e361eaf882da5cb294c2cfb0c7 |
|
MD5 | eac20af08cd9a5c7389925db38a9fb56 |
|
BLAKE2b-256 | 67c98f8d18360f0ac1eb76fbca6321bc71edc39211b8356f588f70708cae08db |