Skip to main content

Data augmentation tool for named entity recognition

Project description

neraug

This python library helps you with augmenting text data for named entity recognition.

Augmentation Example


Reference from An Analysis of Simple Data Augmentation for Named Entity Recognition

Installation

To install the library:

pip install neraug

Usage

One of the example algorithms: DictionaryReplacement:

>>> from neraug.augmentator import DictionaryReplacement
>>> from neraug.scheme import IOBES

>>> ne_dic = {'Tokyo Big Sight': 'LOC'}
>>> augmentator = DictionaryReplacement(ne_dic, str.split, IOBES)
>>> x = ['I', 'went', 'to', 'Tokyo']
>>> y = ['O', 'O', 'O', 'S-LOC']
>>> x_augs, y_augs = augmentator.augment(x, y, n=1)   
>>> x_augs
[['I', 'went', 'to', 'Tokyo', 'Big', 'Sight']]
>>> y_augs
[['O', 'O', 'O', 'B-LOC', 'I-LOC', 'E-LOC']]

The library supports the following algorithms:

  • DictionaryReplacement
  • LabelWiseTokenReplacement
  • MentionReplacement
  • ShuffleWithinSegment

and supports the following scheme:

  • IOB2
  • IOBES
  • BILOU

Reference

Appreciate for the following research:

Citation

@misc{neraug,
  title={neraug: A data augmentation tool for named entity recognition},
  author={Hiroki Nakayama},
  url={https://github.com/Hironsan/neraug},
  year={2021}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neraug-0.1.1.tar.gz (123.6 kB view details)

Uploaded Source

Built Distribution

neraug-0.1.1-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file neraug-0.1.1.tar.gz.

File metadata

  • Download URL: neraug-0.1.1.tar.gz
  • Upload date:
  • Size: 123.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for neraug-0.1.1.tar.gz
Algorithm Hash digest
SHA256 263f42779494e3cb8c1fc58a3ac695e4a474217f4a0fd626387cb836f2bd4887
MD5 02739badfdcf42d3bc632238e6a11b0a
BLAKE2b-256 368c7da2b2117b6c3478ea6024b503b8d48eec0acbe10c74c8749d51639471cb

See more details on using hashes here.

File details

Details for the file neraug-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: neraug-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for neraug-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 28efd5a5ecdd49bbffc3dcb7692ff5458be7fc76e7927ce6a4e768b36a310835
MD5 70444144c51f0328de6da525b68e6763
BLAKE2b-256 b9c3cb7faca44dd63a898b2f80ffd0aa1ea2e559cadb0ae25b8bc243e72ed43f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page