Skip to main content

Data augmentation tool for named entity recognition

Project description


This python library helps you with augmenting text data for named entity recognition.

Augmentation Example

Reference from An Analysis of Simple Data Augmentation for Named Entity Recognition


To install the library:

pip install neraug


One of the example algorithms: DictionaryReplacement:

>>> from neraug.augmentator import DictionaryReplacement
>>> from neraug.scheme import IOBES

>>> ne_dic = {'Tokyo Big Sight': 'LOC'}
>>> augmentator = DictionaryReplacement(ne_dic, str.split, IOBES)
>>> x = ['I', 'went', 'to', 'Tokyo']
>>> y = ['O', 'O', 'O', 'S-LOC']
>>> x_augs, y_augs = augmentator.augment(x, y, n=1)   
>>> x_augs
[['I', 'went', 'to', 'Tokyo', 'Big', 'Sight']]
>>> y_augs
[['O', 'O', 'O', 'B-LOC', 'I-LOC', 'E-LOC']]

The library supports the following algorithms:

  • DictionaryReplacement
  • LabelWiseTokenReplacement
  • MentionReplacement
  • ShuffleWithinSegment

and supports the following scheme:

  • IOB2


Appreciate for the following research:


  title={neraug: A data augmentation tool for named entity recognition},
  author={Hiroki Nakayama},

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neraug-0.1.1.tar.gz (123.6 kB view hashes)

Uploaded source

Built Distribution

neraug-0.1.1-py3-none-any.whl (4.6 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page