Skip to main content

Easy tagging for annotate NER corpus

Project description

EasyNERTag: Easy tagging for annotate NER corpus

Easy tagging for annotate NER corpus

This is tool for helping you to create named entity recognition corpus in conll2002 format. It wants just a tag like BBCode.

Install

pip install easynertag

How to use

I will see you at 10.04 A.M.
10.04 A.M. is the time for me.

From simple data, I want to build NER corpus for time tagging. It wants the time tag. I just add [time] before the start entity and [\time] after the end entity. like this;

I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me.

Next, build the NER Corpus

data = """I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me."""

list_data = data.splitlines()

# Next EasyNERTag
from easynertag import Engine
build = Engine()

conll2002_list = []

for i in list_data:
    conll2002_list.append(build.text2conll2002(i))

print('\n'.join(conll2002_list))

output:

I       O
will    O
see     O
you     O
at      O
        O
10.04   B-TIME
A.M.    I-TIME

10.04   B-TIME
A.M.    I-TIME
        O
is      O
the     O
time    O
for     O
me.     O

You can custom the word_tokenize and the pos_tag in the Engine class.

Engine(
    word_tokenize = function for do word tokenize (default is white_space_split),
    pos_tag: function for do part of speech tagging
)

You can see the custome pos_tag in tests/test_make_tag.py.

License

   Copyright 2022 Wannaphong Phatthiyaphaibun

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EasyNERTag-0.2.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

EasyNERTag-0.2-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file EasyNERTag-0.2.tar.gz.

File metadata

  • Download URL: EasyNERTag-0.2.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for EasyNERTag-0.2.tar.gz
Algorithm Hash digest
SHA256 b31b80909312fcfe9b8dfb68e508100565fabcca8999174b5920d5fe0d1aaeab
MD5 c1ce72a72c8d92575e472e09fb4d11c5
BLAKE2b-256 67e590bd071867440ca2029f99070b0e324894f3a4c76bb728f08bd3d770e23d

See more details on using hashes here.

File details

Details for the file EasyNERTag-0.2-py3-none-any.whl.

File metadata

  • Download URL: EasyNERTag-0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for EasyNERTag-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7d2b07edd9f332f78c63f161c2b19df1384ba47fa72786823e632059431ca68d
MD5 94cc02861f3269751c224ceebc7cbcc5
BLAKE2b-256 decb0763f9bdf06c7ec42acb3b3e4de4f34d03d254971fc85f1c5903258fec99

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page