Easy tagging for annotate NER corpus
Project description
EasyNERTag: Easy tagging for annotate NER corpus
Easy tagging for annotate NER corpus
This is tool for helping you to create named entity recognition corpus in conll2002 format. It wants just a tag like BBCode.
Install
pip install easynertag
How to use
I will see you at 10.04 A.M.
10.04 A.M. is the time for me.
From simple data, I want to build NER corpus for time tagging. It wants the time tag. I just add [time] before the start entity and [\time] after the end entity. like this;
I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me.
Next, build the NER Corpus
data = """I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me."""
list_data = data.splitlines()
# Next EasyNERTag
from easynertag import Engine
build = Engine()
conll2002_list = []
for i in list_data:
conll2002_list.append(build.text2conll2002(i))
print('\n'.join(conll2002_list))
output:
I O
will O
see O
you O
at O
O
10.04 B-TIME
A.M. I-TIME
10.04 B-TIME
A.M. I-TIME
O
is O
the O
time O
for O
me. O
You can custom the word_tokenize
and the pos_tag
in the Engine class.
Engine(
word_tokenize = function for do word tokenize (default is white_space_split),
pos_tag: function for do part of speech tagging
)
You can see the custome pos_tag
in tests/test_make_tag.py
.
License
Copyright 2022 Wannaphong Phatthiyaphaibun
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file EasyNERTag-0.2.tar.gz
.
File metadata
- Download URL: EasyNERTag-0.2.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b31b80909312fcfe9b8dfb68e508100565fabcca8999174b5920d5fe0d1aaeab |
|
MD5 | c1ce72a72c8d92575e472e09fb4d11c5 |
|
BLAKE2b-256 | 67e590bd071867440ca2029f99070b0e324894f3a4c76bb728f08bd3d770e23d |
File details
Details for the file EasyNERTag-0.2-py3-none-any.whl
.
File metadata
- Download URL: EasyNERTag-0.2-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7d2b07edd9f332f78c63f161c2b19df1384ba47fa72786823e632059431ca68d |
|
MD5 | 94cc02861f3269751c224ceebc7cbcc5 |
|
BLAKE2b-256 | decb0763f9bdf06c7ec42acb3b3e4de4f34d03d254971fc85f1c5903258fec99 |