Skip to main content

Easy tagging for annotate NER corpus

Project description

EasyNERTag: Easy tagging for annotate NER corpus

Easy tagging for annotate NER corpus

This is tool for helping you to create named entity recognition corpus in conll2002 format. It wants just a tag like BBCode.

Install

pip install easynertag

How to use

I will see you at 10.04 A.M.
10.04 A.M. is the time for me.

From simple data, I want to build NER corpus for time tagging. It wants the time tag. I just add [time] before the start entity and [\time] after the end entity. like this;

I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me.

Next, build the NER Corpus

data = """I will see you at [TIME]10.04 A.M.[/TIME]
[TIME]10.04 A.M.[/TIME] is the time for me."""

list_data = data.splitlines()

# Next EasyNERTag
from easynertag import Engine
build = Engine()

conll2002_list = []

for i in list_data:
    conll2002_list.append(build.text2conll2002(i))

print('\n'.join(conll2002_list))

output:

I       O
will    O
see     O
you     O
at      O
        O
10.04   B-TIME
A.M.    I-TIME

10.04   B-TIME
A.M.    I-TIME
        O
is      O
the     O
time    O
for     O
me.     O

You can custom the word_tokenize and the pos_tag in the Engine class.

Engine(
    word_tokenize = function for do word tokenize (default is white_space_split),
    pos_tag: function for do part of speech tagging
)

You can see the custome pos_tag in tests/test_make_tag.py.

License

   Copyright 2022 Wannaphong Phatthiyaphaibun

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

EasyNERTag-0.1.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

EasyNERTag-0.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file EasyNERTag-0.1.tar.gz.

File metadata

  • Download URL: EasyNERTag-0.1.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for EasyNERTag-0.1.tar.gz
Algorithm Hash digest
SHA256 ab187c8f0da26027e5e90c1e555c98d4cce48d93c4aadb6ab49c19c17571db0e
MD5 9baec9947e876ee140c6f8d284501cc3
BLAKE2b-256 27c0c8ee31f762d11cc1fe73e6ba0513a4568346fbebfbaeed98e0f7f17f430d

See more details on using hashes here.

File details

Details for the file EasyNERTag-0.1-py3-none-any.whl.

File metadata

  • Download URL: EasyNERTag-0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for EasyNERTag-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 73dd11bb14d47a822661f7a2ba606d55130ab29faaf91e7233ada49af25c5c61
MD5 da91eea8ecbc1153d5a83e8ba8172c9b
BLAKE2b-256 61e49465484ab4805887923e733721dfdf0591da923ca3fdb7398a2991bd5a02

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page