Japanese Entity Linker.
Project description
jel: Japanese Entity Linker
- jel - Japanese Entity Linker - is Bi-encoder based entity linker for japanese.
Usage
- Currently,
link
andquestion
methods are supported.
el.link
- This returnes named entity and its candidate ones from Wikipedia titles.
from jel import EntityLinker
el = EntityLinker()
el.link('今日は東京都のマックにアップルを買いに行き、スティーブジョブスとドナルドに会い、堀田区に引っ越した。')
>> [
{
"text": "東京都",
"label": "GPE",
"span": [
3,
6
],
"predicted_normalized_entities": [
[
"東京都庁",
0.1084
],
[
"東京",
0.0633
],
[
"国家地方警察東京都本部",
0.0604
],
[
"東京都",
0.0598
],
...
]
},
{
"text": "アップル",
"label": "ORG",
"span": [
11,
15
],
"predicted_normalized_entities": [
[
"アップル",
0.2986
],
[
"アップル インコーポレイテッド",
0.1792
],
…
]
}
el.question
- This returnes candidate entity for any question from Wikipedia titles.
>>> linker.question('日本の総理大臣は?')
[('菅内閣', 0.05791765857101555), ('枢密院', 0.05592481946602986), ('党', 0.05430194711042564), ('総選挙', 0.052795400668513175)]
Setup
$ pip install jel
$ python -m spacy download ja_core_news_md
Test
$ python pytest
Notes
- faiss==1.5.3 from pip causes error _swigfaiss.
- To solve this, see this issue.
LICENSE
Apache 2.0 License.
CITATION
@INPROCEEDINGS{manabe2019chive,
author = {真鍋陽俊, 岡照晃, 海川祥毅, 髙岡一馬, 内田佳孝, 浅原正幸},
title = {複数粒度の分割結果に基づく日本語単語分散表現},
booktitle = "言語処理学会第25回年次大会(NLP2019)",
year = "2019",
pages = "NLP2019-P8-5",
publisher = "言語処理学会",
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
jel-0.1.3.tar.gz
(28.4 kB
view details)
File details
Details for the file jel-0.1.3.tar.gz
.
File metadata
- Download URL: jel-0.1.3.tar.gz
- Upload date:
- Size: 28.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.3.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e73521f0f77297ae2e42848ceb131dafb0e5a6cc476dae7143b0142cee0c3db |
|
MD5 | b97c37d38f5a926cb065cf20b9a535bf |
|
BLAKE2b-256 | 7f517e11a56f017e706cfc632cbfc82a99f6d73f1bababe1d395e779fbbd9e5a |