A tool for learning vector representations of words and entities from Wikipedia
Project description
Wikipedia2Vec
=============
[![Fury badge](https://badge.fury.io/py/wikipedia2vec.png)](http://badge.fury.io/py/wikipedia2vec)
[![CircleCI](https://circleci.com/gh/wikipedia2vec/wikipedia2vec.svg?style=svg)](https://circleci.com/gh/wikipedia2vec/wikipedia2vec)
Wikipedia2Vec is a tool used for obtaining embeddings (vector representations) of words and entities from Wikipedia.
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
This tool enables you to learn embeddings that map words and entities into a unified continuous vector space.
The embeddings can be used as word embeddings, entity embeddings, and the unified embeddings of words and entities.
They are used in the state-of-the-art models of various tasks such as [entity linking](https://arxiv.org/abs/1601.01343), [named entity recognition](http://www.aclweb.org/anthology/I17-2017), [entity relatedness](https://arxiv.org/abs/1601.01343), and [question answering](https://arxiv.org/abs/1803.08652).
Documentation and pretrained embeddings are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
Reference
---------
If you use Wikipedia2Vec in a scientific publication, please cite the following paper:
@InProceedings{yamada-EtAl:2016:CoNLL,
author = {Yamada, Ikuya and Shindo, Hiroyuki and Takeda, Hideaki and Takefuji, Yoshiyasu},
title = {Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation},
booktitle = {Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning},
month = {August},
year = {2016},
address = {Berlin, Germany},
pages = {250--259},
publisher = {Association for Computational Linguistics}
}
License
-------
[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
=============
[![Fury badge](https://badge.fury.io/py/wikipedia2vec.png)](http://badge.fury.io/py/wikipedia2vec)
[![CircleCI](https://circleci.com/gh/wikipedia2vec/wikipedia2vec.svg?style=svg)](https://circleci.com/gh/wikipedia2vec/wikipedia2vec)
Wikipedia2Vec is a tool used for obtaining embeddings (vector representations) of words and entities from Wikipedia.
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
This tool enables you to learn embeddings that map words and entities into a unified continuous vector space.
The embeddings can be used as word embeddings, entity embeddings, and the unified embeddings of words and entities.
They are used in the state-of-the-art models of various tasks such as [entity linking](https://arxiv.org/abs/1601.01343), [named entity recognition](http://www.aclweb.org/anthology/I17-2017), [entity relatedness](https://arxiv.org/abs/1601.01343), and [question answering](https://arxiv.org/abs/1803.08652).
Documentation and pretrained embeddings are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
Reference
---------
If you use Wikipedia2Vec in a scientific publication, please cite the following paper:
@InProceedings{yamada-EtAl:2016:CoNLL,
author = {Yamada, Ikuya and Shindo, Hiroyuki and Takeda, Hideaki and Takefuji, Yoshiyasu},
title = {Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation},
booktitle = {Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning},
month = {August},
year = {2016},
address = {Berlin, Germany},
pages = {250--259},
publisher = {Association for Computational Linguistics}
}
License
-------
[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikipedia2vec-0.2.6.tar.gz
(1.1 MB
view details)
File details
Details for the file wikipedia2vec-0.2.6.tar.gz
.
File metadata
- Download URL: wikipedia2vec-0.2.6.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.19.8 CPython/3.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4458c77d951c23ca08b8f94ee2f87b110b27f6d15efb12a4f167d69f8cabced7 |
|
MD5 | 80664867221186f9075cec1859e0ffcf |
|
BLAKE2b-256 | ba87d04a81a533904f42e24e01e4608b9d3225a5de215bee67707e8411023170 |