Skip to main content

A package to run wikification

Project description

wiki_node_disambiguation - - -

What’s this ?

  • You can run “Wikification” as easy as possible.
    • According to wikipedia, Wikification is in computer science, entity linking with Wikipedia as the target knowledge base
  • You can get disambiguated result with its score.

Please visit Github page also. If you find any bugs and you report it to github issue, I’m glad. Any pull-requests are welcomed.

Requirement

  • Python3.x (checked under )
    • I recommend to use “Anaconda” distribution.

Setup

python setup.py install

Get wikipedia entity vector model

Go to this page and download model file from here. Or run download_model.sh

To those who uses interface.predict_japanese_wiki_names()

You’re supposed to have mysql somewhere.

The step until using it.

  1. start mysql server somewhere
  2. download latest mysql dump files
  3. initialize wikipedia database with mysql

To download wikipedia dump files, execute following commands

wget https://dumps.wikimedia.org/jawiki/latest/jawiki-latest-redirect.sql.gz
wget https://dumps.wikimedia.org/jawiki/latest/jawiki-latest-page.sql.gz
gunzip jawiki-latest-redirect.sql.gz
gunzip jawiki-latest-page.sql.gz

To initialize wikipedia database with mysql,

% CREATE DATABASE wikipedia;
% mysql -u [user_name] -p[password] wikipedia < jawiki-latest-redirect.sql
% mysql -u [user_name] -p[password] wikipedia < jawiki-latest-page.sql

Change logs

  • version0.1
    • released
    • It supports only Japanese wikipedia

Project details


Release history Release notifications

This version
History Node

0.17

History Node

0.16

History Node

0.15

History Node

0.14

History Node

0.11

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
word2vec_wikification_py-0.17.tar.gz (30.5 kB) Copy SHA256 hash SHA256 Source None Mar 15, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page