Skip to main content

Parsing, Proccesing, Annotating and Storing WikiTables in Json

Project description

wikitablewrapper

Install virtual ennvironment of python3

sudo apt-get install curl
sudo apt install python3-pip
sudo apt install virtualenv
virtualenv env --python=python3
cd env && source bin/activate

Install dependencies

pip3 install simplejson
pip3 install xmltodict

Install locally DBpedia Spotlight Ubuntu

sudo snap install docker

or Debian

wget https://download.docker.com/linux/debian/dists/jessie/pool/stable/amd64/docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb
sudo dpkg -i docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb

See if you have the image already in your computer

sudo docker image ls

if you don't have it, then run

sudo docker pull dbpedia/spotlight-english
sudo docker run -d -p 2222:80 dbpedia/spotlight-english spotlight.sh

If you have to stop it, see the CONTAINER ID, and then do it

sudo docker container ls
sudo docker stop CONTAINER_ID

test it

curl http://localhost:2222/rest/annotate -H "Accept: application/json" --data-urlencode "text=Brazilian state-run giant oil company Petrobras signed a three-year technology and research cooperation agreement with oil service provider Halliburton." --data "confidence=0.3" --data "support=20"

To install this package, just run:

pip3 install -i https://test.pypi.org/simple/ wikitablewrapper

How use it?

wt = WikitableWrapper(20)
wt.createHtmlFile = True

wt.includeBabelfy = True
wt.includeTagme = True
wt.includeFremeNer = True
wt.includeDBpediaSpotlightLocal = True

# you can do, either
wt.outputFolder="1Out"  # Optional
wt.processJson("1Tablas/1003231_2.json")

# or
wt.ProcessFolderOfJson("100Tablas","100Out")

See below an example of how use the Benchamrk class

wb = WikitableBenchmark()
dSys = {"Babelfy":"_babelfy", "TagME":"_tagme", "FremeNER":"_fremener", "DBpedia Spotlight":"_dbpst"}
wb.MeasureF1_and_Summarize("100Tablas","100Out", dSys, "100Benchmark")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikitablewrapper-0.2.6.tar.gz (33.6 kB view hashes)

Uploaded Source

Built Distribution

wikitablewrapper-0.2.6-py3-none-any.whl (33.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page