Parsing, Proccesing, Annotating and Storing WikiTables in Json
Project description
wikitablewrapper
Install virtual ennvironment of python3
sudo apt-get install curl
sudo apt install python3-pip
sudo apt install virtualenv
virtualenv env --python=python3
cd env && source bin/activate
Install dependencies
pip3 install simplejson
pip3 install xmltodict
Install locally DBpedia Spotlight Ubuntu
sudo snap install docker
or Debian
wget https://download.docker.com/linux/debian/dists/jessie/pool/stable/amd64/docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb
sudo dpkg -i docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb
See if you have the image already in your computer
sudo docker image ls
if you don't have it, then run
sudo docker pull dbpedia/spotlight-english
sudo docker run -d -p 2222:80 dbpedia/spotlight-english spotlight.sh
If you have to stop it, see the CONTAINER ID, and then do it
sudo docker container ls
sudo docker stop CONTAINER_ID
test it
curl http://localhost:2222/rest/annotate -H "Accept: application/json" --data-urlencode "text=Brazilian state-run giant oil company Petrobras signed a three-year technology and research cooperation agreement with oil service provider Halliburton." --data "confidence=0.3" --data "support=20"
To install this package, just run:
pip3 install -i https://test.pypi.org/simple/ wikitablewrapper
How use it?
wt = WikitableWrapper(20)
wt.createHtmlFile = True
wt.includeBabelfy = True
wt.includeTagme = True
wt.includeFremeNer = True
wt.includeDBpediaSpotlightLocal = True
# you can do, either
wt.outputFolder="1Out" # Optional
wt.processJson("1Tablas/1003231_2.json")
# or
wt.ProcessFolderOfJson("100Tablas","100Out")
See below an example of how use the Benchamrk class
wb = WikitableBenchmark()
dSys = {"Babelfy":"_babelfy", "TagME":"_tagme", "FremeNER":"_fremener", "DBpedia Spotlight":"_dbpst"}
wb.MeasureF1_and_Summarize("100Tablas","100Out", dSys, "100Benchmark")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for wikitablewrapper-0.2.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96c6169870ec77b7acd3ffe49768ac37ada6db89922c988379e89e2621014f3a |
|
MD5 | a11dc3118323995e2f36f67dfe7c4002 |
|
BLAKE2b-256 | 20a65c860d770b9abc1597a3ec846ad564338499f18222608ff38473c3a70486 |