Skip to main content

Parsing, Proccesing, Annotating and Storing WikiTables in Json

Project description

wikitablewrapper

Install virtual ennvironment of python3

sudo apt-get install curl
sudo apt install python3-pip
sudo apt install virtualenv
virtualenv env --python=python3
cd env && source bin/activate

Install dependencies

pip3 install simplejson
pip3 install xmltodict

Install locally DBpedia Spotlight Ubuntu

sudo snap install docker

or Debian

wget https://download.docker.com/linux/debian/dists/jessie/pool/stable/amd64/docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb
sudo dpkg -i docker-ce_17.03.0~ce-0~debian-jessie_amd64.deb

See if you have the image already in your computer

sudo docker image ls

if you don't have it, then run

sudo docker pull dbpedia/spotlight-english
sudo docker run -d -p 2222:80 dbpedia/spotlight-english spotlight.sh

If you have to stop it, see the CONTAINER ID, and then do it

sudo docker container ls
sudo docker stop CONTAINER_ID

test it

curl http://localhost:2222/rest/annotate -H "Accept: application/json" --data-urlencode "text=Brazilian state-run giant oil company Petrobras signed a three-year technology and research cooperation agreement with oil service provider Halliburton." --data "confidence=0.3" --data "support=20"

To install this package, just run:

pip3 install -i https://test.pypi.org/simple/ wikitablewrapper

How use it?

wt = WikitableWrapper(20)
wt.createHtmlFile = True

wt.includeBabelfy = True
wt.includeTagme = True
wt.includeFremeNer = True
wt.includeDBpediaSpotlightLocal = True

# you can do, either
wt.outputFolder="1Out"  # Optional
wt.processJson("1Tablas/1003231_2.json")

# or
wt.ProcessFolderOfJson("100Tablas","100Out")

See below an example of how use the Benchamrk class

wb = WikitableBenchmark()
dSys = {"Babelfy":"_babelfy", "TagME":"_tagme", "FremeNER":"_fremener", "DBpedia Spotlight":"_dbpst"}
wb.MeasureF1_and_Summarize("100Tablas","100Out", dSys, "100Benchmark")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikitablewrapper-0.2.6.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

wikitablewrapper-0.2.6-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file wikitablewrapper-0.2.6.tar.gz.

File metadata

  • Download URL: wikitablewrapper-0.2.6.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.8

File hashes

Hashes for wikitablewrapper-0.2.6.tar.gz
Algorithm Hash digest
SHA256 7f61dd92278b81b3d149b6013f5ef3fe1347c44517b41801c89542bcae79b2f9
MD5 30952b700ca720e8211b96efe486204e
BLAKE2b-256 44bb6035d8d3b5db929bd7b09682f43c3ddc39893269d923af8e1cc156163e34

See more details on using hashes here.

File details

Details for the file wikitablewrapper-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: wikitablewrapper-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.8

File hashes

Hashes for wikitablewrapper-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 96c6169870ec77b7acd3ffe49768ac37ada6db89922c988379e89e2621014f3a
MD5 a11dc3118323995e2f36f67dfe7c4002
BLAKE2b-256 20a65c860d770b9abc1597a3ec846ad564338499f18222608ff38473c3a70486

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page