🍰 Making Wikipedia and Wikidata Processing Easy, Like Eating a Piece of Cake

These details have not been verified by PyPI

Project links

Project description

wake

🍰 Making Wikipedia and Wikidata Processing Easy, Like Eating a Piece of Cake

installation

pip3 install wake or pipenv install wake

methods

get_wikidata_entities

Stream Wikidata Entities

from wake import get_wikidata_entities

for entity in get_wikidata_entities():
    print(entity)

You can also filter entities by their type. For example, to get all entities that are humans (Q5) run:

from wake import get_wikidata_entities

for human in get_wikidata_entities(instance_of="Q5"):
    print(human)

clean_title

takes in a title of a Wikipedia page as a string and escapes and cleans it of weird characters, so it can be put in a normal database

download_if_necessary

dowloads a url to the system's temp directory if a file by its name isn't already there

get_most_recent_available_dump

figures out what Wikipedia dump has certain subdumps complete

tokenize

pass in the page text from a dump and get a list of tokens in return

get_links

get links in an article(i.e. what's between '[[' and ']]')

test

python3 -m unittest wake.test

license

CC0-1.0 / Public Domain

contact

Post an issue! Thank you!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.11.0

May 8, 2020

0.10.1

May 3, 2020

0.10

May 3, 2020

0.9

May 24, 2018

0.8

May 24, 2018

0.7

May 22, 2018

0.6

Jan 27, 2018

0.5

Dec 2, 2017

0.4

Nov 30, 2017

0.3

Nov 30, 2017

0.2

Nov 30, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wake-0.11.0.tar.gz (6.2 kB view details)

Uploaded May 8, 2020 Source

File details

Details for the file wake-0.11.0.tar.gz.

File metadata

Download URL: wake-0.11.0.tar.gz
Upload date: May 8, 2020
Size: 6.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/44.1.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/2.7.17

File hashes

Hashes for wake-0.11.0.tar.gz
Algorithm	Hash digest
SHA256	`5794764bda56dcedab3b56680855462462bbc1c7649bc018a013fb0bf1edc371`
MD5	`2a905fc0a262dba913146697f281d8da`
BLAKE2b-256	`549a461d1514f2ffe67acaa5c5b51db312e8ccdf933a43dea50879f3783e9481`

See more details on using hashes here.

wake 0.11.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

wake

installation

methods

get_wikidata_entities

clean_title

download_if_necessary

get_most_recent_available_dump

tokenize

get_links

test

license

contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes