Query Language for Wikipedia
Project description
WikipediaQL: querying structured data from Wikipedia
WikipediaQL is an experimental query language and Python library for querying structured data from Wikipedia. It looks like this:
from wikipedia_ql import media_wiki
wikipedia = media_wiki.Wikipedia()
print(wikipedia.query(r'''
from "Guardians of the Galaxy (film)" {
page@title as "title";
section[heading="Cast"] as "cast" {
li >> text["^(.+?) as (.+?):"] {
text-group[1] as "actor";
text-group[2] as "character"
}
};
section[heading="Critical response"] {
sentence["Rotten Tomatoes"] as "RT ratings" {
text["\d+%"] as "percent";
text["(\d+) (critic|review)"] >> text-group[1] as "reviews";
text["[\d.]+/10"] as "overall"
}
}
}
'''))
# {
# 'title': 'Guardians of the Galaxy (film)',
# 'cast': [{'actor': 'Chris Pratt', 'character': 'Peter Quill / Star-Lord'}, {'actor': 'Zoe Saldana', 'character': 'Gamora'}, {'actor': 'Dave Bautista', 'character': 'Drax the Destroyer'}, {'actor': 'Vin Diesel', 'character': 'Groot'}, {'actor': 'Bradley Cooper', 'character': 'Rocket'}, {'actor': 'Lee Pace', 'character': 'Ronan the Accuser'}, {'actor': 'Michael Rooker', 'character': 'Yondu Udonta'}, {'actor': 'Karen Gillan', 'character': 'Nebula'}, {'actor': 'Djimon Hounsou', 'character': 'Korath'}, {'actor': 'John C. Reilly', 'character': 'Rhomann Dey'}, {'actor': 'Glenn Close', 'character': 'Irani Rael'}, {'actor': 'Benicio del Toro', 'character': 'Taneleer Tivan / The Collector'}],
# 'RT ratings': {'percent': '92%', 'reviews': '328', 'overall': '7.82/10'}
# }
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikipedia_ql-0.0.4.tar.gz
(24.0 kB
view hashes)
Built Distribution
Close
Hashes for wikipedia_ql-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dfcd4d2b2e87a4ef0b188235b22a650eb2d09433193424fa560c8e6e50e6cc9 |
|
MD5 | 22572f6ff868af0854b036cce33a82c4 |
|
BLAKE2b-256 | 2d10894e77e87d0e9c74f8af9b551f610e89564b2e23d57b9009a2ee293b8f40 |