Query Language for Wikipedia
Project description
WikipediaQL: querying structured data from Wikipedia
WikipediaQL is an experimental query language and Python library for querying structured data from Wikipedia. It looks like this:
from wikipedia_ql import media_wiki
wikipedia = media_wiki.Wikipedia()
print(wikipedia.query(r'''
from "Guardians of the Galaxy (film)" {
page@title as "title";
section[heading="Cast"] as "cast" {
li >> text["^(.+?) as (.+?):"] {
text-group[1] as "actor";
text-group[2] as "character"
}
};
section[heading="Critical response"] {
sentence["Rotten Tomatoes"] as "RT ratings" {
text["\d+%"] as "percent";
text["(\d+) (critic|review)"] >> text-group[1] as "reviews";
text["[\d.]+/10"] as "overall"
}
}
}
'''))
# {
# 'title': 'Guardians of the Galaxy (film)',
# 'cast': [{'actor': 'Chris Pratt', 'character': 'Peter Quill / Star-Lord'}, {'actor': 'Zoe Saldana', 'character': 'Gamora'}, {'actor': 'Dave Bautista', 'character': 'Drax the Destroyer'}, {'actor': 'Vin Diesel', 'character': 'Groot'}, {'actor': 'Bradley Cooper', 'character': 'Rocket'}, {'actor': 'Lee Pace', 'character': 'Ronan the Accuser'}, {'actor': 'Michael Rooker', 'character': 'Yondu Udonta'}, {'actor': 'Karen Gillan', 'character': 'Nebula'}, {'actor': 'Djimon Hounsou', 'character': 'Korath'}, {'actor': 'John C. Reilly', 'character': 'Rhomann Dey'}, {'actor': 'Glenn Close', 'character': 'Irani Rael'}, {'actor': 'Benicio del Toro', 'character': 'Taneleer Tivan / The Collector'}],
# 'RT ratings': {'percent': '92%', 'reviews': '328', 'overall': '7.82/10'}
# }
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikipedia_ql-0.0.2.tar.gz
(18.0 kB
view hashes)
Built Distribution
Close
Hashes for wikipedia_ql-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80d5a1a7ce987d1fd19dee6498900579d42f9b6f5f1c90c85bf816c42d838f15 |
|
MD5 | 5f5debdbb74de57f22943b7ef33c048d |
|
BLAKE2b-256 | 74dc588472ea977b54f1d07c24515da489a8d92451d4858b14f6c6eac92688b9 |