Python Wrapper for Wikipedia
Project description
Wikipedia API
This package provides python API for accessing Wikipedia.
Installation
pip3 install wikipedia-api
Usage
import wikipediaapi
# Extract data in Wiki format
wiki_wiki = wikipediaapi.Wikipedia('en')
page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True
print("Page - Id: %s" % page_py.pageid)
# Page - Id: 23862
print("Page - Title: %s" % page_py.title)
# Page - Title: Python (programming language)
print("Page - Summary: %s" % page_py.summary[0:60])
# Page - Summary: Python is a widely used high-level programming language for
def print_sections(sections, level=0):
for s in sections:
print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
print_sections(s.sections, level + 1)
print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l
# ...
def print_langlinks(page):
langlinks = page.langlinks
for k in sorted(langlinks.keys()):
v = langlinks[k]
print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))
print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8
# ...
def print_links(page):
links = page.links
for title in sorted(links.keys()):
print("%s: %s" % (title, links[title]))
print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...
def print_categories(page):
categories = page.categories
for title in sorted(categories.keys()):
print("%s: %s" % (title, categories[title]))
print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...
# ...
section_py = page_py.section_by_title('Features and philosophy')
print("Section - Title: %s" % section_py.title)
# Section - Title: Features and philosophy
print("Section - Text: %s" % section_py.text[0:60])
# Section - Text: Python is a multi-paradigm programming language. Object-orie
# Now lets extract texts with HTML tags
wiki_html = wikipediaapi.Wikipedia(
language='cs',
extract_format=wikipediaapi.ExtractFormat.HTML
)
page_ostrava = wiki_html.page('Ostrava')
print("Page - Summary: %s" % page_ostrava.summary[0:60])
# Page - Summary: <p><b>Ostrava</b> (polsky <span lang="pl" title="polština" x
page_nonexisting = wiki_wiki.page('Wikipedia-API-FooBar')
print("Page - Exists: %s" % page_nonexisting.exists())
# Page - Exists: False
print("Page - Id: %s" % page_nonexisting.pageid)
# Page - Id: -1
# Create wikipedia for Germany
wiki_de = wikipediaapi.Wikipedia('de')
de_page = wiki_de.page('Deutsche Sprache')
print(de_page.title + ": " + de_page.fullurl)
# Deutsche Sprache: https://de.wikipedia.org/wiki/Deutsche_Sprache
print(de_page.summary[0:60])
# Die deutsche Sprache bzw. Deutsch [dɔʏ̯t͡ʃ], abgekürzt Dt. o
# But you can still fetch data from english version
en_page = de_page.langlinks['en']
print(en_page.title + ": " + en_page.fullurl)
# German language: https://en.wikipedia.org/wiki/German_language
print(en_page.summary[0:60])
# German (Deutsch [ˈdɔʏt͡ʃ] ( listen)) is a West Germanic lang
External Links
Changelog
0.3.3
Added support for request timeout
Add header: Accept-Encoding: gzip
0.3.2
Added support for property Categories
0.3.1
Removing WikipediaLangLink
Page keeps track of its own language, so it’s easier to jump between different translations of the same page
0.3.0
Rename directory from wikipedia to wikipediaapi to avoid collisions
0.2.4
Handle redirects properly
0.2.3
Usage method page instead of article in Wikipedia
0.2.2
Added support for property Links
0.2.1
Added support for property Langlinks
0.2.0
Use properties instead of functions
Added support for property Info
0.1.6
Support for extracting texts with HTML markdown
Added initial version of unit tests
0.1.4
It’s possible to extract summary and sections of the page
Added support for property Extracts
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file Wikipedia-API-0.3.3.tar.gz
.
File metadata
- Download URL: Wikipedia-API-0.3.3.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 216f24239e7d703403e5d82cdeb324a6e58387e63b9e36a18c6f9924c33f4d19 |
|
MD5 | 71a34d88a5e6b514fbac328cec5478b9 |
|
BLAKE2b-256 | 9fa098debe09405e33dfe10c70909d6c057c084a3184bfdb74574fd76d9184cf |