A script that scrapes wikipedia
Project description
wikipedia-searcher
wikipedia-searcher is a library for scraping wikipedia information.
>>> from wikisearch.wikisearcher import WikiSearcher
>>> searcher = WikiSearcher()
>>> searcher.search('React.js')
'React (also known as React.js or ReactJS) is a free and open-source front-end JavaScript library[3] for building user interfaces or UI components.'
If a search has has many options it returns a dictionary with a list of articles. You can can also pass the article to search.
>>> from pprint import pprint
>>> search_result = searcher.search('The Hills')
>>> pprint(search_result)
>>> {'Places': [<wikisearch.article.ArticleLink object at 0x00000208A22EFAC0>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF7F0>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF6A0>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF400>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF3D0>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF370>],
'Popular culture': [<wikisearch.article.ArticleLink object at 0x00000208A22EF940>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF880>],
'See also': [<wikisearch.article.ArticleLink object at 0x00000208A22EF340>,
<wikisearch.article.ArticleLink object at 0x00000208A22EF8E0>]}
>>> article = search_result['Places'][0]
>>> article.name
'Santa Monica Mountains'
>>> article.link
'/wiki/Santa_Monica_Mountains'
>>> searcher.search(article)
'The Santa Monica Mountains is a coastal mountain range in Southern....'
How to Install
Requests is available on PyPI:
$ pip install wikipedia-searcher
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for wikipedia_searcher-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b17ea3c80c28d67c81e93345d9e5371153d08bb35e181e80721e57bcef7df87f |
|
MD5 | f4116f2d36bb86a28d7313e0044847bf |
|
BLAKE2b-256 | e91ffb764621f32281d83283bec35ac94f621b2328f172a9838bab3f1436a7e3 |