A simple asynchronous and synchronous Wikipedia scraper.
Project description
Scrapewiki - Wikipedia Scraper
It can scrape Wikipedia synchronously and asynchronously.
scrapewiki.Scrapewiki
has two methods, search
and wiki
.
wiki
It is used to scrape a Wikipedia page.
search
It is used to search some query on Wikipedia. limit
parameter can be optionally specified to set a limit to the amount of results.
Examples
Asynchronous:
import scrapewiki
import trio
wiki = scrapewiki.Scrapewiki()
async def main():
async with wiki.search("python") as results:
async for search_result in results:
...
# equivalent of
searcher = wiki.search("python")
results = await searcher.async_method()
trio.run(main)
import scrapewiki
import trio
wiki = scrapewiki.Scrapewiki()
async def main():
async with wiki.wiki("python", limit=45) as page:
...
# equivalent of
page_scraper = wiki.wiki("python")
page = await page_scraper.async_method()
trio.run(main)
Synchronous:
import scrapewiki
wiki = scrapewiki.Scrapewiki()
with wiki.search("python", limit=45) as results:
for search_result in results:
...
# equivalent of
searcher = wiki.search("python")
results = searcher.sync_method()
import scrapewiki
wiki = scrapewiki.Scrapewiki()
with wiki.wiki("python") as page:
...
# equivalent of
page_scraper = wiki.wiki("python")
page = page_scraper.sync_method()
Extras
The module also provides some utility functions for ease of use (currently just one):
Plans
There are a lot of things that needs to be parsed. There are a lot of bugs that needs to be fixed. I'm pretty sure there are some typos in docstrings and wrong annotations as well. My plan for now is to fix the aforesaid problems.
Note
This library is English only due to how some things have been parsed. I'm sure there are better ways to do them and make it support all languages. This is in my TODO list.
Documentation
I don't have any plans for online documentation as of now. Please read the source code. All the dataclasses can be found here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scrapewiki-0.1.5b0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac43764ae644cf1e05ce38092c353c5e1330a466018b698781d1780de4926507 |
|
MD5 | 92af898bf95f2512714ac3a5b5f2bfe9 |
|
BLAKE2b-256 | 0bcd670b5309109fc7447339704d1d6294befc71bd4582b8e6b43097c01f82fb |