Aysncio search engine scraping package
Searchit is a library for async scraping of search engines. The library supports multiple search engines (currently Google, Yandex, and Bing) with support for other search engines to come.
pip install searchit
Can be installed using pip, by running the above command.
import asyncio from searchit import GoogleScraper, YandexScraper, BingScraper from searchit import ScrapeRequest request = ScrapeRequest("watch movies online", 30) google = GoogleScraper(max_results_per_page=10) # max_results = Number of results per page yandex = YandexScraper(max_results_per_page=10) loop = asyncio.get_event_loop() results = loop.run_until_complete(google.scrape(request)) results = loop.run_until_complete(yandex.scrape(request))
To use Searchit users first create a ScrapeRequest object, with term and number of results as required fields. This object can then be passed to multiple different search engines and scraped asynchronously.
Scrape Request - Object
term - Required str - the term to be searched for count - Required int - the total number of results domain - Optional[str] - the domain to search i.e. .com or .com sleep - Optional[int] - time to wait betweeen paginating pages - important to prevent getting blocked proxy - Optional[str] - proxy to be used to make request - default none language - Optional[str] - language to conduct search in (only Google atm) geo - Optional[str] - Geo location to conduct search from Yandex, and Qwant
- Add additional search engines
- Blocking non-async scrape method
- Add support for page rendering (Selenium and Puppeteer)
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size searchit-2019.12.30.2-py3-none-any.whl (21.4 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size searchit-2019.12.30.2.tar.gz (5.4 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for searchit-2019.12.30.2-py3-none-any.whl