Skip to main content

Search Engines Scraper

Project description

\n# search_engines
A Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages.
Please note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.

Supported search engines

Google
Bing
Yahoo
Duckduckgo
Startpage
Aol
Dogpile
Ask
Mojeek
Brave
Torch

Features

  • Creates output files (html, csv, json).
  • Supports search filters (url, title, text).
  • HTTP and SOCKS proxy support.
  • Collects dark web links with Torch.
  • Easy to add new search engines. You can add a new engine by creating a new class in search_engines/engines/ and add it to the search_engines_dict dictionary in search_engines/engines/__init__.py. The new class should subclass SearchEngine, and override the following methods: _selectors, _first_page, _next_page.
  • Python2 - Python3 compatible.

Requirements

Python 2.7 - 3.x with
Requests and
BeautifulSoup

Installation

Run the setup file: $ python setup.py install.
Done!

Usage

As a library:

from search_engines import Google

engine = Google()
results = engine.search("my query")
links = results.links()

print(links)

As a CLI script:

$ python search_engines_cli.py -e google,bing -q "my query" -o json,print

Other versions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

search_engines_kit-1.0.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

search_engines_kit-1.0-py3-none-any.whl (27.9 kB view details)

Uploaded Python 3

File details

Details for the file search_engines_kit-1.0.tar.gz.

File metadata

  • Download URL: search_engines_kit-1.0.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for search_engines_kit-1.0.tar.gz
Algorithm Hash digest
SHA256 56b7059f7dab2328b6a6b0e0e50f6964e1a47fae517cd5691b3c8784d4e628b2
MD5 3b1054cf426e82e2c7c324d8c7d9557c
BLAKE2b-256 7ca50986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5

See more details on using hashes here.

File details

Details for the file search_engines_kit-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for search_engines_kit-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 668f1fdce321e498c3ba2204aa4e5f892b40e96a3170b76ac0a07d57c4a43388
MD5 ffdeb19d819634543c2a7103d13917c1
BLAKE2b-256 2ebd2a050203ce7603ea1d813e7c8c3ad204032a80fce3b8fb7a653563cc8e43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page