Skip to main content

Simple Google parser

Project description

GOOgle spiDER

Google search engine parser on python3

Instruction

Requirement python 3.10+

pip install gooder

from gooder import Gooder

gooder = Gooder()
# Make request on google.com/search?q=Hello+World
parsed = gooder.parse(query="Hello World")

# Print only result links
print(gooder.get_links())

# Print only result titles
print(gooder.get_titles())

# Print all results list[tuple[link,title]]
print(gooder.raw_results)

# If TRUE = parsed, else = captcha/rate limit
if (parsed)
    # Save urls to json file
    gooder.save_to_file(only_urls=True, to_json=True, override=True, file="results.json")

Methods & Fields

Method/Field Args Example Result
Gooder.parse query: str, page: int=0, ignore_google: bool=True, clear_old: bool=True gooder.parse("hello", clear_old=False) True | False
Gooder.raw_results Field Field [[link, title], ...]
Gooder.get_links repeats: bool = False gooder.get_links() [unique_link, ...]
Gooder.get_titles None gooder.get_titles() [title, title, ...]
Gooder.save_to_file only_urls: bool = True override: bool = True to_json: bool = False file: str = "urls.txt" gooder.save_to_file() New file with urls
Gooder.get_hostname links: str | list gooder.get_hostname(https://google.com/) google.com
Gooder.get_captcha_url None gooder.get_captcha_url() None | google.com/sorry/...
Gooder.get_headers None gooder.get_headers() None | HTTPHeaderDict({...})

Todo:

  • Add proxy manager
  • Replace raw_results: list(list()) to dict()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gooder-0.3.2.tar.gz (3.5 kB view hashes)

Uploaded Source

Built Distribution

gooder-0.3.2-py3-none-any.whl (18.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page