Skip to main content

Not another Google searching tool.

Project description

Not Another Google Search

Not another Google searching library. Just kidding - it is.

Made for educational purposes. I hope it will help!

Table of Contents

How to Install

Standard Install

pip3 install nagooglesearch

pip3 install --upgrade nagooglesearch

Build and Install From the Source

git clone https://github.com/ivan-sincek/nagooglesearch && cd nagooglesearch

python3 -m pip install --upgrade build

python3 -m build

python3 -m pip install dist/nagooglesearch-8.7-py3-none-any.whl

Usage

Standard

Default values:

nagooglesearch.GoogleClient(
	tld = "com",
	homepage_parameters = {
		"btnK": "Google+Search",
		"source": "hp"
	},
	search_parameters = {
	},
	cookies = {
	},
	user_agent = "",
	proxy = "",
	max_results = 100,
	min_sleep = 8,
	max_sleep = 18,
	debug = False
)

Only domains without they keyword google and not ending with the keyword goo.gl are accepted as valid results. The final output is a unique and sorted list of URLs.

Google frequently changes cookies, so default ones might not work; specify new ones using the cookies parameter.

Default cookies can be found here.

Example, standard:

import nagooglesearch

# the following query string parameters are set only if 'start' query string parameter is not set or is equal to zero
# simulate a homepage search
homepage_parameters = {
	"btnK": "Google+Search",
	"source": "hp"
}

# search the internet for additional query string parameters
# https://brightdata.com/blog/web-data/google-search-url-parameters
search_parameters = {
	"q": "site:*.example.com intext:password", # search query
	"tbs": "li:1", # specify 'li:1' for verbatim search (no alternate spellings, etc.)
	"hl": "en",
	"lr": "lang_en",
	"cr": "countryUS",
	"udm": "14", # only web results
	"filter": "0", # specify '0' to display hidden results
	"safe": "images" # specify 'images' to turn off safe search, or specify 'active' to turn on safe search
}

# if the default cookies no longer work, specify new ones here
# if left empty, the default ones will be used
cookies = {
}

client = nagooglesearch.GoogleClient(
	tld = "com", # top level domain, e.g., www.google.com or www.google.hr
	homepage_parameters = homepage_parameters, # 'search_parameters' will override 'homepage_parameters'
	search_parameters = search_parameters,
	cookies = cookies,
	user_agent = "curl/3.30.1", # a random user agent will be set if none is provided
	proxy = "socks5://127.0.0.1:9050", # supported URL schemes are 'http[s]', 'socks4[h]', and 'socks5[h]'
	max_results = 200, # maximum unique URLs to return
	min_sleep = 15, # minimum sleep between page requests
	max_sleep = 30, # maximum sleep between page requests
	debug = True # enable debug output
)

urls = client.search()

if client.get_error() == nagooglesearch.Error.REQUEST:
	print("[ Request Exception ]")
	# do something
elif client.get_error() == nagooglesearch.Error.RATE_LIMIT:
	print("[ HTTP 429 Too Many Requests ]")
	# do something

for url in urls:
	print(url)
	# do something

Check the list of user agents here. For more user agents, check scrapeops.io.

Shortest Possible

Example, shortest possible:

import nagooglesearch

urls = nagooglesearch.GoogleClient(search_parameters = {"q": "site:*.example.com intext:password"}).search()

# do something

Time Sensitive Search

Example, do not show results older than 6 months:

import nagooglesearch, dateutil.relativedelta as relativedelta

def get_tbs(months: int):
	today = datetime.datetime.today()
	return nagooglesearch.get_tbs(today, today - relativedelta.relativedelta(months = months))

search_parameters = {
	"tbs": get_tbs(6)
}

# do something

User Agents

Example, get all user agents:

import nagooglesearch

user_agents = nagooglesearch.get_all_user_agents()
print(user_agents)

# do something

Example, get a random user agent:

import nagooglesearch

user_agent = nagooglesearch.get_random_user_agent()
print(user_agent)

# do something

Project details


Release history Release notifications | RSS feed

This version

8.7

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nagooglesearch-8.7.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nagooglesearch-8.7-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file nagooglesearch-8.7.tar.gz.

File metadata

  • Download URL: nagooglesearch-8.7.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for nagooglesearch-8.7.tar.gz
Algorithm Hash digest
SHA256 c25419e36544538c82d0310885de7d98818719fd45a069c092cfd0854c078b19
MD5 64cbffff413330929c05279c4516caa1
BLAKE2b-256 f859e3d536bf54723933e74d025455aa246029815925db3e54d777ba5fa8a74a

See more details on using hashes here.

File details

Details for the file nagooglesearch-8.7-py3-none-any.whl.

File metadata

  • Download URL: nagooglesearch-8.7-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for nagooglesearch-8.7-py3-none-any.whl
Algorithm Hash digest
SHA256 1e3a260b24e17014f5216d7ecf600c73ed226f28557028059beb4b416e030397
MD5 1de0722e89f34949aa8b0ef21dfe293d
BLAKE2b-256 d8b5f96c9acc10fd82dde545170425c218c9ca8ecab9c5dae4db48b735a063d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page