Not another Google searching tool.
Project description
Not Another Google Search
Not another Google searching library. Just kidding - it is.
Made for educational purposes. I hope it will help!
Table of Contents
How to Install
Standard Install
pip3 install nagooglesearch
pip3 install --upgrade nagooglesearch
Build and Install From the Source
git clone https://github.com/ivan-sincek/nagooglesearch && cd nagooglesearch
python3 -m pip install --upgrade build
python3 -m build
python3 -m pip install dist/nagooglesearch-8.7-py3-none-any.whl
Usage
Standard
Default values:
nagooglesearch.GoogleClient(
tld = "com",
homepage_parameters = {
"btnK": "Google+Search",
"source": "hp"
},
search_parameters = {
},
cookies = {
},
user_agent = "",
proxy = "",
max_results = 100,
min_sleep = 8,
max_sleep = 18,
debug = False
)
Only domains without they keyword google and not ending with the keyword goo.gl are accepted as valid results. The final output is a unique and sorted list of URLs.
Google frequently changes cookies, so default ones might not work; specify new ones using the cookies parameter.
Default cookies can be found here.
Example, standard:
import nagooglesearch
# the following query string parameters are set only if 'start' query string parameter is not set or is equal to zero
# simulate a homepage search
homepage_parameters = {
"btnK": "Google+Search",
"source": "hp"
}
# search the internet for additional query string parameters
# https://brightdata.com/blog/web-data/google-search-url-parameters
search_parameters = {
"q": "site:*.example.com intext:password", # search query
"tbs": "li:1", # specify 'li:1' for verbatim search (no alternate spellings, etc.)
"hl": "en",
"lr": "lang_en",
"cr": "countryUS",
"udm": "14", # only web results
"filter": "0", # specify '0' to display hidden results
"safe": "images" # specify 'images' to turn off safe search, or specify 'active' to turn on safe search
}
# if the default cookies no longer work, specify new ones here
# if left empty, the default ones will be used
cookies = {
}
client = nagooglesearch.GoogleClient(
tld = "com", # top level domain, e.g., www.google.com or www.google.hr
homepage_parameters = homepage_parameters, # 'search_parameters' will override 'homepage_parameters'
search_parameters = search_parameters,
cookies = cookies,
user_agent = "curl/3.30.1", # a random user agent will be set if none is provided
proxy = "socks5://127.0.0.1:9050", # supported URL schemes are 'http[s]', 'socks4[h]', and 'socks5[h]'
max_results = 200, # maximum unique URLs to return
min_sleep = 15, # minimum sleep between page requests
max_sleep = 30, # maximum sleep between page requests
debug = True # enable debug output
)
urls = client.search()
if client.get_error() == nagooglesearch.Error.REQUEST:
print("[ Request Exception ]")
# do something
elif client.get_error() == nagooglesearch.Error.RATE_LIMIT:
print("[ HTTP 429 Too Many Requests ]")
# do something
for url in urls:
print(url)
# do something
Check the list of user agents here. For more user agents, check scrapeops.io.
Shortest Possible
Example, shortest possible:
import nagooglesearch
urls = nagooglesearch.GoogleClient(search_parameters = {"q": "site:*.example.com intext:password"}).search()
# do something
Time Sensitive Search
Example, do not show results older than 6 months:
import nagooglesearch, dateutil.relativedelta as relativedelta
def get_tbs(months: int):
today = datetime.datetime.today()
return nagooglesearch.get_tbs(today, today - relativedelta.relativedelta(months = months))
search_parameters = {
"tbs": get_tbs(6)
}
# do something
User Agents
Example, get all user agents:
import nagooglesearch
user_agents = nagooglesearch.get_all_user_agents()
print(user_agents)
# do something
Example, get a random user agent:
import nagooglesearch
user_agent = nagooglesearch.get_random_user_agent()
print(user_agent)
# do something
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nagooglesearch-8.7.tar.gz.
File metadata
- Download URL: nagooglesearch-8.7.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c25419e36544538c82d0310885de7d98818719fd45a069c092cfd0854c078b19
|
|
| MD5 |
64cbffff413330929c05279c4516caa1
|
|
| BLAKE2b-256 |
f859e3d536bf54723933e74d025455aa246029815925db3e54d777ba5fa8a74a
|
File details
Details for the file nagooglesearch-8.7-py3-none-any.whl.
File metadata
- Download URL: nagooglesearch-8.7-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e3a260b24e17014f5216d7ecf600c73ed226f28557028059beb4b416e030397
|
|
| MD5 |
1de0722e89f34949aa8b0ef21dfe293d
|
|
| BLAKE2b-256 |
d8b5f96c9acc10fd82dde545170425c218c9ca8ecab9c5dae4db48b735a063d1
|