Skip to main content

A comprehensive asynchronous library for scraping and parsing Search Engine Results Pages (SERPs).

Project description

PySerp

PySerp is an asynchronous Python library for automated, flexibly configurable scraping and parsing of Search Engine Results Pages (SERPs).

Purpose

Scraping search engine results is almost always necessary when the task is to automatically analyze these results, or to collect content from the links within them for any purpose.

Examples:

  • Competitive analysis for keywords (SEO)
  • Searching for and extracting any structured information from page content (phone numbers, emails, addresses, etc)
  • Collecting page content to generate summaries (AI search)

Key Features

This library:

  • Is asynchronous by default for maximum efficiency
  • Supports Google and Bing as search engines (more will be added in the future)
  • Applies strict typing to results using Pydantic

Installation

From PyPI (Recommended)

Simply run:

pip install pyserp

From Source (Development)

If you want to contribute or use the latest git version, download the source code:

git clone https://github.com/whode/pyserp
cd pyserp

Create a virtual environment (recommended):

python -m venv venv

And activate it

On Linux:

source venv/bin/activate

On Windows:

venv\Scripts\activate

Install the library:

pip install -e .

Usage

A simple, idiomatic usage example that demonstrates retrieving the top 10 search results from Google for a given query:

import asyncio

from pyserp.providers import GoogleSearcherManager, GoogleSearchSessionsManager


async def main():
    query = "how to learn python"
    print("Searching for:", query, end="\n\n")

    cookies = {"NID": "YOUR_NID_COOKIE (Get it in your browser (but not Chrome): F12 -> Application -> Cookies)"}
    manager = GoogleSearchSessionsManager(cookies = cookies)
    async with GoogleSearcherManager(search_sessions_manager=manager) as searcher:
        search_top_result = await searcher.search_top(query=query,
                                                      limit=10,
                                                      include_page_errors=False)

        print("----- Results -----", end="\n\n")
        for page in search_top_result.pages:
            for result in page.results.organic:
                print(result.title, result.url, sep="\n", end="\n\n")


if __name__ == "__main__":
    asyncio.run(main())

The library offers much more than this. Full documentation will be added in the future.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserp-1.0.1.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserp-1.0.1-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file pyserp-1.0.1.tar.gz.

File metadata

  • Download URL: pyserp-1.0.1.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.1.tar.gz
Algorithm Hash digest
SHA256 3c3e356bb548daefdf4747eb4e8099c5750d86b5b28cdefaedbc63039a1c5340
MD5 b61caa7de02859763e2a9940e858f61c
BLAKE2b-256 9734b5ddf9f787c69921e64fd09d597caac35e8fdb3de315e47988ca830c3395

See more details on using hashes here.

File details

Details for the file pyserp-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: pyserp-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 35.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 514678f5aeaa72a6b7bff070d24a48a7010145e2b4b02410c68c5d9ad96cc3a5
MD5 e32bf7ea6f1a12006770e0079f178d1c
BLAKE2b-256 45c3231aec979878f73ee747e931d2938a268798eefca55b663daba0e3631696

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page