Skip to main content

A comprehensive asynchronous library for scraping and parsing Search Engine Results Pages (SERPs).

Project description

PySerp

PySerp is an asynchronous Python library for automated, flexibly configurable scraping and parsing of Search Engine Results Pages (SERPs).

Purpose

Scraping search engine results is almost always necessary when the task is to automatically analyze these results, or to collect content from the links within them for any purpose.

Examples:

  • Competitive analysis for keywords (SEO)
  • Searching for and extracting any structured information from page content (phone numbers, emails, addresses, etc)
  • Collecting page content to generate summaries (AI search)

Key Features

This library:

  • Is asynchronous by default for maximum efficiency
  • Supports Google and Bing as search engines (more will be added in the future)
  • Applies strict typing to results using Pydantic

Installation

From PyPI (Recommended)

Simply run:

pip install pyserp

From Source (Development)

If you want to contribute or use the latest git version, download the source code:

git clone https://github.com/whode/pyserp
cd pyserp

Create a virtual environment (recommended):

python -m venv venv

And activate it

On Linux:

source venv/bin/activate

On Windows:

venv\Scripts\activate

Install the library:

pip install -e .

Usage

A simple, idiomatic usage example that demonstrates retrieving the top 10 search results from Google for a given query:

import asyncio

from pyserp.providers import GoogleSearcherManager, GoogleSearchSessionsManager


async def main():
    query = "how to learn python"
    print("Searching for:", query, end="\n\n")

    cookies = {"NID": "YOUR_NID_COOKIE (Get it in your browser (but not Chrome): F12 -> Application -> Cookies)"}
    manager = GoogleSearchSessionsManager(cookies = cookies)
    async with GoogleSearcherManager(search_sessions_manager=manager) as searcher:
        search_top_result = await searcher.search_top(query=query,
                                                      limit=10,
                                                      include_page_errors=False)

        print("----- Results -----", end="\n\n")
        for page in search_top_result.pages:
            for result in page.results.organic:
                print(result.title, result.url, sep="\n", end="\n\n")


if __name__ == "__main__":
    asyncio.run(main())

The library offers much more than this. Full documentation will be added in the future.

Development & Testing

Tests are included in the source repository only (not in the PyPI package). To run tests, clone the repository:

git clone https://github.com/whode/pyserp.git
cd pyserp
pip install -U pytest
python -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserp-1.0.2.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserp-1.0.2-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file pyserp-1.0.2.tar.gz.

File metadata

  • Download URL: pyserp-1.0.2.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2.tar.gz
Algorithm Hash digest
SHA256 65546f0f4ad5907dc0d593e09e66754203f430031b491d4714358afcdc68c8f5
MD5 cf550b2dca4763e296ecab6bdce42c44
BLAKE2b-256 c5ee858d590efc27f9d33f30e3c40079297ef9dd12731679407e318207d98093

See more details on using hashes here.

File details

Details for the file pyserp-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pyserp-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 35.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 20dd47362799d6a68349b017fc8bfad945efaae00b864f4c0b18bcc6d03e8ba0
MD5 2e177375b5bf5ab22544a0def0b957ec
BLAKE2b-256 d9c0b52ca0e922e6f21c549aeb4817a005a4ea7091239151cde29f4651d5167a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page