Skip to main content

A comprehensive asynchronous library for scraping and parsing Search Engine Results Pages (SERPs).

Project description

PySerp

PyPI Python Versions License

PySerp is an asynchronous Python library for automated, flexibly configurable scraping and parsing of Search Engine Results Pages (SERPs).

Purpose

Scraping search engine results is almost always necessary when the task is to automatically analyze these results, or to collect content from the links within them for any purpose.

Examples:

  • Competitive analysis for keywords (SEO)
  • Searching for and extracting any structured information from page content (phone numbers, emails, addresses, etc)
  • Collecting page content to generate summaries (AI search)

Key Features

This library:

  • Is asynchronous by default for maximum efficiency
  • Supports Google and Bing as search engines (more will be added in the future)
  • Applies strict typing to results using Pydantic

Installation

From PyPI (Recommended)

Simply run:

pip install pyserp

From Source (Development)

If you want to contribute or use the latest git version, download the source code:

git clone https://github.com/whode/pyserp
cd pyserp

Create a virtual environment (recommended):

python -m venv venv

And activate it

On Linux:

source venv/bin/activate

On Windows:

venv\Scripts\activate

Install the library:

pip install -e .

Usage

A simple, idiomatic usage example that demonstrates retrieving the top 10 search results from Google for a given query:

import asyncio

from pyserp.providers import GoogleSearcherManager, GoogleSearchSessionsManager


async def main():
    query = "how to learn python"
    print("Searching for:", query, end="\n\n")

    cookies = {"NID": "YOUR_NID_COOKIE (Get it in your browser (but not Chrome): F12 -> Application -> Cookies)"}
    manager = GoogleSearchSessionsManager(cookies = cookies)
    async with GoogleSearcherManager(search_sessions_manager=manager) as searcher:
        search_top_result = await searcher.search_top(query=query,
                                                      limit=10,
                                                      include_page_errors=False)

        print("----- Results -----", end="\n\n")
        for page in search_top_result.pages:
            for result in page.results.organic:
                print(result.title, result.url, sep="\n", end="\n\n")


if __name__ == "__main__":
    asyncio.run(main())

The library offers much more than this. Full documentation will be added in the future.

Development & Testing

Tests are included in the source repository only (not in the PyPI package). To run tests, first follow the steps in "From Source (Development)" above. Then:

pip install -U pytest
python -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserp-1.0.2.post2.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserp-1.0.2.post2-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file pyserp-1.0.2.post2.tar.gz.

File metadata

  • Download URL: pyserp-1.0.2.post2.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2.post2.tar.gz
Algorithm Hash digest
SHA256 0c49a971acaa78b32a4d1b64a03caa3316496b4eb88320ce7c8d5417ff52834e
MD5 8c0caee35e4c368ac914c8bda905d0c9
BLAKE2b-256 ff810ab8cefe0d5f3bd98cc4764de5e7e0e7eeb14c260193574ac929fe53a62a

See more details on using hashes here.

File details

Details for the file pyserp-1.0.2.post2-py3-none-any.whl.

File metadata

  • Download URL: pyserp-1.0.2.post2-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 51a581e5c12b1e94c505c1b3801f1f8c9e073c8945f3cbd1da909e22c203f46b
MD5 13e2bfb3ad8d6ccdfb0658c2de25a198
BLAKE2b-256 e58ae24acb57daf5b16355df2d0d725b0b578d0f6dacc59dd72903f42fb01b82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page