Skip to main content

A comprehensive asynchronous library for scraping and parsing Search Engine Results Pages (SERPs).

Project description

PySerp

PyPI Python Versions License

PySerp is an asynchronous Python library for automated, flexibly configurable scraping and parsing of Search Engine Results Pages (SERPs).

Purpose

Scraping search engine results is almost always necessary when the task is to automatically analyze these results, or to collect content from the links within them for any purpose.

Examples:

  • Competitive analysis for keywords (SEO)
  • Searching for and extracting any structured information from page content (phone numbers, emails, addresses, etc)
  • Collecting page content to generate summaries (AI search)

Key Features

This library:

  • Is asynchronous by default for maximum efficiency
  • Supports Google and Bing as search engines (more will be added in the future)
  • Applies strict typing to results using Pydantic

Installation

From PyPI (Recommended)

Simply run:

pip install pyserp

From Source (Development)

If you want to contribute or use the latest git version, download the source code:

git clone https://github.com/whode/pyserp
cd pyserp

Create a virtual environment (recommended):

python -m venv venv

And activate it

On Linux:

source venv/bin/activate

On Windows:

venv\Scripts\activate

Install the library:

pip install -e .

Usage

A simple, idiomatic usage example that demonstrates retrieving the top 10 search results from Google for a given query:

import asyncio

from pyserp.providers import GoogleSearcherManager, GoogleSearchSessionsManager


async def main():
    query = "how to learn python"
    print("Searching for:", query, end="\n\n")

    cookies = {"NID": "YOUR_NID_COOKIE (Get it in your browser (but not Chrome): F12 -> Application -> Cookies)"}
    manager = GoogleSearchSessionsManager(cookies = cookies)
    async with GoogleSearcherManager(search_sessions_manager=manager) as searcher:
        search_top_result = await searcher.search_top(query=query,
                                                      limit=10,
                                                      include_page_errors=False)

        print("----- Results -----", end="\n\n")
        for page in search_top_result.pages:
            for result in page.results.organic:
                print(result.title, result.url, sep="\n", end="\n\n")


if __name__ == "__main__":
    asyncio.run(main())

The library offers much more than this. Full documentation will be added in the future.

Development & Testing

Tests are included in the source repository only (not in the PyPI package). To run tests, first follow these steps. Then:

pip install -U pytest
python -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserp-1.0.2.post1.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserp-1.0.2.post1-py3-none-any.whl (36.0 kB view details)

Uploaded Python 3

File details

Details for the file pyserp-1.0.2.post1.tar.gz.

File metadata

  • Download URL: pyserp-1.0.2.post1.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2.post1.tar.gz
Algorithm Hash digest
SHA256 ccd10913614524a5561c514940baba7186efc21dc16e4f34acf743b1fcaf3c60
MD5 238a3ea24b65c03152126ec27356e608
BLAKE2b-256 0487cf848bc8618a4dd6f340714b28216f9064a6aff124bc8d05577891b8a13c

See more details on using hashes here.

File details

Details for the file pyserp-1.0.2.post1-py3-none-any.whl.

File metadata

  • Download URL: pyserp-1.0.2.post1-py3-none-any.whl
  • Upload date:
  • Size: 36.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for pyserp-1.0.2.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 5fba6567ee8527349e0fceec2d411109a0468f019419d1fd9b06454df90a58a9
MD5 018a205881fa63011b205eee16677a32
BLAKE2b-256 5919d2c6067e659a083dc9ba4f323e8cbe471a8708ca76bd9b42a0d7832fc56d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page