Skip to main content

Add crawling capability to pydantic ai agent

Project description

pydantic-ai-crawling

Seamlessly integrate Pydantic AI with Crawl4AI to empower your AI agents with advanced web crawling and scraping capabilities.

Features

  • 🔎 Crawling & Scraping: High-performance web content extraction tailored for AI agents.
  • 🖼️ Media Support: Extract images, audio, videos, and responsive formats (srcset, picture).
  • 🚀 Dynamic Crawling: Execute JavaScript and handle async/sync content extraction.
  • 📸 Screenshots: Capture page screenshots for debugging or visual analysis.
  • 📂 Raw Data Crawling: Process raw HTML (raw:) or local files (file://) directly.
  • 🔗 Link Extraction: Comprehensive extraction of internal, external, and iframe links.
  • 🛠️ Customizable Hooks: Define hooks at every step to customize crawling behavior.
  • 💾 Caching: Built-in caching for improved speed and efficiency.
  • 📄 Metadata Extraction: Retrieve structured metadata from any web page.
  • 📡 IFrame Support: Seamless extraction from embedded iframe content.
  • 🕵️ Lazy Load Handling: Automatically waits for images and content to load.
  • 🔄 Full-Page Scanning: Simulates scrolling for infinite-scroll and dynamic pages.

Installation

pip install pydantic-ai-crawling

Usage

import pydantic_ai_crawling

# Example usage
pydantic_ai_crawling.greet()

CLI

After installation, you can use the built-in CLI:

crawler

Development

To set up the development environment:

uv sync

To run tests:

uv run pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_crawlers-0.1.0.tar.gz (346.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_ai_crawlers-0.1.0-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_ai_crawlers-0.1.0.tar.gz.

File metadata

  • Download URL: pydantic_ai_crawlers-0.1.0.tar.gz
  • Upload date:
  • Size: 346.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydantic_ai_crawlers-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1ddcdc111bb799ecbead9ba61de263663e2583858563391c488d4933255d6ef9
MD5 15a6843fa3019cd00c5c5c5bc855b4e6
BLAKE2b-256 011073e1150fa2f44597f235d9b8397996b313509ea67e3801111dd9783bbd73

See more details on using hashes here.

File details

Details for the file pydantic_ai_crawlers-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pydantic_ai_crawlers-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydantic_ai_crawlers-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 edbf2fb3ac7d6a86e579e965d3a82dade1cc60681265891b917f90ba6dbf99bc
MD5 db2b20fd6e1a99f41c1844aa19ca7551
BLAKE2b-256 70f129e3478d183967e3bc7d6716dd8f673e3e7981dfcf1302d2abe2f92ede79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page