Skip to main content

Comprehensive self-hosted web browsing and data extraction platform for developers

Project description

BrowseFn Python SDK

Comprehensive self-hosted web browsing and data extraction platform for developers.

The Python SDK for BrowseFn provides a unified interface for:

  • Web scraping and crawling (HTML, Markdown, Text)
  • Image search and download
  • Geolocation services (Geocoding, Reverse Geocoding)
  • Provider-agnostic interface (swap providers easily)

Status

🚧 Alpha

Features

  • Type-safe: Built with Pydantic for robust data validation.
  • Async: Fully asynchronous API using httpx.
  • Extensible: Easy to add custom providers.
  • Batteries included: Comes with basic providers (BeautifulSoup).

Installation

pip install browsefn

Usage

Web Scraping

import asyncio
from browsefn import browse_fn
from browsefn.web.providers.bs4 import BeautifulSoupProvider

async def main():
    # Initialize
    browse = browse_fn()
    
    # Register a provider (e.g., BeautifulSoup)
    bs4_provider = BeautifulSoupProvider()
    browse.web.register_provider("beautifulsoup", bs4_provider)
    browse.web.config.default_provider = "beautifulsoup"
    
    # Get a page
    page = await browse.web.get_page("https://example.com")
    
    print(f"Title: {page.metadata.title}")
    print(f"Content length: {len(page.content)}")

if __name__ == "__main__":
    asyncio.run(main())

Configuration

You can configure BrowseFn using the BrowseFnConfig object.

from browsefn import browse_fn, BrowseFnConfig, WebConfig

config = BrowseFnConfig(
    web=WebConfig(
        default_provider="firecrawl",
        # ...
    )
)
browse = browse_fn(config)

Development

  1. Install dependencies:

    pip install -e ".[test]"
    
  2. Run tests:

    pytest
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browsefn-0.0.1.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browsefn-0.0.1-py3-none-any.whl (37.2 kB view details)

Uploaded Python 3

File details

Details for the file browsefn-0.0.1.tar.gz.

File metadata

  • Download URL: browsefn-0.0.1.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for browsefn-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e32c7f3243fd92b4cce5b59eec4d56ad94c2f37040c49678c8a1bf5811133198
MD5 8a65e15fe670e58c026b53c6ea2a01d2
BLAKE2b-256 91366c84f16893352c67e1ed27d4bdb6bd7729b86208016790faab82a5ec4dca

See more details on using hashes here.

File details

Details for the file browsefn-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: browsefn-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 37.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for browsefn-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 de9f9793025848110eda5a047b874376a84e2b36641b22adc67ff780d2ff57fa
MD5 efb2bf2962fb588b91405705fed8871e
BLAKE2b-256 6bd5f90434cb75dc149c9feccedcc804aee77f172ca6a7b5ed3f37ea8a626062

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page