Comprehensive self-hosted web browsing and data extraction platform for developers
Project description
BrowseFn Python SDK
Comprehensive self-hosted web browsing and data extraction platform for developers.
The Python SDK for BrowseFn provides a unified interface for:
- Web scraping and crawling (HTML, Markdown, Text)
- Image search and download
- Geolocation services (Geocoding, Reverse Geocoding)
- Provider-agnostic interface (swap providers easily)
Status
🚧 Alpha
Features
- Type-safe: Built with Pydantic for robust data validation.
- Async: Fully asynchronous API using
httpx. - Extensible: Easy to add custom providers.
- Batteries included: Comes with basic providers (BeautifulSoup).
Installation
pip install browsefn
Usage
Web Scraping
import asyncio
from browsefn import browse_fn
from browsefn.web.providers.bs4 import BeautifulSoupProvider
async def main():
# Initialize
browse = browse_fn()
# Register a provider (e.g., BeautifulSoup)
bs4_provider = BeautifulSoupProvider()
browse.web.register_provider("beautifulsoup", bs4_provider)
browse.web.config.default_provider = "beautifulsoup"
# Get a page
page = await browse.web.get_page("https://example.com")
print(f"Title: {page.metadata.title}")
print(f"Content length: {len(page.content)}")
if __name__ == "__main__":
asyncio.run(main())
Configuration
You can configure BrowseFn using the BrowseFnConfig object.
from browsefn import browse_fn, BrowseFnConfig, WebConfig
config = BrowseFnConfig(
web=WebConfig(
default_provider="firecrawl",
# ...
)
)
browse = browse_fn(config)
Development
-
Install dependencies:
pip install -e ".[test]"
-
Run tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file browsefn-0.0.1.tar.gz.
File metadata
- Download URL: browsefn-0.0.1.tar.gz
- Upload date:
- Size: 27.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e32c7f3243fd92b4cce5b59eec4d56ad94c2f37040c49678c8a1bf5811133198
|
|
| MD5 |
8a65e15fe670e58c026b53c6ea2a01d2
|
|
| BLAKE2b-256 |
91366c84f16893352c67e1ed27d4bdb6bd7729b86208016790faab82a5ec4dca
|
File details
Details for the file browsefn-0.0.1-py3-none-any.whl.
File metadata
- Download URL: browsefn-0.0.1-py3-none-any.whl
- Upload date:
- Size: 37.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de9f9793025848110eda5a047b874376a84e2b36641b22adc67ff780d2ff57fa
|
|
| MD5 |
efb2bf2962fb588b91405705fed8871e
|
|
| BLAKE2b-256 |
6bd5f90434cb75dc149c9feccedcc804aee77f172ca6a7b5ed3f37ea8a626062
|