Browserbase Haystack Fetcher
Project description
Browserbase Haystack Fetcher
Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.
Power your AI data retrievals with:
- Serverless Infrastructure providing reliable browsers to extract data from complex UIs
- Stealth Mode with included fingerprinting tactics and automatic captcha solving
- Session Debugger to inspect your Browser Session with networks timeline and logs
- Live Debug to quickly debug your automation
Installation and setup
- Get an API key and Project ID from browserbase.com and set it in environment variables (
BROWSERBASE_API_KEY
,BROWSERBASE_PROJECT_ID
). - Install the required dependencies:
pip install browserbase-haystack
Usage
You can load webpages into Haystack using BrowserbaseFetcher
. Optionally, you can set text_content
parameter to convert the pages to text-only representation.
Standalone
from browserbase_haystack import BrowserbaseFetcher
browserbase_fetcher = BrowserbaseFetcher()
browserbase_fetcher.run(urls=["https://example.com"], text_content=False)
In a pipeline
from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from browserbase_haystack import BrowserbaseFetcher
prompt_template = (
"Tell me the titles of the given pages. Pages: {{ documents }}"
)
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator()
browserbase_fetcher = BrowserbaseFetcher()
pipe = Pipeline()
pipe.add_component("fetcher", browserbase_fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("fetcher.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")
result = pipe.run(data={"fetcher": {"urls": ["https://example.com"]}})
Parameters
urls
Required. A list of URLs to fetchtext_content
Optional. Only return page text contentsession_id
Optional. The Session IDproxy
Optional. Enable Proxy
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file browserbase_haystack-0.0.4.tar.gz
.
File metadata
- Download URL: browserbase_haystack-0.0.4.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 423eda63c7e01a2e91e7b61bf924dc6d298e4213436492d98caefbd65c8f1178 |
|
MD5 | 96f840bb6f75b5a532e59b841f40fba9 |
|
BLAKE2b-256 | c5821396c07589b56ee600fc39851f058942e93723542e0cf26df15a45c1f3b4 |
File details
Details for the file browserbase_haystack-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: browserbase_haystack-0.0.4-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96e0762550ab6fae51c30fd632fe3d12d31ef6641fcf34ed8da3568c5f825712 |
|
MD5 | c9527b0aba7ffcc42d6fd8942cd71d8c |
|
BLAKE2b-256 | 777872bea2ff73abbd9abf22f26bb349bfb25cdb86ccc656bf88381acf8b497b |