Browserbase Haystack Fetcher
Project description
Browserbase Haystack Fetcher
Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.
Power your AI data retrievals with:
- Serverless Infrastructure providing reliable browsers to extract data from complex UIs
- Stealth Mode with included fingerprinting tactics and automatic captcha solving
- Session Debugger to inspect your Browser Session with networks timeline and logs
- Live Debug to quickly debug your automation
Installation and setup
- Get an API key and Project ID from browserbase.com and set it in environment variables (
BROWSERBASE_API_KEY,BROWSERBASE_PROJECT_ID). - Install the required dependencies:
pip install browserbase-haystack
Usage
You can load webpages into Haystack using BrowserbaseFetcher. Optionally, you can set text_content parameter to convert the pages to text-only representation.
Standalone
from browserbase_haystack import BrowserbaseFetcher
browserbase_fetcher = BrowserbaseFetcher()
browserbase_fetcher.run(urls=["https://example.com"], text_content=False)
In a pipeline
from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from browserbase_haystack import BrowserbaseFetcher
prompt_template = (
"Tell me the titles of the given pages. Pages: {{ documents }}"
)
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator()
browserbase_fetcher = BrowserbaseFetcher()
pipe = Pipeline()
pipe.add_component("fetcher", browserbase_fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("fetcher.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")
result = pipe.run(data={"fetcher": {"urls": ["https://example.com"]}})
Parameters
urlsRequired. A list of URLs to fetchtext_contentOptional. Only return page text contentsession_idOptional. The Session IDproxyOptional. Enable Proxy
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file browserbase_haystack-0.0.4.tar.gz.
File metadata
- Download URL: browserbase_haystack-0.0.4.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
423eda63c7e01a2e91e7b61bf924dc6d298e4213436492d98caefbd65c8f1178
|
|
| MD5 |
96f840bb6f75b5a532e59b841f40fba9
|
|
| BLAKE2b-256 |
c5821396c07589b56ee600fc39851f058942e93723542e0cf26df15a45c1f3b4
|
File details
Details for the file browserbase_haystack-0.0.4-py3-none-any.whl.
File metadata
- Download URL: browserbase_haystack-0.0.4-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96e0762550ab6fae51c30fd632fe3d12d31ef6641fcf34ed8da3568c5f825712
|
|
| MD5 |
c9527b0aba7ffcc42d6fd8942cd71d8c
|
|
| BLAKE2b-256 |
777872bea2ff73abbd9abf22f26bb349bfb25cdb86ccc656bf88381acf8b497b
|