Skip to main content

Browserbase Haystack Fetcher

Project description

Browserbase Haystack Fetcher

Browserbase is a developer platform to reliably run, manage, and monitor headless browsers.

Power your AI data retrievals with:

Installation and setup

  • Get an API key and Project ID from browserbase.com and set it in environment variables (BROWSERBASE_API_KEY, BROWSERBASE_PROJECT_ID).
  • Install the required dependencies:
pip install browserbase-haystack

Usage

You can load webpages into Haystack using BrowserbaseFetcher. Optionally, you can set text_content parameter to convert the pages to text-only representation.

Standalone

from browserbase_haystack import BrowserbaseFetcher

browserbase_fetcher = BrowserbaseFetcher()
browserbase_fetcher.run(urls=["https://example.com"], text_content=False)

In a pipeline

from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from browserbase_haystack import BrowserbaseFetcher

prompt_template = (
    "Tell me the titles of the given pages. Pages: {{ documents }}"
)
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator()

browserbase_fetcher = BrowserbaseFetcher()

pipe = Pipeline()
pipe.add_component("fetcher", browserbase_fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("fetcher.documents", "prompt_builder.documents")
pipe.connect("prompt_builder.prompt", "llm.prompt")
result = pipe.run(data={"fetcher": {"urls": ["https://example.com"]}})

Parameters

  • urls Required. A list of URLs to fetch
  • session_id Optional. The Session ID
  • proxy Optional. Enable Proxy
  • text_content Optional. Only return page text content

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browserbase_haystack-0.0.3.tar.gz (3.4 kB view hashes)

Uploaded Source

Built Distribution

browserbase_haystack-0.0.3-py3-none-any.whl (3.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page