No project description provided
Project description
Dendrite
Dendrite is a developer tool that makes it very easy to interact with and scrape websites using AI.
This project is still in it's infancy and is susceptible to many changes in the coming weeks.
If you want to chat with us developers, give feedback and report bugs, our discord is the place to be! Invite link
Installation:
pip install dendrite-python-sdk && playwright install
Quick start:
Here is a very simple example of how to use Dendrite. In the example we go to google.com, close the cookie popup, find the search bar and enter 'hello world' into it.
To do this we install dendrite and asyncio like so: pip install asyncio dendrite-python-sdk && playwright install
and we then create a file called main.py
which we can run with python main.py
from the terminal.
main.py
from dendrite_python_sdk import DendriteBrowser
import asyncio
async def main():
dendrite_browser = DendriteBrowser(openai_api_key=...) # Use your own OpenAI API key here
page = await dendrite_browser.goto("https://google.com")
close_cookies = await page.get_interactable_element("reject cookies popup button")
await close_cookies.click(expected_outcome="That the cookies popup closed.")
search_bar = await page.get_interactable_element("Return the search bar")
await search_bar.fill("hello world")
await dendrite_browser.close()
asyncio.run(main())
More Advanced Example:
With Dendrite we can do more than just finding and using elements.
We can scrape data from pages and ensure that the returned data followed a certain structure by prompting, using a JSON schema or passing in a pydantic model.
Below is an example of how he can, see more examples in dendrite_python_sdk/examples
.
import os
import asyncio
from dendrite_python_sdk import DendriteBrowser
from dotenv import load_dotenv, find_dotenv
import pandas as pd
load_dotenv(find_dotenv())
async def main():
dendrite_browser = DendriteBrowser(
openai_api_key=os.environ.get("OPENAI_API_KEY", ""),
dendrite_api_key=os.environ.get("DENDRITE_API_KEY", "") # Generate a dendrite API at https://dendrite.se to get better rate limits and latency.
)
page = await dendrite_browser.goto(
"https://ycombinator.com/companies", scroll_through_entire_page=False
)
# Get elements with prompts instead of using brittle selectors
search_bar = await page.get_interactable_element(
"The search bar used to find startups"
)
await search_bar.fill("AI agent")
# Scraping data is as easy as writing a good prompt
await page.scroll_through_entire_page()
startup_urls = await page.scrape(
"Extract a list of valid urls for each listed startup's YC page"
)
# Let's go to every AI agent startup's page and fetch information about the founders
startup_info = []
for url in startup_urls:
page = await dendrite_browser.goto(url)
startup_details = await page.scrape(
"Extract a dict like this: {name: str, slogan: str, founders_social_media_urls: a string where each url is separated by a linebreak}"
)
print("startup_details: ", startup_details)
startup_info.append(startup_details)
df = pd.DataFrame(startup_info)
df.to_excel("ai_agent_founders.xlsx", index=False)
asyncio.run(main())
Class Documentation
DendriteBrowser
DendriteBrowser is an abstraction on top of playwright's browser. You can do everything you would normally do with Playwright but with a few new handy AI functions. Access
Methods
__init__(self, openai_api_key: str, id=None, playwright_options: Any = {"headless": False})
Initializes the DendriteBrowser class.
Params:
openai_api_key
(str): Your OpenAI API Key.id
(Optional): The unique ID for the browser session.playwright_options
(dict): Options to configure Playwright browser.
get_active_page(self) -> DendritePage
Returns the active page.
goto(self, url: str, scroll_through_entire_page: Optional[bool] = True) -> DendritePage
Navigates to the specified URL.
Params:
url
(str): URL of the webpage.scroll_through_entire_page
(Optional[bool]): Whether to scroll through the entire page.
google_search(self, query: str, filter_results_prompt: Optional[str] = None, load_all_results: Optional[bool] = True) -> List[SearchResult]
Performs a Google search and returns search results.
Params:
query
(str): The search query.filter_results_prompt
(Optional[str]): Prompt to filter results.load_all_results
(Optional[bool]): Whether to load all results.
new_page(self) -> DendritePage
Opens a new page in the browser.
close(self)
Closes the browser context.
DendritePage
Abstraction layer on top of Playwright's Page class. Use get_playwright_locator
to get the playwright page.
Methods
get_playwright_page(self) -> Page
Returns the Playwright page instance.
scrape(self, prompt: str, expected_return_data: Optional[str] = None, return_data_json_schema: Optional[Any] = None, pydantic_return_model: Optional[Type[BaseModel]] = None) -> Any
Scrapes data from the page based on a prompt.
Params:
prompt
(str): The prompt to gather data.expected_return_data
(Optional[str]): Expected return data.return_data_json_schema
(Optional[Any]): JSON schema for return data.pydantic_return_model
(Optional[Type[BaseModel]]): Pydantic model for return data.
scroll_through_entire_page(self) -> None
Scrolls through the entire page.
get_interactable_element(self, prompt: str) -> DendriteLocator
Returns an interactable element based on a prompt.
DendriteLocator
Abstraction layer on top of Playwright's Locator class. Use get_playwright_locator
to get the playwright locator.
Methods
get_playwright_locator(self) -> Locator
Returns the Playwright locator.
wait_for_page_changes(self, old_url: str, timeout: float = 2)
Waits for changes in the page. Params:
old_url
(str): The URL before the interaction.timeout
(float): The time to wait for changes.
click(self, expected_outcome="", *args, **kwargs) -> InteractionResponse
Clicks the element and handles the interaction.
Params:
expected_outcome
(str): Expected outcome after clicking.
fill(self, value: str, expected_outcome="", *args, **kwargs)
Fills the element with the provided value.
Params:
value
(str): The value to fill.expected_outcome
(str): Expected outcome after filling.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dendrite_python_sdk-1.0.0.tar.gz
.
File metadata
- Download URL: dendrite_python_sdk-1.0.0.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.10.4 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4892d7d5fdb29dd2f9133f3a5d3f96cab4f3fbd018334a89110f101736293704 |
|
MD5 | babc36c7cd8cf580f8e97e475e4e78fe |
|
BLAKE2b-256 | fc973f6995c05bf7f2c9804d6d457cea71f4541d2de4acc8c9682f440135233a |
File details
Details for the file dendrite_python_sdk-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: dendrite_python_sdk-1.0.0-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.10.4 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fb144e555a0451085689b1162de3185b661a27da05915f23f3ad3b28fe7d3fc |
|
MD5 | da9f4872248fa576661c2c0ee20dbf7e |
|
BLAKE2b-256 | 995779c515c45658bb6ed78ddb07901f665bdd526d17c583005d0c23216263dd |