Get HTML elements locators using natural language.
Project description
Locatr
Locatr package helps you to find HTML locators on a webpage using prompts and llms.
Overview
- LLM based HTML locator finder.
- Re-rank support for improved accuracy.
- Supports playwright, selenium, cdp.
- Uses cache to reduce calls to llm apis.
- Results/Statistics generation of api calls.
Example:
starButtonLocator, err := locatr.GetLocatr("Star button on the page")
starButtonLocator.click()
Install Locatr with
go get github.com/vertexcover-io/locatr
Table of Contents
- Quick Example
- LLM Client
- Re-ranking Client
- Locatr Settings
- Locatrs
- Cache Schema & Management
- Logging
- Generate Statistics
- Contributing
Quick Example
Here's a quick example on how to use the project:
package main
import (
"fmt"
"log"
"os"
"time"
"github.com/playwright-community/playwright-go"
"github.com/vertexcover-io/locatr"
)
func main() {
pw, err := playwright.Run()
if err != nil {
log.Fatalf("could not start playwright: %v", err)
}
defer pw.Stop()
browser, err := pw.Chromium.Launch(
playwright.BrowserTypeLaunchOptions{
Headless: playwright.Bool(false),
},
)
if err != nil {
log.Fatalf("could not launch browser: %v", err)
}
defer browser.Close()
page, err := browser.NewPage()
if err != nil {
log.Fatalf("could not create page: %v", err)
}
if _, err := page.Goto("https://hub.docker.com/"); err != nil {
log.Fatalf("could not navigate to docker hub: %v", err)
}
time.Sleep(5 * time.Second) // wait for page to load
llmClient, err := locatr.NewLlmClient(
locatr.OpenAI, // (openai | anthropic),
os.Getenv("LLM_MODEL_NAME"),
os.Getenv("LLM_API_KEY"),
)
if err != nil {
log.Fatalf("could not create llm client: %v", err)
}
options := locatr.BaseLocatrOptions{UseCache: true, LogConfig: locatr.LogConfig{Level: locatr.Silent}, LlmClient: llmClient}
playWrightlocatr := locatr.NewPlaywrightLocatr(page, options)
searchBarLocator, err := playWrightlocatr.GetLocatr("Search Docker Hub input field")
if err != nil {
log.Fatalf("could not get locator: %v", err)
}
fmt.Println(searchBarLocator.InnerHTML())
}
Please check the examples directory for more examples.
LLM Client
The LlmClient is a wrapper around the llm provider you want to use. Supported providers are locatr.OpenAI, locatr.Anthropic. It is optional; if not provided in the options, Locatr will automatically create an LlmClient using environment variables.
- The following environment variables will be read to create a default LlmClient:
- LLM_PROVIDER: Defines which provider's LLM should be utilized (
openai,anthropic). - LLM_MODEL: Specifies the model to use
- LLM_API_KEY: The API key required to authenticate with the LLM provider.
- LLM_PROVIDER: Defines which provider's LLM should be utilized (
To create a new llm client call the locatr.NewLlmClient function.
import (
"github.com/vertexcover-io/locatr.
"os"
)
llmClient, err := locatr.NewLlmClient(
locatr.OpenAI, // Supported providers: "openai" | "anthropic"
os.Getenv("LLM_MODEL_NAME"),
os.Getenv("LLM_API_KEY"),
)
options := locatr.BaseLocatrOptions{
LlmClient: llmClient,
}
Run without creating the llm client..
import (
"github.com/vertexcover-io/locatr.
"os"
)
options := locatr.BaseLocatrOptions{
UseCache: true,
}
Re-ranking Client
ReRankClient is a wrapper around the ranking provider you want to use. Currently, we only support the cohere re-ranker. To create a cohere re-ranker, use the following code:
note: There is no support to create a re-ranking client by default if not provided in BaseLocatrOptions
- Only re-ranked HTML chunks with a score greater than
0.9are sent to the LLM. - The default
coherere-ranking model isrerank-english-v3.0.
import (
"github.com/vertexcover-io/locatr"
"os"
)
reRankClient, err := locatr.NewCohereClient(
os.Getenv("COHERE_API_KEY"),
)
options := locatr.BaseLocatrOptions{
ReRankClient: reRankClient,
}
Advantages of using re-ranking in Locatr
- Using re-ranking reduces the input context sent to the LLM.
- Re-ranked chunks will contain only the most relevant HTML chunks, improving the accuracy.
- Sending less input context to the LLM reduces response time and lowers the cost per LLM call.
Locatr Options
locatr.BaseLocatrOptions is a struct with multiple fields used to configure caching, logging, and output file paths in locatr.
Fields
-
CachePath (
string):- Path where the cache will be saved.
- Example:
"/path/to/cache/file"
-
UseCache (
bool):- Default is
false. Set totrueto enable caching.
- Default is
-
LogConfig (
LogConfig):-
Configuration for logging behavior.
-
Level (
LogLevel):- Sets the log level. Controls the verbosity of logging.
- Example:
locatr.Infoto log errors, warnings, and info messages.
-
Writer (
Writer):- Destination for log output. Implement the
Printffunction for custom log handling.
- Destination for log output. Implement the
-
-
ResultsFilePath (
string):- Path to the file where
locatrresults will be saved. - If not provided, results will be saved to
DEFAULT_LOCATR_RESULTS_FILE.
- Path to the file where
-
LlmClient (
LlmClientInterface):- Optional value; if not provided will be created by default (read more about llm client)
-
ReRankClient (
ReRankInterface)- The
ReRankClientyou want to use. When this is passed locatr will use the re-ranking client to re-rank the html chunks. (More about re-ranking).
- The
Locatrs
Locatrs are a wrapper around the main plugin (playwright, selenium, cdp).
PlaywrightLocatr
Create an instance of PlayWrightLocatr using :
playWrightLocatr := locatr.NewPlaywrightLocatr(page, llmClient, options)
CdpLocatr
To use Locatr through CDP, we first need to start the browser with a CDP server. This can be achieved by running:
google-chrome --remote-debugging-port=9222
We can pass the same arguments when using Selenium or Playwright:
- Selenium:
chrome_options = Options()
chrome_options.add_argument("--remote-debugging-port=9222")
- Playwright:
browser = playwright.chromium.launch(headless=False, args=["--remote-debugging-port=9222"])
After starting the browser with CDP, we need the page ID. The page ID is essential to run Locatr scripts on the correct page. This can be achieved in two ways:
- Directly getting it from the CDP server
- Send a GET request to http://localhost:9222/json.
- You will receive the following response:
[ { "description": "", "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:9222/devtools/page/215947B924E9C4D232ADE7331FDBEBA6", "faviconUrl": "https://www.youtube.com/s/desktop/e718aa11/img/logos/favicon_32x32.png", "id": "215947B924E9C4D232ADE7331FDBEBA6", "title": "YouTube", "type": "page", "url": "https://www.youtube.com/", "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/215947B924E9C4D232ADE7331FDBEBA6" }]- The
idfield contains the page id.
- Get it through playwright:
const browser = await chromium.launch({ headless: false });
const context = await browser.newContext();
const page = await context.newPage();
const cdpSession = await context.newCDPSession(page);
const response = await cdpSession.send('Page.getFrameTree');
const pageId = response.frameTree.frame.id;
Once we have the page ID, we can establish a connection with CDP:
connectionOpts := locatr.CdpConnectionOptions{
Port: 9222,
PageId: "177AE4272FC8BBE48190C697A27942DA",
}
connection, err := locatr.CreateCdpConnection(connectionOpts)
defer connection.Close()
Now we can create the CDP Locatr with:
cdpLocatr, err := locatr.NewCdpLocatr(connection, options)
Selenium Locatr
Selenium Locatr can be created through two ways:
- Through selenium server url:
seleniumLocatr, err := locatr.NewRemoteConnSeleniumLocatr("http://localhost:4444/wd/hub", driver.SessionID(), options)
note: the path must always be /wd/hub
- Directly passing the selenium driver:
seleniumLocatr, err := locatr.NewSeleniumLocatr(driver, options)
Methods
-
GetLocatr: Locates an element using a descriptive string and returns a
Locatorobject.searchBarLocator, err := playWrightLocatr.GetLocatr("Search Docker Hub input field")
Cache
Cache Schema
The cache is stored in JSON format. The schema is as follows:
{
"Page Full Url" : [
{
"locatr_name": "The description of the element you gave",
"locatrs": [
"input#search"
]
}
]
}
Cache Management
To remove the cache, delete the file at the path specified in BaseLocatrOptions's CachePath.
Logging
Logging is enabled by default in locatr and it's set to Error log level. Pass the LogConfig value in the BaseLocatrOptions struct.
options := locatr.BaseLocatrOptions{UseCache: true, LogConfig: locatr.LogConfig{Level: locatr.Debug}}
Available Log Levels
The following log levels are available, in increasing order of priority:
Debug: Logs all messages, info, warn, error.Info: Logs informational messages, warnings, and errors.Warning: Logs warnings and errors only.Error(Default): Logs only error messages.
Locatr Results
Locatr provides a feature to get all the information about each locatr request made (call to GetLocatr function). The result has the following schema.
- LocatrDescription (
string): Description of the locatr passed to the request. - Url (
string): The URL associated with the locatr. - CacheHit (
bool): Indicates if the result was retrieved from the cache (true) or freshly generated (false). - Locatr (
string): The locatr generated by the operation. - InputTokens (
int): Number of input tokens processed by the LLM call. - OutputTokens (
int): Number of tokens generated in the output by the LLM call. - TotalTokens (
int): Sum of input and output tokens. - LlmErrorMessage (
string): The error message from the LLM, if any. - ChatCompletionTimeTaken (
int): Time taken for the LLM to complete locatr generation in seconds. - AttemptNo (
int): An integer field to indicate the attempt number with re rank. - LocatrRequestInitiatedAt (
time.Time): The timestamp when the request was initiated. - LocatrRequestCompletedAt (
time.Time): The timestamp when the request was completed. - AllLocatrs (
[]string): All the locatrs of each located elements.
Saving Results
Results can be saved to a file specified by locatr.BaseLocatrOptions.ResultsFilePath (locatr_results.json). If no file path is specified, results are written to locatr.DEFAULT_LOCATR_RESULTS_PATH.
- To write results to a file: Use the
playwrightLocatr.WriteResultsToFilefunction.
Schema of the json file:
{
"locatr_description": "",
"url": "",
"cache_hit": false,
"locatr": "",
"input_tokens": 8399,
"output_tokens": 22,
"total_tokens": 8421,
"llm_error_message": "",
"llm_locatr_generation_time_taken": 1,
"attempt_no": 0,
"request_initiated_at": "",
"request_completed_at": "",
"all_locatrs": []
}
- To retrieve results as a slice: Use the
playwrightLocatr.GetLocatrResultsfunction.
Contributing
We welcome contributions! Please read our CONTRIBUTING.md guide to get started.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file test_locatr-0.2.0.tar.gz.
File metadata
- Download URL: test_locatr-0.2.0.tar.gz
- Upload date:
- Size: 5.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72b03384f60032f451b1e74267b69eb3f3a4ed80816f6a2e92b2d43c8db52c6d
|
|
| MD5 |
e0592ac3eb2498957a737fe1b2344ec5
|
|
| BLAKE2b-256 |
908a97b34b7d202b5dbb802b1056351746f761b16b9909304844cf0f6be21f4f
|
File details
Details for the file test_locatr-0.2.0-py3-none-any.whl.
File metadata
- Download URL: test_locatr-0.2.0-py3-none-any.whl
- Upload date:
- Size: 5.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.12.3 Linux/5.15.167.4-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89389a56c40ffc222a38e0669baae8217c174d805effe66384a0eefaf0babb96
|
|
| MD5 |
aab63168681ce22d7a0efdcc11f176ed
|
|
| BLAKE2b-256 |
5903a19e9dca9c87847e950237b08c11c010635ef60e19796469a9fda13b5b0c
|