Skip to main content

A brief description of what your package does

Project description

pyvigate

Pyvigate: A Python framework that combines headless browsing with LLMs that assists you in your data solutions, product tours, building RAG applications, web automation, functional testing, and many more!

DownloadsDownloads Documentation

Installation

Pyvigate can be installed using pip or directly from the source for the latest version.

Using pip

pip install pyvigate

Installing from source

git clone https://github.com/kindsmiles/pyvigate.git
cd pyvigate
pip install .

Components

Pyvigate consists of several key components designed to work together seamlessly for web automation tasks.

PlayWrightEngine:

PlayWright is one library we use for headless browsing and other browser automation tasks.

from pyvigate.core.engine import PlaywrightEngine

engine = PlaywrightEngine(headless=True)
await engine.start_browser()

QueryEngine (with Azure OpenAI)

QueryEngine incorporates AI to dynamically detect web page elements, significantly improving the efficiency and reliability of automated interactions. It also can help the user navigate and also create their own applications, which involve curating data, creating RAG applications, product tour, functional testing, etc.

from pyvigate.ai.query_engine import QueryEngine

query_engine = QueryEngine(
    api_key=os.getenv("OPENAI_API_KEY"),
    azure_api_version=os.getenv("AZURE_API_VERSION"),
    azure_endpoint=os.getenv("AZURE_ENDPOINT"),
    azure_llm_deployment_name=os.getenv("LLM_DEPLOYMENT_NAME"),
    azure_embedding_deployment_name=os.getenv("EMBEDDING_DEPLOYMENT_NAME")
)

Login

Some products can be accessed by the browser only after the login. We can do this either manually identifying the login selectors or letting the AI detect the UI elements where the credentials can be passed.The Login component utilizes QueryEngine to intelligently identify login forms and fields, streamlining the login process.

from pyvigate.core.login import Login

login = Login(query_engine)
await login.perform_login(engine.page, "https://example.com/login", "username", "password")

Scraping

With Scraping, Pyvigate offers powerful data extraction capabilities, enabling the collection of content from web pages post-login or navigation.

from pyvigate.services.scraping import Scraping

scraping = Scraping(data_dir="data")
content = await scraping.extract_data_from_page(engine.page)
print("Scraped content:", content)

Caching

The Caching component allows for the local storage of web page content, facilitating offline analysis and reducing bandwidth usage.

from pyvigate.services.caching import Caching

caching = Caching(cache_dir="html_cache")
await caching.cache_page_content(engine.page, "https://example.com/page")

Full Example

Bringing it all together, here's how you can use Pyvigate to login, scrape content, and cache it:

import asyncio
from dotenv import load_dotenv
from pyvigate.core.engine import PlaywrightEngine
from pyvigate.core.login import Login
from pyvigate.services.scraping import Scraping
from pyvigate.services.caching import Caching
from pyvigate.ai.query_engine import QueryEngine
import os

load_dotenv()

async def login_and_scrape():
    engine = PlaywrightEngine(headless=True)
    await engine.start_browser()

    query_engine = QueryEngine(api_key=os.getenv("OPENAI_API_KEY"))
    login = Login(query_engine)
    await login.perform_login(engine.page, "https://example.com/login", os.getenv("USERNAME"), os.getenv("PASSWORD"))

    scraping = Scraping(data_dir="data")
    content = await scraping.extract_data_from_page(engine.page)
    print("Scraped content:", content)

    caching = Caching(cache_dir="html_cache")
    await caching.cache_page_content(engine.page, "https://example.com/dashboard")

    await engine.stop_browser()

if __name__ == "__main__":
    asyncio.run(login_and_scrape())

Donations

We are looking to cover some of our API costs, so we accept any donations that you can make:

Sponsor me on Buy Me a Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvigate-0.0.3.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

pyvigate-0.0.3-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file pyvigate-0.0.3.tar.gz.

File metadata

  • Download URL: pyvigate-0.0.3.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.16

File hashes

Hashes for pyvigate-0.0.3.tar.gz
Algorithm Hash digest
SHA256 db0d38a68374a0e07487fd976354d7e17f6a0e76564a9084ab9ee14953244cb2
MD5 2cd8a8ea696108e3b7a123ee0f9ce656
BLAKE2b-256 c85d70ac60f058223b0c31bd06f819e410c5c7814f2b9bc2623f46147ec62290

See more details on using hashes here.

File details

Details for the file pyvigate-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: pyvigate-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.16

File hashes

Hashes for pyvigate-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5f6166af282274b1a0ca859f89c1761bf7291d6cbf3bf0d781a6d92f83da21ad
MD5 90311abf674c57075df670bfef077b09
BLAKE2b-256 b6d8e4e2609d8371aa376730ac815c114b6cbad346cebb5224cf41c3e519ffeb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page