Skip to main content

A Python package for web processing and vision tasks with browser automation capabilities

Project description

DoNew

PyPI version PyPI - Python Version PyPI - License

A powerful Python package designed for AI agents to perform web processing, document navigation, and autonomous task execution. DoNew provides a high-level, agentic interface that makes it easy for AI systems to interact with web content and documents.

Quick Install

pip install donew
donew-install-browsers  # Install required browsers

Why DoNew?

DoNew is built with AI agents in mind, providing intuitive interfaces for:

  • Autonomous web navigation and interaction
  • Document understanding and processing
  • Task execution and decision making
  • State management and context awareness

Features

  • Browser automation using Playwright
  • Web page processing and interaction
  • Vision-related tasks and image processing
  • Easy-to-use API for web automation
  • Async support for better performance
  • AI-friendly interfaces for autonomous operation

Roadmap

Current Features

  • DO.Browse: Agentic web navigation
    • Autonomous webpage interaction
    • Element detection and manipulation
    • State awareness and context management
    • Cookie and storage handling
    • Visual debugging tools

Coming Soon

  • DO.Read: Agentic document navigation

    • PDF processing and understanding
    • Document structure analysis
    • Content extraction and processing
    • Cross-document reference handling
  • DO(...).New: Agentic behavior execution

    • Task planning and execution
    • Decision making based on content
    • Multi-step operation handling
    • Context-aware actions

Quick Start

import asyncio
from donew import DO

async def main():
    # Configure browser settings (optional)
    DO.Config(headless=True)  # Run in headless mode
    
    # Start agentic web navigation
    browser = await DO.Browse("https://example.com")
    
    try:
        # Analyze page content
        content = await browser.text()
        print("Page content:", content)
        
        # Get all interactive elements with their context
        elements = browser.elements()
        
        # Smart element detection (finds relevant input fields by context)
        input_fields = {
            elem.element_label or elem.attributes.get("name", ""): id
            for id, elem in elements.items()
            if elem.element_type == "input"
            and elem.attributes.get("type") in ["text", "email"]
        }
        
        # Autonomous form interaction
        for label, element_id in input_fields.items():
            await browser.type(element_id, f"test_{label}")
        
        # State management
        cookies = await browser.cookies()
        print("Current browser state (cookies):", cookies)
        
        # Context persistence
        await browser.storage({
            "localStorage": {"agent_context": "form_filling"},
            "sessionStorage": {"task_state": "in_progress"}
        })
        
        # Visual debugging (helps AI understand page state)
        await browser.toggle_annotation(True)
        
        # Get current state for decision making
        state = await browser._get_state_dict()
        print("Current agent state:", state)
        
    finally:
        await browser.close()

if __name__ == "__main__":
    asyncio.run(main())

Example: AI Agent Task Execution

from donew import DO

async def search_and_extract(query: str):
    browser = await DO.Browse("https://example.com/search")
    try:
        # Find and interact with search form
        elements = browser.elements()
        search_input = next(
            (id for id, elem in elements.items() 
             if elem.element_type == "input" and 
             ("search" in elem.element_label.lower() if elem.element_label else False)),
            None
        )
        
        if search_input:
            # Execute search
            await browser.type(search_input, query)
            await browser.press("Enter")
            
            # Wait for and analyze results
            content = await browser.text()
            
            # Extract structured data
            return {
                "query": query,
                "results": content,
                "page_state": await browser._get_state_dict()
            }
    finally:
        await browser.close()

Development Setup

Requirements

  • Python 3.11 (required for Knowledge Graph functionality)
  • uv package manager (recommended over pip)

Installation Steps

  1. Clone the repository
git clone https://github.com/DONEWio/donew.git
cd donew
  1. Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Create and activate virtual environment:
uv venv -p python3.11
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:

    • For basic usage:
    uv pip install pip
    uv pip install -e ".[dev]"
    
    • For Knowledge Graph functionality:
    uv pip install pip
    uv pip install -e "."
    uv pip install -e ".[kg,dev]"
    uv run -- spacy download en_core_web_md
    #uv run -- spacy download en_core_web_lg # Large web model
    #uv run -- spacy download en_core_web_sm # Small web model
    
  2. Install Playwright browsers:

playwright install chromium
playwright install # or all browsers

Testing

Run the test suite:

pytest tests/ --httpbin-url=https://httpbin.org

For more detailed testing options, including using local or remote httpbin, see the Testing Documentation. (#TODO)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Knowledge Graph Component

The Knowledge Graph component (donew.see.graph) provides entity and relationship extraction from text, with persistent storage in KuzuDB. This implementation is inspired by and adapted from the GraphGeeks.org talk and strwythura.

Features

  • Named Entity Recognition using GLiNER
  • Relationship Extraction using GLiREL
  • Graph storage and querying with KuzuDB
  • Text processing and chunking with spaCy

Graph Construction

The graph is built in layers:

  1. Base Layer: Textual analysis using spaCy parse trees
  2. Entity Layer: Named entities and noun chunks from GLiNER
  3. Relationship Layer: Semantic relationships from GLiREL
  4. Storage Layer: Persistent graph storage in KuzuDB

Usage

from donew.see.graph import KnowledgeGraph

# Initialize KG (in-memory or with persistent storage)
kg = KnowledgeGraph(db_path="path/to/db")  # or None for in-memory

# Analyze text
result = kg.analyze("""
OpenAI CEO Sam Altman has partnered with Microsoft.
The collaboration was announced in San Francisco.
""")

# Query the graph
ceo_relations = kg.query("""
MATCH (p:Entity)-[r:Relation]->(o:Entity)
WHERE p.label = 'Person' AND o.label = 'Company'
AND r.type = 'FOUNDER'
RETURN p.text as Founder, o.text as Company
ORDER BY Founder;
""") 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

donew-0.1.6.tar.gz (70.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

donew-0.1.6-py3-none-any.whl (63.9 kB view details)

Uploaded Python 3

File details

Details for the file donew-0.1.6.tar.gz.

File metadata

  • Download URL: donew-0.1.6.tar.gz
  • Upload date:
  • Size: 70.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for donew-0.1.6.tar.gz
Algorithm Hash digest
SHA256 a60bb7e2d299c46cd3771c57938a27875264d232f3bccb5a55d05c252899759f
MD5 e517dcf640df21735e93f81c457f1a33
BLAKE2b-256 a510dac0cefa251850c708fa92f0f39e41bbc9aef80ed39d63d0126caa2c93a0

See more details on using hashes here.

File details

Details for the file donew-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: donew-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 63.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for donew-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b0ebc5bb45746c973d31d72a7507b2e82fc6fbe14b39f847aa51130daf8a55bc
MD5 edf7318288cc0c79b0149018a7f82b54
BLAKE2b-256 94b5eec442ca2a67b55711fd1a806c42183d4cacd19ee1d8a5a973e6cb90e36b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page