Skip to main content

A comprehensive Python-based CLI web browser with JavaScript support and modern web compatibility

Project description

Julia Browser 🌐

A comprehensive Python-based CLI web browser with JavaScript support and modern web compatibility.

Julia Browser transforms command-line web browsing into a dynamic, intelligent experience with comprehensive JavaScript simulation and rendering capabilities.

Features

  • Enhanced JavaScript Engine: Mozilla SpiderMonkey integration via PythonMonkey
  • Modern Web Compatibility: Full HTML DOM API, CSS Object Model (CSSOM), and modern JavaScript APIs
  • Interactive CLI Interface: Rich terminal interface with comprehensive web interaction support
  • Advanced Navigation: Back/forward, bookmarks, history, and smart link following
  • Intelligent Content Processing: Dynamic content filtering and clean markdown output
  • Performance Optimizations: Caching, asynchronous execution, and connection pooling
  • Real Web Interactions: Form submission, file uploads, authentication flows
  • Multiple Output Formats: Markdown, HTML, and JSON rendering
  • Responsive Design Detection: Breakpoint analysis and mobile-first patterns
  • Web Components Support: Extracts content from custom elements and Shadow DOM
  • CSS Generated Content: Captures ::before/::after pseudo-element text
  • Modern Layout Support: Proper text flow for CSS Grid and Flexbox layouts
  • Shadow DOM Access: Accesses encapsulated component content from modern frameworks

Installation

pip install julia-browser

Quick Start

Command Line Interface

Command Line Usage

# Browse a website
julia-browser browse https://example.com

# Start interactive mode
julia-browser interactive

# Render to different formats
julia-browser render https://api.github.com --format json

AI Agent SDK 🤖

Julia Browser includes a clean, simple AI Agent SDK that enables AI systems to control websites exactly like humans do - with direct, intuitive functions that mirror CLI browser commands.

Perfect for: Web automation, AI agents, research workflows, data collection, form automation, and human-like web interactions.

Simple Functions - Just Like CLI Commands

from julia_browser import AgentSDK

# Initialize AI agent
agent = AgentSDK()

# Open a website (like CLI 'browse' command)
result = agent.open_website("https://example.com")
print(f"Opened: {result['title']}")

# List interactive elements (like CLI 'elements' command)
elements = agent.list_elements()
print(f"Found {elements['total_clickable']} clickable elements")

# Type into input fields (like CLI 'type' command)
agent.type_text(1, "search query")

# Click buttons or links (like CLI 'click' command)
agent.click_element(1)

# Submit forms (like CLI form submission)
result = agent.submit_form()
print(f"Form submitted: {result['title']}")

Complete AI Agent SDK Functions

Core Website Functions:

  • open_website(url) - Open any website and get page content
  • list_elements() - List all clickable elements and input fields with numbers
  • get_page_info() - Get current page title, URL, and full content
  • search_page(term) - Search for text within the current page

Human-like Interactions:

  • click_element(number) - Click buttons or links by their number
  • type_text(field_number, text) - Type text into input fields by number
  • submit_form() - Submit forms with typed data
  • follow_link(number) - Navigate to links by their number

Page Navigation & Scrolling:

  • scroll_down(chunks=1) - Scroll down to see more content (like browser scrolling)
  • scroll_up(chunks=1) - Scroll up to see previous content
  • scroll_to_top() - Jump to the top of the page
  • scroll_to_bottom() - Jump to the bottom of the page
  • get_scroll_info() - Get current scroll position and page info

Page Navigation:

  • scroll_down(chunks=1) - Scroll down to see more content
  • scroll_up(chunks=1) - Scroll up to see previous content
  • scroll_to_top() - Jump to the top of the page
  • scroll_to_bottom() - Jump to the bottom of the page
  • get_scroll_info() - Get current scroll position and progress

Real-World Usage Examples

1. Simple Web Search:

agent = AgentSDK()

# Open search engine
agent.open_website("https://duckduckgo.com")

# Type search query into first input field
agent.type_text(1, "Python programming")

# Submit the form
result = agent.submit_form()
print(f"Search results: {result['title']}")

2. Form Automation:

agent = AgentSDK()

# Open a form page
agent.open_website("https://httpbin.org/forms/post")

# List all available elements
elements = agent.list_elements()
print(f"Found {elements['total_inputs']} input fields")

# Fill form fields by number
agent.type_text(1, "John Doe")      # First field
agent.type_text(2, "555-1234")      # Second field

# Submit the completed form
result = agent.submit_form()

3. Scrolling Through Long Pages:

agent = AgentSDK()

# Open a documentation site
agent.open_website("https://api.python.langchain.com/en/latest/langchain_api_reference.html")

# Check scroll info
info = agent.get_scroll_info()
print(f"Page has {info['total_elements']} content sections")
print(f"Currently at {info['progress_percentage']}% of page")

# Scroll down to see more content
result = agent.scroll_down(2)  # Scroll down 2 chunks
print(f"Viewing sections {result['visible_range']}")

# Jump to bottom for conclusions/references
bottom = agent.scroll_to_bottom()
print("Now viewing the end of the document")

# Go back to top to review
agent.scroll_to_top()

4. Website Navigation:

agent = AgentSDK()

# Open website
agent.open_website("https://example.com")

# List clickable elements
elements = agent.list_elements()
for button in elements['buttons']:
    print(f"Button {button['number']}: {button['text']}")

# Click the first link/button
result = agent.click_element(1)

# Search current page for specific content
search = agent.search_page("information")
print(f"Found '{search['search_term']}' {search['matches_found']} times")

AI Agent Response Format

All AI Agent SDK methods return structured dictionaries with consistent formats:

{
    'success': True,
    'page_title': 'Example Domain',
    'url': 'https://example.com',
    'markdown': '# Page Content...',
    'forms': 1,
    'buttons': 3,
    'inputs': 5,
    'links': 12,
    'matches_found': 2,
    'error': None
}

Burp Suite-Style Intercepting Proxy 🔍

Julia Browser includes a powerful HTTP/HTTPS intercepting proxy designed for AI agents to inspect and manipulate web traffic - similar to Burp Suite but super simple with no Python functions needed!

Proxy Features

  • 🎯 Zero Code Interception: Use simple dictionaries - no Python functions required!
  • Request Modification: Change headers, URLs, methods, and body data
  • Response Modification: Alter response content, headers, and status codes
  • Traffic Logging: Comprehensive logging with filtering support
  • Declarative Rules: AI-friendly rule system using JSON-like dictionaries
  • Security Testing: Analyze security headers and test API endpoints
  • API Monitoring: Automatically detect and log API calls
  • Zero Impact: Completely optional - all existing functions work with proxy disabled

Quick Start with Proxy

from julia_browser import AgentSDK

agent = AgentSDK()

# Start the intercepting proxy
agent.proxy_start()

# Browse websites - all traffic is automatically logged
agent.open_website("example.com")

# Get traffic logs
traffic = agent.proxy_get_traffic(limit=10)
print(f"Captured {traffic['traffic_count']} HTTP requests")

# View request/response details
for entry in traffic['traffic']:
    req = entry['request']
    resp = entry['response']
    print(f"{req['method']} {req['url']}{resp['status_code']}")

# Stop proxy
agent.proxy_stop()

Simple Request Modification - No Python Functions Needed!

Just use dictionaries - perfect for AI agents:

agent = AgentSDK()
agent.proxy_start()

# Add authentication header to all API requests
agent.proxy_add_rule({
    "name": "add_auth",
    "type": "request",
    "match": {"url_contains": "api"},
    "actions": {"set_headers": {"Authorization": "Bearer token123"}}
})

# Now browse - auth header is automatically added to API requests
agent.open_website("api.example.com")

Simple Response Modification

Modify responses with simple dictionaries:

agent.proxy_start()

# Remove sensitive data from all JSON responses
agent.proxy_add_rule({
    "name": "privacy_filter",
    "type": "response",
    "match": {"content_type": "application/json"},
    "actions": {"find_replace": {"password": "***", "api_key": "***"}}
})

# Browse - responses are automatically cleaned
agent.open_website("dashboard.example.com")

Security Testing Example

agent = AgentSDK()
agent.proxy_start()

# Browse target website
agent.open_website("example.com")

# Analyze security headers
traffic = agent.proxy_get_traffic(limit=1)
headers = traffic['traffic'][0]['response']['headers']

security_headers = {
    'Strict-Transport-Security': 'HSTS',
    'Content-Security-Policy': 'CSP',
    'X-Frame-Options': 'Clickjacking Protection'
}

print("Security Analysis:")
for header, name in security_headers.items():
    if header in headers:
        print(f"✓ {name}: {headers[header]}")
    else:
        print(f"✗ {name}: Missing")

agent.proxy_stop()

Traffic Filtering Example

agent.proxy_start()

# Browse multiple sites
agent.open_website("example.com")
agent.open_website("api.github.com")

# Get only API traffic
api_traffic = agent.proxy_get_traffic(
    limit=50,
    filter={"url_contains": "api"}
)

# Get only POST requests
post_traffic = agent.proxy_get_traffic(
    limit=50,
    filter={"method": "POST"}
)

print(f"API requests: {api_traffic['traffic_count']}")
print(f"POST requests: {post_traffic['traffic_count']}")

Proxy API Reference

Proxy Control:

  • proxy_start() - Enable traffic interception
  • proxy_stop() - Disable traffic interception
  • proxy_status() - Get proxy status and statistics
  • proxy_get_traffic(limit, filter) - Retrieve traffic logs with optional filtering
  • proxy_clear_traffic() - Clear all traffic logs

Simple Rule-Based Interception (Recommended for AI Agents):

  • proxy_add_rule(rule_dict) - Add declarative rule using simple dictionary
  • proxy_remove_rule(name) - Remove rule by name
  • proxy_list_rules() - List all active rules

Advanced Interception (For Power Users):

  • proxy_add_request_interceptor(func, name) - Add Python function as request interceptor
  • proxy_add_response_interceptor(func, name) - Add Python function as response interceptor
  • proxy_remove_interceptor(name) - Remove interceptor by name
  • proxy_list_interceptors() - List all active interceptors

Rule Configuration Reference

Match Conditions (for filtering):

  • url_contains: Match URLs containing specific text
  • url_matches: Match URLs with regex pattern
  • method: Match specific HTTP method (GET, POST, etc.)
  • header_equals: Match specific header values
  • header_contains: Match headers containing text
  • status_code: Match response status code (responses only)
  • content_type: Match content type (responses only)

Actions (for modification):

Request Actions:

  • set_headers: Add or modify request headers
  • remove_headers: Remove specific headers
  • set_method: Change HTTP method
  • rewrite_url: Rewrite URL with regex
  • replace_body: Replace entire request body
  • append_body: Append to request body
  • find_replace: Find and replace text in request body
  • block_request: Block request from being sent (returns 403)

Response Actions:

  • set_headers: Add or modify response headers
  • remove_headers: Remove specific headers
  • set_status: Change HTTP status code
  • replace_body: Replace entire response body
  • append_body: Append to response body
  • find_replace: Find and replace text in response body

Complete Rule Example:

# Complex rule with multiple conditions and actions
agent.proxy_add_rule({
    "name": "api_modifier",
    "type": "request",
    "match": {
        "url_contains": "api",
        "method": "POST",
        "header_contains": {"Content-Type": "json"}
    },
    "actions": {
        "set_headers": {
            "Authorization": "Bearer token123",
            "X-API-Version": "v2"
        },
        "set_method": "PUT"
    }
})

Traffic Analysis Example

agent = AgentSDK()
agent.proxy_start()

# Browse multiple pages
for site in ["example.com", "httpbin.org", "api.github.com"]:
    agent.open_website(site)

# Get comprehensive statistics
status = agent.proxy_status()
stats = status['proxy_status']['stats']

print(f"Total Requests: {stats['total_requests']}")
print(f"Total Responses: {stats['total_responses']}")
print(f"Modified Requests: {stats['modified_requests']}")
print(f"Modified Responses: {stats['modified_responses']}")

# Get detailed traffic log
traffic = agent.proxy_get_traffic(limit=20)
for entry in traffic['traffic']:
    print(f"{entry['request']['method']:4} {entry['request']['url']}")

agent.proxy_stop()

Integration with AI Systems

The AI Agent SDK is specifically designed for integration with AI frameworks and automation platforms:

LangChain Integration:

from julia_browser import AgentSDK
from langchain.tools import BaseTool

class JuliaBrowserTool(BaseTool):
    name = "julia_browser"
    description = "Navigate and interact with websites"
    
    def _run(self, url: str) -> str:
        agent = AgentSDK()
        result = agent.navigate(url)
        return result['markdown']

OpenAI Function Calling:

import openai
from julia_browser import AgentSDK

def browse_website(url: str):
    """Browse a website and return its content"""
    agent = AgentSDK()
    return agent.navigate(url)

# Register as OpenAI function
functions = [{
    "name": "browse_website",
    "description": "Browse and analyze website content",
    "parameters": {
        "type": "object",
        "properties": {
            "url": {"type": "string", "description": "Website URL to browse"}
        }
    }
}]

Autonomous Agent Workflows:

# Multi-step research workflow
agent = AgentSDK()

# Step 1: Search for information
agent.navigate("https://duckduckgo.com")
agent.fill_input(0, "artificial intelligence trends 2024")
agent.submit_form()

# Step 2: Follow first result
agent.follow_link(0)

# Step 3: Extract specific data
insights = agent.search_page("machine learning")
print(f"Found {insights['matches_found']} ML references")

# Step 4: Continue to related pages
elements = agent.get_elements()
if elements['links'] > 0:
    agent.follow_link(0)  # Follow related link

Why Choose Julia Browser AI Agent SDK?

  • 🎯 Human-like Interaction: Functions mirror exactly how humans browse websites
  • 🔢 Simple Numbering: All elements are numbered - just click_element(1), type_text(1, "hello")
  • 📊 Clean Responses: Consistent JSON responses with success/error handling
  • 🌐 Real Websites: Handles modern sites with JavaScript, forms, and dynamic content
  • 🚀 Zero Setup: No browser installation or complex configuration needed
  • ⚡ Direct Commands: Each function does exactly one thing, like CLI commands

Advanced Features

Python API

from julia_browser import BrowserEngine, BrowserSDK

# Initialize browser
sdk = BrowserSDK()

# Browse a website
result = sdk.browse_url("https://example.com")
print(result['markdown'])

# Interactive CLI
from julia_browser import CLIBrowser
browser = CLIBrowser()
browser.start_interactive_mode()

Interactive Mode Commands

  • browse <url> - Navigate to a website
  • elements - Show all interactive elements (buttons, links, forms)
  • click <number> - Click on numbered elements
  • type <text> - Type into input fields
  • submit - Submit forms
  • back/forward - Navigate browser history
  • bookmark add <name> - Save bookmarks
  • help - Show all commands

Advanced Features

JavaScript Support

  • Full ES6+ compatibility with Mozilla SpiderMonkey
  • React, Vue, Angular framework support
  • Real API calls and network requests
  • Modern browser API simulation

Web Interaction

  • Smart form handling with real submission
  • File upload support
  • Authentication flows and session management
  • Cookie handling and persistent sessions

Performance

  • Intelligent caching with SQLite backend
  • Asynchronous request handling
  • Connection pooling and optimization
  • Lazy loading for large websites

Examples

Browse and Interact

julia-browser interactive
> browse github.com
> elements
> click 1  # Click login button
> type username myuser
> type password mypass
> submit

API Integration

from julia_browser import BrowserSDK

sdk = BrowserSDK()

# Handle JSON APIs
result = sdk.browse_url("https://api.github.com/users/octocat")
user_data = result['json_data']

# Process forms
result = sdk.submit_form("https://httpbin.org/post", {
    "username": "test",
    "email": "test@example.com"
})

Requirements

  • Python 3.8+
  • PythonMonkey (Mozilla SpiderMonkey)
  • Rich (terminal formatting)
  • Click (CLI framework)
  • BeautifulSoup4 (HTML parsing)
  • Requests (HTTP client)

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

julia_browser-2.1.0.tar.gz (174.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

julia_browser-2.1.0-py3-none-any.whl (179.9 kB view details)

Uploaded Python 3

File details

Details for the file julia_browser-2.1.0.tar.gz.

File metadata

  • Download URL: julia_browser-2.1.0.tar.gz
  • Upload date:
  • Size: 174.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for julia_browser-2.1.0.tar.gz
Algorithm Hash digest
SHA256 86dcf6e6b957ea5bf999520a36217bf2e28de649a11550fdee143136a850aa4d
MD5 f11bd1e4b9afed19ca410826e126b0e6
BLAKE2b-256 97c60c5b744b47b86d3e8b23f3cd3bf87622917971e1761bc8af45b9234857b0

See more details on using hashes here.

File details

Details for the file julia_browser-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: julia_browser-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 179.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for julia_browser-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 18b200050d4ba9b402bb90292635a883f809400a6bd19933bf1d11b45466720b
MD5 a24ffb0cae071830822db0e2fde500a8
BLAKE2b-256 ff2125c52132b848b1a8967aee0f84dc7db4fee7d7598ebf2c000dd01e36770c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page