A comprehensive Python-based CLI web browser with JavaScript support and modern web compatibility
Project description
Julia Browser 🌐
A comprehensive Python-based CLI web browser with JavaScript support and modern web compatibility.
Julia Browser transforms command-line web browsing into a dynamic, intelligent experience with comprehensive JavaScript simulation and rendering capabilities.
Features
- Enhanced JavaScript Engine: Mozilla SpiderMonkey integration via PythonMonkey
- Modern Web Compatibility: Full HTML DOM API, CSS Object Model (CSSOM), and modern JavaScript APIs
- Interactive CLI Interface: Rich terminal interface with comprehensive web interaction support
- Advanced Navigation: Back/forward, bookmarks, history, and smart link following
- Intelligent Content Processing: Dynamic content filtering and clean markdown output
- Performance Optimizations: Caching, asynchronous execution, and connection pooling
- Real Web Interactions: Form submission, file uploads, authentication flows
- Multiple Output Formats: Markdown, HTML, and JSON rendering
- Responsive Design Detection: Breakpoint analysis and mobile-first patterns
- Web Components Support: Extracts content from custom elements and Shadow DOM
- CSS Generated Content: Captures ::before/::after pseudo-element text
- Modern Layout Support: Proper text flow for CSS Grid and Flexbox layouts
- Shadow DOM Access: Accesses encapsulated component content from modern frameworks
Installation
pip install julia-browser
Quick Start
Command Line Interface
Command Line Usage
# Browse a website
julia-browser browse https://example.com
# Start interactive mode
julia-browser interactive
# Render to different formats
julia-browser render https://api.github.com --format json
AI Agent SDK 🤖
Julia Browser includes a clean, simple AI Agent SDK that enables AI systems to control websites exactly like humans do - with direct, intuitive functions that mirror CLI browser commands.
Perfect for: Web automation, AI agents, research workflows, data collection, form automation, and human-like web interactions.
Simple Functions - Just Like CLI Commands
from julia_browser import AgentSDK
# Initialize AI agent
agent = AgentSDK()
# Open a website (like CLI 'browse' command)
result = agent.open_website("https://example.com")
print(f"Opened: {result['title']}")
# List interactive elements (like CLI 'elements' command)
elements = agent.list_elements()
print(f"Found {elements['total_clickable']} clickable elements")
# Type into input fields (like CLI 'type' command)
agent.type_text(1, "search query")
# Click buttons or links (like CLI 'click' command)
agent.click_element(1)
# Submit forms (like CLI form submission)
result = agent.submit_form()
print(f"Form submitted: {result['title']}")
Complete AI Agent SDK Functions
Core Website Functions:
open_website(url)- Open any website and get page contentlist_elements()- List all clickable elements and input fields with numbersget_page_info()- Get current page title, URL, and full contentsearch_page(term)- Search for text within the current page
Human-like Interactions:
click_element(number)- Click buttons or links by their numbertype_text(field_number, text)- Type text into input fields by numbersubmit_form()- Submit forms with typed datafollow_link(number)- Navigate to links by their number
Page Navigation & Scrolling:
scroll_down(chunks=1)- Scroll down to see more content (like browser scrolling)scroll_up(chunks=1)- Scroll up to see previous contentscroll_to_top()- Jump to the top of the pagescroll_to_bottom()- Jump to the bottom of the pageget_scroll_info()- Get current scroll position and page info
Page Navigation:
scroll_down(chunks=1)- Scroll down to see more contentscroll_up(chunks=1)- Scroll up to see previous contentscroll_to_top()- Jump to the top of the pagescroll_to_bottom()- Jump to the bottom of the pageget_scroll_info()- Get current scroll position and progress
Real-World Usage Examples
1. Simple Web Search:
agent = AgentSDK()
# Open search engine
agent.open_website("https://duckduckgo.com")
# Type search query into first input field
agent.type_text(1, "Python programming")
# Submit the form
result = agent.submit_form()
print(f"Search results: {result['title']}")
2. Form Automation:
agent = AgentSDK()
# Open a form page
agent.open_website("https://httpbin.org/forms/post")
# List all available elements
elements = agent.list_elements()
print(f"Found {elements['total_inputs']} input fields")
# Fill form fields by number
agent.type_text(1, "John Doe") # First field
agent.type_text(2, "555-1234") # Second field
# Submit the completed form
result = agent.submit_form()
3. Scrolling Through Long Pages:
agent = AgentSDK()
# Open a documentation site
agent.open_website("https://api.python.langchain.com/en/latest/langchain_api_reference.html")
# Check scroll info
info = agent.get_scroll_info()
print(f"Page has {info['total_elements']} content sections")
print(f"Currently at {info['progress_percentage']}% of page")
# Scroll down to see more content
result = agent.scroll_down(2) # Scroll down 2 chunks
print(f"Viewing sections {result['visible_range']}")
# Jump to bottom for conclusions/references
bottom = agent.scroll_to_bottom()
print("Now viewing the end of the document")
# Go back to top to review
agent.scroll_to_top()
4. Website Navigation:
agent = AgentSDK()
# Open website
agent.open_website("https://example.com")
# List clickable elements
elements = agent.list_elements()
for button in elements['buttons']:
print(f"Button {button['number']}: {button['text']}")
# Click the first link/button
result = agent.click_element(1)
# Search current page for specific content
search = agent.search_page("information")
print(f"Found '{search['search_term']}' {search['matches_found']} times")
AI Agent Response Format
All AI Agent SDK methods return structured dictionaries with consistent formats:
{
'success': True,
'page_title': 'Example Domain',
'url': 'https://example.com',
'markdown': '# Page Content...',
'forms': 1,
'buttons': 3,
'inputs': 5,
'links': 12,
'matches_found': 2,
'error': None
}
Burp Suite-Style Intercepting Proxy 🔍
Julia Browser includes a powerful HTTP/HTTPS intercepting proxy designed for AI agents to inspect and manipulate web traffic - similar to Burp Suite but super simple with no Python functions needed!
Proxy Features
- 🎯 Zero Code Interception: Use simple dictionaries - no Python functions required!
- Request Modification: Change headers, URLs, methods, and body data
- Response Modification: Alter response content, headers, and status codes
- Traffic Logging: Comprehensive logging with filtering support
- Declarative Rules: AI-friendly rule system using JSON-like dictionaries
- Security Testing: Analyze security headers and test API endpoints
- API Monitoring: Automatically detect and log API calls
- Zero Impact: Completely optional - all existing functions work with proxy disabled
Quick Start with Proxy
from julia_browser import AgentSDK
agent = AgentSDK()
# Start the intercepting proxy
agent.proxy_start()
# Browse websites - all traffic is automatically logged
agent.open_website("example.com")
# Get traffic logs
traffic = agent.proxy_get_traffic(limit=10)
print(f"Captured {traffic['traffic_count']} HTTP requests")
# View request/response details
for entry in traffic['traffic']:
req = entry['request']
resp = entry['response']
print(f"{req['method']} {req['url']} → {resp['status_code']}")
# Stop proxy
agent.proxy_stop()
Simple Request Modification - No Python Functions Needed!
Just use dictionaries - perfect for AI agents:
agent = AgentSDK()
agent.proxy_start()
# Add authentication header to all API requests
agent.proxy_add_rule({
"name": "add_auth",
"type": "request",
"match": {"url_contains": "api"},
"actions": {"set_headers": {"Authorization": "Bearer token123"}}
})
# Now browse - auth header is automatically added to API requests
agent.open_website("api.example.com")
Simple Response Modification
Modify responses with simple dictionaries:
agent.proxy_start()
# Remove sensitive data from all JSON responses
agent.proxy_add_rule({
"name": "privacy_filter",
"type": "response",
"match": {"content_type": "application/json"},
"actions": {"find_replace": {"password": "***", "api_key": "***"}}
})
# Browse - responses are automatically cleaned
agent.open_website("dashboard.example.com")
Security Testing Example
agent = AgentSDK()
agent.proxy_start()
# Browse target website
agent.open_website("example.com")
# Analyze security headers
traffic = agent.proxy_get_traffic(limit=1)
headers = traffic['traffic'][0]['response']['headers']
security_headers = {
'Strict-Transport-Security': 'HSTS',
'Content-Security-Policy': 'CSP',
'X-Frame-Options': 'Clickjacking Protection'
}
print("Security Analysis:")
for header, name in security_headers.items():
if header in headers:
print(f"✓ {name}: {headers[header]}")
else:
print(f"✗ {name}: Missing")
agent.proxy_stop()
Traffic Filtering Example
agent.proxy_start()
# Browse multiple sites
agent.open_website("example.com")
agent.open_website("api.github.com")
# Get only API traffic
api_traffic = agent.proxy_get_traffic(
limit=50,
filter={"url_contains": "api"}
)
# Get only POST requests
post_traffic = agent.proxy_get_traffic(
limit=50,
filter={"method": "POST"}
)
print(f"API requests: {api_traffic['traffic_count']}")
print(f"POST requests: {post_traffic['traffic_count']}")
Proxy API Reference
Proxy Control:
proxy_start()- Enable traffic interceptionproxy_stop()- Disable traffic interceptionproxy_status()- Get proxy status and statisticsproxy_get_traffic(limit, filter)- Retrieve traffic logs with optional filteringproxy_clear_traffic()- Clear all traffic logs
Simple Rule-Based Interception (Recommended for AI Agents):
proxy_add_rule(rule_dict)- Add declarative rule using simple dictionaryproxy_remove_rule(name)- Remove rule by nameproxy_list_rules()- List all active rules
Advanced Interception (For Power Users):
proxy_add_request_interceptor(func, name)- Add Python function as request interceptorproxy_add_response_interceptor(func, name)- Add Python function as response interceptorproxy_remove_interceptor(name)- Remove interceptor by nameproxy_list_interceptors()- List all active interceptors
Rule Configuration Reference
Match Conditions (for filtering):
url_contains: Match URLs containing specific texturl_matches: Match URLs with regex patternmethod: Match specific HTTP method (GET, POST, etc.)header_equals: Match specific header valuesheader_contains: Match headers containing textstatus_code: Match response status code (responses only)content_type: Match content type (responses only)
Actions (for modification):
Request Actions:
set_headers: Add or modify request headersremove_headers: Remove specific headersset_method: Change HTTP methodrewrite_url: Rewrite URL with regexreplace_body: Replace entire request bodyappend_body: Append to request bodyfind_replace: Find and replace text in request bodyblock_request: Block request from being sent (returns 403)
Response Actions:
set_headers: Add or modify response headersremove_headers: Remove specific headersset_status: Change HTTP status codereplace_body: Replace entire response bodyappend_body: Append to response bodyfind_replace: Find and replace text in response body
Complete Rule Example:
# Complex rule with multiple conditions and actions
agent.proxy_add_rule({
"name": "api_modifier",
"type": "request",
"match": {
"url_contains": "api",
"method": "POST",
"header_contains": {"Content-Type": "json"}
},
"actions": {
"set_headers": {
"Authorization": "Bearer token123",
"X-API-Version": "v2"
},
"set_method": "PUT"
}
})
Traffic Analysis Example
agent = AgentSDK()
agent.proxy_start()
# Browse multiple pages
for site in ["example.com", "httpbin.org", "api.github.com"]:
agent.open_website(site)
# Get comprehensive statistics
status = agent.proxy_status()
stats = status['proxy_status']['stats']
print(f"Total Requests: {stats['total_requests']}")
print(f"Total Responses: {stats['total_responses']}")
print(f"Modified Requests: {stats['modified_requests']}")
print(f"Modified Responses: {stats['modified_responses']}")
# Get detailed traffic log
traffic = agent.proxy_get_traffic(limit=20)
for entry in traffic['traffic']:
print(f"{entry['request']['method']:4} {entry['request']['url']}")
agent.proxy_stop()
Integration with AI Systems
The AI Agent SDK is specifically designed for integration with AI frameworks and automation platforms:
LangChain Integration:
from julia_browser import AgentSDK
from langchain.tools import BaseTool
class JuliaBrowserTool(BaseTool):
name = "julia_browser"
description = "Navigate and interact with websites"
def _run(self, url: str) -> str:
agent = AgentSDK()
result = agent.navigate(url)
return result['markdown']
OpenAI Function Calling:
import openai
from julia_browser import AgentSDK
def browse_website(url: str):
"""Browse a website and return its content"""
agent = AgentSDK()
return agent.navigate(url)
# Register as OpenAI function
functions = [{
"name": "browse_website",
"description": "Browse and analyze website content",
"parameters": {
"type": "object",
"properties": {
"url": {"type": "string", "description": "Website URL to browse"}
}
}
}]
Autonomous Agent Workflows:
# Multi-step research workflow
agent = AgentSDK()
# Step 1: Search for information
agent.navigate("https://duckduckgo.com")
agent.fill_input(0, "artificial intelligence trends 2024")
agent.submit_form()
# Step 2: Follow first result
agent.follow_link(0)
# Step 3: Extract specific data
insights = agent.search_page("machine learning")
print(f"Found {insights['matches_found']} ML references")
# Step 4: Continue to related pages
elements = agent.get_elements()
if elements['links'] > 0:
agent.follow_link(0) # Follow related link
Why Choose Julia Browser AI Agent SDK?
- 🎯 Human-like Interaction: Functions mirror exactly how humans browse websites
- 🔢 Simple Numbering: All elements are numbered - just click_element(1), type_text(1, "hello")
- 📊 Clean Responses: Consistent JSON responses with success/error handling
- 🌐 Real Websites: Handles modern sites with JavaScript, forms, and dynamic content
- 🚀 Zero Setup: No browser installation or complex configuration needed
- ⚡ Direct Commands: Each function does exactly one thing, like CLI commands
Advanced Features
Python API
from julia_browser import BrowserEngine, BrowserSDK
# Initialize browser
sdk = BrowserSDK()
# Browse a website
result = sdk.browse_url("https://example.com")
print(result['markdown'])
# Interactive CLI
from julia_browser import CLIBrowser
browser = CLIBrowser()
browser.start_interactive_mode()
Interactive Mode Commands
browse <url>- Navigate to a websiteelements- Show all interactive elements (buttons, links, forms)click <number>- Click on numbered elementstype <text>- Type into input fieldssubmit- Submit formsback/forward- Navigate browser historybookmark add <name>- Save bookmarkshelp- Show all commands
Advanced Features
JavaScript Support
- Full ES6+ compatibility with Mozilla SpiderMonkey
- React, Vue, Angular framework support
- Real API calls and network requests
- Modern browser API simulation
Web Interaction
- Smart form handling with real submission
- File upload support
- Authentication flows and session management
- Cookie handling and persistent sessions
Performance
- Intelligent caching with SQLite backend
- Asynchronous request handling
- Connection pooling and optimization
- Lazy loading for large websites
Examples
Browse and Interact
julia-browser interactive
> browse github.com
> elements
> click 1 # Click login button
> type username myuser
> type password mypass
> submit
API Integration
from julia_browser import BrowserSDK
sdk = BrowserSDK()
# Handle JSON APIs
result = sdk.browse_url("https://api.github.com/users/octocat")
user_data = result['json_data']
# Process forms
result = sdk.submit_form("https://httpbin.org/post", {
"username": "test",
"email": "test@example.com"
})
Requirements
- Python 3.8+
- PythonMonkey (Mozilla SpiderMonkey)
- Rich (terminal formatting)
- Click (CLI framework)
- BeautifulSoup4 (HTML parsing)
- Requests (HTTP client)
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file julia_browser-2.1.0.tar.gz.
File metadata
- Download URL: julia_browser-2.1.0.tar.gz
- Upload date:
- Size: 174.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86dcf6e6b957ea5bf999520a36217bf2e28de649a11550fdee143136a850aa4d
|
|
| MD5 |
f11bd1e4b9afed19ca410826e126b0e6
|
|
| BLAKE2b-256 |
97c60c5b744b47b86d3e8b23f3cd3bf87622917971e1761bc8af45b9234857b0
|
File details
Details for the file julia_browser-2.1.0-py3-none-any.whl.
File metadata
- Download URL: julia_browser-2.1.0-py3-none-any.whl
- Upload date:
- Size: 179.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18b200050d4ba9b402bb90292635a883f809400a6bd19933bf1d11b45466720b
|
|
| MD5 |
a24ffb0cae071830822db0e2fde500a8
|
|
| BLAKE2b-256 |
ff2125c52132b848b1a8967aee0f84dc7db4fee7d7598ebf2c000dd01e36770c
|