A Model Context Protocol server that provides web automation tools using Playwright

Project description

Amazon Bedrock Web Tools Example

This project demonstrates how to use Amazon Bedrock with Anthropic Claude and Amazon Nova models to create a web automation assistant with tool use, human-in-the-loop interaction, and vision capabilities.

Setup Instructions

Project Overview

This project contains a series of progressive examples that demonstrate different capabilities:

Step 1: Basic Setup (No Tools)

01-no-tools - A minimal example that sends a request to Claude via Amazon Bedrock without any tools.

Step 2: Tool Definition

02-tool-definition - Introduces tool definitions for web navigation and screenshots.

Step 3: Tool Loop

03-loop - Implements a loop to handle tool requests with simulated responses.

Step 4: Tool Invocation

04-invoke-tool - Adds actual tool implementation functions for navigation and screenshots.

Step 5: Headless Browser

05-headless-browser - Integrates Playwright to control a real headless browser.

Step 6: Human-in-the-Loop

06-human-in-loop - Adds a tool that allows the model to ask the user questions during execution.

Step 7: Vision Capabilities

07-vision - Enhances the system with vision capabilities, allowing the model to:

See screenshots it takes
Analyze visual content
Click on specific elements based on visual analysis

Step 8: Text Input with Form Submission

08-type-scroll-tools - Adds text input and scrolling tools that enable the model to:

First click on elements like form fields using vision
Then type text into the last clicked element
Submit forms by pressing Enter after typing (optional)
Scroll up and down to see more content on the page
Demonstrates a complete e-commerce search workflow

Step 9: File Writing

09-write-file - Adds a file writing tool that enables the model to:

Search for information on the web
Scroll through search results to find relevant information
Compile and organize the findings
Write the results to a markdown file
Create permanent documentation of web search results

Step 10: MCP Refactoring

10-mcp-client - Refactors the application to use the Model Context Protocol:

Separates the application into client and server components
Uses FastMCP framework for structured tool definitions
Implements proper resource lifecycle management with lifespan API
Provides type-safe context passing between components
Uses stdio transport for client-server communication
Maintains all previous functionality with improved architecture
Automatically starts the MCP server (10-mcp-server.py)

Step 11: Conversation History Management

11-mcp-client - Adds conversation history management features:

Implements media content removal to reduce token usage
Adds conversation summarization for long interactions
Uses a smaller model (Amazon Nova Micro) for efficient summarization
Preserves tool use/result pairs for API validation compliance
Maintains context while reducing token usage
Enables longer, more complex conversations without hitting token limits
Automatically starts the MCP server (11-mcp-server.py)

Architecture Diagrams

The project includes a Mermaid diagram that illustrates the architecture and components of each step:

Simplified Components Overview

components-overview.md - A simplified version with just the essential components for each step

The components overview includes:

One diagram per step showing only the key components
Model evolution across steps
Tool evolution across steps
Architecture evolution across steps

This diagram helps visualize the progression from a simple API call to a sophisticated web automation assistant with multiple capabilities.

Key Features

Web Navigation: Navigate to specified URLs using Playwright
Screenshots: Capture screenshots of web pages
Human-in-the-Loop: Ask the user questions during execution
Vision Capabilities: Allow the model to see and analyze screenshots
Interactive Clicking: Click on elements based on visual analysis
Text Input & Form Submission: Type text into previously clicked elements and optionally submit forms
Page Scrolling: Scroll up or down to see more content on long pages
File Writing: Save search results and findings to markdown files for documentation
MCP Architecture: Client-server architecture with standardized protocol (Step 10)
Conversation Management: Efficient token usage through media removal and conversation summarization (Step 11)

Amazon Bedrock Configuration

This example assumes you have:

AWS CLI configured with appropriate credentials
Access to Claude 3 models through Amazon Bedrock
Appropriate permissions to use the Bedrock API

Example Workflow

The model navigates to a website
It takes a screenshot and analyzes the visual content
It identifies interactive elements like search boxes
It clicks on an element (e.g., a search box) using vision-based coordinates
It types text into the clicked element (e.g., search query)
It can submit forms by pressing Enter after typing
It can scroll through search results to find relevant information
It can analyze search results and compile findings
It can write results to a markdown file for documentation
It can ask the user for input when needed

Notes

Screenshots are saved with random UUIDs as filenames
The browser is automatically cleaned up after execution
The system prompt instructs the model to use its vision capabilities and ask for help when needed
Step 8 demonstrates a complete e-commerce search workflow for AAA Amazon Basics batteries
Step 9 extends the workflow to save search results in markdown format to a file
Step 10 refactors the application to use the Model Context Protocol (MCP)
Step 11 adds conversation history management to handle long interactions efficiently
The type tool includes a submit option that presses Enter after typing, useful for form submissions
The scroll tool allows the model to navigate through long pages by scrolling up or down
The approach of clicking first, then typing mimics how humans interact with web interfaces
The run_step.sh script is designed to be extensible - it will automatically discover new step files as they are added
Interactive scripts (those that ask for user input) are detected automatically and run in a way that preserves input functionality

License

This project is licensed under the MIT License - see the LICENSE file for details.

Authors

This project was created and is maintained by:

Project details

Release history Release notifications | RSS feed

This version

1.0.0

Feb 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0.tar.gz (63.8 kB view details)

Uploaded Feb 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0-py3-none-any.whl (54.9 kB view details)

Uploaded Feb 9, 2026 Python 3

File details

Details for the file iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0.tar.gz.

File metadata

Download URL: iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0.tar.gz
Upload date: Feb 9, 2026
Size: 63.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`289abd7f4c529a429a7784fbc57a3c6f076af1737da7730b15ab90db821ec1b1`
MD5	`558cdfc82711dcc97e5d29bde0a6725f`
BLAKE2b-256	`bbc0cc1b1f88cf644f21d3ce6846424ee48b978a614c7376daec93fc92ce9b63`

See more details on using hashes here.

File details

Details for the file iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0-py3-none-any.whl.

File metadata

Download URL: iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0-py3-none-any.whl
Upload date: Feb 9, 2026
Size: 54.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_aws_samples_sample_agentic_ai_web-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`28819bb6d4914ea7048d6e1d9390b5fdbfd18c4b04b7b0297b4496fa482880b2`
MD5	`885c76e309dab2a8f43a40c5c46b2bd5`
BLAKE2b-256	`5044ec71f3ba8f0ae05a3ca7d3647672f9acb1b9ade00f84f585149d96300fad`

See more details on using hashes here.

iflow-mcp_aws-samples-sample-agentic-ai-web 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Amazon Bedrock Web Tools Example

Setup Instructions

Project Overview

Step 1: Basic Setup (No Tools)

Step 2: Tool Definition

Step 3: Tool Loop

Step 4: Tool Invocation

Step 5: Headless Browser

Step 6: Human-in-the-Loop

Step 7: Vision Capabilities

Step 8: Text Input with Form Submission

Step 9: File Writing

Step 10: MCP Refactoring

Step 11: Conversation History Management

Architecture Diagrams

Simplified Components Overview

Key Features

Amazon Bedrock Configuration

Example Workflow

Notes

License

Authors

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes