Skip to main content

Programming with Pixels (PwP) - A framework for computer-use software engineering agents

Project description

Programming with Pixels (PwP)

PwP Logo

Overview

Programming with Pixels (PwP) is a modern framework for evaluating and developing Software Engineering (SWE) agents that interact with computers as humans do - through visual perception and basic actions like typing and clicking.

Our motivating hypothesis is that achieving general-purpose Software Engineering (SWE) agents requires a shift to computer-use agents that can interact with any IDE interface through screenshots and primitive actions, rather than through specialized tool APIs.

Installation

Prerequisites

  • Python 3.6+
  • Docker
  • (Optional) NVIDIA GPU with CUDA support

Using pip (Recommended)

pip install programming-with-pixels

Development Installation

git clone https://github.com/ProgrammingWithPixels/pwp.git
cd pwp
pip install -e .

Quick Start

from pwp import PwP
from pwp import PwPBench

# Create a basic environment
env = PwP(image_name='pwp_env')

# Take a screenshot
observation = env.render()
observation.save('screenshot.png')

# Execute a command
result = env.step("echo 'Hello, World!'")
print(result['output'])

# Try a benchmark task
bench = PwPBench('humaneval')
dataset = bench.get_dataset()
task_env = bench.get_env(dataset[0])

Command Line Interface

For quicker testing, PwP also comes with a convenient command-line interface:

# Start an environment
pwp env --vnc

# List available benchmark tasks
pwp list

# Run a benchmark
pwp bench humaneval

Examples

Check out the examples directory for demonstration scripts:

  • Quickstart: Complete walkthrough of PwP's capabilities, including environment interaction, benchmarks, and advanced features
  • Basic Demo: Simple environment setup and interaction showcase
  • Demo2: Additional demonstration of PwP features

Benchmark Tasks

PwP-Bench comes with a wide range of benchmark tasks for evaluating agents:

  • HumanEval: Python coding problems
  • Design2Code: Converting design mockups to code
  • ChartMimic: Recreating charts from visual references
  • SWE-bench: Software engineering tasks
  • And many more!

You will first need to setup benchmarks for evaluating agents. See the benchmark documentation for more details.

Evaluating Agents

For detailed examples, check out the agent implementations in the src/pwp/agents directory. Each agent type can be customized with different LLM backends and system prompts to optimize for various tasks.

Building Custom Environments

Build the Base Environment

# Build the base PWP environment
cd src/pwp/docker/
docker build -t pwp_env .

Custom Environment

You can create custom Docker environments by extending the base image:

FROM pwp_env

# Install additional dependencies
RUN apt-get update && apt-get install -y \
    your-package-here \
    && rm -rf /var/lib/apt/lists/*

# Add custom files
COPY your-files /home/devuser/your-files

Package Structure

The PwP package consists of several modules:

  • pwp.env: Core environment module for managing Docker containers
  • pwp.bench: Benchmark module with various programming tasks
  • pwp.agents: Agent implementations for solving tasks
  • pwp.utils: Utility functions for image processing and other helpers
  • pwp.tools: Tools for agent interaction with environments
  • pwp.functions: Function implementations for tools
  • pwp.prompts: Prompt templates for different agent types

See the package documentation for more details on each module.

Contributing

We welcome contributions to the PwP project! Please see our contribution guidelines for more information.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use PwP in your research, please cite our paper:

@article{pwp2025,
  title={Programming with Pixels: Computer-Use Meets Software Engineering},
  author={Aggarwal, Pranjal and Welleck, Sean},
  journal={arXiv preprint arXiv:2502.00000},
  year={2025}
}

Acknowledgments

  • This project builds on various open-source tools and libraries
  • Thanks to all contributors who have helped shape the project

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

programming_with_pixels-0.1.3.tar.gz (60.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

programming_with_pixels-0.1.3-py3-none-any.whl (62.0 MB view details)

Uploaded Python 3

File details

Details for the file programming_with_pixels-0.1.3.tar.gz.

File metadata

  • Download URL: programming_with_pixels-0.1.3.tar.gz
  • Upload date:
  • Size: 60.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for programming_with_pixels-0.1.3.tar.gz
Algorithm Hash digest
SHA256 d56bbf61c3efee423360eb12324d17883682573ac8f3b9b9d26a35c8ffcc9cbb
MD5 9da7c66e408f3fe05b039d9cb5b51c1c
BLAKE2b-256 525b692119268f8f2e23956f434de67d8806c61c6585627a10477b4d78ebde21

See more details on using hashes here.

File details

Details for the file programming_with_pixels-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for programming_with_pixels-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1406e1997afe83cc44c312cf64d06b2d5cf0f611b9e346f53de2e2c0a56b3999
MD5 9b71de0a29d6622940e58d085e7042e6
BLAKE2b-256 f281faa3228ef82e1a203a2a8c1543bd06ee203904d665aef1817a6284fc1cad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page