Skip to main content

High-performance, omnimodal desktop recorder for Windows

Project description

ocap

ocap gstreamer-bundle

High-performance desktop recorder for Windows. Captures screen, audio, keyboard, mouse, and window events.

https://github.com/user-attachments/assets/4e94782c-02ae-4f64-bb52-b08be69d33da

What is ocap?

ocap (Omnimodal CAPture) captures all essential desktop signals in synchronized format. Records screen video, audio, keyboard/mouse input, and window events. Built for the open-world-agents project but works for any desktop recording needs.

TL;DR: Complete, high-performance desktop recording tool for Windows. Captures everything in one command.

📊 Working with recorded data? See the OWAMcap Format Guide for analysis, processing, and ML integration.

Key Features

  • Complete desktop recording: Video, audio, keyboard/mouse events, window events
  • High performance: Hardware-accelerated with Windows APIs and GStreamer
  • Efficient encoding: H265/HEVC for high quality and small file size
  • Simple operation: ocap FILE_LOCATION (stop with Ctrl+C)
  • Clean architecture: Core logic in a single 400-line recorder.py
  • Modern formats: MKV with embedded timestamps, OWAMcap format for events (built on MCAP)

System Requirements

Based on OBS Studio recommended specs + NVIDIA GPU requirements:

Component Specification
OS Windows 11 (64-bit)
Processor Intel i7 8700K / AMD Ryzen 1600X
Memory 8 GB RAM
Graphics NVIDIA GeForce 10 Series or newer ⚠️
DirectX Version 11
Storage 600 MB + ~100MB per minute recording

⚠️ NVIDIA GPU Required: Currently only supports NVIDIA GPUs for hardware acceleration. AMD/Intel GPU support possible through GStreamer framework - contributions welcome!

🖥️ OS Support: Currently only supports Windows. However, support for other operating systems (Linux, macOS) can be relatively easily extended due to the presence of GStreamer. Simply using different GStreamer pipelines can enable capture on other platforms - contributions welcome!

Installation & Usage

Option 1: Download Release

  1. Download ocap.zip from releases
  2. Unzip and run:
    • Double-click run.bat (opens terminal with virtual environment)
    • Or in CLI: run.bat --help

Option 2: Package Install

All OWA packages are available on PyPI:

# Install GStreamer dependencies first (for video recording)
$ conda install open-world-agents::gstreamer-bundle

# Install ocap
$ pip install ocap

Basic Usage

# Start recording (stop with Ctrl+C)
$ ocap my-recording

# Show all options
$ ocap --help

# Advanced options
$ ocap FILENAME --window-name "App"   # Record specific window
$ ocap FILENAME --monitor-idx 1       # Record specific monitor
$ ocap FILENAME --fps 60              # Set framerate
$ ocap FILENAME --no-record-audio     # Disable audio

Output Files

  • .mcap — Event log (keyboard, mouse, windows) in OWAMcap format
  • .mkv — Video/audio with embedded timestamps

Your recording files will be ready immediately!

Feature Comparison

Feature ocap OBS wcap pillow/mss
Advanced data formats (OWAMcap) ✅ Yes ❌ No ❌ No ❌ No
Timestamp aligned logging ✅ Yes ❌ No ❌ No ❌ No
Customizable event definition & Listener ✅ Yes ❌ No ❌ No ❌ No
Single python file ✅ Yes ❌ No ❌ No ❌ No
Audio + Window + Keyboard + Mouse ✅ Yes ⚠️ Partial ❌ No ❌ No
Hardware-accelerated encoder ✅ Yes ✅ Yes ✅ Yes ❌ No
Supports latest Windows APIs ✅ Yes ✅ Yes ✅ Yes ❌ No (legacy APIs only)
Optional mouse cursor capture ✅ Yes ✅ Yes ✅ Yes ❌ No

Technical Architecture

Built on GStreamer with clean, maintainable design:

flowchart TD
    %% Input Sources
    A[owa.env.desktop] --> B[Keyboard Events]
    A --> C[Mouse Events] 
    A --> D[Window Events]
    E[owa.env.gst] --> F[Screen Capture]
    E --> G[Audio Capture]
    
    %% Core Processing
    B --> H[Event Queue]
    C --> H
    D --> H
    F --> H
    F --> I[Video/Audio Pipeline]
    G --> I
    
    %% Outputs
    H --> J[MCAP Writer]
    I --> K[MKV Pipeline]
    
    %% Files
    J --> L[📄 events.mcap]
    K --> M[🎥 video.mkv]
    
    style A fill:#e1f5fe
    style E fill:#e1f5fe
    style H fill:#fff3e0
    style L fill:#e8f5e8
    style M fill:#e8f5e8

Troubleshooting

  • Record terminates right after start? Re-run the same command a few times. This is due to an intermittent GStreamer crash with an unknown cause.
  • GStreamer error message box appears on first run? This is a known issue where GStreamer may show error dialogs the first time you run ocap. These messages do not affect recording—simply close the dialogs and continue. ocap will function normally.
  • Audio not recording? By default, only audio from the target process is recorded. To change this, manually edit the GStreamer pipeline.
  • Large file sizes? Reduce file size by adjusting the gop-size parameter in the nvd3d11h265enc element. See pipeline.py.
  • Performance tips: Close unnecessary applications before recording, use SSD storage for better write performance, and record to a different drive than your OS drive.

FAQ

  • How much disk space do recordings use? ~100MB per minute for 1080p H265 recording.
  • Will ocap slow down my computer? Minimal impact with hardware acceleration. Designed for low overhead.
  • What is OWAMcap format? A specialized format that stores screen video (.mkv) + synchronized events (.mcap) for AI training. Contains keyboard, mouse, window events with nanosecond precision. Learn more →
  • Can I save recording in other formats? Yes sure, all the source code you must edit is single recorder.py. You can implement JSONL, Parquet, CSV, anything you want easily.

When to Use ocap

  • AI Agent Training: Capture desktop interactions for training multimodal models
  • Workflow Documentation: Record exact steps with precise timing
  • Performance Testing: Low-overhead recording during intensive tasks
  • Research & Datasets: Generate standardized OWAMcap data for the community (HuggingFace Hub)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocap-0.6.1.post1.tar.gz (91.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ocap-0.6.1.post1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file ocap-0.6.1.post1.tar.gz.

File metadata

  • Download URL: ocap-0.6.1.post1.tar.gz
  • Upload date:
  • Size: 91.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for ocap-0.6.1.post1.tar.gz
Algorithm Hash digest
SHA256 708d2c3be6712823e9f69cab0ec986bc4f6258b6619e0427d557802bacc5423e
MD5 cd6725d111fdc1c0c487577c0959cf72
BLAKE2b-256 e449cb944bbbb470a47369a70510c5f24f5667bcb4a6a647a8f2f8fb95920445

See more details on using hashes here.

File details

Details for the file ocap-0.6.1.post1-py3-none-any.whl.

File metadata

  • Download URL: ocap-0.6.1.post1-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for ocap-0.6.1.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 acdaf613c268e546a40a18a8b0aa9c8ae6e7d581369b6b434c44567178d93e67
MD5 47d57c76fd241a1425e73cd670022407
BLAKE2b-256 99c6d3460c4fbeeaf35be740ad1067b8f23f2797dc030b9bff34edbe3d5379a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page