Skip to main content

A Python high-performance screenshot library for Windows use Desktop Duplication API

Project description

Better DXcam

Fastest Python Screenshot for Windows, Forked, and Maintained

import betterdxcam
camera = betterdxcam.create()
camera.grab()

Introduction

BetterDXcam is a fork of DXcam, a Python high-performance screenshot library for Windows using Desktop Duplication API. Capable of 240Hz+ capturing. It was originally built as a part of deep learning pipeline for FPS games to perform better than existed python solutions (python-mss, D3DShot).

BetterDXcam provides these improvements over DXcam:

  • Fixed crashing when screen changes resolution

Compared to these existed solutions, DXcam provides:

  • Way faster screen capturing speed (> 240Hz)
  • Capturing of Direct3D exclusive full-screen application without interrupting, even when alt+tab.
  • Automatic handling of scaled / stretched resolution.
  • Accurate FPS targeting when in capturing mode, makes it suitable for Video output.
  • Seamless integration with NumPy, OpenCV, PyTorch, etc.

Installation

From PyPI:

pip install betterdxcam

Note: OpenCV is required by betterDXcam for colorspace conversion. If you don't already have OpenCV, install it easily with command pip install betterdxcam[cv2].

From source:

pip install --editable .

# for installing OpenCV also
pip install --editable .[cv2]

Usage

In betterDXCam, each output (monitor) is asscociated to a betterDXCamera instance. To create a betterDXCamera instance:

import betterdxcam
camera = betterdxcam.create()  # returns a betterDXCamera instance on primary monitor

Screenshot

For screenshot, simply use .grab:

frame = camera.grab()

The returned frame will be a numpy.ndarray in the shape of (Height, Width, 3[RGB]). This is the default and the only supported format (for now). It is worth noting that .grab will return None if there is no new frame since the last time you called .grab. Usually it means there's nothing new to render since last time (E.g. You are idling).

To view the captured screenshot:

from PIL import Image
Image.fromarray(frame).show()

To screenshot a specific region, use the region parameter: it takes tuple[int, int, int, int] as the left, top, right, bottom coordinates of the bounding box. Similar to PIL.ImageGrab.grab.

left, top = (1920 - 640) // 2, (1080 - 640) // 2
right, bottom = left + 640, top + 640
region = (left, top, right, bottom)
frame = camera.grab(region=region)  # numpy.ndarray of size (640x640x3) -> (HXWXC)

The above code will take a screenshot of the center 640x640 portion of a 1920x1080 monitor.

Screen Capture

To start a screen capture, simply use .start: the capture will be started in a separated thread, default at 60Hz. Use .stop to stop the capture.

camera.start(region=(left, top, right, bottom))  # Optional argument to capture a region
camera.is_capturing  # True
# ... Do Something
camera.stop()
camera.is_capturing  # False

Consume the Screen Capture Data

While the betterDXCamera instance is in capture mode, you can use .get_latest_frame to get the latest frame in the frame buffer:

camera.start()
for i in range(1000):
    image = camera.get_latest_frame()  # Will block until new frame available
camera.stop()

Notice that .get_latest_frame by default will block until there is a new frame available since the last call to .get_latest_frame. To change this behavior, use video_mode=True.

Advanced Usage and Remarks

Multiple monitors / GPUs

cam1 = betterdxcam.create(device_idx=0, output_idx=0)
cam2 = betterdxcam.create(device_idx=0, output_idx=1)
cam3 = betterdxcam.create(device_idx=1, output_idx=1)
img1 = cam1.grab()
img2 = cam2.grab()
img2 = cam3.grab()

The above code creates three betterDXCamera instances for: [monitor0, GPU0], [monitor1, GPU0], [monitor1, GPU1], and subsequently takes three full-screen screenshots. (cross GPU untested, but I hope it works.) To get a complete list of devices and outputs:

>>> import betterdxcam
>>> betterdxcam.device_info()
'Device[0]:<Device Name:NVIDIA GeForce RTX 3090 Dedicated VRAM:24348Mb VendorId:4318>\n'
>>> betterdxcam.output_info()
'Device[0] Output[0]: Res:(1920, 1080) Rot:0 Primary:True\nDevice[0] Output[1]: Res:(1920, 1080) Rot:0 Primary:False\n'

Output Format

You can specify the output color mode upon creation of the betterDXCamera instance:

betterdxcam.create(output_idx=0, output_color="BGRA")

We currently support "RGB", "RGBA", "BGR", "BGRA", "GRAY", with "GRAY being the gray scale. As for the data format, betterDXCamera only supports numpy.ndarray in shape of (Height, Width, Channels) right now. We will soon add support for other output formats.

Video Buffer

The captured frames will be insert into a fixed-size ring buffer, and when the buffer is full the newest frame will replace the oldest frame. You can specify the max buffer length (defualt to 64) using the argument max_buffer_len upon creation of the betterDXCamera instance.

camera = betterdxcam.create(max_buffer_len=512)

Note: Right now to consume frames during capturing there is only get_latest_frame available which assume the user to process frames in a LIFO pattern. This is a read-only action and won't pop the processed frame from the buffer. we will make changes to support various of consuming pattern soon.

Target FPS

To make betterDXCamera capture close to the user specified target_fps, we used the undocumented CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag to create a Windows Waitable Timer Object. This is far more accurate (+/- 1ms) than Python (<3.11) time.sleep (min resolution 16ms). The implementation is done through ctypes creating a perodic timer. Python 3.11 used a similar approach[^2].

camera.start(target_fps=120)  # Should not be made greater than 160.

However, due to Windows itself is a preemptive OS[^1] and the overhead of Python calls, the target FPS can not be guarenteed accurate when greater than 160. (See Benchmarks)

Video Mode

The default behavior of .get_latest_frame only put newly rendered frame in the buffer, which suits the usage scenario of a object detection/machine learning pipeline. However, when recording a video that is not ideal since we aim to get the frames at a constant framerate: When the video_mode=True is specified when calling .start method of a betterDXCamera instance, the frame buffer will be feeded at the target fps, using the last frame if there is no new frame available. For example, the following code output a 5-second, 120Hz screen capture:

target_fps = 120
camera = betterdxcam.create(output_idx=0, output_color="BGR")
camera.start(target_fps=target_fps, video_mode=True)
writer = cv2.VideoWriter(
    "video.mp4", cv2.VideoWriter_fourcc(*"mp4v"), target_fps, (1920, 1080)
)
for i in range(600):
    writer.write(camera.get_latest_frame())
camera.stop()
writer.release()

You can do interesting stuff with libraries like pyav and pynput: see examples/instant_replay.py for a ghetto implementation of instant replay using hot-keys

Safely Releasing of Resource

Upon calling .release on a betterDXCamera instance, it will stop any active capturing, free the buffer and release the duplicator and staging resource. Upon calling .stop(), betterDXCamera will stop the active capture and free the frame buffer. If you want to manually recreate a betterDXCamera instance on the same output with different parameters, you can also manully delete it:

camera1 = betterdxcam.create(output_idx=0, output_color="BGR")
camera2 = betterdxcam.create(output_idx=0)  # Not allowed, camera1 will be returned
camera1 is camera2  # True
del camera1
del camera2
camera2 = betterdxcam.create(output_idx=0)  # Allowed

Benchmarks

For Max FPS Capability:

start_time, fps = time.perf_counter(), 0
cam = betterdxcam.create()
start = time.perf_counter()
while fps < 1000:
    frame = cam.grab()
    if frame is not None:  # New frame
        fps += 1
end_time = time.perf_counter() - start_time
print(f"{title}: {fps/end_time}")

When using a similar logistic (only captured new frame counts), betterDXcam / DXcam, python-mss, D3DShot benchmarked as follow:

DXcam python-mss D3DShot
Average FPS 238.79 :checkered_flag: 75.87 118.36
Std Dev 1.25 0.5447 0.3224

The benchmark is across 5 runs, with a light-moderate usage on my PC (5900X + 3090; Chrome ~30tabs, VS Code opened, etc.), I used the Blur Buster UFO test to constantly render 240 fps on my monitor (Zowie 2546K). DXcam captured almost every frame rendered.

For Targeting FPS:

camera = betterdxcam.create(output_idx=0)
camera.start(target_fps=60)
for i in range(1000):
    image = camera.get_latest_frame()
camera.stop()
(Target)\(mean,std) betterDXcam / DXcam python-mss D3DShot
60fps 61.71, 0.26 :checkered_flag: N/A 47.11, 1.33
30fps 30.08, 0.02 :checkered_flag: N/A 21.24, 0.17

Work Referenced

D3DShot : DXcam (and by extension betterDXcam) borrows the ctypes header directly from the no-longer maintained D3DShot.

OBS Studio : Learned a lot from it.

[^1]: https://en.wikipedia.org/wiki/Preemption_(computing) Preemption (computing)

[^2]: https://github.com/python/cpython/issues/65501 bpo-21302: time.sleep() uses waitable timer on Windows

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdxcam-0.0.9.tar.gz (18.0 kB view hashes)

Uploaded Source

Built Distribution

betterDXcam-0.0.9-py3-none-any.whl (21.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page