Compare various stereo depth estimation algorithms on image files or with an OAK-D camera.
Project description
stereodemo
Small Python utility to compare and visualize the output of various stereo depth estimation algorithms:
- Make it easy to get a qualitative evaluation of several state-of-the-art models in the wild
- Feed it left/right images or capture live from an OAK-D camera
- Interactive colored point-cloud view since nice-looking disparity images can be misleading
- Try different parameters on the same image
Included methods (implementation/pre-trained models taken from their respective authors):
- OpenCV stereo block matching and Semi-global block matching baselines, with all their parameters
- CREStereo: "Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation" (CVPR 2022)
- RAFT-Stereo: "Multilevel Recurrent Field Transforms for Stereo Matching" (3DV 2021)
- Hitnet: "Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching" (CVPR 2021)
- STereo TRansformers: "Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers" (ICCV 2021)
- Chang et al. RealtimeStereo: "Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices" (ACCV 2020)
See below for more details / credits to get each of these working.
https://user-images.githubusercontent.com/541507/169557430-48e62510-60c2-4a2b-8747-f9606e405f74.mp4
Getting started
Installation
python3 -m pip install stereodemo
Running it
With an OAK-D camera
To capture data directly from an OAK-D camera, use:
stereodemo --oak
Then click on Next Image
to capture a new one.
With image files
For convenience a tiny subset of some popular datasets is included in this repository. Just provide a folder to stereodemo
and it'll look for left/right pairs (either im0/im1 or left/right in the names):
# To evaluate on the oak-d images
stereodemo datasets/oak-d
# To cycle through all images
stereodemo datasets
Then click on Next Image
to cycle through the images.
Sample images included in this repository:
- drivingstereo: outdoor driving.
- middlebury_2014: high-res objects.
- eth3d: outdoor and indoor scenes.
- sceneflow: synthetic rendering of objects.
- oak-d: indoor images I captured with my OAK-D lite camera.
Dependencies
pip
will install the dependencies automatically. Here is the list:
- Open3D. For the point cloud visualization and the GUI.
- OpenCV. For image loading and the traditional block matching baselines.
- onnxruntime. To run pretrained models in the ONNX format.
- pytorch. To run pretrained models exported as torch script.
- depthai. Optional, to grab images from a Luxonis OAK camera.
Credits for each method
I did not implement any of these myself, but just collected pre-trained models or converted them to torch script / ONNX.
-
CREStereo
- Official implementation and pre-trained models: https://github.com/megvii-research/CREStereo
- Model Zoo for the ONNX models: https://github.com/PINTO0309/PINTO_model_zoo/tree/main/284_CREStereo
- Port to ONNX + sample loading code: https://github.com/ibaiGorordo/ONNX-CREStereo-Depth-Estimation
-
RAFT-Stereo
- Official implementation and pre-trained models: https://github.com/princeton-vl/RAFT-Stereo
- I exported the pytorch implementation to torch script via tracing, with minor modifications of the source code.
- Their fastest implementation was not imported.
-
Hitnet
- Official implementation and pre-trained models: https://github.com/google-research/google-research/tree/master/hitnet
- Model Zoo for the ONNX models: https://github.com/PINTO0309/PINTO_model_zoo/tree/main/142_HITNET
- Port to ONNX + sample loading code: https://github.com/ibaiGorordo/ONNX-HITNET-Stereo-Depth-estimation
-
Stereo Transformers
- Official implementation and pre-trained models: https://github.com/mli0603/stereo-transformer
- Made some small changes to allow torch script export via tracing.
- The exported model currently fails with GPU inference, so only CPU inference is enabled.
-
Chang et al. RealtimeStereo
- Official implementation and pre-trained models: https://github.com/JiaRenChang/RealtimeStereo
- I exported the pytorch implementation to torch script via tracing with some minor changes to the code https://github.com/JiaRenChang/RealtimeStereo/pull/15 . See chang_realtimestereo_to_torchscript_onnx.py.
License
The code of stereodemo is MIT licensed, but the pre-trained models are subject to the license of their respective implementation.
The sample images have the license of their respective source, except for datasets/oak-d which is licenced under Creative Commons Attribution 4.0 International License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file stereodemo-0.2.tar.gz
.
File metadata
- Download URL: stereodemo-0.2.tar.gz
- Upload date:
- Size: 30.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02fb2490caa2bdb243ecb8a575897d2fb0cb5d5de24fd5598b8e36aab36e4a07 |
|
MD5 | a3e7d9cd397467c7f785121b414e185e |
|
BLAKE2b-256 | 0d5b67bec023516190074aebd7a801cd585cd11158f8a7ce2ec7ab7037259162 |
File details
Details for the file stereodemo-0.2-py3-none-any.whl
.
File metadata
- Download URL: stereodemo-0.2-py3-none-any.whl
- Upload date:
- Size: 29.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13331f0de84694f764430073e8d913f2c747e256d9dab0c00f7eb839153f6d0e |
|
MD5 | 3b0d503bdd0aa515690d808b6a722280 |
|
BLAKE2b-256 | 8779397d530836764607c41a75e14bdb7d67c6d8a5f36834f4b0b00b5f915167 |