Camtools: camera tools for computer vision.

Project description

CamTools

Camtools: camera tools for computer vision. Useful for plotting, converting, projecting, and ray casting with camera parameters.

Installation

# Option 1: install from pip.
pip install camtools

# Option 2: install from git.
pip install git+https://github.com/yxlao/camtools.git

# Option 3: install from source.
git clone https://github.com/yxlao/camtools.git
cd camtools
pip install -e .  # Dev mode, if you want to modify camtools.
pip install .     # Install mode, if you want to use camtools only.

What can you do with CamTools?

Plot cameras. Useful for debugging 3D reconstruction and NeRFs!

import camtools as ct
import open3d as o3d
cameras = ct.camera.create_camera_ray_frames(Ks, Ts)
o3d.visualization.draw_geometries([cameras])

Convert camera parameters.

pose = ct.convert.T_to_pose(T)     # Convert T to pose
R, t = ct.convert.T_to_R_t(T)      # Convert T to R and t
C    = ct.convert.pose_to_C(pose)  # Convert pose to camera center
K, T = ct.convert.P_to_K_T(P)      # Decompose projection matrix to K and T
                                   # And more...

Projection and ray casting.

# Project 3D points to pixels.
pixels = ct.project.points_to_pixel(points, K, T)

# Back-project depth image ot 3D points.
points = ct.project.im_depth_to_points(depth, K, T)

# Ray cast a triangle mesh to depth image.
im_depth = ct.raycast.mesh_to_depths(mesh, Ks, Ts, height, width)

# And more...

Image I/O and depth I/O with no surprises.
```
ct.io.imread()
ct.io.imwrite()

ct.io.imread_detph()
ct.io.imwrite_depth()
```
Strict type checks and range checks are enforced. These APIs are specifically designed to solve the following pain points:
- Is my image float32 or uint8?
- Does it has range [0, 1] or [0, 255]?
- Is it RGB or BGR?
- Do my image have alpha channel?
- When saving depth image as integer-based .png, is it correctly scaled?

Useful command-line tools (run in terminal).

# Crop image boarders.
ct crop-boarders *.png --pad_pixel 10 --skip_cropped --same_crop

# Draw synchronized bounding boxes interactively.
ct draw-bboxes path/to/a.png path/to/b.png

# For more help.
ct --help

And more.
- Solve line intersections.
- COLMAP tools.
- Points normalization.
- ...

Camera conventions

We follow the standard pinhole camera model:

Camera coordinate: right-handed, with $Z$ pointing away from the camera towards the view direction and $Y$ axis pointing down. Note that this is different from the Blender convention, where $Z$ points towards the opposite view direction and the $Y$ axis points up.
Image coordinate: starts from the top-left corner of the image, with $x$ pointing right (corresponding to the image width) and $y$ pointing down (corresponding to the image height). This is also consistent with OpenCV, but pay attention that the 0-th dimension in the image array is the height (i.e., $y$) and the 1-th dimension is the width (i.e., $x$). That is:
- $x$ <=> width <=> column <=> the 1-th dimension
- $y$ <=> height <=> row <=> the 0-th dimension

K: (3, 3) camera intrinsic matrix.

K = [[fx,  s, cx],
     [ 0, fy, cy],
     [ 0,  0,  1]]

T or W2C: (4, 4) camera extrinsic matrix.
```
T = [[R  | t   = [[R_01, R_02, R_03, t_0],
      0  | 1]]    [R_11, R_12, R_13, t_1],
                  [R_21, R_22, R_23, t_2],
                  [   0,    0,    0,   1]]
```
- T is also known as the world-to-camera W2C matrix, which transforms a point in the world coordinate to the camera coordinate.
- T's shape is (4, 4), not (3, 4).
- T must be invertible, where np.linalg.inv(T) = pose.
- The camera center C in world coordinate is projected to [0, 0, 0, 1] in camera coordinate, i.e.,
```
T @ C = np.array([0, 0, 0, 1]).T
```
R: (3, 3) rotation matrix.
```
R = T[:3, :3]
```
- R is a rotation matrix. It is an orthogonal matrix with determinant 1, as rotations preserve volume and orientation.
  - R.T == np.linalg.inv(R)
  - np.linalg.norm(R @ x) == np.linalg.norm(x), where x is a (3, ) vector.
t: (3,) translation vector.
```
t = T[:3, 3]
```
- t's shape is (3,), not (3, 1).
pose or C2W: (4, 4) camera pose matrix. It is the inverse of T.
```
pose = T.inv()
```
- pose is also known as the camera-to-world C2W matrix, which transforms a point in the camera coordinate to the world coordinate.
- pose is the inverse of T, i.e., pose == np.linalg.inv(T).
C: camera center.
```
C = pose[:3, 3]
```
- C's shape is (3,), not (3, 1).
- C is the camera center in world coordinate. It is also the translation vector of pose.
P: (3, 4) the camera projection matrix.
- P is the world-to-pixel projection matrix, which projects a point in the homogeneous world coordinate to the homogeneous pixel coordinate.
- P is the product of the intrinsic and extrinsic parameters.
```
# P = K @ [R | t]
P = K @ np.hstack([R, t[:, None]])
```
- P's shape is (3, 4), not (4, 4).
- It is possible to decompose P into intrinsic and extrinsic matrices by QR decomposition.
- Don't confuse P with pose.
For more details, please refer to the following blog posts: part 1, part 2, and part 3.

Future works

Refined APIs.
Full PyTorch/Numpy compatibility.
Unit tests.

Project details

Release history Release notifications | RSS feed

0.1.5

Jul 3, 2024

0.1.4

Dec 2, 2023

0.1.3

Sep 9, 2023

0.1.2

Jul 20, 2023

0.1.1

Jul 19, 2023

This version

0.1.0

Jul 18, 2023

0.0.1

Jul 18, 2023

0.0.0

Nov 2, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

camtools-0.1.0.tar.gz (36.4 kB view hashes)

Uploaded Jul 18, 2023 Source

Built Distribution

camtools-0.1.0-py3-none-any.whl (37.1 kB view hashes)

Uploaded Jul 18, 2023 Python 3

Hashes for camtools-0.1.0.tar.gz

Hashes for camtools-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e42e9c44180bd9ec01d933bceda502a4cfd8b3368ee02f18a6f095597c66ec81`
MD5	`a70d2a5b35eadd91d891df7733661f5e`
BLAKE2b-256	`92bee07b945817321015fec8afe7ac70766731604c76cab03a8166af1e88b1cb`

Hashes for camtools-0.1.0-py3-none-any.whl

Hashes for camtools-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ae7d51ac47ba98132ef9073e82748411dac60cde141c24b672a2d61b0a1bb98c`
MD5	`9c26a698a204e227c22bf55408158d88`
BLAKE2b-256	`ee63ebd2939686fb1e91d56e953547825a50c21e4bb37a99a4a8023764057272`