Skip to main content

image processing and stuff

Project description

image2layout_computer_vision

An image processing module for some computer vision tasks (public module for image2layout)

Package Page: pypi

Features:

  1. Text Detection and Recognition (OCR)
  2. Color extraction (background and main foreground)

Installations

Install with python/conda [Linux]

  1. (Optional) Conda
curl https://repo.anaconda.com/archive/Anaconda3-2023.03-1-Linux-x86_64.sh -o ~/conda.sh
bash ~/conda.sh -b -f -p /opt/conda
rm ~/conda.sh
conda init --all --dry-run --verbose

conda create -n cv python=3.10 -y
conda activate cv
  1. Python libraries (python>=3.8)

CPU

python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python -m pip install paddleocr paddlepaddle

python -m pip install datasets transformers scikit-learn Pillow numpy pandas chardet
python -m pip install --upgrade image2layout-computer-vision

GPU

# python -m pip install 'torch>=2.0' torchvision torchaudio
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia -y
python -m pip install paddleocr paddlepaddle-gpu

python -m pip install datasets transformers scikit-learn Pillow numpy pandas chardet
python -m pip install --upgrade image2layout-computer-vision

Install with docker

For running with CPU on Ubuntu

sudo docker build --tag cv -f Dockerfile_cpu .

sudo docker run -it -p 0.0.0.0:8000:8000 -p 0.0.0.0:8001:8001 -v $(pwd):/app cv bash

From inside container

cd deployment
conda activate cv
python api_serve.py -n CV -p 8000

from git

python -m pip install git+https://github.com/felix-do-wizardry/image2layout-computer-vision.git

Usage

Note: Input image/images expects a filepath, an Image.Image object, or a numpy array

  1. Run this python code to pre-download model weights
from image2layout_computer_vision import OCR
OCR._load()
  1. Recognize texts
from image2layout_computer_vision.ocr as OCR

# [A] no text, box only, 2 lists of dicts with keys [text (empty), box, score (empty)]
data_merged, data_raw = OCR.detect_text_data('path/to/image.png', recognition=False)

# [B] text + box from multiple images -> list of list of dicts with keys [text, box, score]
data_raw_multi = OCR.detect_text_elements(['path/to/image.png', 'path/to/image2.png'])
  1. Extract colors
import image2layout_computer_vision as icv

# [A] list [ tuples [ 2 rgb-color tuples ] ] for background and foreground
# sample output: [((2, 2, 2), (4, 4, 4)), ((6, 6, 6), (8, 8, 8))]
colors_all = icv.extract_colors(['path/to/image.png', 'path/to/image2.png'])

# [B] 2 rgb-color tuples for background and foreground
# sample output: ((9, 9, 9), (6, 6, 6))
color_bg, color_fg = icv.extract_colors('path/to/image.png')
  1. Detect elements [work-in-progress]
import image2layout_computer_vision.yolov6 as Detection

# pd.DataFrame with columns [box, score, class_index, class_name]
df_element = Detection.detect_element('path/to/image.png')

Build

(for building and uploading this package)

python -m pip install --upgrade pip
python -m pip install --upgrade build twine "keyring<19.0"

rm -rf dist
python -m build
python -m twine upload dist/* --verbose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image2layout_computer_vision-0.1.10.tar.gz (13.8 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page