Skip to main content

A tool for extracting a text line image from the contour with different methods

Project description

Line image extractor

This is a tool and a library to be used for extracting line images. Built by Teklia and freely available as open-source under the MIT licence.

It supports different extraction methods:

  • boundingRect - bounding rectangle of the line polygon
  • polygon - exact polygon
  • min_area_rect - minimum area rectangle containing the polygon
  • deskew_polygon - deskew the polygon
  • deskew_min_area_rect - deskew the minimum area rectangle
  • skew_polygon - skew the polygon (rotate by some angle)
  • skew_min_area_rect - skew the minimum area rectangle (rotate by some angle)

Install the library using stable version from Pypi:

pip install teklia-line-image-extractor

Install the library in development mode:

pip install -e .

Test extraction:

line-image-extractor -i tests/data/page_img.jpg -o out.jpg -p tests/data/line_polygon.json -e deskew_min_area_rect --color

How to use it?:

from pathlib import Path
import numpy as np
from line_image_extractor.extractor import extract, read_img, save_img
from line_image_extractor.image_utils import polygon_to_bbox
from line_image_extractor.image_utils import Extraction

page_img = read_img(Path("tests/data/page_img.jpg"))
polygon = np.asarray([[241, 1169], [2287, 1251], [2252, 1190], [244, 1091], [241, 1169]])
bbox = polygon_to_bbox(polygon)
extracted_img = extract(
    page_img, polygon, bbox, Extraction.polygon
)
save_img("line_output.jpg", extracted_img)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

teklia_line_image_extractor-0.4.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file teklia_line_image_extractor-0.4.0.tar.gz.

File metadata

File hashes

Hashes for teklia_line_image_extractor-0.4.0.tar.gz
Algorithm Hash digest
SHA256 e573d097baad0c22410f37f3f780de8061554eb16fa2839f9e68c0f384987053
MD5 d5559d2ad9076f3b2ff838b8c3a73741
BLAKE2b-256 ee92e4eca272f8018d2fdfe90e9b1253785847551980cc64e3c153f5ad5282e5

See more details on using hashes here.

File details

Details for the file teklia_line_image_extractor-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for teklia_line_image_extractor-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 18d25f33cbc8e482153bc7ebcbe262c6117948f0a29872be53c85ec0bf839254
MD5 42ab5dbba66b6d532f3fdfc7497479b4
BLAKE2b-256 29b19f25742e7feca596bbfface7be676f65b73bcb553159083151038306760c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page