Skip to main content

A Non Maximal Suppression Python Package

Project description

opencv-text-detection

This is a derivative of pyimagesearch.com OpenCV-text-detection and the OpenCV text detection c++ example

This code began as an attempt to rotate the rectangles found by EAST. The pyimagesearch code doesn't rotate the rectangles due to a limitation in the OpenCV Python bindings -- specifically, there isn't a NMSBoxes call for rotated rectangles (this turns out to be a bigger problem)

EAST is an Efficient and Accurate Scene Text detection pipeline. Adrian's post does a great job explaining EAST. In summary, EAST detects text in an image (or video) and provides geometry and confidence scores for each block of text it detects. Its worth noting that:

  • The geometry of the rectangles is given as offsetX, offsetY, top, right, bottom, left, theta.
  • Top is the distance from the offset point to the top of the rectangle, right is the distance from the offset point to the right edge of the rectangle and so on. The offset point is most likey not the center of the rectangle.
  • The rectangle is rotated around the offset point by theta radians.

While the EAST paper is pretty clear about determining the positioning and size of the rectangle, its not very clear about the rotation point for the rectangle. I used the offset point as it appeared to provide the best visual results.

Modifications

In the PyImageSearch example code, Non Maximal Suppression (NMS) is performed on the results provided by EAST on the unrotated rectangles. The unrotated rectangles returned by NMS are then drawn on the original image.

Initially, I modified the code to rotate the rectangles selected by NMS and then drawing them on the original image. As an example, the images below show the unrotated and rotated rectangles.

Unrotated Rotated
Unrotated Rotated

The Challenge

With my assumption that each rectangle returned by EAST was to be rotated around its offset, I wanted to see how the individual rotations would impact the results of NMS. That is, rather than applying NMS to the EAST rectangles and then drawing them rotated, could I rotate the rectangles and then run them through NMS? This became a challenge as the PyImageSearch imutils and OpenCV Python bindings don't support NMS applied to rectangles rotated about an arbitrary point.

The code in this repo is a result of that challenge. nms.py has three functions for performing NMS:

  • nms_rboxes(rotated_rects, scores)

    rotated_rects is a list of rotated rectangles described by ((cx, cy), (w,h), deg) where (cx, cy) is the center of the rectangle, (w,h) are the width and height and deg is the rotation angle in degrees about the center. The format for rrects was chosen to match the OpenCV c++ implementation of NMSBoxes.

    scores is the corresponding list of confidence scores for the rrects.

    This function converts the list of rotated rectangles into a list of polygons (contours) described by its verticies and passes this list along with other received parameters to nms_polygons.

    Returns a list of indicies of the highest scoring, non-overlapping rotated_rects.

  • nms_polygons(polys, scores)

    polys is a list of polygons, each described by its verticies

    scores is the corresponding list of confidence scores for the rrects.

    Returns a list of indicies of the highest scoring, non-overlapping polys.

  • nms_rects(rects, scores)

    rects is a list of unrotated, upright rectangles each described by (x, y, w, h) where x,y is the upper left corner of the rectangle and w, h are its width and height.

    scores is the corresponding list of confidence scores for the rects.

    Returns a list of indicies of the highest scoring, non-overlapping rects.

Each of the above functions has an optional named parameter nms_function and accept additional parameters that are received as **kwargs.

  • nms_function if specified, must be one of felzenswalb, malisiewicz or fast. If omitted, defaults to malisiewicz. Note that the value for nms_function is not quoted.

The felzenswalb implementation was transmogrified from this PyImageSearch blog post.

indicies = nms_polygons(polygons, scores, nms_function=felzenswalb)

The malisiewicz implementation was transmogrified from this PyImageSearch blog post. Per Adrian's post, this implementation is much faster than felzenswalb but be aware that when running nms_rrects or nms_polygons, some of the vectorization is lost :( and performance suffers a bit.

indicies = nms_polygons(polygons, scores, nms_function=malisiewicz)

The fast implmentation is an approximation of the OpenCV c++ NMSFAST routine. Which, as the inline comments there will tell you, was inspired by Piotr Dollar's NMS implementation in EdgeBox.

indicies = nms_polygons(polygons, scores, nms_function=fast)
  • kwargs are used to pass in custom values. All of these are optional and if not specified, default values are used:

nms_threshold: The value used for making NMS overlap comparisons. Defaults to 0.4 if omitted.

score_threshold: The value used to cull out rectangles/polygons based on their associated score. Defaults to 0.3.

top_k: Used to truncate the scores (after sorting) to include only the top_k scores. If top_k is 0, all scores are included. Default value is 0.

eta is only applicable for fast and is a coefficient in the adaptive threshold formula: nms_thresholdi=eta⋅nms_threshold. The default value is 1.0

As an example of what's possible:

indicies = nms_polygons(polygons, scores, nms_function=fast, nms_threshold=0.45, eta=0.9, score_threshold=0.6, top_k=100)

Results

As you might expect, performing NMS on the rotated rectangles doesn't really change much on images with sparse text like the Lebron images above. However, with busier images there can be a difference -- EAST doesn't perform well with the image below, but it's instrumental for examining the NMS results.

Unrotated Rotated
Unrotated Rotated
Malisiewicz (above) 15 Rectangles 11 Rectangles
Felzenswalb 10 Rectangles 9 Rectangles
Fast 10 Rectangles 9 Rectangles

Run the Code

This code was developed and run on Python 3.7 and OpenCV 4.0.0-pre on OSX. You can find helpful instructions for setting up this environment on yet another PyImageSearch blog post

Clone the repo and run:

python text_detection.py --east frozen_east_text_detection.pb --image images/lebron_james.jpg

What's Next?

I have not implemented text_detection_video.py

There are not any tests.

Thanks

A big thanks to Adrian Rosebrock (@PyImageSearch) at PyImageSearch -- he writes some amazing and inspiring content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nms-0.1.0.tar.gz (13.9 kB view hashes)

Uploaded Source

Built Distribution

nms-0.1.0-py3-none-any.whl (14.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page