Library to use Google Lens OCR via API used in Chromium.

These details have not been verified by PyPI

Project links

Homepage

Project description

Chrome Lens API for Python

This project provides a Python library and CLI tool for interacting with Google Lens's OCR functionality via the API used in Chromium. This allows you to process images and extract text data, including full text, coordinates, and stitched text using various methods.

Features

Full Text Extraction: Extract the complete text from an image.
Coordinates Extraction: Extract text along with its coordinates.
Stitched Text: Reconstruct text from blocks using various methods:
- Default Full Text: Basic method for stitching text blocks.
- Old Method: Sequential text stitching.
- New Method: Enhanced text stitching.

PS. Lens has a problem with the way it displays full text, which is why methods have been added that stitch text from coordinates.

Installation

You can install the package using pip:

From PyPI (SOON)

pip install chrome-lens-py

From GIT

pip install git+https://github.com/bropines/chrome-lens-py.git

From source

Clone the repository and install the package:

git clone https://github.com/bropines/chrome-lens-api-py.git
cd chrome-lens-api-py
pip install -r requirements.txt
pip install .

Usage

You can use the lens_scan command from the CLI to process images and extract text data, or you can use the Python API to integrate this functionality into your own projects.

CLI Usage

lens_scan <image_file> <data_type>

Data Types

all: Get all data (full text, coordinates, and stitched text using both methods).
full_text_default: Get only the default full text.
full_text_old_method: Get stitched text using the old sequential method.
full_text_new_method: Get stitched text using the new enhanced method.
coordinates: Get text along with coordinates.

Example

To extract text using the new method for stitching:

lens_scan path/to/image.jpg full_text_new_method

To get all available data:

lens_scan path/to/image.jpg all

CLI Help

You can use the -h or --help option to display usage information:

lens_scan -h

Programmatic API Usage

In addition to the CLI tool, this project provides a Python API that can be used in your scripts.

Basic Programmatic Usage

First, import the LensAPI class:

from chrome_lens_py import LensAPI

Example Programmatic Usage

Instantiate the API:
```
api = LensAPI()
```

Process an image:

Get all data:

result = api.get_all_data('path/to/image.jpg')
print(result)

Get the default full text:

result = api.get_full_text('path/to/image.jpg')
print(result)

Get stitched text using the old method:

result = api.get_stitched_text_sequential('path/to/image.jpg')
print(result)

Get stitched text using the new method:

result = api.get_stitched_text_smart('path/to/image.jpg')
print(result)

Get text with coordinates:

result = api.get_text_with_coordinates('path/to/image.jpg')
print(result)

Programmatic API Methods

get_all_data(image_path): Returns all available data for the given image.
get_full_text(image_path): Returns only the full text from the image.
get_text_with_coordinates(image_path): Returns text along with its coordinates in JSON format.
get_stitched_text_smart(image_path): Returns stitched text using the enhanced method.
get_stitched_text_sequential(image_path): Returns stitched text using the basic sequential method.

Project Structure

/chrome-lens-api-py
│
├── /src
│   ├── /chrome_lens_py
│   │   ├── __init__.py           # Package initialization
│   │   ├── constants.py          # Constants used in the project
│   │   ├── utils.py              # Utility functions
│   │   ├── image_processing.py   # Image processing module
│   │   ├── request_handler.py    # API request handling module
│   │   ├── text_processing.py    # Text processing module
│   │   ├── lens_api.py           # API interface for use in other scripts
│   │   └── main.py               # CLI tool entry point
│
├── setup.py                      # Installation setup
├── README.md                     # Project description and usage guide
└── requirements.txt              # Project dependencies

Acknowledgments

Special thanks to dimdenGD for the method of text extraction used in this project. You can check out their work on the chrome-lens-ocr repository. This project is inspired by their approach to leveraging Google Lens OCR functionality.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Disclaimer

This project is intended for educational purposes only. The use of Google Lens OCR functionality must comply with Google's Terms of Service. The author of this project is not responsible for any misuse of this software or for any consequences arising from its use. Users are solely responsible for ensuring that their use of this software complies with all applicable laws and regulations.

Author

Bropines - Mail / Telegram

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.0.5

Sep 3, 2024

1.0.2

Aug 11, 2024

This version

1.0.0

Aug 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chrome_lens_py-1.0.0.tar.gz (11.2 kB view hashes)

Uploaded Aug 11, 2024 Source

Built Distribution

chrome_lens_py-1.0.0-py3-none-any.whl (11.5 kB view hashes)

Uploaded Aug 11, 2024 Python 3

Hashes for chrome_lens_py-1.0.0.tar.gz

Hashes for chrome_lens_py-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`aec28f5a223c0a6034b04a3e9b0925a7d90c9f23944905e1abb5013fba6fa86b`
MD5	`0bc88003e11b801a88b4d22fa3cebab5`
BLAKE2b-256	`98de00c4e1778df4f08c75a27f806e29c19a67f5ac85545fd718284e742b8c31`

Hashes for chrome_lens_py-1.0.0-py3-none-any.whl

Hashes for chrome_lens_py-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9988234eebded0e1e204abb484e0e97753d1e3b462b82b7d5de7c4d9c5f1d54d`
MD5	`f311d50f9a8731ac9a2fd8473bb4e635`
BLAKE2b-256	`10e873144b82b3b28067699db3a63b456dca3714ef46d7dbe580ea9c71711a93`