Skip to main content

No project description provided

Project description

Detextify

What is this?

TL;DR: A Python library to remove unwanted pseudo-text from images generated by your favorite generative AI models (Stable Diffusion, Midjourney, DALL·E).

Before After
before after

So, why should I care?

We all know generative AI is the coolest thing since sliced bread 🍞.

But try using any off-the-shelf generative vision model and you'll quickly see that these systems can get... creative with interpreting your prompts.

Specifically, you'll observe all kinds of weird artifacts on your images from extra fingers on hands, to arms coming out of chests, to alien text written in random places.

For generative systems to actually be usable in downstream applications, we need to better control these outputs and mitigate unwanted effects.

We believe the next frontier for generative AI is about robustness and trust. In other words, how can we architect these systems to be controllable, relevant, and predictably consistent with our needs?

Detextify is the first phase in our vision of robustifying generative AI.

If we get this right, we will unlock slews of new applications for generative systems that will change the landscape of human-AI collaboration. 🌎

Cute, but what are you actually doing?

Detextify runs text detection on your image, masks the text boxes, and in-paints the masked regions until your image is text-free. Detextify can be run entirely on your local machine (using Tesseract for text detection and Stable Diffusion for in-painting), or can call existing APIs (Azure for text detection and OpenAI or Replicate for in-painting).

Installation

pip install detextify

Additionally:

  • To run text detection locally (as opposed to using the Azure API), you need to install Tesseract.
  • To run in-painting locally (as opposed to using the OpenAI or Replicate APIs), you need a GPU with CUDA and cuDNN installed.

Usage

You can remove unwanted text from your image in just a few lines 💪:

from detextify.text_detector import TesseractTextDetector
from detextify.inpainter import LocalSDInpainter
from detextify.detextifier import Detextifier

text_detector = TesseractTextDetector("/path/to/tesseract/installation")
detextifier = Detextifier(text_detector, LocalSDInpainter())
detextifier.detextify("/my/input/image/path.png", "/my/output/image/path.png")

and 💣💥, just like that, your image is cleared of any bizarre text artifacts.

Or if you want to clean up a directory of PNG images, just wrap it in a for-loop:

import glob
from detextify.text_detector import TesseractTextDetector
from detextify.inpainter import LocalSDInpainter
from detextify.detextifier import Detextifier

text_detector = TesseractTextDetector("/path/to/tesseract/installation")
detextifier = Detextifier(text_detector, LocalSDInpainter())
for img_file in glob.glob("/path/to/dir/*.png"):
    detextifier.detextify(img_file, img_file.replace(".png", "_detextified.png"))

We provide multiple implementations for text detection and in-painting (both local and API-based), and you are also free to add your own.

Text Detectors

  1. TesseractTextDetector (based on Tesseract) runs locally. Follow this guide to install the tesseract library locally. On Ubuntu:
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

To find the path where it was installed (and pass it to the TesseractTextDetector constructor):

whereis tesseract
  1. AzureTextDetector calls a computer vision API from Microsoft Azure. You will first need to create a Computer Vision resource via the Azure portal. Once created, take note of the endpoint and the key.
AZURE_CV_ENDPOINT = "https://your-endpoint.cognitiveservices.azure.com"
AZURE_CV_KEY = "your-azure-key"
text_detector = AzureTextDetector(AZURE_CV_ENDPOINT, AZURE_CV_KEY)

Our evaluation shows that the two text detectors produce comparable results.

In-painters

  1. LocalSDInpainter (implemented via Huggingface's diffusers library) runs locally and requires a GPU. Defaults to Stable Diffusion v2 for in-painting.
  2. ReplicateSDInpainter calls the Replicate API. Defaults to Stable Diffusion v2 for in-painting (and requires an API key).
  3. DalleInpainter calls the DALL·E 2 API from OpenAI (and requires an API key).
# You only need to instantiate one of the following:
local_inpainter = LocalSDInpainter()
replicate_inpainter = ReplicateSDInpainter("your-replicate-key")
dalle_inpainter = DalleInpainter("your-openai-key")

Contributing

To contribute, clone the repository, make your changes, commit and push to your clone, and submit a pull request.

To build the library, you need to install poetry:

curl -sSL https://install.python-poetry.org | python3 -
# Add poetry to your PATH. Note the specific path will differ depending on your system.
export PATH="/home/ubuntu/.local/bin:$PATH"
# Check the installation was successful:
poetry --version

Install dependencies for detextify:

poetry install

To execute a script, run:

poetry run python your_script.py

Please run the unit tests to make sure that your changes are not breaking the codebase:

poetry run pytest

Authors

This project was authored by Mihail Eric and Julia Turc. If you are building in the generative AI space, we want to hear from you!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

detextify-0.1.9.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

detextify-0.1.9-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file detextify-0.1.9.tar.gz.

File metadata

  • Download URL: detextify-0.1.9.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for detextify-0.1.9.tar.gz
Algorithm Hash digest
SHA256 4460b1ae0c4c6bd8671683a23ec737714d322d039c464fa8fcde12e5ac323813
MD5 d5a771e71b28a2e7d842e93ebe1782c4
BLAKE2b-256 ada7a1ed15377e406118e6aeb2a0c255a106462847c95d6744559aa6b56a652c

See more details on using hashes here.

File details

Details for the file detextify-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: detextify-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for detextify-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6c693259f8b43456d12631b2e06e610adfd4db708af8a6a3bf8f882a05e2b5ec
MD5 8c8a355074bbf2e290ba2b85a2c1d900
BLAKE2b-256 fbae5547fcb5be266220a520b2c77f4de30ef46dfc466a530e22976651374dd1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page