Skip to main content

An implementation of the DiffEdit algorithm for prompt-based mask creation and inpating. For more information, see the Readme file.

Project description

DiffEdit


An unofficial implementation of DiffEdit based on 🤗 Hugging Face , this repo and PyTorch. This methodology leverage the diffusion process to automatically extract a mask from an image given a prompt. The mask is then used to inpaint the image with the new content. To get a clearer overview of the process, you can take a look at the DiffEdit.ipynb notebook.

The wheel for this repo is avaivalable here: https://pypi.org/project/DiffEdit/

Results

Prompt: removeadd)Original image Mask Edited
"lion" ⟶ "dog" An AI generated lion The mask of the image region containing the 'lion' The edited image with the 'dog' instead of the 'lion'
"house" ⟶ "3-floor hotel" An AI generated house The mask of the image region containing the 'house' The edited image with the '3-floor hotel' instead of the 'house'
"an F1 race" ⟶ "a motogp race" An AI generated image of an F1 competition The mask of the image region containing the F1 cars The edited image with the 'motogp' instead of the 'F1'

All the previous masks was generated with: num-samples = 10

Installation

You can install the package in different ways, depending on your needs.

Optional step (recommended)

Create a virtual environment, to avoid conflicts with other packages. Here are some alternatives:

  • with venv:
python -m venv venv
source venv/bin/activate
  • with poetry:
poetry shell
  • with conda:
conda create -n diff-edit python=3.10
conda activate diff-edit

Install the package from PyPi:

pip install diff-edit
Alternative ways

Install the package from source:

poetry install

Install the package in editable mode, suggested for further development:

pip install -e .

Usage

For a fast evaluation use the script image_edit.py:

python image_edit.py --input_image <path_to_image> --output_image <path_to_output_image> --prompt <prompt>

An example of usage is the following (resulting in this image):

python image_edit.py --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10

or you can use the Command Line Interface (CLI) to interact with the script:

diff-edit --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10

The script has the following options:

python image_edit.py --help
usage: image_edit.py [-h] [--remove-prompt REMOVE_PROMPT] [--add-prompt ADD_PROMPT] [--image IMAGE] [--image-link IMAGE_LINK] [--device {cpu,cuda,mps}]
                     [--vae-model VAE_MODEL] [--tokenizer TOKENIZER] [--text-encoder TEXT_ENCODER] [--unet UNET] [--scheduler SCHEDULER]
                     [--scheduler-start SCHEDULER_START] [--scheduler-end SCHEDULER_END] [--num-train-timesteps NUM_TRAIN_TIMESTEPS] [--beta-schedule BETA_SCHEDULE]
                     [--inpainting INPAINTING] [--seed SEED] [--n N] [--save-path SAVE_PATH]

Diffusion Image Editing arguments

options:
  -h, --help            show this help message and exit
  --remove-prompt REMOVE_PROMPT
                        What you want to remove from the image
  --add-prompt ADD_PROMPT
                        What you want to add to the image
  --image IMAGE         Path to the image to edit
  --image-link IMAGE_LINK
                        Link to the image to edit
  --device {cpu,cuda,mps}
  --vae-model VAE_MODEL
                        Model name. E.g. stabilityai/sd-vae-ft-ema
  --tokenizer TOKENIZER
                        Tokenizer to tokenize the text. E.g. openai/clip-vit-large-patch14
  --text-encoder TEXT_ENCODER
                        Text encoder to encode the text. E.g. openai/clip-vit-large-patch14
  --unet UNET           UNet model for generating the latents. E.g. CompVis/stable-diffusion-v1-4
  --scheduler SCHEDULER
                        Noise scheduler. E.g. LMSDiscreteScheduler
  --scheduler-start SCHEDULER_START
                        Scheduler start value
  --scheduler-end SCHEDULER_END
                        Scheduler end value
  --num-train-timesteps NUM_TRAIN_TIMESTEPS
                        Number of training timesteps
  --beta-schedule BETA_SCHEDULE
                        Beta schedule
  --inpainting INPAINTING
                        Inpainting model. E.g. runwayml/stable-diffusion-inpainting
  --seed SEED           Random seed
  --num-samples N       Number of diffusion steps to generate the mask
  --save-path SAVE_PATH
                        Path to save the result. Default is <script_folder>/result.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffedit-0.0.2rc10.tar.gz (947.7 kB view hashes)

Uploaded Source

Built Distribution

diffedit-0.0.2rc10-py3-none-any.whl (945.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page