A PyTorch implementation of DeepDream
Project description
neural-dream
This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt.
Here we DeepDream a photograph of the Tubingen in Germany with a variety of settings:
Specific Channel Selection
Channel Selection Based On Activation Strength
You can select channels automatically based on their activation strength.
Clockwise from upper left: The top 10 weakest channels, the 10 most average channels, the top 10 strongest channels, and all channels
Setup:
Dependencies:
Optional dependencies:
- For CUDA backend:
- CUDA 7.5 or above
- For cuDNN backend:
- cuDNN v6 or above
- For ROCm backend:
- ROCm 2.1 or above
- For MKL backend:
- MKL 2019 or above
- For OpenMP backend:
- OpenMP 5.0 or above
After installing the dependencies, you'll need to run the following script to download the VGG model:
python models/download_models.py
This will download the original VGG-19 model. The original VGG-16 model will also be downloaded. By default the original VGG-19 model is used.
If you have a smaller memory GPU then using NIN Imagenet model will be better and gives slightly worse yet comparable results. You can get the details on the model from BVLC Caffe ModelZoo. The NIN model is downloaded when you run the download_models.py
script.
You can find detailed installation instructions for Ubuntu and Windows in the installation guide.
Usage
Basic usage:
python neural_dream.py -content_image <image.jpg>
cuDNN usage with NIN Model:
python neural_dream.py -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet.pth -gpu 0 -backend cudnn -num_iterations 10 -seed 123 -dream_layers relu0,relu3,relu7,relu12 -dream_weight 10 -image_size 512 -optimizer adam
Note that paths to images should not contain the ~
character to represent your home directory; you should instead use a relative
path or a full absolute path.
Options:
-image_size
: Maximum side length (in pixels) of the generated image. Default is 512.-gpu
: Zero-indexed ID of the GPU to use; for CPU mode set-gpu
toc
.
Optimization options:
-dream_weight
: How much to weight DeepDream. Default is 1e3.-tv_weight
: Weight of total-variation (TV) regularization; this helps to smooth the image. Default is 1e-3. Set to 0 to disable TV regularization.-num_iterations
: Default is 10.-init
: Method for generating the generated image; one ofrandom
orimage
. Default isimage
which initializes with the content image;random
uses random noise to initialize the input image.-init_image
: Replaces the initialization image with a user specified image.-jitter
: Apply jitter to image. Default is 32. Set to 0 to disable jitter.-optimizer
: The optimization algorithm to use; eitherlbfgs
oradam
; default isadam
. L-BFGS tends to give better results, but uses more memory. Switching to ADAM will reduce memory usage; when using ADAM you will probably need to play with other parameters to get good results, especially the style weight, content weight, and learning rate.-learning_rate
: Learning rate to use with the ADAM optimizer. Default is 1e1.-normalize_weights
: If this flag is present, style and content weights will be divided by the number of channels for each layer. Idea from PytorchNeuralStyleTransfer.-loss_mode
: The DeepDream loss mode;bce
,mse
,mean
, ornorm
; default ismean
.
Output options:
-output_image
: Name of the output image. Default isout.png
.-print_iter
: Print progress everyprint_iter
iterations. Set to 0 to disable printing.-print_octave_iter
: Print octave progress everyprint_octave_iter
iterations. Default is set to 0 to disable printing.-save_iter
: Save the image everysave_iter
iterations. Set to 0 to disable saving intermediate results.-save_octave_iter
: Save the image everysave_octave_iter
iterations. Default is set to 0 to disable saving intermediate results.
Layer options:
-dream_layers
: Comma-separated list of layer names to use for DeepDream reconstruction.
Channel options:
-channels
: Comma-separated list of channels to use for DeepDream. If-channel_mode
is set to a value other thanall
, only the first value in the list will be used.-channel_mode
: The DeepDream channel selection mode;all
,strong
,avg
, orweak
; default isall
. Thestrong
option will select the strongest channels, whileweak
will do the same with the weakest channels. Theavg
option will select the most average channels instead of the strongest or weakest. The number of channels selected bystrong
,avg
, orweak
is based on the first value for the-channels
parameter.
Octave Options:
-num_octaves
: Number of octaves per iteration. Default is 4.-octave_scale
: Value for resizing the image by. Default is 0.25.-octave_iter
: Number of iterations per octave. Default is 100.
Other options:
-original_colors
: If you set this to 1, then the output image will keep the colors of the content image.-model_file
: Path to the.pth
file for the VGG Caffe model. Default is the original VGG-19 model; you can also try the original VGG-16 model.-model_type
: Whether the model was trained using Caffe or PyTorch preprocessing;caffe
orpytorch
; default iscaffe
.-pooling
: The type of pooling layers to use; one ofmax
oravg
. Default ismax
. The VGG-19 models uses max pooling layers, but the paper mentions that replacing these layers with average pooling layers can improve the results. I haven't been able to get good results using average pooling, but the option is here.-seed
: An integer value that you can specify for repeatable results. By default this value is random for each run.-multidevice_strategy
: A comma-separated list of layer indices at which to split the network when using multiple devices. See Multi-GPU scaling for more details.-backend
:nn
,cudnn
,openmp
, ormkl
. Default isnn
.mkl
requires Intel's MKL backend.-cudnn_autotune
: When using the cuDNN backend, pass this flag to use the built-in cuDNN autotuner to select the best convolution algorithms for your architecture. This will make the first iteration a bit slower and can take a bit more memory, but may significantly speed up the cuDNN backend.-clamp
: If this flag is enabled, every iteration values will be clamped so that they are the model's input range.
Frequently Asked Questions
Problem: The program runs out of memory and dies
Solution: Try reducing the image size: -image_size 512
(or lower). Note that different image sizes will likely
require non-default values for -octave_scale
and -num_octaves
for optimal results.
If you are running on a GPU, you can also try running with -backend cudnn
to reduce memory usage.
Problem: -backend cudnn
is slower than default NN backend
Solution: Add the flag -cudnn_autotune
; this will use the built-in cuDNN autotuner to select the best convolution algorithms.
Problem: Get the following error message:
Missing key(s) in state_dict: "classifier.0.bias", "classifier.0.weight", "classifier.3.bias", "classifier.3.weight". Unexpected key(s) in state_dict: "classifier.1.weight", "classifier.1.bias", "classifier.4.weight", "classifier.4.bias".
Solution: Due to a mix up with layer locations, older models require a fix to be compatible with newer versions of PyTorch. The included donwload_models.py
script will automatically perform these fixes after downloading the models.
Memory Usage
By default, neural-dream
uses the nn
backend for convolutions and L-BFGS for optimization. These give good results, but can both use a lot of memory. You can reduce memory usage with the following:
- Use cuDNN: Add the flag
-backend cudnn
to use the cuDNN backend. This will only work in GPU mode. - Use ADAM: Add the flag
-optimizer adam
to use ADAM instead of L-BFGS. This should significantly reduce memory usage, but may require tuning of other parameters for good results; in particular you should play with the learning rate, content weight, and style weight. This should work in both CPU and GPU modes. - Reduce image size: If the above tricks are not enough, you can reduce the size of the generated image;
pass the flag
-image_size 256
to generate an image at half the default size.
With the default settings, neural-dream uses about GB of GPU memory on my system; switching to ADAM and cuDNN reduces the GPU memory footprint to about GB.
Multi-GPU scaling
You can use multiple CPU and GPU devices to process images at higher resolutions; different layers of the network will be
computed on different devices. You can control which GPU and CPU devices are used with the -gpu
flag, and you can control
how to split layers across devices using the -multidevice_strategy
flag.
For example in a server with four GPUs, you can give the flag -gpu 0,1,2,3
to process on GPUs 0, 1, 2, and 3 in that order; by also giving the flag -multidevice_strategy 3,6,12
you indicate that the first two layers should be computed on GPU 0, layers 3 to 5 should be computed on GPU 1, layers 6 to 11 should be computed on GPU 2, and the remaining layers should be computed on GPU 3. You will need to tune the -multidevice_strategy
for your setup in order to achieve maximal resolution.
We can achieve very high quality results at high resolution by combining multi-GPU processing with multiscale generation as described in the paper Controlling Perceptual Factors in Neural Style Transfer by Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, Aaron Hertzmann and Eli Shechtman.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for neural_dream-0.0.1.dev0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce334bf9e57286bad0abda5d4e6c753565a6993b446e2d3a0cbe80b4b17b8c07 |
|
MD5 | 14fc8ae86d29772b262150ff3650aacb |
|
BLAKE2b-256 | d7c7bc27021f9151ed81dfb93a4b4cea831cf6774650de2233614ba582759d21 |