Simply put, the mission of this project is to colorize and restore old images. I'll get into the details in a bit, but first let's get to the pictures! BTW – most of these source images originally came from the TheWayWeWere subreddit, so credit to them for finding such great photos.
Maria Anderson as the Fairy Fleur de farine and Lyubov Rabtsova as her page in the ballet “Sleeping Beauty” at the Imperial Theater, St. Petersburg, Russia, 1890.
Woman relaxing in her livingroom (1920, Sweden)
Medical Students pose with a cadaver around 1890
Surfer in Hawaii, 1890
Whirling Horse, 1898
Interior of Miller and Shoemaker Soda Fountain, 1899
Paris in the 1880s
Edinburgh from the sky in the 1920s
Texas Woman in 1938
People watching a television set for the first time at Waterloo station, London, 1936
Geography Lessons in 1850
Chinese Opium Smokers in 1880
Deadwood, South Dakota, 1877
Siblings in 1877
Portsmouth Square in San Franscisco, 1851
Samurais, circa 1860s
Seneca Native in 1908
This is a deep learning based model. More specifically, what I've done is combined the following approaches:
The beauty of this model is that it should be generally useful for all sorts of image modification, and it should do it quite well. What you're seeing above are the results of the colorization model, but that's just one component in a pipeline that I'm looking to develop here with the exact same model.
What I develop next with this model will be based on trying to solve the problem of making these old images look great, so the next item on the agenda for me is the "defade" model. I've committed initial efforts on that and it's in the early stages of training as I write this. Basically it's just training the same model to reconstruct images that augmented with ridiculous contrast/brightness adjustments, as a simulation of fading photos and photos taken with old/bad equipment. I've already seen some promising results on that as well:
So that's the gist of this project – I'm looking to make old photos look reeeeaaally good with GANs, and more importantly, make the project useful. And yes, I'm definitely interested in doing video, but first I need to sort out how to get this model under control with memory (it's a beast). It'd be nice if the models didn't take two to three days to train on a 1080TI as well (typical of GANs, unfortunately). In the meantime though this is going to be my baby and I'll be actively updating and improving the code over the foreseeable future. I'll try to make this as user-friendly as possible, but I'm sure there's going to be hiccups along the way.
Oh and I swear I'll document the code properly...eventually. Admittedly I'm one of those people who believes in "self documenting code" (LOL).
The easiest way to get started is to simply try out colorization here on Colab: https://colab.research.google.com/github/jantic/DeOldify/blob/master/DeOldify_colab.ipynb. This was contributed by Matt Robinson, and it's simply awesome.
You should now be able to do a simple install with Anaconda. Here are the steps:
Open the command line and navigate to the root folder you wish to install. Then type the following commands
git clone https://github.com/jantic/DeOldify.git DeOldify cd DeOldify conda env create -f environment.yml
Then start running with these commands:
source activate deoldify jupyter lab
From there you can start running the notebooks in Jupyter Lab, via the url they provide you in the console.
Disclaimer: This conda install process is new- I did test it locally but the classic developer's excuse is "well it works on my machine!" I'm keeping that in mind- there's a good chance it doesn't necessarily work on others's machines! I probably, most definitely did something wrong here. Definitely, in fact. Please let me know via opening an issue. Pobody's nerfect.
This project is built around the wonderful Fast.AI library. Unfortunately, it's the -old- version and I have yet to upgrade it to the new version. (That's definitely [update 11/18/2018: maybe] on the agenda.) So prereqs, in summary:
conda install -c conda-forge jupyterlab
conda install -c anaconda tensorflow-gpu
To start right away with your own images without training the model yourself, download the weights here (right click and download from this link). Then open the ColorizeVisualization.ipynb in Jupyter Lab. Make sure that there's this sort of line in the notebook referencing the weights:
colorizer_path = IMAGENET.parent/('colorize_gen_192.h5')
Then you simply pass it to this (all this should be in the notebooks already):
filters = [Colorizer(gpu=0, weights_path=colorizer_path)]
Which then feed into this:
vis = ModelImageVisualizer(filters, render_factor=render_factor, results_dir='result_images')
Just drop whatever images in the
/test_images/ folder you want to run this against and you can visualize the results inside the notebook with lines like this:
The result images will automatically go into that result_dir defined above, in addition to being displayed in Jupyter.
There's a render_factor variable that basically determines the quality of the rendered colors (but not the resolution of the output image). The higher it is, the better, but you'll also need more GPU memory to accomodate this. The max I've been able to have my GeForce 1080TI use is 42. Lower the number if you get a CUDA_OUT_OF_MEMORY error. You can customize this render_factor per image like this, overriding the default:
For older and low quality images in particular, this seems to improve the colorization pretty reliably. In contrast, more detailed and higher quality images tend to do better with a higher render_factor.
Model weight saves are also done automatically during the training runs by the
GANTrainer – defaulting to saving every 1000 iterations (it's an expensive operation). They're stored in the root training data folder you provide, and the name goes by the save_base_name you provide to the training schedule. Weights are saved for each training size separately.
I'd recommend navigating the code top down – the Jupyter notebooks are the place to start. I treat them just as a convenient interface to prototype and visualize – everything else goes into
.py files (and therefore a proper IDE) as soon as I can find a place for them. I already have visualization examples conveniently included – just open the
xVisualization notebooks to run these – they point to test images already included in the project so you can start right away (in test_images).
The "GAN Schedules" you'll see in the notebooks are probably the ugliest looking thing I've put in the code, but they're just my version of implementing progressive GAN training, suited to a Unet generator. That's all that's going on there really.
Pretrained weights for the colorizer generator again are here (right click and download from this link). The DeFade stuff is still a work in progress so I'll try to get good weights for those up in a few days.
Generally with training, you'll start seeing good results when you get midway through size 192px (assuming you're following the progressive training examples I laid out in the notebooks). Note that this training regime is still a work in progress- I'm stil trying to figure out what exactly is optimal. In other words, there's a good chance you'll find something to improve upon there.
I'm sure I screwed up something putting this up, so please let me know if that's the case.
I just put up a bunch of significant improvements! I'll just repeat what I put in Twitter, here:
So first, this image should really help visualize what is going on under the hood. Notice the smallified square image in the center.
That small square center image is what the deep learning generator actually generates now. Before I was just shrinking the images keeping the same aspect ratio. It turns out, the model does better with squares- even if they're distorted in the process!
Note that I tried other things like keeping the core image's aspect ratio the same and doing various types of padding to make a square (reflect, symmetric, 0, etc). None of this worked as well. Two reasons why I think this works.
It turns out that the human eye doesn't perceive color (chrominance) with nearly as much sensitivity as it does intensity (luminance). Hence, we can render the color part at much lower resolution compared to the desired target res.
Before, I was having the model render the image at the same size as the end result image that you saw. So you maxed out around 550px (maybe) because the GPU couldn't handle anymore. Now? Colors can be rendered at say a tiny 272x272 (as the image above), then the color part of the model output is simply resized and stretched to map over the much higher resolution original images's luminance portion (we already have that!). So the end result looks fantastic, because your eyes can't tell the difference with the color anyway!
With the above, we're now able to generate much more consistently good looking images, even at different color gpu rendering sizes. Basically, you do generally get a better image if you have the model take up more memory with a bigger render. BUT if you reduce that memory footprint even in half with having the model render a smaller image, the difference in image quality of the end result is often pretty negligible. This effectively means the colorization is usable on a wide variety of machines now!
i.e. You don't need a GeForce 1080TI to do it anymore. You can get by with much less.
Finally- With the above, I was finally able to narrow down a scheme to make it so that the hunt to find the best version of what the model can render is a lot less tedious. Basically, it amounts to providing a render_factor (int) by the user and multiplying it by a base size multiplier of 16. This, combined with the square rendering, plays well together. It means that you get predictable behavior of rendering as you increase and decrease render_factor, without too many surprise glitches.
Increase render_factor: Get more details right. Decrease: Still looks good but might miss some details. Simple! So you're no longer going to deal with a clumsy sz factor. Bonus: The memory usage is consistent and predictable so you just have to figure out the render_factor that works for your gpu once and forget about it. I'll probably try to make that render_factor determination automatic eventually but this should be a big improvement in the meantime.
You're not losing any image anymore with padding issues. That's solved as a byproduct.
I added a new generic filter interface that replaces the visualizer dealing with models directly. The visualizer loops through these filters that you provide as a list. They don't have to be backed by deep learning models- they can be any image modification you want!