Chapter 17 - Representation, Generation using Autoencoders and GANs
Chapter Summary
In this chapter, we cover how to learn features and how to generate data. The purpose of learning features is to find a transformation of the data that is compressed and or more robust (e.g. towards noise and missing data).
Autoencoders
Compression
Compression can be a useful way to include unsupervised pretraining on an unlabeled dataset for tackling a supervised learning problem, say classification. Since labeling is costly, being able to include unlabeled data is a good thing.
An autoencoder is basically a bunch of layers, where the output layer has the same dimensionality as the input layer. We then do our best to learn the identity mapping, i.e. we penalize the deviation from the input. Compression is achieved by introducing a bottleneck hidden layer (a layer with fewer active neurons than the input layer). To come close to the identity mapping the network must learn to discard useless information contained in the input. There are two common approaches to achieve fewer active neurons
1) Let the bottleneck layer have fewer neurons than the input layer
2) Achieve sparsity by including an appropriate term to the cost function.
The author recommends either the Kullbak-Leibler Divergence (Wikipedia
Links to an external site.) or adding an ℓ1term (sometimes called lasso) to the cost function.
ℓ2is best avoided, as it does not really induce sparsity.
The left side of the bottleneck (including the bottleneck) is called the encoder, the right side is called the decoder. To solve a supervised task you first train the autoencoder on an unlabeled dataset, then replace the decoder with a smaller network that you train on the labeled data.
Denoising, missing data
Conceptually, this is quite easy. Just add noise or dropout to the input layer and train as usual. Hopefully, you end up with outputs that look more like the noise-free input.
Variational Autoencoders
Similar to previous ideas, but add the noise to the bottle-neck layer rather than the input. The key idea is to let the encoder learn a mapping from input to Gaussian distributions, meaning it will output mean values and variances. On each forward pass of the network, we sample from the resulting distribution. Noise is active both during training and during application. Hopefully, this gives us a feature space (codings) where similar codings not far from each other (in ℓ2norm) are decoded to similar outputs. We can thus generate new data by sampling in the space of codings.
Architectures
You can really use any architecture we've gone through. The author mentions (and includes code for) some simple examples using
- Fully connected nets
- Convolutional nets
- Recurrent nets
GANs
Welcome to the wild west of saddle-points and Deepfakes Links to an external site.. GANs relates to adversarial training Links to an external site.where you train a model to fool another model. Doing this, you can generate adversarial examples Links to an external site. such as
GANs were introduced in 2014 by Ian Goodfellow[3]. Wikipedia Links to an external site.lists quite a lot of applications, including art generation, image enhancement, upscaling low-resolution videogames (Final Fantasy, resident evil and Max Payne remakes), and much more.
The idea is quite simple, pick a neural network (structure like a decoder from an autoencoder). It will take noise as input and output something of interest (say a picture). You will let this generator compete against a classifier network called the discriminator. The task of the discriminator is to classify whether the input is real or fake. The task of the generator is to fool the discriminator.
One training step includes one step of the generator and one step of the discriminator. Make sure to freeze the one you aren't training.
Training issues
Mode Collapse - Possible hacks to help: Experience replay and minibatch discrimination
Stability - Possible hacks to help? pray and hope
Cool stuff we found on the interwebz
Latest and hottest from NVIDIA StyleGAN2: https://arxiv.org/abs/1912.04958 Links to an external site.GITHUB Links to an external site.
Coursera specialization from deeplearning.ai https://www.coursera.org/specializations/generative-adversarial-networks-gans Links to an external site.
Generated fake images of persons, created by StyleGAN2 by NVIDIA: https://thispersondoesnotexist.com/ Links to an external site.
Pex favorite readings: https://www.gwern.net/Faces Links to an external site.
Gan in EEG signal reconstruction: https://www.frontiersin.org/articles/10.3389/fninf.2020.00015/full Links to an external site.
Deep EEG supersampling: https://ieeexplore.ieee.org/document/8333379 Links to an external site.
Learning to Correspond Dynamical Systems: https://sites.google.com/view/l2cds Links to an external site.
Ganhacks: https://github.com/soumith/ganhacks Links to an external site.
Hands on - flower generation
Origin: https://www.robots.ox.ac.uk/~vgg/data/flowers/102/ Links to an external site.
Tensorflow dataset: https://www.tensorflow.org/datasets/catalog/oxford_flowers102 Links to an external site.
Will be shown during discussions
Exercise suggestions
Exercises 1,2,...,8 are about comprehending the chapter and I expect that you go over them briefly.
We will discuss:
Exercise 1 with a twist: How have autoencoders been used field of study? Alt. What are conceivable applications of autoencoders in your field of study?
Exercise 7 with a twist: How have GANs been used in your field of study? Alt. What are conceivable applications of autoencoders in your field of study?
Hands on:
Exercise 11: Copy/paste and steal as much code as you can, try to generate some fun images. Here are some suggested datasets to try out (consider downsampling for an easier, more manageable problem):
Generate animals: https://www.kaggle.com/alessiocorrado99/animals10 Links to an external site.
Flowers: https://www.tensorflow.org/datasets/catalog/oxford_flowers102 Links to an external site.
Generating Pokemon (small dataset): https://www.kaggle.com/kvpratama/pokemon-images-dataset Links to an external site.
Creating dogs: https://www.kaggle.com/jessicali9530/stanford-dogs-dataset Links to an external site.
[1] https://iq.opengenus.org/implementing-autoencoder-tensorflow/
[2] Hands-on Machine Learning with Scikit-Learn, Keras, and Tensorflow, A. Geron, 2019
[3]https://arxiv.org/abs/1406.2661