Chapter 10 - Introduction to Artificial Neural Networks with Keras

Responsible for the session: Frida Heskebeck

Chapter summary

Artificial neural networks inspired by brain structure. If the input to a neuron is above a threshold, the neuron fires (sending an output).

Neurons

Screenshot 2020-04-17 at 08.00.14.png

Each input to the neuron is weighted and a sum of those are calculated. The output of the neuron depends on the activation function, in the case of perceptrons it is a step function. Instead of a step function, we can have other activation functions (see below). When we have an activation function with derivatives that are nonzero we can use backpropagation to calculate the weight update for the learning step (see below). There also exist bias neurons that always output a constant value.

Activation functions

Screenshot 2020-04-17 at 08.26.34.png

Here are some different activation functions.

The general tip is:

ReLU	Hidden layers
Logistic (sigmoid)	If binary classification (one output neuron per class)
Softmax	For multiclass classification (one activation function for all neurons).
None	For regression

Layers

Input	As many neurons as there are features
Hidden	Often enough with a few hidden layers. Generally more neurons in the deeper layers (close to input) than in the shallow (close to output)
Output	As many neurons as classification/regression tasks.

Backpropagation

One way of automatic differentiation.

Forward pass:
Do a prediction and calculate error (uses loss function).

Backward pass:
Calculate the derivative in every node with respect to the weights, the algorithm uses the chain rule for this. The derivative says how to change the weights to minimize the error. Then the algorithm adjust the weights with the chosen optimizer (often a stochastic gradient descent).

Keras

Create the model	Create layers, the model can be sequential or functional
Compile model	Tell what loss, metrics, and optimizer to use.
Fit model	Decide number of epochs and callbacks
Evaluate model	Test model on test data.

If hyperparameter tuning - wrap Keras model as sci-kit learn model.

Hyper parameters

Number of layers	See above
Number of neurons	See above
Leraning rate	Test different and look at loss, chose a bit lower than what the optimal seem to be
Activation function	See above
Optimizers	More info in next chapter

Additional resources

The course in Convex optimization for Machine learning Links to an external site. given at the department.
It contains a lecture about backpropagation Links to an external site..

This video series Links to an external site. about deep learning.

Page Links to an external site. with some short information about activation functions and when to use them.

Page Links to an external site. with backpropagation.

Session Agenda

The plan for the meeting:

Go through summary of the chapter.
Discussion if there is something unclear from the chapter.

Extra:

How should we do with the course? Schedule, responsible for sessions, and so on.

Recommended exercises

Task 1 - Play around with the network. Nice visual representation.

Task 4 +8 - What is the secret behind ANN?

Task 6 - What size do we have of things?

Task 10 - Practice to set up an ANN with Python. Not important to train a perfect network but rather to create the models and start training.