Chapter 10 - Introduction to Artificial Neural Networks with Keras
Responsible for the session: Frida Heskebeck
Chapter summary
Artificial neural networks inspired by brain structure. If the input to a neuron is above a threshold, the neuron fires (sending an output).
Neurons
Each input to the neuron is weighted and a sum of those are calculated. The output of the neuron depends on the activation function, in the case of perceptrons it is a step function. Instead of a step function, we can have other activation functions (see below). When we have an activation function with derivatives that are nonzero we can use backpropagation to calculate the weight update for the learning step (see below). There also exist bias neurons that always output a constant value.
Activation functions
Here are some different activation functions.
The general tip is:
ReLU | Hidden layers |
Logistic (sigmoid) | If binary classification (one output neuron per class) |
Softmax | For multiclass classification (one activation function for all neurons). |
None | For regression |
Layers
Input | As many neurons as there are features |
Hidden | Often enough with a few hidden layers. Generally more neurons in the deeper layers (close to input) than in the shallow (close to output) |
Output | As many neurons as classification/regression tasks. |
Backpropagation
One way of automatic differentiation.
Forward pass:
Do a prediction and calculate error (uses loss function).
Backward pass:
Calculate the derivative in every node with respect to the weights, the algorithm uses the chain rule for this. The derivative says how to change the weights to minimize the error. Then the algorithm adjust the weights with the chosen optimizer (often a stochastic gradient descent).
Keras
Create the model | Create layers, the model can be sequential or functional |
Compile model | Tell what loss, metrics, and optimizer to use. |
Fit model | Decide number of epochs and callbacks |
Evaluate model | Test model on test data. |
If hyperparameter tuning - wrap Keras model as sci-kit learn model.
Hyper parameters
Number of layers | See above |
Number of neurons | See above |
Leraning rate | Test different and look at loss, chose a bit lower than what the optimal seem to be |
Activation function | See above |
Optimizers | More info in next chapter |
Additional resources
The course in Convex optimization for Machine learning
Links to an external site. given at the department.
It contains a lecture about backpropagation
Links to an external site..
This video series Links to an external site. about deep learning.
Page Links to an external site. with some short information about activation functions and when to use them.
Page Links to an external site. with backpropagation.
Session Agenda
The plan for the meeting:
- Go through summary of the chapter.
- Discussion if there is something unclear from the chapter.
Extra:
- How should we do with the course? Schedule, responsible for sessions, and so on.
Recommended exercises
Task 1 - Play around with the network. Nice visual representation.
Task 4 +8 - What is the secret behind ANN?
Task 6 - What size do we have of things?
Task 10 - Practice to set up an ANN with Python. Not important to train a perfect network but rather to create the models and start training.