Activation Functions
You can think of the activation function as the network's decision-maker at each node. After computing the weighted sum of its inputs, a neuron asks: "How much of this signal should I pass forward?" The activation function answers that question.
Without activation functions, no matter how many layers you stack, your network would still be computing a linear function. A linear function of a linear function is still linear. Activation functions introduce the non-linearity that makes deep networks capable of approximating complex patterns.
Current output values
Toggle functions on and off, then drag the slider to see how each activation responds to different input values.
You are building a classifier that identifies one of ten different manufacturing defect types from sensor readings. Which activation function should you use in the output layer?