Multi-Layer Network Architectures:
As we considered we saw in the previous lecture that perceptrons have limited scope in the type of concepts that they can learn - but they can only learn linearly separable functions. Moreover, we can think of constructing larger networks by building them out of perceptrons. So in generally such larger networks that we call the step function units the perceptron units are in multi-layer networks.
However as with individual perceptrons there multi-layer networks can be used for learning tasks. In fact, the learning algorithm in which we look at as the backpropagation routine is derived mathematically by utilising differential calculus. Hence the derivation relies on having a differentiable threshold function that effectively rules out using perceptron units so if we want to be sure that backpropagation works correctly. Conversely the step function in perceptrons is not continuous, thus non-differentiable. If there an alternative unit was therefore chosen that had similar properties to the step function in perceptron units but that was differentiable.