Example of Weight training calculations:
Through having calculated all the error values associated with such each unit like hidden and output then we can now transfer this information into the weight changes Δij between units i and j. After this now the calculation is as follows as: next for weights wij between input unit Ii and hidden unit Hj, we add on as:
So keep in memorise that xi is the input to the i-th input node for example E in which η is a small value that known as the learning rate and that δHj is the error value we calculated for hidden node Hj using the formula above.
In fact for weights wij between hidden unit Hi and output unit Oj, we add on as:
Here now remember that hi(E) is the output from hidden node Hi where example E is propagated by the network, and that δOj is the error value that we calculated for output node Oj using the formula above.
Ultimately for each alteration Δ is added to the weights and this concludes the calculation for example E. There the next example is utilise to tweak the weights further. Furthermore as with perceptrons there the learning rate is needed to ensure that the weights are only moved a short distance for each example so then the training for previous examples is not lost. To be ensure that the mathematical derivation for the above calculations is totally based on derivative of σ which we saw above. However for a full description of this there see chapter 4 of Tom Mitchell's book "Machine Learning".