Example of Weight training calculations:
Through having calculated all the error values associated with such each unit like hidden and output then we can now transfer this information into the weight changes Δ_{ij} between units i and j. After this now the calculation is as follows as: next for weights w_{ij }between input unit Ii and hidden unit H_{j}, we add on as:
So keep in memorise that x_{i} is the input to the i-th input node for example E in which η is a small value that known as the learning rate and that δH_{j} is the error value we calculated for hidden node H_{j }using the formula above.
In fact for weights w_{ij} between hidden unit Hi and output unit O_{j}, we add on as:
Here now remember that h_{i}(E) is the output from hidden node H_{i} where example E is propagated by the network, and that δO_{j} is the error value that we calculated for output node O_{j} using the formula above.
Ultimately for each alteration Δ is added to the weights and this concludes the calculation for example E. There the next example is utilise to tweak the weights further. Furthermore as with perceptrons there the learning rate is needed to ensure that the weights are only moved a short distance for each example so then the training for previous examples is not lost. To be ensure that the mathematical derivation for the above calculations is totally based on derivative of σ which we saw above. However for a full description of this there see chapter 4 of Tom Mitchell's book "Machine Learning".