Weight Training Calculations:
However we have more weights in our network than in perceptrons but we firstly need to introduce the notation as: w_{ij} just to specify the weight between unit i and unit j. Means as with perceptrons there we will calculate a value Δ_{ij} to add on to each weight in the network after an example has been tried for this. In fact to calculate the weight changes for a particular example, E or where we first start with the information regarding how the network should perform for E. So that is there we write down the target values t_{i}(E) that each output unit O_{i} should produce for E. But notice that for categorisation problems there t_{i}(E) will be zero for all the output units except one that is the unit associated with the correct categorisation for E. So now for that unit, t_{i}(E) will be 1.
Here another example E is propagated by the network so that we can record all the observed values o_{i}(E) for such output nodes O_{i}. In fact at the same time there we record all the observed values h_{i}(E) for the hidden nodes. After then, for each output unit O_{k } and we calculate its error term as follows as:
Hence the error terms from the output units are required to calculate error terms for the hidden units. However this method gets its name it means that we propagate this information backwards by the network. Now next for each hidden unit H_{k} then we calculate the error term as follows as: