Here the weights are initially assigned randomly and training examples are needed one after another to tweak the weights in the network. Means all the examples in the training set are used and the entire process as using all the examples again is iterated until all examples are correctly categorised through the network. But the tweaking is called as the perceptron training ruleso then is as follows: There if the training example, E, or is correctly categorised through the network so then no tweaking is carried out. Whether E is mis-classified and then each weight is tweaked by adding on a small value and Δ. Let suppose here that we are trying to calculate weight wi that is between the i-th input unit and xi and the output unit.
After then given that the network should have calculated the target value t(E) as an example E but in reality we calculated the observed value o(E) and then Δ is calculated as:
Δ = η (t(E)- o(E))xi
Always note that η is a fixed positive constant that called the learning rate. By ignoring η briefly we can see that the value Δ that we add on to our weight wi is calculated through multiplying the input value xi through t(E) - o(E). t(E) - o(E) will either be +2 or -2 it means that perceptrons output only +1 or -1 so t(E) cannot be equal to o(E) or else we wouldn't be doing any tweaking. Now we can think of t(E) - o(E) as a movement in a general numerical direction that is, positive or negative. It means that this direction will be like, if the overall sum, S, was too low to get over the threshold and produce the correct categorisation rather then the contribution to S from wi * xi will be increased.