Avoiding Local Minima:
However the error rate of multi-layered networks over a training set could be calculated as the number of mis-classified examples. So always keep in remember that there are many output nodes and all of that could potentially misfire as e.g., giving a value close to 1 when it should have output 0, and vice-versa then we can be more sophisticated in our error evaluation. Generally in practicing the overall network error is calculated like:
Here This is may be not complicated as it first appears. But there the calculation simply involves working out the difference between the observed output for each output unit and the target output and squaring this to make sure it is positive and then adding up all these squared differences for each and all output unit and for each example.