Further for the calculation for information gain is the most difficult part of this algorithm. Hence ID3 performs a search whereby the search states are decision trees and the operator involves adding a node to an existing tree. So there uses information gain to measure the attribute to put in each node but performs a greedy search using this measure of worth. However the algorithm goes like: by given a set of examples, S, categorised in categories ci, then as:
1. Moreover choose the root node to be the attribute, A that scores the highest for information gain relative to S.
2. Just for each value v that A can possibly take and draw a branch from the node.
3. And for each branch from A corresponding to value v but calculate Sv. like:
If considered the algorithm terminates either when the decision tree perfectly classifies the examples or when all the attributes have been exhausted.