What are the major differences between using the Information Gain and Entropy when we use to determine the credibility or the importance in the classification.
The information gain is the amount of information gained about a random variable or signal from observing another random variable.
Entropy is the average rate at which information is produced by a stochastic source of data, Or, it is a measure of the uncertainty associated with a random variable.
The information gain is the amount of information gained about a random variable or signal from observing another random variable.
Entropy is the average rate at which information is produced by a stochastic source of data, Or, it is a measure of the uncertainty associated with a random variable.
Information gain (IG) measures how much “information” a feature gives us about the class.
Entropy is the measures of impurity, disorder or uncertainty in a bunch of examples. Entropy controls how a Decision Tree decides to split the data. It actually effects how a Decision Tree draws its boundaries.
The difference between information gain (IG) and entropy can be understood from the definitions (and representations).
Entropy of a random variable X can be represented as H(X), which tells about the uncertainty about the random variable. In other words, how many bits do we need to represent X.
Whereas, thinking of IG, you would need two of such random variables, let us suppose X and Y. IG(Y|X) = H(Y) - H(Y|X). In other words, IG tells us how many more bits do I need to measure Y, when the information about X is already known. So, in a sense you reduce uncertainty with the additional information available from one random variable.