What programming language/library/package you are using? I think that J48 performs pruning after generating the perfect tree so overfitting should be not that huge error.
In J48 decision tree, over fitting happens when algorithm gets information with exceptional attributes. This causes many fragmentations in the process distribution. Statistically unimportant nodes with least examples are known as fragmentations. Usually J48 algorithm builds trees and grows its branches ‘just deep enough to perfectly classify the training examples’. This approach performs better with noise free data. But most of the time this strategy over fits the training examples with noisy data. At present there are two strategies which are widely used to bypass this overfitting in decision tree learning. Those are: 1) If tree grows taller, stop it from growing before it reaches the maximum point of accurate classification of the training data. 2) Let the tree to over-fit the training data then post-prune tree.