Some doctors told me that data mining and classification techniques are old sience and not interested any more for new and further research. I am not satisfy of their openion. I want some evidences against their openion.
Classification is one of the most important task in machine learning. Its importance can never be ignored. In every aspect of life we have to face classification which is pattern recognition.
Classification is a scale through which one can decide how and why billions of things are different from each other. For example, animals species, various plants, various diseases etc all are different via a scale of classification.
The job of science is to propose and test theories using the scientific method:
1. Generating hypotheses (often based on some theory)
2. Creating a research design to establish causality
3. Gathering data to test these hypotheses based on the research design
4. Using statistical modelling to examine relationships in the data
5. If relationships in the data support the hypotheses, then causal hypotheses are said to be supported.
Data mining does not operate in this way. Instead of generating hypotheses and using a research design to establish causality, data mining tends to use statistical modelling (often just correlational analysis) to analyse whatever data happens to be available. If relationships are found in the data:
1. There is no way of establishing whether they are causal because there is no research design
2. There is no way of knowing whether they are just a specific artefact of the data that happened to be on hand (because the data was not specifically generated based on hypotheses). Thus it cannot be known whether that relationship is likely to exist outside of that dataset.
Classification when used in a unsupervised machine learning context is not hypothesis-driven and is therefore often a form of data mining.
Data mining can be useful for generating hypotheses to be tested by the scientific method.
So when the doctors said that data mining was old-fashioned, what they meant is that it is risky to use conclusions based on data mining because they are not properly tested using hypotheses and research design.