What is the difference between machine learning and data mining?

11 November 2013 32 10K Report

Machine Learning vs. Data Mining.

I think that DM does not necessarily require ML, even though they use similar techniques. DM is expected to cope with large amounts of data in order to extract hidden knowledge, e.g. to evaluate how the Stock Market is going to evolve taking into account all data from the last 5 years- Whatever DM returns can be used by people or machines. If a machine acquires and uses that knowledge to make its own prediction, it is using ML techniques.

Mahsa Hassankashi

Hello,

I think both of them are near to each other but actually , machine learning would be trained with data and make decision and build knowledge to sort action this concept has upper hand vs data mining, on the other hand data mining build model when we encounter to big data to extract meaningful data from large data set.

Mahsa.

Leroy Dyer

Machine learning is closely related to predictive analytic's where as previously algorithms used in artificial intelligence, they have now been brought into use by the data mining community to solve classification problems. predictive analytical techniques such as decision trees and cluster analysis. such techniques have been applied in facial recognition and pattern matching. such techniques have migrated to data mining using the same techniques to find decisions and classify data in data. Due to constant data growth new techniques are required to segment and filter data. Artificial intelligence and Business intelligence have become blurred lines and techniques used for both purposes. the primary function of data mining is to find out what has happened, enable to predict what may happen. this axiom is also used in artificial intelligence as with past experiences new experiences can be inferred. such Supervised and unsupervised learning techniques have been taken to be principles of machine learning, but is this Intelligence or predictive aynalitics ?

Data mining can be also considered as reducing entropy unstructured data becoming information, increasing knowledge from information based upon classification. determining decision processes taken by the data given and the information. "making sense of data" can this be classed as intelligence? or machine learning? or is it data understanding, and modeling? ....

Leroy

Jong-Soo Sohn

Machine learning is close to experience or study about dangerous situations.

Data mining is close to index book chapters about danger within uncountable books.

Mahmoud Omid

Machine Learning (ML) tries to answer two questions; (1) How to build computer systems that automatically improve with experience, and (2) What are the fundamental laws that govern all learning processes?” Accordingly, ML is a branch of artificial intelligence (AI) or the science of getting computers to act without being explicitly programmed.

However, Data Mining (DM) wants to discover hidden value (or revealing valuable knowledge hidden in raw data) in your data warehouse. Accordingly, DM can be regarded as an application of algorithms to search for patterns and relationships that may exist in large databases.

Nowadays datasets are so large, so many relationships are possible. To search this space of possibilities, ML techniques are often used. So DM is often regarded as a sub-field of ML.

Glen Dario Rodriguez

DM uses the techniques of ML (and other fields of AI), but DM also deals with visualization of data (big data), and in some way with methods for the storage/management/querying of big data. so DM and ML intersects, but no one is completely a subset of the other one.

Cristina Urdiales

Leroy Dyer

Glen Rodriguez, I beleive your correct. techniques have just been migrated from Machine learning to serve simular purposes in data mining. The problem of gaining maximum entropy from existing data is a complex task. Although some of the algorithms used by machine learning do not necessarly produce information from data more as potential relationships from features of the data. These potential relationships are re aynalized to fit hypothesises assumed by the algorithm. Such as K-nearest neighbor ... Clusters are created, yet clusters created do not necessarly have a relationship. But after re aynalizing the clusters and appling more classification. A relationship may be found. This type of exploratory aynalasis can produce results . But can data be made to tell any story? Fitting the purpose of the aynalizer? The danger of big data sets is that information can be generated to fit any purpose. As with so many combinations available the entropy increases instead of decreasing. This goes against the principle of gaining information from data reducing entropy.

Leroy

Frank Veroustraete

1) Machine learning.

Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders.

The core of machine learning deals with representation and generalization. Representation of data instances and functions evaluated on these instances are part of all machine learning systems. Generalization is the property that the system will perform well on unseen data instances; the conditions under which this can be guaranteed are a key object of study in the subfield of computational learning theory.

There is a wide variety of machine learning tasks and successful applications. Optical character recognition, in which printed characters are recognized automatically based on previous examples, is a classic example of machine learning.

2) Data mining

Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

The term is a buzzword, and is frequently misused to mean any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) but is also generalized to any kind of computer decision support system, including artificial intelligence, machine learning, and business intelligence. In the proper use of the word, the key term is discovery, commonly defined as "detecting something new". Even the popular book "Data mining: Practical machine learning tools and techniques with Java" (which covers mostly machine learning material) was originally to be named just "Practical machine learning", and the term "data mining" was only added for marketing reasons. Often the more general terms "(large scale) data analysis", or "analytics" – or when referring to actual methods, artificial intelligence and machine learning – are more appropriate.

The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection) and dependencies (association rule mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting are part of the data mining step, but do belong to the overall KDD process as additional steps.

The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These methods can, however, be used in creating new hypotheses to test against the larger data populations.

Data mining uses information from past data to analyze the outcome of a particular problem or situation that may arise. Data mining works to analyze data stored in data warehouses that are used to store that data that is being analyzed. That particular data may come from all parts of business, from the production to the management. Managers also use data mining to decide upon marketing strategies for their product. They can use data to compare and contrast among competitors. Data mining interprets its data into real time analysis that can be used to increase sales, promote new product, or delete product that is not value-added to the company.

3) The difference?

While Machine learning is strongly related to artificial intelligence, data mining is not. It is more related to the statistics of large datasets.

Cheers,

Frank

SP.: See also the techniques used by NSA and other intelligence agencies which are mostly data mining techniques.

Ask the whistleblower ;-)

Jonas Gozdecki

Datamining uses machine learning techniques, but when using righ level systems, more processing is spent, that's a natural diference.

The concept of ML fits in DM, but the way to archive this is completly diferent. The use their own resources.

It's possible to use datamining to build better machine learning algorithms.

Fernanda Dorea

Http://stats.stackexchange.com/questions/5026/what-is-the-difference-between-data-mining-statistics-machine-learning-and-ai

Fernanda Dorea

Http://stackoverflow.com/questions/7105428/difference-between-data-mining-and-machine-learning

Muddsair Sharif

ML is usually used for predictive on supervised and unsupervised data while DM is only consider supervised.

Ishwar Sethi

Check it out at http://wp.me/p3iXlv-S

Leroy Dyer

Pointing out:

Data mining is not a subset of machine learning or machine learning a subset of data mining .

Data mining has its roots in gaining data from unstructured information. This process of detecting data from unstructured information such as the web, documents, audio etc. this has been a task for data mining from its inception. Data mining today has graduated or you could say "grown up". Data mining has moved or shifted its main focus from gaining data from information to gaining information from data ... In fact the whole process has become reversed. Data mining may have become associated with artificial intelligence and machine learning and even changed from data mining to business intelligence. Business intelligence encompasses the whole process from dat collection to data aynalasis and visualization . The evolution of business aynalitics have encompassed many technologies which otherwise have become stunted in their own growth. Even machine learning is to be brought into question are the algorithms associated with machine learning "do they enable a machine to learn?" Or just reduce entropy. Not actual learning . This maybe why they have been migrated to the data mining or business aynalitics or business inteligence communities.

Negar Ahmadi

check it out:

http://answers.yahoo.com/question/index?qid=20080305004107AA25hZZ

good luck

Mustansar ali Ghazanfar

DM is about finding and visualising pattern (using clustering), mining trends, and for filtering rules from large amount of data.

ML, in addition to (more or less doing the same stuff as DM does), deals with predicting analysis, training some models and testing their generalization performance on same/different domains.

Ryan Benton

Data mining, machine learning, and pattern recognition (just to complicate issues) arose from different fields. Loosely speaking, machine learning grew out of artificial intelligence, pattern recognition out of signal/image processing and data mining out of the database community.

Hence, they had/have somewhat different focuses as well as somewhat different terminology (which can lead to some havoc). That said, there has been a large amount of interchange between the different discipline, which means many techniques are seen in the various groups.

Reema Singh

Machine Learning focused on Prediction based on known properties from training data.

Data Mining focused on discovery of unknown properties of data.

This available on Wiki Page - http://en.wikipedia.org/wiki/Machine_learning#Machine_learning_and_data_mining

James Dominic O'Shea

Data mining is about discovery of knowledge, whereas machine learning tends to be more mundane production of classifiers or predictors for a given set of data. I believe the HERO effect (hazards of electromagnetic radiation to ordnance ) was the first high-impact discovery made by data mining but I suggest you check this in the literature for yourself.

Mahmoud Omid

As I see it, the goal of ML and DM approachs is learning from data. But DM uses ML algorithms and tools for carrying out DM tasks (such as extracting patterns or rules, classification, regression, clustering, etc.)

Abdelrazzaq Alrababa'a

For simplicity- Data Mining------>> machine learning algorithm------->> Neural network. All related to pattern recognition. So Information from the data.

Thabet Slimani

Look at the following link which give you some insights about DM and ML.

http://web.cecs.pdx.edu/~mperkows/CLASS_479/LECTURES479/PE013..pdf

Zhao Zhang

Yes, ML is a tool of DM for data analysis, including feature learning, etc. But both ML and DM are talking many same topics, since they both mine and learn from data.

Hosahalli Doreswamy

Data mining is the process of extracting previously, unknown and potentially useful knowledge from the large volume of data or information. DM is confluence of many disciplines such as Machine learning, statistic, database technology, Artificial intelligence, and soft computing approaches. DM is desirable where machine learning algorithms are unfeasible to perform on large data on the storage disk, where as Machine learning algorithms are suitable for processing of data on the main memory. Therefore DM is different from machine learning. Loosely speaking, Machine learning is a subset of Data Mining.

Iping Supriana

ML is tool for accelerating DM performance. And DM result can be used to enhance ML knowledge (rule)

Abdelrazzaq Alrababa'a

Data Mining technique is a critical step in knowledge discovery process. Machine learning is more comprehensive field of study includes classification, clustering and regression techniques. Also, the most important tools of data mining tools are clustering, classification. In addition to that machine learning techniques could be applied under supervised or unsupervised process and this could be as well in data mining. I wonder why most of books and articles do not differentiate between the data mining and machine learning and you can read the title of one article is "Evaluating the predictability of data mining techniques in the stock markets, but once you start reading the article you can find that the author is made the methodology from the combination of data mining techniques and other machine learning techniques. So it is not pure machine learning, and we can not mix between the data mining and machine learning all the time. It sounds confusing :( . The data mining is machine learning but not all the machine learning are data mining tools.

Later on you will see that they are not differentiating between data mining and neural networks and this is wrong as well where the neural network is a classification technique under the machine learning but it is not data mining. So all three terms are different.

Ahmed Salman Mirza

data mining is the process of extracting data where as Machine learning is teaching a machine to perform tasks on the basis of data.

Abdelrazzaq Alrababa'a

Dear Ahmad Salman;

I think the data mining could be used as a machine learning sometime. It could be used in a special case as a forecasting technique and here you don't need to apply any other models. Data mining is a broad topic to study where many steps might be applied in the same process, starting from preprocessing the data and ending by testing and validating of different models. Actually it is considered the most important field of study and usually used by researchers to solve problems in different areas. Two methods of data mining are important namely supervised and un-supervised technique, but un-supervised method could provide more valid and critical results because it compares between input and outputs and the training process come between. Again this is to my knowledge and my opinion.

Ahmed Salman Mirza

Dear Abdelrazzaq

i have not denied the fact .. i have clearly mentioned in the defination that machine learning is performed on the basis of data... no wonder wht u have stated is right .. data mining is used to extract data to formulate it into understandable information.

Khurram Shehzad

Machine learning is about learning from "small scale" data that has been acquired in a laboratory and has been "carefully selected and prepared" for learning purposes. So we can say that it involves the presence of ideal conditions. Data mining, by contrast, is about learning (or discovering) patterns, trends, signal evolution etc from "real world" data. This data is generated in a wide variety of sectors and stored in large repositories and is "not generated for learning purposes" in the first place. This data is "large scale" in that it keeps multiplying over time and is always "imperfect" (has errors, missing values etc).

Therefore machine learning techniques, when used for data mining purposes, have to be adapted so that they can cope with "large" and "inherently imperfect" data which is the norm in the real world. And although data mining can also be done using techniques from other fields such as statistics etc, the techniques predominantly used for DM are from the field of ML.

Leroy Dyer

Actually, I can be said that this is a trick question as DM & ML can be described as being the same thing, yet ML Can be said also stands alone. ML is concerned with predictive knowledge whereas DM can also be applied to descriptive and predictive knowledge. Both can be considered to be data science simply put.

Both can be described as data intelligence evenly.

The distinction between the two is ambiguous whereas ten years ago the distinction was much clearer DM Was mainly concerned with data science whereas ML was mainly concerned with AI. This is not the case now, the both have become used and their descriptions blurred as the tasks associated with each have crossed over.

Perhaps they have crossed over so far that no clear boundary exists any more they are both forms and aspects of General Data science.

Badges
Science topic

Similar topics
Computer Science
Data Mining

More Bilal Esmael's questions See All

Are combined methods better than a single approach?

Are combined methods better than a single approach in machine learning?

02 March 2014 9,541 5 View

How does one choose which algorithm is best suitable for the dataset at hand?

How does one choose which machine learning algorithm is most suitable for a given dataset?

02 March 2014 9,450 10 View

What is the running time complexity of SVM and ANN?

What is the best, worst, and average running time complexity of SVM and ANN? Why most of machine learning papers report only the classification accuracy, and ignore the running time?

01 February 2014 8,452 5 View

How to make a classifier forget some wrong cases without re-training the whole system?

How to force a classifier to forget some wrong cases without re-training the whole system?

01 February 2014 5,207 2 View

What are the disadvantages of moving average filter when using it with time series data?

Moving-Average Disadvantages.

31 December 2013 10,375 9 View

SVM with large feature space?

Why Support Vector Machine can deal with a huge number of features? Why it works perfectly in Text classification where usually we have a hung amount of features e.g., 1000 features or more.

31 December 2013 4,555 5 View

How can I select the most informative features from a big feature set?

How can we select a feature subset from a huge amount of features (around 1500 features) that will produce the highest possible classification accuracy? Most of the feature selection algorithms do...

11 December 2013 2,935 47 View

When and why do we need data normalization?

Data normalization means transforming all variables in the data to a specific range. My question is when and why do we need data normalization?

10 November 2013 8,014 36 View

How can we solve an overfitting problem?

Overfitting avoidance.

10 November 2013 8,903 7 View

How to use HMM for Multivariate time series classification

How can I use HMM to classify multivariate time series. The given time series should be segmented to different-length segments, and for each segment a label (class) should be assigned.

10 November 2013 2,355 7 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View