What are the disadvantages of Naïve Bayes?

More Bilal Esmael's questions See All

Are combined methods better than a single approach?

Are combined methods better than a single approach in machine learning?

02 March 2014 9,541 5 View

How does one choose which algorithm is best suitable for the dataset at hand?

How does one choose which machine learning algorithm is most suitable for a given dataset?

02 March 2014 9,450 10 View

What is the running time complexity of SVM and ANN?

What is the best, worst, and average running time complexity of SVM and ANN? Why most of machine learning papers report only the classification accuracy, and ignore the running time?

01 February 2014 8,452 5 View

How to make a classifier forget some wrong cases without re-training the whole system?

How to force a classifier to forget some wrong cases without re-training the whole system?

01 February 2014 5,207 2 View

What are the disadvantages of moving average filter when using it with time series data?

Moving-Average Disadvantages.

31 December 2013 10,375 9 View

SVM with large feature space?

Why Support Vector Machine can deal with a huge number of features? Why it works perfectly in Text classification where usually we have a hung amount of features e.g., 1000 features or more.

31 December 2013 4,555 5 View

How can I select the most informative features from a big feature set?

How can we select a feature subset from a huge amount of features (around 1500 features) that will produce the highest possible classification accuracy? Most of the feature selection algorithms do...

11 December 2013 2,935 47 View

When and why do we need data normalization?

Data normalization means transforming all variables in the data to a specific range. My question is when and why do we need data normalization?

10 November 2013 8,014 36 View

What is the difference between machine learning and data mining?

Machine Learning vs. Data Mining.

10 November 2013 10,102 32 View

How can we solve an overfitting problem?

Overfitting avoidance.

10 November 2013 8,903 7 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Rafael Falcon Popular answer

A subtle issue ("disadvantage" if you like) with Naive-Bayes is that if you have no occurrences of a class label and a certain attribute value together (e.g. class="nice", shape="sphere") then the frequency-based probability estimate will be zero. Given Naive-Bayes' conditional independence assumption, when all the probabilities are multiplied you will get zero and this will affect the posterior probability estimate.

This problem happens when we are drawing samples from a population and the drawn vectors are not fully representative of the population. Lagrange correction and other schemes have been proposed to avoid this undesirable situation.

Pablo Santana Mansilla

In classification tasks you need a big data set in order to make reliable estimations of the probability of each class. You can use Naïve Bayes classification algorithm with a small data set but precision and recall will keep very low.

I think that could be a disadvantage.

Rafael Falcon

Janto Frederick Dreijer

If you're using Naive-Bayes only for classification then have a look at generative vs discriminative models (e.g. logistic regression). One of the primary problems with using a generative model for classification is that you're often actually only interested in the separating hyperplane between your classes. With a generative model you're also trying to model data points far away from this plane.

Ryan Benton

The independence assumption, oddly, isn't necessarily a disadvantage. You do lose the ability to exploit the interactions between features; however, for classification tasks, this often isn't a problem. This doesn't seem to be true with respect to regression; at this point, it becomes more of an issue.

You may want to refer to this article:

Eibe Frank, Leonard E. Trigg, Geoffrey Holmes, and Ian H. Witten. Naive Bayes for regression (technical note). Machine Learning, 41(1):5-25, 2000

Different thought.

One problem that is often overlooked is how to calculate probabilities, for Naive Bayes, when working with real valued features. Often, people attempt to either discretize the feature (which leads to questions on the number of discrete values to have) or they attempt to fit a normal curve. Potential solution is to estimate a none normal distribution. See this reference:

John, G. H., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338-345). Montreal, Quebec: Morgan Kaufmann.

Fabrice Clerot

in the same vein as Ryan's and Rafael's answers above, see :

Boullé, M. (2005). A Bayes Optimal Approach for Partitioning the Values of Categorical Attributes. Journal of Machine Learning Research 6. p. 1431-1452.

Boullé, M. (2006). MODL: A Bayes optimal discretization method for continuous attributes, Machine Learning 65(1). p. 131-165

Boullé, M. (2007). Compression-based Averaging of selective Naive Bayes Classifiers. Journal of machine Learning Research 8. p. 1659-1685.

(available from the author's home page http://perso.rd.francetelecom.fr/boulle/‎ )

regarding the NB itself, there is an interesting paper :

D. Hand and K. Yu, "Idiot Bayes, not so stupid after all",International Statistical Review, Vol. 69, No. 3. (2001), pp. 385-398

Abstract :

Folklore has it that a very simple supervised classification rule, based on the typically false assumption that the predictor variables are independent, can be highly effective, and often more effective than sophisticated rules. We examine the evidence for this, both empirical, as observed in real data applications, and theoretical, summarising explanations for why this simple rule might be effective.

Ataollah Shirzadi

Dear

I want to know that how for each class of 1 and 0 in binary coding, the probability using Naive Bayes will be computed?

it is clearly explained here ...

https://en.wikipedia.org/wiki/Naive_Bayes_classifier

Jorge Félix Beltrán Lissabet

It depends on the type of data you are handling.

NB assumes independence among variables

Mohammed Kemal

Dear Bilal Esmael ,

Great discussion point!!