Which kind of Machine learning is Hidden Markov Model ??

Graham W Pulford Popular answer

Hello

Hidden Markov models have been around for a pretty long time (1970s at least). It's a misnomer to call them machine learning algorithms. The HMM model itself is a stochastic process based on a Markov chain, usually discrete in time and space but not necessarily so. The estimation problems concerning HMMs, as covered in Rabiner's tutorial paper in the Proc IEEE, are the computation of the HMM likelihood function, the estimation of the state sequence (Forward backward or Viterbi algorihm) and the re-estimation of the HMM parameters via the EM algorithm, leading to the Baum-Welch formulae. None of these is what I would call a machine learning algorithm, except maybe EM. Once you have estimated the HMM parameters ("trained" in machine learning speak), you can use it to provide likelihoods for the noisy inputs to the HMM, which is sort of like classification. It is most useful, IMO, for state sequence estimation, which is not a machine learning problem since it is for a dynamical process, not a static classification task. Here is a link to an example of what you can do with HMMs in state sequence estimation.

Conference Paper A High Performance 1-D Hidden Markov Model Tracker for Passi...

Salil Batra

Unsupervised and primarily it is used for classification as it is dynamic form of bayseian network

Ian Martin Ajzenszmidt

https://en.wikipedia.org/wiki/Hidden_Markov_model

Alok Jain

Machine learning Algorithms are classified as "Classification" , "Clustering" or "Regression".

Support vector machines or k-means are specifically designed to solve a problem (classification in the first case, clustering in the second), and are indeed just an optimization procedure to maximize an "expected goodness of classification" or "goodness of clustering" criterion. The beauty lies in the choice of the criterion and optimization procedure. HMM are not an algorithm per se. They are a specific kind of probability distribution over sequences of vectors - for which we know good parameter estimation and marginal distribution computation algorithms. But asking whether they are in the "clustering" or "classification" family is as ridiculous as asking whether the Gaussian distribution is supervised or unsupervised learning.

Why "both classification and clustering"? Because of the following: Being probability distributions, HMM can be used for classification in a bayesian framework; and being model with hidden states, some latent clustering of the training data can be recovered from their parameters. More precisely:

HMM can be used for classification. This is a straightforward application of the bayesian classification framework, with the HMM being used as the probabilistic model describing your data. For example, you have a large database of utterances of digits ("one", "two", etc) and want to build a system capable of classifying an unknown utterance. For each class in your training data ("one", "two", you estimate the parameters of a HMM model describing the training sequences in this class - and you end up with 10 models. Then, to perform recognition, you compute the 10 likelihood scores (which indicate how likely the sequence you want to recognize has been generated by the model), and the model with the highest score gives you the digit. In the Rabiner tutorial on HMMs, the training stage is "Problem 3", the classification stage is "Problem 2".

HMM can be used in an unsupervised fashion too, to achieve something akin to clustering. Given a sequence, you can train a kk-state HMM on it, and at the end of the training process, run the Viterbi algorithm on your sequence to get the most likely state associated with each input vector (or just pull this from the γγ during the training process). This gives you a clustering of your input sequence into kk classes, but unlike what you would have obtained by running your data through k-means, your clustering is homogeneous on the time axis. For example, you can extract the color histograms of each frame of a video sequence, run this process on this sequence, and you'll end up with a break-down of the video into homogeneous temporal segments corresponding to scenes (the unpractical bit is that you have to set up the number of scenes kk in advance). This technique is commonly used in automatic, unsupervised, structure analysis of video or music.

Alok Jain

First, let's look at the three options:

Classification : Identifying which class of a set of pre-defined classes the data belongs to.

Clustering : Learning the set of classes the data belongs to.

Regression : Finding a relationship between on variable and one or more others.

The description of the HMM on Wikipedia has the following table:

as shown in https://en.wikipedia.org/wiki/Hidden_Markov_model

so the number of states (classes) is fixed.

That means that the algorithm does not try to figure out the number of classes (states) are --- so it's not open-ended clustering (where the number of states is unknown).

However, as @nikie points out, the HMM will do clustering.

There is not really an independent variable (as exists in the regression context) --- so it's not regression.

So my answer is that the HMM is a classification and a clustering algorithm, I do not believe it is a regression.

Graham W Pulford

Hello

Conference Paper A High Performance 1-D Hidden Markov Model Tracker for Passi...

What can be the alternate buffer for TAPS (PH 8.5) in Post tagmentation Wash step in Cut & Tag experiment ?

Minimize the sum of squared errors between the experimental and predicted data in order to estimate two optimum parameters by using matlab?

How much of primary antibody do you add on slides in immunohistochemistry(IHC) / Immunocytochemistry (ICC)?

How to take action against fake or duplicate research journals websites?

Is there any method to combine three dimensions of physical activity (PA) i.e. intensity, duration and frequency to make a single PA level?

Why do I get minor electric shock in my laboratory when touch something metallic?

Could anyone explain how to apply anova (statistical analysis) on the triplicate swelling data of hydrogel?

How to separate dead cells from live cells when propagating suspension cells?

How to do multiple data sequential Rietveld refinement in TOPAS?

Why does the transmittance % in FTIR become more than 100% (Opposite direction)?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

Please explain how the plastic input value should be considered from the true stress-strain curve for the bilinear elastoplastic material model ?

"A Markov-like Model for Patient Progression"?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

How to report results of Generalised Linear Mixed Models in a journal article?

Need help with my research project on open source SIEM and machine learning?

Is it necessary to covary exogenous constructs in a structural model?