How can I use HMM to classify multivariate time series. The given time series should be segmented to different-length segments, and for each segment a label (class) should be assigned.
There are a number of ways to try to tackle this problem (I'm not saying that they'll all work, just that I can think of a few ways to go about this). First a quick question to make sure that I understand what you're trying to accomplish. Is your primary objective to determine the change points? i.e., given the change points, do you have a classification scheme that works? (Based on SAX or DTW?). Second question: do you have reason to believe that there is a probabilistic (specifically Markov) structure governing the transition between the various segments, i.e. is there a stochastic finite state machine (finite Markov chain) specifying transition probabilities?
Assuming that the answer to question 2 is yes, a HMM approach can be helpful by providing a mechanism to condition segmentation/classification on this transition structure. If no, or this assumption is invalid (e.g., the transitions are random or governed by a non-Markovian process) HMM's may complicate things. Also note, HMMs can require a great deal of data to "burn in" and continuous time implementations are extremely computationally expensive and difficult to stabilize.
There are a couple of variants of HMMs that you might want to look at as well. As the dwell time at the states varies, you might want to consider that as part of the signature - Explicit State Duration HMMs (EDHMMs) provide some additional machinery in that regard. If you don't explicitly know the number of states, you might want to look at Infinite HMMs (IHMMs). Also, there are some variants based on other realizations of HMMs as stochastic Finite State Automata (HMMs can be viewed as a Moore machine). This is interesting as there is a hierarchy of automata with increasing descriptive power and within this context, switching from one FSA to another can be interpreted as an increase in computational complexity (i.e. a higher order machine such as Pushdown Stack Automata is needed to describe the transition). This increase in complexity of computation then provides the change detection. James Crutchfield demonstrated this using his epsilon-Machine Reconstruction method (inferring Mealy machine FSAs) in determining "The Attractor Basin Portrait of a Cellular Automata." Another interesting implementation of HMMs is Shalizi's Causal State Reconstruction algorithm. Both of these two are particularly well-suited to accommodating symbolic representations. (I'll also note, be careful with SAX, this technique is intimately related to Symbolic Dynamics and I don't believe that the advocates of SAX have yet made the connection - certainly I have not found anything discussing the implicit partition of phase space within that context. Long story short, if the symbolic coding is not done properly, the symbolic sequence will not be a faithful encoding of the dynamics and there can be uniqueness issues).
In a more standard HMM approach, what you're looking to do is find the Viterbi path - the most likely sequence of hidden states that results in the observed sequence. An algorithm such as the Baum-Welch (that uses the forward-backward algorithm) can be used to infer an HMM and the Viterbi algorithm can be used to find the Viterbi path. This approach is structurally similar to that taken by Crutchfield in the paper mentioned above.
Does this help or are you looking for more help on setting up problem?
You want to recognize patterns in a sequence? Simple alternative: choose appropriate segment/pattern length and classify each segment with classifier. Another alternative: Map multivariate time-series to sequence of symbols with dimensionality reduction approach (like SOM, PCA, ISOMAP, etc.) and employ string matching algorithm (or dynamic time warping if you assume insertions and deletions).
Is there a particular reason that you want to use HMM's? Do you already have a signature for the different time series segments? What are their structures? What is observable? Do you know how many different classes exist? Do you have reason to expect that structure exists governing the transition from one segment to another? It strikes me that what you've described is a clustering problem at its heart and may benefit from a clustering approach. Also, change detection methods may be fruitful. Without knowing more about the structure of the individual segments, it's difficult to provide much in the way of suggestions as to how HMM techniques might be applied except in very general terms.
There are a number of ways to try to tackle this problem (I'm not saying that they'll all work, just that I can think of a few ways to go about this). First a quick question to make sure that I understand what you're trying to accomplish. Is your primary objective to determine the change points? i.e., given the change points, do you have a classification scheme that works? (Based on SAX or DTW?). Second question: do you have reason to believe that there is a probabilistic (specifically Markov) structure governing the transition between the various segments, i.e. is there a stochastic finite state machine (finite Markov chain) specifying transition probabilities?
Assuming that the answer to question 2 is yes, a HMM approach can be helpful by providing a mechanism to condition segmentation/classification on this transition structure. If no, or this assumption is invalid (e.g., the transitions are random or governed by a non-Markovian process) HMM's may complicate things. Also note, HMMs can require a great deal of data to "burn in" and continuous time implementations are extremely computationally expensive and difficult to stabilize.
There are a couple of variants of HMMs that you might want to look at as well. As the dwell time at the states varies, you might want to consider that as part of the signature - Explicit State Duration HMMs (EDHMMs) provide some additional machinery in that regard. If you don't explicitly know the number of states, you might want to look at Infinite HMMs (IHMMs). Also, there are some variants based on other realizations of HMMs as stochastic Finite State Automata (HMMs can be viewed as a Moore machine). This is interesting as there is a hierarchy of automata with increasing descriptive power and within this context, switching from one FSA to another can be interpreted as an increase in computational complexity (i.e. a higher order machine such as Pushdown Stack Automata is needed to describe the transition). This increase in complexity of computation then provides the change detection. James Crutchfield demonstrated this using his epsilon-Machine Reconstruction method (inferring Mealy machine FSAs) in determining "The Attractor Basin Portrait of a Cellular Automata." Another interesting implementation of HMMs is Shalizi's Causal State Reconstruction algorithm. Both of these two are particularly well-suited to accommodating symbolic representations. (I'll also note, be careful with SAX, this technique is intimately related to Symbolic Dynamics and I don't believe that the advocates of SAX have yet made the connection - certainly I have not found anything discussing the implicit partition of phase space within that context. Long story short, if the symbolic coding is not done properly, the symbolic sequence will not be a faithful encoding of the dynamics and there can be uniqueness issues).
In a more standard HMM approach, what you're looking to do is find the Viterbi path - the most likely sequence of hidden states that results in the observed sequence. An algorithm such as the Baum-Welch (that uses the forward-backward algorithm) can be used to infer an HMM and the Viterbi algorithm can be used to find the Viterbi path. This approach is structurally similar to that taken by Crutchfield in the paper mentioned above.
Does this help or are you looking for more help on setting up problem?
Answer 1: The first part of the problem is to determine the change points based on many variables and not only one. Then the second part of problem is to assign a label to each segment.
Answer 2: Yes the system behind is data is a stochastic finite state machine.