Hi, I have read some papers on model-based clustering of categorical data using EM algorithm. In these works, frequency data of (input) categorical sequences are used in the 'E' and 'M' steps of the algorithm. In this way, temporal order of the sequences is not maintained during computing likelihood of a given sequence.
How the algorithm parameters can be changed to consider temporal order, that is, take transition probabilities of categorical sequences instead.
Thanks.