I have a dataset. There are many users. Each user has many sequences. How can I cluster User in this case?
E.g. User 1: sequence A, sequence B,.... sequence Z
User 2: sequence A, sequence B,.... sequence Z
.....
User n: sequence A, sequence B,.... sequence Z
Sequence A is 0111000 (binary value) or (s1,l1) (s2,l2) //s: start of 1, l: len. E.g:011100011 is (1,3)(7,2).
How can I structure data for clustering? I will find similarity function. Thanks a lot.