Consider we have a dataset of discrete values and want to create a random generator based on them. So we need to divide these values into zones (states). But how do people normally choose the number of states a Markov's chain should have? Generally speaking the more states you have, the less random the process will become. So what should the criteria be?