Extending the discussion of shustrings, it occurs to me that indeed this will perhaps be the best measure of sequence redundancy. If it happens that a genome of arbitrary size is of minimal redundancy, then all shustrings will be of just one or two sizes; a histogram will show just one column (or in the worst case, two columns); the line is vertical. On the other hand, a genome of arbitrary size and is totally redundant will have an horizontal line. it is between these regimes that we see this graph:
So, this call is out for opinions of others regarding the use of this curve as a formal measure of sequence redundancy.