Let ''I'' be a set of bit arrays of length ''n''. Given a bit array ''t'', it is desired to find the array ''i'' in ''I'' which is closest to ''t'' in terms of their Hamming distance [1] ''h'' – which may be 0 if ''t'' is in ''I'', but most likely ''h > 0''.
Clearly this is an instance of the multidimensional nearest-neighbor problem, but if ''n'' is too large (say, ''n = 256''), traditional nearest-neighbor algorithms such as k-d trees become impractical. However, it also seems certain that the domain allows for some optimizations – after all, only two values (0 and 1) are possible for each dimension along the data points.
Does anyone know of a nearest-neighbor algorithm that is optimized for this case?
[1] http://en.wikipedia.org/wiki/Hamming_distance