I have seen in almost all processor, L2 cache associativity is some power of 2 (L1 is generally direct mapped or some low power of 2). Although in Alpha 21164 super-scalar processor L2 cache (96KB) was 3-way set associative. Which doctrine had convinced the architect of alpha 21164 processor to not to have the associativity some power of 2 ?