I'm trying to guess what are L1 caches typical pipeline stages. The attached file describes a 3-cycle one, like those found in Silvermont, Jaguar, and Cortex-A9. The notation conventions are:

  • blue for the adress computation;
  • yellow for the adress translation;
  • orange for the data access.

However, high-end CPUs such as Haswell, Bulldozer, and Cortex-A15 have a 4-cycle L1 cache access latency. Where does the fourth cycle come from? Could someone explain in detail what do the four stages do?

More Hugo Décharnes's questions See All
Similar questions and discussions