Current parallel BFS algorithms are known to have reduced time complexity. However, such cases do not take into account synchronization costs which increase exponentially with the core count. Such synchronization costs stem from communication costs due to data movement between cores, and coherence traffic if using a cache coherent multicore. What is the best parallel BFS algorithm available in this case?