I want to know why more articles in NoC Reliability field assume Fault occurs in the NOC-Router why not in Resources,Links,PEs , other elements of NoC?
I think it is because router has a responsibility to decide the best data routing instead of others. While, others like PE or resource has a modular function which can be considered like independent. Hopefully it helps
In my humble opinion, the assumption that faults occurs on router rather than other elements, is that errors are more critical at the routing decisions, where a single bit can lead to a deadlock, livelock, or increase traffic in an already crowded link.
In addition, if you use error detecting/correcting mechanisms you have two main options:
- Put them in the Network interface as a end-to-end error detecting/correcting mechanism
- Or, put them in every router as a link error detecting/correcting mechanism
Each has its advantages and drawbacks as well, it depends on the robustness of the application running on top of the NoC
Error and faults occurs everywhere but they matter the most at the routing process, therefore the common assumption is to "detect" errors or faults at the routers
When talking about network-on-chip, ie an embedded network that represents a scalable communication support for the multi-core system (homogeneous or heterogeneous) so since talking about communication the entity responsible for the reliability of communications is indeed the router ( routing algorithms, switching, arbitration, ..ect) for this reason, most of the people dealing with this problem operate at the router level.