In these days, the "feature fusion" in deep neural networks became so popular that many of the academic papers implement it without references.

So, could you please recommend some papers dealing with the effects of feature fusion (i.e., concatenating the hidden layers/multiple inputs), or forking the network (for multiple outputs) ?

Papers in any fields are welcome.

Appreciated for your attention.

Similar questions and discussions