I understand how/why one goes about breaking data up into a train/test/cross validation set; however, it is unclear how this works analogously when the more rigorous nested k-folds method.  Additionally, can you describe the motivational difference between a k-folds method and a nested k-folds method.  The link below provides mathematical details for the nested k-folds method.

Thesis Understanding Random Forests: From Theory to Practice

Similar questions and discussions