I understand how/why one goes about breaking data up into a train/test/cross validation set; however, it is unclear how this works analogously when the more rigorous nested k-folds method. Additionally, can you describe the motivational difference between a k-folds method and a nested k-folds method. The link below provides mathematical details for the nested k-folds method.
Thesis Understanding Random Forests: From Theory to Practice