01 January 1970 3 3K Report

Reinforcement-Learning-On-NLP means that using reward to update model.

Re-Label-That-Data means using reward to label-again the related data and then re-train.

More Tong Guo's questions See All
Similar questions and discussions