There are new models such as BERT that requires pre-training and fine-tuning. And there are the traditional models, such as DecisionTree and SVM, which requires us to extract features from the text and train on them.

If I want to compare BERT results with the previous ones, do I need to extract features (perform feature engineering) on the text? Can I somehow use the pre-training values or vectors from BERT?

More Issa Annamoradnejad's questions See All
Similar questions and discussions