Hi,

I am new to Gensim and Doc2vec, i am solving problem of sentiment analysis. Previously i did the below: i have a data-set e,g Amazon Product Reviews, i computed Cosine Similarity b/w documents and consider score of document with all other documents using cosine similarity and consider score of top k documents as features, so in this way i fed those features vectors to classifier and compute F1 score.

Now i want to do: The sketch in mind is that:

  • First i will Train my Doc2vec Model on some Training dataset(e.g Google News Dataset) Suggest any, if you know better.
  • the i will generate feature of test data(my product reviews dataset)based on the training that i did.
  • then i will find similarity score between these vectors of test dataset(product reviews) and pick top K scores and will use these scores as features.
  • then feed it to classifier and compute F1 Score.
  • The thing i need from you guys is:

  • Kindly check my flow is correct or not
  • Any tutorial or help how to do it.
  • Training Dataset, if you know better.
  • Best Classifier in my case, Previously i used SVM.
  • Any other suggestions/Tutorials/Video Tutorials/Links.
  • Thanks.

    Similar questions and discussions