22 June 2013 1 779 Report

In a case where thousands of documents from the same domain is fed into the standard LSA topic modeling, resulted topics are confusing. In some cases many keywords in each topic is being repeated in many topics.

I am new to topic modeling. A sample result is shown below (the keyword play is being repeated).

topic ratio keyword

0 0.178252012 play

0 0.13429748 like

0 0.129169756 get

1 -0.299574178 ending

1 -0.213000647 play

2 0.395652058 ending

2 -0.296851122 play

What could be the problem?

How to confirm the model is built correctly?

More Issa Atoum's questions See All
Similar questions and discussions