How useful are Topic Models in practice?

Eric K. Ringger Popular answer

For a careful analysis of the quality of topics discoverable by LDA, I highly recommend the paper "Reading Tea Leaves" by Chang, Boyd-Graber, Wang, Gerrish, and Blei. http://www.umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf

Sebastiano Panichella

Yes, i like your question because often empirical (and not empirical) results obtained in several academic papers are not present (or are not used) in the common practise.

And this is our challenge about academic works.

However there is a nice paper that discuss about the use of topic model in data

beloging to facebook and/or twitter and is nice discover that there are possibility

to use ideas for concrete task....

paper title:"Empirical Study of Topic Modeling in Twitter"

authors: "Liangjie Hong and Brian D. Davison"

link: http://snap.stanford.edu/soma2010/papers/soma2010_12.pdf

Mikael Lundin

I don't have an answer to this question, but would rather like to ask an additional: To what extent can topics be categorized within NLP/Computational linguistics?

Frederic Andres

did you consider the topic maps model (ISO 13250) ?

Kristóf Csorba

I used topic modeling techniques for compact document topic representation. In this application, the topic identification (and modeling) is used as a data compression technique: identifies the rough category and allows the selection of topic-dependent, finer classifications. In this sense, the topics are a characteristic of the corpus and their changes reflect changes in the requirements. This is not necessarily bad. Of course if you want to recognize topics as we, humans consider them, you need classification and supervised learning with annotated teaching data sets. If you use unsupervised approach, generated topics may not exactly overlap with our original topics. Summarized, I think topic models are very useful in many aspects, but if you do not want to see shifts in topic boundaries, you need supervised learning techniqes. Otherwise, the system may draw borders where it is less expected.

Leo Wang

I also have this question in my application, and there is no answer now.

Justin Fister

I have found topic models to be a useful feature in text classification problems. They weren't one of the most predictive features, but were still helpful to the model.

Bahram Amini

Generally, this is a natural feature of text mining applications. For example, simliar documents with different "theme" may have different topic models. The origin of the issue comes from the missing "theme" of documents when developing a topic model. This is an ongoing research question which requires new idea to develope.

John Chen

Topic models are good for data exploration, when there is some new data set and you don't know what kinds of structures that you could possibly find in there. But if you did know what structures you could find in your data set, topic models are still useful if you didn't have the time or resources to construct classification models based on supervised machine learning. Lastly, if you did have the time and resources to construct classification models based on supervised learning, topic models would still be useful as extra features to add to the models in order to increase their accuracy. This is the case because topic models act as a kind of "smoothing" that helps combat the sparse data problem that is often seen in supervised learning.

There is previous work in automatically assigning labels to topics discovered by topic models. They work OK, but they're heuristic in nature.

So, topic models are not the be all and end all of data analysis. But they're a useful tool.

Xuan-Hieu Phan

Topic models can be used for document classification, contextual matching (like contextual advertising), content-based recommendation, and search. The CTO of Chomp said that they use LDA as an important part of their app search engine for iOS and Android. Chomp was acquired by Apple a couple of months ago. We are currently building a content-based recommendation system.

Eric K. Ringger

Sebastian de la Chica

Curious if you are asking the question in terms of commercial applications or research practice?

Andras Kornai

Topic classification can be very useful in practice. The Northern Light search engine used a hierarchical scheme of 22k topics (the hierarchy was manually created, but the documents were automatically classified). The general philosophy these days is that there is so much data out there that recall is irrelevant, as long as there is good precision you will find relevant answers. Under such circumstances, topic classification is really quite irrelevant. However, in those cases where recall still matters (e.g. there may only be a very small pool of qualified applicants in a huge pile of resumes, and you want to hire all of them), topic classification is still relevant, since it lets you discard 99.99% of your data and concentrate on the rest.

Hao Wang

Topic models such as LDA can be used for recommender systems.

Agung Dewandaru

While some criticisms on the question are valid, we need to underline the primarily use of topic modeling as a basis for data exploration process. This will obviously need, for most cases, to be post-processed or as a basis for further task-oriented customization/application. It can be used to assist IR or IE tasks. Also, it has to be noted that the field is still growing due to the appealing basic of the topic model (LDA).

That being said, there are some direction to increase further the quality and stability of the topics inferred. For example, combining supervision method with labels with the basic unsupervised model to "influence" the Topics inferred (see Labeled LDA, Supervised LDA for example).

Second avenue is by making nonparametric assumptions to the topic model itself. That is, we are not bounded to K predefined (and often trial and error) inputs. Instead the number of topics are inferred from the document via Dirichlet / Chinese Restaurant Process. See HDP and SHDP for example of this. However, still, this direction remains much to be improved upon.

Iping Supriana

Hiba J. Aleqabie

topic modelling also used for information retrial

Fatemeh Zarmehr

is it possible to use topic modeling for classification of document like digital resources?

Fatemeh Zarmehr

is any body here to response me?

Hiba J. Aleqabie

Fatemeh Zarmehr

Yes, you can.

Mahdieh Zabihimayvan

Dear @Fatemeh Zarmehr ,

Topic modeling can be used for classification of documents based on their topics.

Junaid Rashid

Topic modeling is used for documents classification and also gives better classification results.

José Ramón Saura

Fatemeh Zarmehr you can apply a Latent Dirichlet Allocation (LDA) model to digital resources divided in documents. The LDA model is a state-of-the-art thematic modeling tool that works in Python and determines the documents topic by analyzing them. You can check this link out: https://pypi.org/project/lda/

Feedback defines the constitution of an organism?

How can I prepare virus for a TEM or SEM imaging?

How to learn more about SPSS and its Application?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?