What is the interpretation of low accuracy of LSA model?

05 May 2013 1 10K Report

Applying LSA on 500 pdf documents extracted from Google (for a certain feature), I got a low accuracy once I tried to infer the topic of new documents.

What could be the reason for this?

Ian Kennedy

The fact that you used a phrase to select the 500 pdfs from Google does not mean that they contain material on the same topic, or even that they are homogeneous. The first 125 returns are more likely to contain your terms than the next 125. If you are using the 501th return from Google for testing, it is possible that the phrase only occurs once in the whole document, or the document is an outlier. Try regarding the first 125 pdfs as 'seen' documents and the next 125 as 'unseen'. Are all these pdf topics poorly classified?

When all else fails, doubt your implementation of the algorithm. Check that all the assumptions for the latent semantic analysis are not breached. See e.g. http://en.wikipedia.org/wiki/Latent_semantic_analysis and its references.

Badges
Science topic

Similar topics
Computer Science
Data Mining

More Issa Atoum's questions See All

What to consider when preparing a call for a special issue in a journal/conference?

If you got an invitation for a special issue. when it is advised to reject/accept the invitation? what factors could gain the success of the special issue ?

05 June 2019 3,093 0 View

Looking for language dependent audio datasets?

Where can I find language-dependent audio datasets for deep learning applications other than English (example Arabic). Is their any way to extract such data using YouTube or any other source?

07 August 2018 3,445 4 View

Suggest comprehensive software requirements datasets?

Looking for huge datasets for software requirements that could be used in machine learning.

06 July 2018 1,253 5 View

What technology do Facebook use to post related ads?

During my research on a topic related to software quality in use, I noticed a related ad on my Facebook page. I did not yet publish the research and I did not discuss it elsewhere. Do Facebook...

05 June 2018 7,020 2 View

Active, passive voice dilemma, which one is suitable for scientific thesis?

Hi,In scientific writing many people tend to use passive voice. On, other hand, many readers believe that active voice is clearer than passive voice. In thesis writing, the author is considered...

11 December 2014 628 8 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

I recently came across an anatomy text by Carl Moller that was published in 1915 but it is in German or Dutch neither of which I can understand. I would like to know if there is an English...

10 August 2024 4,347 1 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

How can I apply boundary conditions in an orthotropic steel deck numerical model using ABAQUS software?

I am trying to simulate vehicular loading on an orthotopic steel deck bridge section in ABAQUS software. The red arrow mark in the attached figure indicates the direction in which the vehicle will...

08 August 2024 719 0 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View