How to calculate RMSSTD (root mean square standard deviation) for Text cluster analysis ?

More Abhishek Verma's questions See All

Which one is best for implementing machine learning algorithms and statistical modelling; SAS, R, Python or MATLAB ?

If all of the above are available, which one should be chosen to implement machine learning algorithms and statistical modelling?

04 May 2014 8,667 18 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

I recently came across an anatomy text by Carl Moller that was published in 1915 but it is in German or Dutch neither of which I can understand. I would like to know if there is an English...

10 August 2024 4,347 1 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View

How to prepare the nanoparticle treated fungal sample for Environmental SEM analysis?

A fungal strain was treated with nanoparticles. We want to do an environmental SEM analysis. So could anyone share your views on preparing the sample? Thank you.

07 August 2024 5,307 1 View

Md.Julkar Nayeen Mahi

Hi dear Avi hope you find this article helpful.

http://www.imm.dtu.dk/~perbb/MAS/ST116/module02/index.html

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_cluster_sect021.htm

Can check on the rapidminer tool , I think it is better then R and suits for your work.

Best regards,

Mahi

http://finzi.psych.upenn.edu/library/bio3d/html/rmsd.html

https://www.youtube.com/watch?v=G8j8KAJtJlw

https://www.youtube.com/watch?v=-2Koi-caSZw

https://rstudio-pubs-static.s3.amazonaws.com/31867_8236987cf0a8444e962ccd2aec46d9c3.html

http://vancouverdata.blogspot.com/2010/11/text-analytics-with-rapidminer-part-4.html

https://rapidminer.com/

Javzmaa Tsend

Our data set have 2447 object and 42 attribute of disease and meteorological.

In classification method, disease and meteorological attribute were converted to classification. Then, have established the model by two-thirds of the package (training set) and the model’s accuracy is then estimated in the test set (one-thirds of the package).

For cluster analysis, was used numerical tuple. Before cluster analysis method, Hopkins Statistic coefficient (H) is calculated, because H is 0.1, the tuple has statistically significant clusters.

By k-medoids method, because silhouette coefficient of created clusters are 0.4, 0.4, 0.4, clusters have statistically significant clusters. Then we are nominating k-medoids method.

Abhishek Verma

According to RussAlbright (SAS Employee):

Each document is a K dimensional vector.

Similarly, the mean of the cluster is a k dimensional vector where each component is an average of the corresponding component for each of the m documents.

A document error is the square root of the sum of the squared differences of each of its k components with each of the k components of the mean of the cluster.

The RMSSTD is an error for the entire cluster so to incorporate all documents from the cluster in this err calculation, it becomes the sum of the squared differences for every component of every document. There are m*k components to sum over in this case.