Our data set have 2447 object and 42 attribute of disease and meteorological.
In classification method, disease and meteorological attribute were converted to classification. Then, have established the model by two-thirds of the package (training set) and the model’s accuracy is then estimated in the test set (one-thirds of the package).
For cluster analysis, was used numerical tuple. Before cluster analysis method, Hopkins Statistic coefficient (H) is calculated, because H is 0.1, the tuple has statistically significant clusters.
By k-medoids method, because silhouette coefficient of created clusters are 0.4, 0.4, 0.4, clusters have statistically significant clusters. Then we are nominating k-medoids method.
Similarly, the mean of the cluster is a k dimensional vector where each component is an average of the corresponding component for each of the m documents.
A document error is the square root of the sum of the squared differences of each of its k components with each of the k components of the mean of the cluster.
The RMSSTD is an error for the entire cluster so to incorporate all documents from the cluster in this err calculation, it becomes the sum of the squared differences for every component of every document. There are m*k components to sum over in this case.