Dear Researchers,
We are trying to impute values in a large data set (n=1950) by k-nearest method. There are 60 nos. of missing data which are needed to be imputed. The data will be used for PCA and PLS-DA analysis. We are using the R package "robCompositions" for this imputation. My question is related to the optimal value of k. Earlier researchers have noted that "The k value must be determined by calculating the error between the randomly imputed and the original values. The optimal k yields the smallest error" (Makvandi et al. 2016, Ore geology review). We do not know the practical way to do that and naturally we are unable to find the optimal value of k. If you know how to find optimal k values, could you please let us know? It would be a huge help for our research and we would appreciate it.
Thanking you in advance,
Sincerely,
Abu Saeed Baidya