Hi everyone.
I have a temperature variable in my data. I want to exclude the highest temperature days. For that, I have to look at the distribution of temperature, select the highest temperature and then decide a cut point where the data becomes more and more sparse.
My question is do I have to look at temperature distribution manually in R and decide the cutoff points? (criteria for the cutoff point is: where the data become more and more sparse)?
What Does "sparse" actually mean ? is that where the data starts to become wider and wider?
For example., 25 degree is a high temperature. I will see the values of temperature manually and delete all the those days where temperature values are near 25 degree celsius? is that the right procedure?