Sampling Can be done over the data set which you have, but authenticity, credibility of such data or quality of data is additional question which you need to think. Sampling technique based on How good your samples represent the population or events.Thanking you
Does it mean,that if i have two datasets Supermarket Item purchase and Retail Sales and Order Dataset ,
Now,i want to Mine frequent pattern or Association rules ,I did it with apriori algorithm and Applied Sampling Technique(Not fixed ,but Same technique i applied on both the datasets) ,is it possible that Supermarket dataset - i performed and increased efficiency because of sampling without dropping accuracy & On retail sales & order dataset ,results are differing,it dropped accuracy and didn't improved time complexity,
Thing i got to know about,if i applied stratisfied sampling taking one dataset and applied with concept of Association rule mining,it gave me accurate result and hence i proved that i enhanced algorithm.
But would it give better output for all the datasets????
Your question: Does sampling technique depends on dataset?
The selection of sampling technique is based on the type of data only. Because, if you want to do a study or research related to Diabetic Retinopathy. Then you can follow the simple random sampling technique. Because, the no. of persons with DR are very rare one. Diabetic diabetics are more in India and in abroad. So, we would use the Systematic random sampling for that study.
If you want to do research in a district, then you can go to stratified or cluster sampling. In that you can consider, each area as a strata or a cluster.
If you want to do research in the supermarket dataset. In that study, you can find the custumer getting the items or goods in what time interval, retail sales, order sales data set and so on.
In this type of study, we have to spend much more time, becuase you are compulsorly using the census sampling (include all the data). If you would include all the data in your study then only you will get the results in correct and accuqately. Manpower interms of data collection is all more. In this type of study you have spend more money also.
Thanks for Knowledge You Share @Senthilvel Vasudevan
I really want to make my algorithm independent of dataset category,
I do not have much time to gather all types of datasets and convert it into perticular format for my algorithm and then check whether applied sampling technique works well for all datasets or not.Even if i tried and if it falls for some datasets,my research wouldn't be a good research.
Can you orally notify approximately which sampling technique will not fall for accuracy for more number of datasets.
If possible ,can you give me links i can get different datasets??