I have searched for such establishment survey data so that I can try applying methodology to them for which I previously led development on official energy statistics.  The idea is to impute mostly for smaller members of a population by using a prediction (model-based) approach, with predictor data coming from a previous census.  This is in support of https://www.researchgate.net/project/Cutoff-and-quasi-cutoff-sampling-with-prediction-for-Official-Statistics, and its updates.  

Here is an example of data I would want from Canadian agriculture: 

Monthly Miller's Survey

Over 500 metric tons per month:

https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=3403

Annual Miller's Survey

Under 500 metric tons per month:

https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=3443

I would have liked to have called it "The Miller's Tale: Another Example of Quasi-Cutoff Sampling with Prediction."  (Sorry Geoffrey Chaucer.)  However, the data are confidential, so I was unable to obtain them.  Twelve monthly samples, and one annual together would have made up a census for a year to be used as predictor data for the following monthly samples so that population totals could be 'predicted' each month. 

If I had two census establishment  surveys for succeeding periods, I could (temporarily) delete some smaller cases from the more recent census, pretend it was a model-based sample, 'predict' totals (i.e., using regression), and see how well the model performed, both for prediction of totals and for variance. 

There are annual census data available from the US Environmental Protection Agency (EPA), for industrial release of toxins, and I have obtained data from them, with helpful information on those data from our friendly EPA personnel, but I am not yet sure how useful those data will be for my purposes, but I am hopeful.  So I plan to try working with those data on toxins released, but meanwhile, with this thread I am asking for any other examples of available microdata (data at the level of data collection) for establishment censuses, of which you may know.  URLs to these data would be much appreciated.  Most such data, I suppose, would likely be for official statistics.  (Note: The surveys on which these data are collected are generally multipurpose, meaning more than one question, y-value, per survey, though the predictors are generally different for each y-value so that even if multiple regression might sometimes be used, multivariate regression has not been used.  However, multiple questions means a compromise as to which members of the population are included in the model-based sample, which brings us to quasi-cutoff sampling.) 

Thank you. 

More James R Knaub's questions See All
Similar questions and discussions