Is "borrowing strength," as found in this article, sufficient to refer to this as "small area estimation"?

I have no doubt that your paper does do small area estimation, but not that "gathering strength" defines "small area estimation".

I think the question confuses "small area estimation" with techniques used to estimate small domains. Small area estimation has to "gather strength" using information from other sources. There is no other way. But not all gathering srength is small area estimation.

"Small area estimation" refers to what we want to do; "gathering strength" refers to how we plan to do it. If a domain does not have enough observations for accurate enough estimates based only on data from the domain, we need to make some assumptions. We often assume smoothness, so that neighbouring regions are treated as similar, but so does all estimation. This even allows domains with no data which some might regard as black magic. But we do it all the time.

For example, imagine you are Robert Boyle. You have measured pressure and volume of a gas at constant temperature. The graph looks like a hyperbola. To check more accurately, you plot log volume against log pressure and get an almost perfect straight line. Now you have discovered that v = a/p and can use this to interpolate. This estimates values at points where you have no data and is gathering strength from the other points. I don't think anyone would call this "small area estimation".

You can regard both model based and model assisted survey methods as gathering strength. The term "small area estimation" is vague, but I think it only refers to domains that require special techniques because of paucity of data.

By the way, the first time I heard the phrase "gathering strength" was in a seminar John Tukey gave at Massey University in 1974. He said that the only reasonable discussion for samples of size 1 he'd seen was in an archaeology text book which inferred the variation in measurements on skulls of a species from variation in different, but related, species found in the region.

James R Knaub

Terry -

Thank you for your very informative answer. I am also retired, but this should be useful to people I know where I used to work, as well as interesting to me and others. Regarding where I worked, they are going through some development now, hopefully, so the timing of this discussion is particularly good, I think.

BTW, I'd like to comment on this from your answer:

" This even allows domains with no data which some might regard as black magic."

I have run into that problem in cases covered by the paper in the first attached link, which has been used since 1999 to produce results for perhaps a thousand or more estimated totals each month for official energy statistics. There have been many categories/cells that were empty or nearly empty for n, and sometimes N. There was pressure to increase the sample to cover those with very small or no n and small N, but they were for categories of little importance, and the overall increase in sample size would have very likely made for a bigger nonsampling error in the form of measurement error. The sample size at one time became particularly unwieldy and had to be managed better, so more attention came back to this methodology. Total survey error is the overall consideration.

For cases where n is zero, there are still estimated totals and estimated standard errors for the prediction errors for those totals in all case, when using this small area approach. That is good for being able to obtain higher aggregates, but I did not want management tempted to report the estimates for the empty sample cells (categories). The variance is measureable, but not the bias due to any level of model misspecification. However, in such cases, the estimated variance was generally huge - even without separately considering bias. Thus I was able to convince management not to publish data for those cells, because of the very large estimated standard errors for those estimated totals, and they just figured in to the higher levels of aggregation.

So the point is that one can avoid publishing the "black magic" estimated totals and standard errors, because the estimated relative standard errors are generally hundreds or thousands of percent - what I said was effectively 'infinite.' The concern still is that managers often want to publish data with relative standard errors that are too high, and with small n, using small area estimation (SAE), there is also bias to consider. Even though we speak of model-unbiased results, that does not account for model failure, and when it comes to SAE, that is generally more of a concern. That is why a good deal of testing and monitoring can be helpful - but a couple of decades or so of successful use has showed this to be worthwhile.

Thank you for your comments.

Jim

Article Using Prediction-Oriented Software for Survey Estimation

Article Efficacy of Quasi-Cutoff Sampling and Model-Based Estimation...

Article Using Prediction-Oriented Software for Survey Estimation - P...

What are the various statistical software systems available which allow entering w in WLS regression?

What are examples of software for performing WLS polynomial regression?

How does heteroscedasticity compare to heterogeneity of variance?

Why use OLS regression, instead of WLS regression?

Why use simple random sampling for design-based classical ratio estimation?

Where may I find a good reference on all forms of "impure" heteroscedasticity?

How does your nation's official statistics system work?

What simple statistics/scores do you know which are often misleading because they try to incorporate too much?

How often does a statistical "test" have much value?

Where might I find a small, all continuous data, preferably finite population dataset, with one dependent and two independent variables?

How to learn more about SPSS and its Application?

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

Baseline drift in HPLC? What causes this?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

Which Scopus Journal provides the most affordable fees?

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

Is this a facetotecta nauplius?