First, recognize that the Accuracy of appraisals is a 95% requirement for a successful business. Ref>: https://finmodelslab.com/blogs/kpi-metrics/real-estate-appraisal-kpi-metrics
"The Accuracy of Real Estate Valuations refers to how closely a property’s estimated value aligns with its actual market value." Applying normal reliability metrics to this definition would need to include the fact that any appraisal is extremely time-dependent! "Reliability" is simply getting the same results from the same inputs and methods. "Stability" means that variations in the inputs correspond well to the changes in the outputs. "Performance" is ill-defined here, but it may refer to the duration of an appraisal so as to minimize errors due to the time-based nature of any appraisal.
Thank you, Steven, for making a clear distinction between these technical terms. These are the most useful definitions I have ever been exposed to.
In which cases can the same input and method produce different results (reliability)? How can we judge the range of correspondence between changes in input and output (stability)?
What about the other terms: generalizability, trustworthiness, and scalability in the field of real estate appraisal?
Those are some interesting additions. Generally, in engineering, we strive for some “optimum” point which is neither a maximum nor a minimum for any particular criterion. However, statistics will demonstrate that the more factors you try to include, the more “average” will be the final outcome! That may be desirable, but understand that it means that NONE of the desired attributes is likely to be exceptional in performance.
Again, the most critical factor in value assessments of any kind is timeliness! Projections rarely work well in those cases, and the degradation of the data value may even be exponential. An economic system may never be “stable” for more than the time it takes for new transactions to occur.
Reliability can be affected if either the data (input) or algorithms (method) are flawed or include a larger amount of variation than required. An example could be the assessments of properties with similar major characteristics (size, materials) without proper consideration of contributing factors like location or aesthetics.
Stability is another term for what we call “sensitivity” in engineering analysis. When you change the value of an input, how much does it affect the output value? A stable system will only change, and in some consistent proportion, when an input value is changed. An “unstable” system will change erratically – sometimes giving different outputs for the same inputs. This may be due to an erroneous “correlation” between an input and an output, or an inordinate weighting or influence of the mathematics used in an algorithm. There are many real-world systems that have any number of stable and unstable conditions. A stable condition will tend to remain in that position unless a large external force is applied. An unstable one will spontaneously move to some other stable condition.
An unfortunate example may be the impact of wars on local economies. The unpredictable nature of effects and consequences makes any realistic appraisal of local value unlikely.
In developing a model we try to account for all of the factors that we can. To develop a useful model it is also necessary to determine which factors are most significant, as trying to include every factor (if we knew them) will result in more errors in practice than simply recognizing the contribution to the accuracy error that leaving them out will have.
Trustworthiness is really a synonym for Reliability. They may have a different semantic or emotional appeal to some people, but in appraisals you can really use them interchangeably. “Reliable” means that you can “Trust” it. Trust is the belief that you will get reliable results.
Scalability means that the model you developed with a particular data set can be used with larger data sets equally well. The caveat here is that the model must include factors that reliably reflect that change in scale. Models comparing prices of small homes or those of large factories will have many similarities, but some significant differences. You could neither simply “Scale up” a model for small home prices to large factories, nor “Scale down” a good model for large factories to predict a small home market.
You could include adjusting factors in an overall model that would allow you to use it in both cases – and anything in between. The key here is how well you do your test cases while developing the models. This then relates to your term of generalization. Generalization of a model is desirable for end-users and simplicity in marketing. It usually involves many more chances for errors, requiring more extensive testing and validation.
Some things just cannot be “compared” well because of the differences in application. You cannot really say that an appraisal of a factory is accurate or not based on a comparison to an appraisal of a small family home. Trying to include them both in one general algorithm is not a very useful idea.
The list detailing models used in real estate appraisal, such as reliability, stability, scalability, and generalizability, seems endless. Other relevant terms include interpretability, explainability, uncertainty, flexibility, and adaptability. It is not always easy to distinguish between interpretability and explainability, or between performance and uncertainty, or even adoptability and adaptability. Many published papers, even peer-reviewed ones, use these terms without making clear distinctions about what they are referring to.
It is interesting that Chen et al. (2024) demonstrated the model's stability by using the results of k-fold cross-validation. The consistency between the MSE, MAE, MAPE, and R² results across the training, validation, and testing datasets serves as a strong indicator of the model's stability.
"Applicable" refers to the suitability for a particular use. "Adaptable" means that the same methods/algorithms can be used for somewhat different uses with little or no modifications. Something that is adaptable would be applicable for different situations. Something that is applicable for one situation may or may not be applicable when the conditions change. It would not be adaptable in that case.