Propensity analysis is a powerful tool to control confounding/selection bias in observational studies of drug effectiveness. However, its use for the study of causal or prognostic factors unrelated to drugs is not widespread.
Excellent question, my impression is, that often it is not clear what it adds to any other regression based approach... also the barrier for actually conducting such an analysis is a little higher than for a regular regression.
Propensity score (PS) matching is a more robust to balance baseline co-variates and adjust for confounding. The propensity score allows one to design and analyze an observational (nonrandomized) study so that it mimics some of the particular characteristics of a randomized controlled trial ("strongest" study design: Imagine if we had the ability to randomize patients to having a particular prognostic factor vs. not).
There may be some utility of using of propensity score matching for studies of causal or prognostic factors to balance confounding variables between group 1) that has the progonostic factor and group 2) that does not have the prognostic factor.
That being said, PS matching may not a useful technique if your ultimate goal is to build a predictive model for disease progression.
There are some pitfalls for using PS:
1) It is important to match on PS as "closely" as possible, however, this may cause the loss of some individuals leading to reduced sample size and power. I put closely in quotations because there are two primary methods for this: nearest neighbor matching and nearest neighbor matching within a specified caliper distance (Rosenbaum & Rubin, 1985).
2)You cannot use co-variates that may be affected by the exposure of interest in the model that estimates the propensity scores.
3) Even though PS can balance observed baseline covariates between groups, it MAY NOT balance unmeasured characteristics and confounders.
Peter Austin has done some great work in the field of propensity scores. I highly recommend reading some of his work if you decide on using propensity scores.
One article of his:
Austin, PC. An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behav Res. 2011 May;46(3):399-424.
Rosenbaum P.R., Rubin D.B. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician. 1985; 39:33–38
Many thanks to Dr. Simon and Al-Jaishi for their helpful comments. If I may I would like to make a more specific question. Suppose I want to analize the prognostic value of obesity in patients with acute coronary syndrome using a cohort study. It is known that obesity is associated with other factors such as age, sex, diabetes, dyslipidemia, hypertension, etc.., some of which are themselves prognostic factors. It is possible to create a propensity score to obesity by using these cofactors, which is only moderately discriminative for obesity (v.g. area under the ROC curve of 0.70). I can choose 2 analysis strategies: Strategy A: To develop a regression model in which the response variable is hospital mortality and the predictors are obesity, age, sex, diabetes, etc. Strategy B: a model that only includes obesity and propensity score (that may be analized using regression, stratification or matching). Which of the two methods is preferable?
I see propensity scores more as a mean to achieve covariate balance in a matching situation. For the described strategies: I do not see much value in strategy B. It just makes things more complicated. Additionally is obesity really a dichotomous variable or isn't it actually just the BMI? In strategy A you could add BMI as is (keeping more information-> "better model"). However I would expect that the results of both strategies are fairly similar and then it becomes more a question of who your targeted audience is and how familiar they are with the different approaches.
There are several advantages of propensity score analysis over traditional regression analysis:
1) application of flexible machine learning algorithms yielding well-calibrated estimates
2) capture of complex and nonlinear relationships between groups and covariates without overfitting,
3) reduced bias by estimating the propensity score without reference to the outcome variable,
4) more interpretable and less prone to violation of model assumptions
Check out the work by Dan McCaffrey in particular:
McCaffrey, D., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R., & Burgette, L. F. (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Statistics in Medicine, 32(19), 3388–3414. doi:10.1002/sim.5753