Is multivariable logistic regression instead of Cox proportional hazards model acceptable for survival analysis with 2 year follow-up? Some papers are based on such methodology. I wonder whether this makes sense.
Cox ph model is for time to event data (by including what we call censoring process) where the response variable is positive number, for logistic models, they fit the pattern of response variable (as a binomial sacle: 0 &1).
Under this view, they are nit interchaneable as cited by professor, Mehmet Sinan Iyisoy
Cox proportional hazard risk model is a method of time-to-event analysis while logistic regression model do not include time variable. For example, we can imagine an intervention in a randomized trial that only delays the onset of an endpoint and the number of events in the two groups is the same. In such a situation, logistic regression will not reveal the benefits of the intervention in the study, while the Cox model does. Of course, the Cox proportional hazard model has advantages over logistic regression in this respect, but it cannot be concluded that logistic regression is not a good method of analysis. The logistic regression result can be presented in addition to the Cox model, e.g. to better visualize the differences in the number of events between groups. However, when a survival analysis is performed, the Kaplan-Meier curve is usually also presented, so it is difficult to omit the time variable. Perhaps the studies you mention are comparing survival at the start of the study and at the end of the study (after 2 years), where the exact time of the endpoint is not known. In such a situation, logistic regression will be a better choice than the Cox model.
Yes, the pooled logistic regression can be used instead of the Cox proportional hazard model. But there are several assumptions:
1. The Cox PH model is a semi-parametric modelling approach. The key assumption of the Cox PH model is the hazard ratio is constant over time.
2. You can use the pooled logistic regression (which is a parametric modelling approach) to estimate hazard ratio, survival probability, and cumulative incidence (risk) difference. In this case, you need to make assumptions about the functional form of the baseline hazard and about whether the hazard ratio is constant or time-varying.
3. Additional to the assumption of the functional form, you need to assume that the outcome of interest is rare (say, 10% of incidence rate) in each time interval.
4. The Cox PH model should be applied for the wide-format data, while logistic regression should be applied for the long-format data.
5. If you only want to estimate the hazard ratio, the Cox PH regression makes the fewest assumptions. On the other hand, since pooled logistic regression uses multiple observations per person, you need to correct the standard errors. One option is to use robust (or sandwich) standard errors, though the better option is to use bootstrapping.
See the article attached below. You will get more details here.Article Relation of Pooled Logistic Regression to Time Dependent Cox...
As per this paper Article A comparison of the logistic regression and the Cox proporti...
both analyses will identify practically the same associated factors but the coefficients of the logistic regression will be a bit more inflated than the Cox regression. This is normal because the logistic regression will be approximative as opposed to Cox regression which adjusts for time to the event. If the follow-up time to event was accurately measured, then go for a Cox regression if not then use the logistic regression as both will probably lead to the same conclusion. For me, it is like using the Fisher exact test or the Chi-square test (approximation).