The survival curve for the exposed and unexposed group cross over more than once over the time period of 6 months. I was wondering if log rank test is still appropriate to test for significance?
You will have to get more specific on what aspect of the curves you are interested in. For example, is the earlier part of the curve more important than the later part of the curve or vice versa? Are you interested in the entire curve or cumulative survival at a certain time point? I recommend that you read on the G-rho family of tests (Harrington, D. P. and Fleming, T. R. (1982). A class of rank test procedures for censored survival data. Biometrika 69, 553-566). This hallmark paper deals with the possible tests that compare entire curves. The commonly used logrank and Peto-Peto tests are simply special cases of this general family of tests. Another situation would be testing difference for one single time-point instead of the difference of the entire curve (with whatever weight you give to different parts of the curve). So the choice of your test starts with you clearly deciding what the objective of your hypothesis is.
Thanks for your reply. I am looking to compare cumulative survival at the end of follow-up. Evidence suggest that there could be a short term risk (earlier in the curve) among the exposed group but the hypothesis is that they will have better cumulative outcome in the long run.
So what is useful is to estimate the Kaplan-Meier curves along with the 95% confidence intervals (CIs) for these cumulative survival estimates. These cumulative survival estimates and the 95% CIs will enable you to compare the groups for the times that are of interest to you (i.e. short-term and long-term). The confidence intervals will give you a range of plausible values for the population cumulative survival at that given time. You can look at how much the confidence intervals overlap between/among groups. I actually have to admit I am not sure what the corresponding p-value for the comparison at a given time point would be. I would have to look into that. I always just show the 95% confidence intervals. Maybe someone else on researchgate knows the answer right away. I am guessing that you could for example, use the standard errors of the cumulative survival estimates and develop some type of Wald test for comparison of groups at a given time point.
perhaps the very significant method is to estimate the Kaplan-Meier curves along with the 95% confidence intervals (CIs) for these cumulative survival estimates. These cumulative survival estimates and the 95% CIs will enable you to compare the groups for the times that are of interest to you .
Thank you all. I am actually reporting the point-wise confidence interval. Decided that the log rank test might not be appropriate since the assumption of risk being independent of time might not be applicable in our scenario.
A method for sequential analysis of survival data with non-proportional hazards.
Biometrics 54, 1072-1084, 1998
Although this method is developed in a sequential setting it can just as well be used for fixed sample trials. If you let me know your email address I can send you the paper.
As a Statistician I would say that randomization is required for any statistical test to be valid. Otherwise the results obtained would be biased. If there was any bias in the allocation of patients to treatments then any affect found by the statistical test can be due to treatment differences or the bias in allocation. Then it is difficult to differentiate one from the other. Therefore results from any tests applied has to confront this problem.
Brent R. Logan, John P. Klein, and Mei-Jie Zhang. Comparing Treatments in the Presence of Crossing Survival Curves: An Application to Bone Marrow Transplantation. Biometrics. 2008; 64(3): 733–740. doi:10.1111/j.1541-0420.2007.00975.x.
John P. Klein, Brent Logan, Mette Harhoff and Per Kragh Andersen. Analyzing survival curves at a fixed point in time. Statist. Med. 2007; 26:4505–4519
From experience working with Kaplan-Meier curves, survival curves that crossover hardly give significant p-values. I believe you are right to explore other methods of statistical test; as I doubt if the KM (or CPH) would be useful given this scenario.
YOU MAY USE ALTERNATIVES, LIKE PATIENT-DAYS, YEARS RATHER THAN TIME TO EVENT, AND PLOT AGAINST TIME or CUMULATIVE EVENTS ..... REF...John P. Costella, A simple alternative to Kaplan–Meier for survival curves.
In my opinion, crossing of survival curves does not necessary completely invalidate log-rank test. Crossing of survival curves makes acquisition of significant P value much less likely (test loses power). But, if there are large number of events, and events in one group tend to occur later than in other group "on average", test can obtain significant result even if survival curves cross. This is what log-rank test indirectly evaluates - time to event of interest and I think you could conclude that patients in one group died significantly later than in other if P value is significant (i.e. neto effect is in favor of new drug). On the opposite, hazard ratios obtained in this situation are inaccurate and actually misleading as they do not represent truth for entire study period.
Alternative tests that would give more weight to earlier or later events and therefore mathematically favor obtaining of significant P value in specific situations (giving more weight to later observations if curves cross early or more weight to early events if curves cross later) would lead to same conclusion as log-rank test in this case. List of different alternative approaches is extensive and I can suggest this article that evaluated them: https://www.researchgate.net/publication/273284453_Statistical_Inference_Methods_for_Two_Crossing_Survival_Curves_A_Comparison_of_Methods
I think it is of biggest importance that author of such paper (where survival curves cross) recognizes non-proportionality of hazards irrespective of significant/non-significant P value obtained by the log-rank test (if result significant by alternative test) and reports it (which regularly does not happen in medical literature). This is because medical readership does not necessary understand assumptions behind statistical tests and can easily be manipulated by "non-transparent" reporting. They usually believe what is presented to them, especially if published in high impact journal. What I find even more worrisome is that the name "the log-rank test" can be used (and is used by different statistical programs) for three different mathematical procedures that provide three similar, but noticeably different P values. This opens whole new space for data manipulation in borderline significant situations, but is not currently recognized as a problem: https://www.researchgate.net/publication/313548550_Survival_analysis_more_than_meets_the_eye
However, If curves crossed several times during follow-up period as you described, this could indicate there is no real difference in survival between these two groups of patients and you will probably obtain insignificant result using the log-rank test and some alternative tests. In my opinion, there is also a problem of clinical interpretation of potentially significant result obtained by alternative statistical test (how should you explain such kind of "filtered" survival benefit to a patient considering new drug?).
I find previous answers to this topic very informative and smart. I wish you luck with your future research.
Article Statistical Inference Methods for Two Crossing Survival Curv...
Article Survival analysis, more than meets the eye