I should to perform Monte-Carlo simulation on reliability field. I know, how to generate failure random times according Weibull CDF = 1 - exp(-a*t^b). But how to generate right-censored times, which appear if last measurement is before failure?
If you know when measurements will be cencored, this doable. This is best explained by example:
Suppose we run a reliability test with 5 items for 2 years. Just generate 5 samples from the lifetime distribution (Weibul in your example). If any of these 5 samples exceed 2 years, just replace them with 2 years and indicate that this is a censored measurement.
Dear Joachim, thank you very much for your answer. Unfortunately, I did not know times for censoring. Example : I have labeled array of times with labels "failure" or "censored". By means of Least Squares method I have got the parameters of CDF of failure times and now I want to generate similar array of times of failures/ and censored events.
I've a bit of a problem understanding what you want to do exactly. Could you please state the application (e.g Matlab etc) you are using to generate the samples?
Dear Hindolo and Ilya, thanks for your answers. For example, input array of times of events is:
41.1c , 77.8 , 83.3c , 88.7c , 101.8, 105.9 , 117 , 126.9 , 138.7 , 148.9, 151.3c , 157.3 , 163.8 , 177.2c , 194.3c, 195.6c , 207 , 215.3c , 217.4 , 258.8c, where c indicates censored event. We see, that it isn't situation, that Joachim mentioned. By means of LS using following values of Weibull parameters were got : beta = 3.4, teta = 190
Now I want to generate another sample (with another size!) both with failures and censored events, according these parameters. For failures it is evident, but how to generate censored events?
to calculate t_failure = F^(-1)(rand) and t_censored = (1-F)^(-1)(rand), where F is CDF for failure time, F^(-1) and (1-F)^(-1) are inverse functions for F and (1-F). After this to calculate t_event = min( t_failure, t_censored).
Dear Sasha, about parameter estimation, you should used maximum likelihood method.
See in Gertsbakh, Reliability theory with applications to preventive maintenance,
Springer, 2000.p 54. As to the generation, you must know the random censoring mechanism. Right now, it is not clear to me how to "restore" it from the data observed.
Dear Sasha, thanks for the explicit explanation. However, may I know why you need to sample the censored times? If your aim is survival analysis, the failure time distribution is enough, you do not need to sample the censored times. For instance, to determine the number of surviving elements at time, t, simply generate n failure times, n being the total number of elements. The number of failure times exceeding t, denotes the number of surviving elements.
There's an inbuilt Matlab function that computes Weibull parameters with one line of code. You may want to visit http://www.mathworks.com/help/stats/wblfit.html for details. This function yielded (200.24,3.0) as the Weibull parameter set for the data you supplied above.
Answering the previous letter. About parameter estimation -all correct. But Sasha probably wants to understand the probabilistic mechanism governing the censoring process. And this is not that simple. In my view, there might be many such mechanisms which would produce similarly looking samples.
Dear Hindolo and Ilya, thanks for your answers. Reason to take into account censored events is following - to simulate different situations with same input parameters and to analyse sensitivity. So, Ilya is absolutely right, I "want to understand probabilistic mechanism governing the censoring process".
About my answer (above). It is, certainly, only one of the possible expressions. You can also use some others, e.g., to select some value of K and to calculate:
The problem is that there might be many random mechanisms producing seemingly
similar results. For example, if observation x_i ends by 1,2,3,4, the next observation will be censored by r.v. Z-> F_1(t). Otherwise, the censoring variable will be W-> F_2(t).
Dear Ilya, what is (possible) conclusion ? Using different rules for censoring, we can get essentially different data sets of failures and censored data. But.. with same values of teta and beta parameters! It seems me, in this case MLE estimations for these data sets also will be essentially different for any (even very large) size of these data sets. So, these estimations will be asimptotically shifted, in differ of regular (only failures) data sets?
Dear Ilya and Sergey, I've been following your interesting stances regarding the censored times. Out of curiosity, what's the physical implication of generating the censored times in survival analysis? Cheers!
I fully agree with reason, mentioned by Alex and Ilya - " to understand probabilistic mechanism governing the censoring process". Another (possible) reason - to generate a very large sample to get/analyse some "rare-event" estimations.
Some remarks about my answers above ("if t_failure < K*t_censored, t_event = t_failure; else t_event = t_censored... " and "Using different rules for censoring, we can get essentially..." I have performed some numerical experiments: generated samples of 100...10,000 events (failures+censored) according some fixed values of Weibull parameters and after this estimated parameters by means of MLE. Short conclusion:
If K >= 1 (Amount Failures > 50%), results are good.
If K < 0.5 (Amount Failures < 20%), accuracy is bad for any size of sample.