Summary: | Big data analytics focuses on getting useful insights, trends and pattern out of complex and large data. Increasing the sample by resampling the data, in biostatistics expertise, can be employed using the bootstrapping techniques. The world of bootstrapping is very large and expanding where it does not only compute the confidence interval but also perform a standard resampling method. Nevertheless, survival analysis study mostly allows the data to be not normally distributed because of the censored observations. Small number of samples also one of the reasons why this study has to perform bootstrapping to overcome the issues of biasness. Bootstrapping method is said to be one of the best methods in handling skewed data. Thus, by considering bootstrapping method, this study aims to find the most significant prognostic factors of lung cancer disease that affect the survival times with the presence of censored observations by using the parametric survival analysis. Therefore, based on 100, 150, 250 and 600 number of sampling sizes, exponential distribution appeared to fit all the assigned sample sizes. Weibull and log-logistic distribution seems to fit the data only for 100 number of samples. Races and two of the interaction terms in the model appeared to be the most significant prognostic factors affecting the survival time of lung cancer. © 2023 IEEE.
|