Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

On Cox proportional hazards model performance under different sampling schemes

  • Hani Samawi ,

    Roles Writing – original draft

    hsamawi@georgiasouthern.edu

    Affiliation Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, Georgia, United States of America

  • Lili Yu,

    Roles Software, Writing – review & editing

    Affiliation Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, Georgia, United States of America

  • JingJing Yin

    Roles Software, Writing – review & editing

    Affiliation Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, Georgia, United States of America

Abstract

Cox’s proportional hazards model (PH) is an acceptable model for survival data analysis. This work investigates PH models’ performance under different efficient sampling schemes for analyzing time to event data (survival data). We will compare a modified Extreme, and Double Extreme Ranked Set Sampling (ERSS, and DERSS) schemes with a simple random sampling scheme. Observations are assumed to be selected based on an easy-to-evaluate baseline available variable associated with the survival time. Through intensive simulations, we show that these modified approaches (ERSS and DERSS) provide more powerful testing procedures and more efficient estimates of hazard ratio than those based on simple random sampling (SRS). We also showed theoretically that Fisher’s information for DERSS is higher than that of ERSS, and ERSS is higher than SRS. We used the SEER Incidence Data for illustration. Our proposed methods are cost saving sampling schemes.

Introduction

The survival time of a particular event is called the time-to-event. The time of death and time to develop a disease are examples of survival data. Statistical methods for survival analysis have been applied to many vital fields of research. Generally, survival analysis uses data to predict survival probability and identify risk and/or prognostic factors related to subjects’ survival and disease progression. An essential aspect of survival data is not usually fully observed in all subjects under study, leading to different censored data types.

Subjects in a study are usually assumed to be selected randomly (interred the study randomly) in the sense of simple random sample (SRS) [1]. Another promising sampling method in the literature is Ranked set sampling (RSS) introduced by McIntyre [2, 3]. RSS is a useful cost affected sampling scheme for agriculture and environmental studies. Samawi and Al-Sagheer [4] applied RSS in a study involving human subjects by collecting data in the study based on bilirubin’s level in the jaundice premature babies’ blood. Also, recently Jabrah et al. [5] used RSS in a study involving human subjects, and they selected college students to analyze a psychological intervention to buttress resilience study. More applications of RSS and variations of RSS and their efficiency for estimation population means and other parameters are presents recently by several authors such as [614] among others.

RSS (Balanced) is collected by drawing m2 items randomly from a target population and then randomly dividing them into m sets of m units. Units from each set of size m then ranked virtually, or by the mean of the available auxiliary variable (concomitant) that associated with the variable of interest. Among the first set of m units, we choose the unit with the lowest rank and measure the unit’s actual value. In the second set unit with the second-lowest ranking is measured, and so on until the unit with the highest ranking is measured from the last set. It is essential to point out that the ranking error will reduce RSS’s efficiency compared with SRS in some inferential procedures. Therefore, it is crucial to choose a ranking method that improves the ranking precision [15, 16].

Furthermore, when the set size m is large, selecting an RSS sample of size m is susceptible to ranking errors, so the literature suggested m not to excide 5. However, to overcome size r of m restriction to reduce ranking errors, several modifications and variations of the RSS are suggested in the literature. Samawi et al. [17] were the first to introduce the ERSS sampling scheme, while Al-Odat and Al-Saleh [18] proposed moving Extreme Ranked Set Sampling (MERSS). Al-Saleh and Samawi [19] and Samawi and Al-Saleh [20] implemented both MERSSmax and MERSSmin to provide valid odds and more efficient estimates of the odds ratio, respectively. Samawi et al. [21] investigates the use of MERSSmin (max) to improve the Cox proportional hazards model’s efficiency. This modified sample’s advantage is only the maximum (or minimum) sets of varied sizes are identified for quantification in this procedure. Therefore, even for large m, MERSS, or ERSS, can be easily implemented.

As another modification of RSS, is the Double Extreme Ranked Set Sampling (DERSS) sampling scheme was introduced by Samawi [22] for the mean and regression estimators. Helu et al. [23] improved the AFT survival model’s analysis efficiency using a modified version of ERSS, namely ERSSmin (ERSSmax). Also, Samawi et al. [24] used the modified MERSS to improve the logistic regression analyses’ performance. Recently, Samawi et al. [25] further enhances logistic regression analysis using modified DERSS.

The DERSSmin (DERSSmax) procedure, which is an extension to ERSS procedure, is suggested here. This procedure involves randomly drawing m independent sets each contains m random samples of size m each (note that each set has m2 sample units). Each sample is drawn from a large population. We assume that the largest or the smallest sample unit within each sample with respect to the value of an auxiliary variable Z, which is associated with survival time, can be identified with no or little cost. From each set of size m2 use ERSS procedure, as described by Samawi at el. [17] to obtain m ERSSmin (or ERSSmax) samples of size m each (this completes the first stage). Again apply the same ERSSmin (or ERSSmax) procedure on the obtained m ERSSmin (or ERSSmax) samples of size m each to obtain a sample of size m, which will be called the second stage of ERSSmin (or ERSSmax) or Double Extreme Ranked Set Sample DERSSmin (or DERSSmax). This will complete one cycle. The cycle may be repeated r times to obtain a DERSS of size n = rm.

This work is the first to discusses the implementation and the performance of the proportional hazards model when the subjects are selected based on the modified ERSS (ERSSmin or ERSSmax) and DERSS (DERSSmin or DERSSmax) sampling scheme. In this work, we assume that the ranking of subjects, including in the study, is based on one of the available auxiliary variables associated with the variable of interest (time–to–event). Also, we are the first to show theoretically that the proposed sampling schemes improve the Fisher’s information compared with SRS. This implies that Fisher’s information for DERSS is higher than that of ERSS, and ERSS is higher than SRS. The rest of the paper is as follows: Section 2 provides basic survival analysis, definitions, DERSS procedure, and notation. The analysis using the Cox proportional hazards model and its properties using DERSS is discussed in section 3. In Section 4, we provide our simulation study to compare the Cox proportional hazards model’s performance using the modified DERSS to ERSS and SRS. We illustrate the proposed methods using the SEER Incidence Data in section 5. Final remarks are provided in Section 6.

Preliminaries and notations

Basic definitions

Let S(t) denote the survival function at time t with hazard rate function denoted by h(t) and f(t) as p.d.f for a random variable of time to an event T. Furthermore, survival data is usually not fully observed, and data censoring procedure occurs. The right censoring occurs when the subject has not experienced the event of interest at a given time, which means that we observe only the lower bound for the value of t of the censored individuals. On the other hand, type I censoring occurs when the event is unobserved before some pre-specified time, such as at the closing of a study. In this work, we focus on type I right censoring.

Likelihood function for type I right censored data

In survival data, we define ti = min(Ti, Ci)Ti the true survival time and Ci the censoring time for the ith individual. Let δi be an indicator variable, such that (1)

The likelihood function for n observations is given by (2) where β is the set of parameters of risk factors.

Sample notation and some basic results of the modified ERSS and DERSS

As in Samawi (2002), let denotes the m independent sets for the kth cycle, each with sample size m2 from a p.d.f. f(x) and a distribution function F(x) (absolutely continuous). Using similar notations as in Samawi et al. [21] for selecting DERSSmin, and hence also ERSSmin, we have, after ranking the sample units within each sample in each set, visually or by any not costly way, we obtain ERSSmin as, (X1(1)k, X2(1)k,…,Xm(1)k, k = 1,2,…,r) of size m.r. Then DERSSmin is obtained as , ,…., . Then Y1(1)k, Y2(1)k,…,Ym(1)k k = 1, 2,…, r, denotes DERSSmin. However, selecting DERSSmax is similar but selecting the maximum instead of the minimum. Samawi et al. [25] showed that the p.d.f of the smallest and the largest order statistics of an i.i.d sample of size m with p.d.f fX(x) are respectively given by: fX(1)(x) = m(1−FX(x))m−1 fX(x) and fX(m)(x) = m(FX(x))m−1 fX(x) Also, let Yi(1)k have p.d.f g(1)(y) and c.d.f G(1)(x) and Yi(m)k have p.d.f g(m)(y) and c.d.f G(m)(y) where i = 1,2,…,m and k = 1, 2,…, r., then

  1. ,

Cox proportional hazards model using ERSSmin and DERSSmin

This section investigates the Cox Proportional Hazards Model using ERSSmin and DERSSmin when the survival time and the ranked auxiliary variable (X) are positively correlated. When they have a negative correlation, we can use either ERRSmax or DERSSmax, and all consequence results can be derived in similar manners. We will focus on DERSSmin in our consequence discussion.

For the kth cycle, k = 1,2,…,r, let Y1(1)k,Y2(1)k,…,Ym(1)k be the measurements obtained on the auxiliary variable, using DERSSmin with size n = rm. Moreover, let the vector Vi[1]k = (Xi[1]k…,Xi[1]kp, Yi(1)k)′, i = 1,2,…,m;k = 1,2,…,r represent the p+1 explanatory variables’ observations and one auxiliary variable obtained from the sampled unit in the ith set and kth cycle. Note that the notation (.) stands for perfect ranking while the notation [.] stands for imperfect ranking.

The partial likelihood is usually used to estimate the model parameters β = (β1, β2,…,βp+1)′ in the Cox model. Under SRS assumptions, Cox and Oakes [26] showed that the estimators of β = (β1, β2,…,βp+1)′ from partial likelihood are consistent and asymptotically normal distributed. However, Cox and Oakes [26] showed a loss of precision in estimation by using partial likelihood compared to the full likelihood. Also, they indicated that the sample size and the proportion of censoring observation affect inference precision. When the percentage of censoring increases and the sample size is small, the estimators from partial likelihood lose more precision. We propose to use DERSS to improve the accuracy of inferences based on the partial likelihood estimation of Cox model parameters.

Partial likelihood for the Cox model

To estimate the parameter in the Cox proportional hazards model, we order the random sample of size n individuals according to the rank of their survival times, t1 < t2 <… < tn, assuming there are no ties for the observations. Instead of using the full likelihood function in (5), Cox [27] proposed partial likelihood as follows, (3)

It is the consequence of the conditional probability that an individual experiences an event at time ti[1]k, (i = 1,2,…,m;k = 1,2,…,r) given that the individual in the event’s risk set at ti[1]k, where R(ti[1]k) be the set of all individuals who have not experienced the event and are uncensored just prior to ti[1]k. Under DERSSmin, and as described by Samawi et al. [21] the partial likelihood function (PLHF) can be written as (4)

When the observation is censored, the product of the conditional probabilities is 1 and equivalently can be written as (5) where (rd.m) is the number of events (failed subjects) in a sample of size (n = r.m). The log-likelihood can be written as, (6)

Since the partial likelihood function depends on the ordering of survival times and then the baseline hazard distribution is inherent in ordering the survival times, and no shape is described. Therefore, the produced partial likelihood is only based on the uncensored observations, and hence this may result in a large sample size requirement. However, in contrast to SRS, ERSS and DERSS naturally provide a larger percentage of events, as indicated by Samawi et al. [21, 25] and may result in requiring a smaller sample size. For estimating the jth parameter (βj), we need to solve the following equation (7) where is the weight of the hth failed subject because . Clearly, the MLE ()is the solution of Eq (7).

For the consistency and the asymptotic properties of , see [26].

To show the superiority of using ERSS and DERSS proposed in this paper, we need to derive the Fisher’s information matrix of , say I(β). To find I(β) we take the second derivative of L(β):

For the jth covariate, we have (8) which is a function of the ranked auxiliary covariate (Y) used to select subjects from the population using the DERSSmin scheme randomly and assumed to have some distribution FY(y), with a continuous density fY(y). As suggested by [21, 25] ranking on this variable induces a corresponding ordering on the response variable (T), which leads to improved precision. When using DERSSmin, we have taken the minimum judgment ranking from each set of size m, and as shown by [25], the density function of Y(1) is . Then the Fisher information using DERSSmin is given by:

Since we assume that βP+1 > 0, then the number of events and R(ti[1]k) will be increased. This implies that gh(y(1); R(ti[1]k)) is a decreasing function of y. Also, is a decreasing function of y, thus as in See and Chen [28]

However, , therefore, (9)

Hence, the last inequality in [15] can be argued similarly. These inequalities are still held when the correlation between the survival time and Y is negative, and DERSSmax is used. The inequalities in [15] show that using DERSSmin provides more information than ERSS and ERSS provides more information than SRS, and hence DERSSmin may require a smaller sample size to achieve similar precision as for ERSS or SRS.

Simulation study and results

For comparison purposes, we used simulation to evaluate the Cox proportional hazards model’s performance when ERSSmin and DERSSmin are used. We assume that the Ranked Auxiliary Covariate is positively correlated with survival time. We compared DERSSmin with both ERSSmin and SRS. The power of testing the hypothesis of no treatment effect after controlling for the auxiliary variables is compared. We also studied the performance of estimating the conditional hazard ratios for risk factors and their confidence intervals. We used different conditional hazard ratios and different values of the association between survival time and the auxiliary variables. Two combinations of sample sizes of m and r are considered used in the simulations, i.e., (m = 20, r = 10), and (m = 30, r = 20). We used 5000 simulated samples in our simulation.

The parameter β1 is the association between T and Y, and β2 is the risk factor parameter. The simulation results are given in Tables 13 for a dichotomous risk factor, while the results of a continuous risk factor are presented in Tables 46. We estimated the empirical nominal value (α = 0.05) and the tests’ power (see Tables 1 and 4). The estimated conditional hazard ratios and their MSEs are reported in Tables 2 and 5, while the 95% confidence intervals for hazard ratios are reported in Tables 3 and 6.

thumbnail
Table 1. Evaluating the (α = 0.05) and the power of testing Ho: β2 = 0 vs Ha: β2 ≠ 0 adjusting for (Z) in the model.

(Binary risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t001

thumbnail
Table 2. Estimation of Hazard ratio (HR) estimation and their MSE (Binary risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t002

thumbnail
Table 3. Estimating 95% confidence Interval length and coverage probability (CP) of the Hazard Ratio (HR) (Binary risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t003

thumbnail
Table 4. Evaluating (α = 0.05) and the power of testing Ho: β2 = 0 vs Ha: β2 ≠ 0 adjusting for the auxiliary variable (Z) in the model.

(Continuous risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t004

thumbnail
Table 5. Estimating Hazard ratio (HR) estimation and their MSE (Continuous risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t005

thumbnail
Table 6. Estimating 95% confidence interval length and coverage probability (CP) of the Hazard Ratio (HR) (Continuous risk factor) {Censoring variable = ci = U(0,1)*1.5}.

https://doi.org/10.1371/journal.pone.0278700.t006

From Table 1, we can conclude that, when the risk factor is dichotomous, testing the hypothesis of the risk factor’s effect, controlling for the Ranked Auxiliary Covariate Y in the model, the DERSSmin results in a more powerful test than SRS and ERSSmin. All SRS, ERSSmin, and DERSSmin achieved the test nominal value (0.05) under the null hypothesis in all cases. We demonstrated the test’s power increases as the set size m increase and/or the value of β1 increases. However, using DERSSmin in the Cox proportional hazard model provides greater power than SRS and ERSSmin, in all cases. Similarly, from Tables 2 and 3, DERSSmin has smaller MSE and narrower confidence intervals in estimating the hazard ratios. Similar results are concluded from Tables 46 when the risk factor is a continuous variable.

Application using the SEER incidents data set

We illustrate the Cox proportional hazards model based on DERSS, ERSS, and SRS using SEER Incident Data [29]. We selected histologic grade as the primary predictor of interest and month of life after a breast cancer diagnosis as the outcome. Due to breast cancer, deaths were considered death and being alive, or Deaths due to other causes deemed to be censored. In the Cox proportional hazards model, we also include age and tumor marker1 as confounders. The data included 454,517 individuals, and 233,125 have complete data for a grade (1: well-differentiated; 2: moderately differentiated; 3: poorly differentiated; 4: undifferentiated). Our analysis was based on complete data. Since grade levels 3 and 4 are crossed on the survival plots, which violates the proportional hazards assumption for the Cox model, we combined grade 3 and grade 4 into one single group as poorly differentiated or undifferentiated and denoted the new grade variable (1: well-differentiated; 2: moderately differentiated; 3: poorly differentiated or undifferentiated).

We treated the entire data set of 233,125 observations as a population and randomly selected ERSS and DERSS samples (with m = 15, r = 20). We selected ERSS and DERSS based on ranking on the lifetime directly since most of the data set variables are categorical. The only potential continuous variable for ranking is age, which is only weakly associated with lifetime with hazards ratio close to 1. We performed the hypothesis testing of no difference in survival time between the patients with a different histologic grade at diagnosis, adjusting for the covariate age. Table 7 represents the survival analysis results for a Cox proportional hazards model using all available data. The estimation of parameters and the conditional hazard ratios are treated as actual for comparison purposes in this example.

thumbnail
Table 7. Variables in the Equation using all the data (n = 233,125).

https://doi.org/10.1371/journal.pone.0278700.t007

Table 7 and Fig 1 are the results of the survival analysis based on the entire data. Fig 1 indicates that the proportional hazard assumption of the model is valid. Also, Fig 1 shows that the survival chance for grade 1 is higher than grade2 and grade 3, and grade 2 is higher than grade 3 of the disease. However, based on the p-values, the test of the effect of grade 3 on survival time compared with grade 1 and 2 was significant, controlling for age, based on the 500 bootstrap samples of SRS, ERSS, and DERSS of size n = 300. Both ERSS and DERSS used survival time for ranking and chose the samples with the minimum survival time. The DERSS samples give smaller p-values for the majority of the time and coincide with the whole data analysis. The estimated power out of the 5000 runs was 0.944, 0.930, and 0.856, respectively, under DERSS, ERSS, and SRS. Therefore, whenever possible, ERSS and/or DERSS are to be recommended for survival analysis.

thumbnail
Fig 1. Survival function for the grades (1: Well-differentiated; 2: Moderately differentiated; 3: Poorly differentiated or undifferentiated) using all complete data (n = 233,125).

https://doi.org/10.1371/journal.pone.0278700.g001

Final remarks and conclusion

It is essential in collecting data to have a sampling scheme that is cost-effective and less time-consuming. ERSS and DERSS, which are RSS modifications, are encouraging sampling techniques that can effectively have more efficient estimators with less cost than SRS. We proposed a more efficient survival regression analysis method based on Double Extreme Ranked Set Sampling (DERSS) and Extreme Ranked Set Sampling (ERSS). Subjects are assumed to be ranked based on an auxiliary variable associated with the response variable. We discussed parameter estimation based on the maximum likelihood approach and provided an expression for the estimated variance based on the inverse information matrix. Also, we discussed the asymptotic behavior of the ML estimators. Our findings conclude that using ERSS and DERSS can significantly increase power when used in a Cox proportional hazards model. We show that the test’s power increases as the set size m increases through the simulation studies. Moreover, ERSS and DERSS provide more efficient estimates of the parameters associated with hazard ratios in smaller MSEs and narrower confidence intervals than those under SRS.

References

  1. 1. Scheaffer R. L., Ill W. M., and Ott R. L. (2006). Elementary Survey Sampling. 6th edition, Thomson, Brooks/Cole, USA.
  2. 2. McIntyre G.A. (1952). A method for unbiased selective sampling, using ranked sets. Australian Journal of Agriculture Research, 3, 385–390.
  3. 3. Mode N. A., Conquest L. L., and Maeker D. A. (1999). Ranked set sampling for ecological research: Accounting for the total costs of sampling. Environmetrics, 10, 179–194.
  4. 4. Samawi H. M., and Al-Sagheer O. A. (2001). On the estimation of the distribution function using extreme and median ranked set samples. Biometrical Journal, 43(3), 357–373.
  5. 5. Jabrah R., Samawi H., Vogel R., Rochani H., Linder D. and Klibert J. (2017). Using ranked auxiliary covariate as a more efficient sampling design for ANCOVA model: Analysis of a psychological intervention to buttress resilience. Communications for Statistical Applications and Methods, 24(3), 241–257.
  6. 6. Bhushan S., Kumar A., Shahzad U., Al-Omari A.I., Almanjahie I, (2022a). On some improved class of estimators by using stratified ranked set sampling, Mathematics, 10, 3283.
  7. 7. Bhushan S., Kumar A., Shahab S., Lone S.A., and Almutlak S.A. (2022b). Modified class of estimators using ranked set sampling. Mathematics, 10, 3921.
  8. 8. Bhushan S., Kumar A. and Lone S.A. (2022c). On some novel classes of estimators under ranked set sampling. AEJ—Alexandria Engineering Journal, 61, 5465–5474.
  9. 9. Bhushan S., Kumar A., Akhtar M.T. and Lone S.A. (2022d). Logarithmic type predictive estimators under simple random sampling. AIMS Mathematics, 7(7), 11992–12010.
  10. 10. Bhushan S., Kumar A., Shahab S., Lone S.A. and Akhtar M.T. (2022e). On efficient estimation of population mean under stratified ranked set sampling. Journal of Mathematics, 2022(3), 1–20.
  11. 11. Chen Z. (2007). Ranked set sampling: Its essence and new applications. Environ. Ecol. Stat. 14, 355–363.
  12. 12. Shahzad U., Ahmad I., Oral E., Hanif M., Almanjahie I., (2021). Estimation of the population mean by successive use of an auxiliary variable in median ranked set sampling., Mathematical Population Studies, 28, 3, 176–199.
  13. 13. Shahzad U., Hanif M., Koyuncu N., Garcia Luengo A.V., (2018). A new estimator of mean under ranked set sampling alongside the sensitivity issue, Journal of Statistics and Management Systems, 21, 8, 1553–1564.
  14. 14. Shahzad, U., Hanif, M., Koyuncu, N., Garcia Luengo, A.V., (2019). A regression estimator for mean estimation under ranked set sampling alongside the sensitivity issue., Communications Faculty of Sciences University of Ankara, Series A1: Mathematics and Statistics, 65, 2, 2037–2049.
  15. 15. Chen H., Stasny E. A., and Wolfe D. A. (2005). Ranked set sampling for efficient estimation of a population proportion. Statistics in medicine, 24(21), 3319–3329. pmid:16100735
  16. 16. Stokes S. L. (1977). Ranked set sampling with concomitant variables. Communications in Statistics—Theory & Methods. 6, 1207–1211.
  17. 17. Samawi H. M., Ahmed M. S., and Abu‐Dayyeh W. (1996). Estimating the population mean using extreme ranked set sampling. Biometrical Journal, 38(5), 577–586.
  18. 18. Al-Odat M., and Al-Saleh M. F. (2001). A variation of ranked set sampling. Journal of Applied Statistical Science, 10(2), 137–146.
  19. 19. Al-Saleh M. F., and Samawi H. (2010). On estimating the odds using Moving extreme ranked set sampling. Statistical Methodology, 7(2), 133–140.
  20. 20. Samawi H. M., and Al-Saleh M. F. (2013). Valid estimation of odds ratio using two types of moving extreme ranked set sampling. Journal of the Korean Statistical Society, 42(1), 17–24.
  21. 21. Samawi H., Yu L., Rochani H. and Vogel R. (2020). Reducing Sample Size Needed for Cox-Proportional hazards model Analysis Using More Efficient Sampling Method. Communication of Statistics: Theory and Methods, 49 (6), 1281–1298.
  22. 22. Samawi H. M. (2002). On Double Extreme Ranked Set Sample with Application to Regression Estimator. Metron, LX(1–2), 53–66.
  23. 23. Helu A., Samawi H., and Rochani Haresh (2019). Further Increasing Fisher’s Information for Parameters of Association in Accelerated Failure Time Models via Double Extreme Ranks. Pakistani Journal of Statistics, 15(3), 679–700.
  24. 24. Samawi H., Rochani H., Linder D. and Chatterjee A. (2017). More Efficient Logistic Analysis Using Moving Extreme Ranked Set Sampling. Journal of Applied Statistics 44(4), 753–766.
  25. 25. Samawi H., Zhang X., and Rochani H. (2020). Further improve the performance of logistic regression analysis using double extreme ranking. Accepted: journal of Statistical Theory and practice.
  26. 26. Cox D.R and Oakes (1984). Analysis of Survival Data. London: Chapman and Hall.
  27. 27. Cox D. R. (1972). Regression models and life-table. Journal of the Royal Statistical Society. Series B (Methodological), 34 (2), 187–220.
  28. 28. See C. and Chen J. (2008). Inequalities on the variance of convex functions of random variables. Journal of inequalities in pure and applied mathematics. 9(3):1–5.
  29. 29. SEER Research Data (1973–2015).Surveillance, Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov) Research Data (1973–2015), National Cancer Institute, DCCPS, Surveillance Research Program, released April 2018, based on the November 2017 submission.