Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A New Powerful Nonparametric Rank Test for Ordered Alternative Problem

  • Guogen Shan ,

    guogen.shan@unlv.edu

    Affiliation Epidemiology and Biostatistics Program, Department of Environmental and Occupational Health, School of Community Health Sciences, University of Nevada Las Vegas, Las Vegas, Nevada, United States of America

  • Daniel Young,

    Affiliation Division of Health Sciences, University of Nevada Las Vegas, Las Vegas, Nevada, United States of America

  • Le Kang

    Affiliation Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia, United States of America

Abstract

We propose a new nonparametric test for ordered alternative problem based on the rank difference between two observations from different groups. These groups are assumed to be independent from each other. The exact mean and variance of the test statistic under the null distribution are derived, and its asymptotic distribution is proven to be normal. Furthermore, an extensive power comparison between the new test and other commonly used tests shows that the new test is generally more powerful than others under various conditions, including the same type of distribution, and mixed distributions. A real example from an anti-hypertensive drug trial is provided to illustrate the application of the tests. The new test is therefore recommended for use in practice due to easy calculation and substantial power gain.

Introduction

The problem of statistically testing the equality of three or more populations has been studied for decades, and many efficient nonparametric tests have been proposed. Kruskal and Wallis [1] introduced a nonparametric test for a general alternative where at least two independent populations differ in median under the alternative. This test does not identify the pairwise group differences or the number of these differences. Specific ordered alternatives, such as the trend among groups, may be more interesting to practitioners and researchers. Many tests have been proposed for different types of ordering alternatives, for example, the test proposed by Mack and Wolfe [2] for an umbrella alternative, the one proposed by Fligner and Wolfe [3] for a tree alternative, the Cochran-Armitage test [4], [5] for a monotonic alternative with binary endpoints, and the Jonckheere-Terpstra (JT) test [6], [7] for a monotonic alternative with continuous endpoints.

The monotonic ordering problem with continuous endpoints occurs frequently in a wide range of statistical and medical applications [8], [9]. For example, in typical toxicity studies, the risk of adverse events that are caused, or possibly caused, by the treatment's action is often expected to rise with increasing doses. This problem has received considerable attention in the literature. After Jonckheere [6] and Terpstra [7] developed the nonparametric test for the nondecreasing ordered alternative based on the Mann Whitney (MW) testing procedure, many nonparametric tests have been developed for this problem based on the MW test or other tests. Recently, Neuhauser et al. [10] introduced a modified JT (MJT) test weighted by the distance between groups, and this test was shown to be more powerful than the JT test in small sample sizes due to the less discrete null sampling distribution. But the power gain would vanish as the sample size increases. This MJT test is a special case of the generalized JT test proposed by Tryon and Hettmansperger [8]. The Wilcoxon rank sum test was extended to the k-sample ordered problem by Cuzick [11] (referred to as the CU test) based on the the Wilcoxon rank sum test. The CU test is a special case of the linear rank test, and is a locally most powerful test for location shifts under the logistic distribution [12]. Later, Le [13] proposed a test for monotonic ordering alternatives analogous to the Kruskal Wallis test, which was shown to be equivalent to the CU test when the sample sizes were equal across groups. The numerical comparison among the JT test, the CU test, and the Le test was performed by Mahrer and Magel [14], and they found that all three tests were comparable in terms of power. Most aforementioned tests are constructed on pairwise comparisons. More recently, Terpstra and Magel [15] proposed a nonparametric test based on simultaneous comparisons with one observation from each group. In addition, interested readers are referred to Kossler [16], and Alonzo et al. [17].

In this article, we propose a new nonparametric test for the monotonic ordering problem based on the rank difference between two observations from different independent groups. The commonly used JT test statistic is calculated as the total number of pairs whose observation in the second group is greater than that in the first group. In addition to the sign of difference between two observations, the actual difference is also important to detect the ordered alternative. The actual difference can be measured by the rank difference in the nonparametric setting. The new nonparametric test captures not only the sign of the difference between observations, but also the value of the difference. We are the first to propose this new idea for detecting a monotonic ordering, and it can be readily extended to other important statistical problems.

The remainder of this article is organized as follows. In Section 2, we introduce the proposed new nonparametric rank test, derive the exact mean and variance of the test statistic under the null hypothesis, and prove the asymptotic null distribution. In Section 3, we compare the performance of the proposed test and other commonly used nonparametric tests with regard to power under a wide range of conditions. A real example from an anti-hypertensive drug trial is given to illustrate the application of the nonparametric tests in Section 4. Section 5 is given to discussion and future work.

Nonparametric tests

The underlying distribution functions of independent populations are assumed to be absolutely continuous and of the form , where is the location parameter for the th group, . The total number of subjects in the study is , with subjects in the th group, and . There is no difference among the populations under the null hypothesis, and the distributions under the monotone ordering alternative differ by their location parameters . Specifically, the hypotheses areand

Let be the th observation in the th group, and denote the rank in the combined data for the th observation in the th group, where and . The commonly used JT test is based on the possible pairwise comparisons between two groups, and within each two group comparison the MW test statistic [18] is used. The JT test statistic is expressed aswhere is the MW test statistic for comparing the -th and -th population, if is true, and otherwise.

2.1 Existing nonparametric tests.

In addition to the JT test, we considered three more frequently used nonparametric tests for monotonic ordering alternative problems to compare the performance with the new proposed test. They are the modified JT (MJT) test introduced by Neuhauser et al. [10], the test proposed by Terpstra and Magel [15] (referred to as the TM test), and the CU test proposed by Cuzick [11] based on the Wilcoxon rank sum test. The MJT test is a special case of generalized versions of the JT test with the weight as the distance between the group, and the test statistic is given as

Neuhauser et al. [10] showed that the MJT test has an actual type I error closer to the nominal level and is substantially more powerful than the common JT test in small sample sizes.

Terpstra and Magel [15] introduced a nonparametric test based on the k-tuplet simultaneous comparison, not the pairwise comparison as in the JT test. A k-tuplet is constructed with one observation from each group, and the total number of k-tuplet is . The TM test statistic is

It is noted that the MW test is a special case of the TM test when .

The Wilcoxon rank sum test is one of the most popular nonparametric tests for comparing two independent populations. An extension of the Wilcoxon test was proposed by Cuzick [11]. The sum of ranks for each group is first calculated, and then the CU test statistic is computed as a weighted sum of these ranks with the weight as the group number

The CU test is generally more powerful than other tests under monotonic alternatives [17]. Although other tests may be considered, these four existing nonparametric tests are typically used in applications and are considered as representatives of the available tests for the monotonic ordering problem.

2.2 Proposed rank test.

The MW test statistic used in the JT test counts the number of pairs such that the observation from one group is greater than that from another group; however, it does not differentiate pairs using pair differences. In other words, the actual differences between observations are not well captured. We consider the actual differences to be important information that should be utilized in the testing procedure to improve the test's efficiency. Following Shan [19] for comparing two groups, the new rank based nonparametric test by incorporating the actual differences is given as(1)where , and denotes the rank of the observation in the combined data. This new test can be considered as an extension of the sign test and the Wilcoxon rank sum test, since and are used in the sign test and the Wilcoxon test, respectively. The exact mean and variance of the null sampling distribution are given in the following theorem.

Theorem 2.1 Under the null hypothesis, the new test statistic has the mean and variance asandwhere and

Proof. The calculation for the mean of is straightforward.

Under the null hypothesis, the expectation of is given as

The calculation for variance is not easy and requires some effort. The variance of can be written as a summation of covariances,

If , then and ; if , then or . We use these notations interchangeably in this article. We consider two observations as a pair when they have the same value. Because and , one observation from a pair is from and the other is from .

The covariance is non-zero only when at least one pair exists in the observations . The maximum number of pairs in is two, with and . Then is the variance of .

Thus, the under the null hypothesis is expressed as

When only one pair exists in , there are four possible outcomes: (a) , (b) , (c) and (d) . In cases (a) and (b), the observations are either from two groups where the unpaired two observations are from one group and the pair is from the other, or from three groups where the pair is from either the first group or the third group after the groups have been sorted.

The first type of covariance in the case with only one pair is

In cases (c) and (d), the observations are from three different groups and the pair is from the second group (the middle group) after sorting the groups.

Then, the second type of covariance in the case with only one pair is given as

In the case with no pair in the observations , and are independent, and .

Therefore, the variance of is given as

The standardized test statistic of is(2)

The following theorem shows the asymptotic normality of the test statistic under the null hypothesis.

Theorem 2.2 When exists, , the proposed test has an asymptotic standard normal distribution as and .

Proof. Let be the th observation in the th group, where . Define

It should be noted that

By applying the results of the Problem 42 in the Appendix of Lehmann [20], asymptotically follows a normal distribution without scaling by the standard deviation, which can be proven by projecting the test statistic onto a sum of independent random variables [21] and then applying the central limit theorem.□

The new proposed test can be performed by comparing with appropriate quantile of standard normal distribution. For example, at the significance level of , the null hypothesis will be rejected in the favor of an increasing ordered alternative if , where is the upper percentile of the standard normal distribution.

The asymptotic cumulative distribution function (CDF) and the Monte Carlo simulation based exact distribution of for , are displayed in Figure 1. The simulated exact distribution was based on 20,000 iterations from the standard normal distribution for each group. As seen in the figure, the exact permutation distribution approximates the asymptotic distribution well.

thumbnail
Figure 1. The cumulative distribution function based on the asymptotic distribution, and based on the Monte Carlo simulation based exact distribution for .

https://doi.org/10.1371/journal.pone.0112924.g001

Numerical study

We conduct extensive exact Monte Carlo simulation studies to compare the five tests: 1): the JT test; 2) the MJT test; 3) the TM test; 4) the CU test; and 5) the new proposed test. The nominal level is set to be . In order to make a fair comparison between tests and avoid unsatisfied type I error rate control for tests using asymptotic distributions, exact permutation approach is used with data simulated from standard normal distributions with the same location and scale, e.g., . Total 20,000 iterations are utilized to obtain the 95% cutpoint, and these 20,000 simulated data is used for all the methods. For given the number of group and sample size within each group, the 95% cutpoint for each test is computed from the same simulated null distribution. In other words, the simulated null distribution under each configuration, is used multiple times to cacluate the cutpoint for each test. The same rule is applied to the simulated alternative distribution for power comparison. This procedure would reduce the bias of cutpoint and power estimates between tests, and makes a fair comparison between them.

The number of groups with and are considered in the power comparison. The simulated power is calculated as the proportion of iterations whose test statistic falls in the rejection region based on 10,000 simulations. Sample sizes , , , and are examined, and five alternatives are considered for normal distributions: four with a unit variance , and one with different variances . The parameters for alternative distributions (a), (b), (c), and (d) are also used for the t distribution with df = 3 of the form . In addition to symmetric distributions, we also consider a skewed distribution, exponential distribution, and a mixed distribution of normal distribution and exponential distribution. We consider similar distributions for the case of , but with the sample sizes : (8,8,8,8), (10,6,6,10), (20,20,10,10), and (10,20,10,20), and three alternatives: , , and . The power comparison between the five tests is examined for each configuration of sample size and alternative hypothesis.

The simulated power under normal distributions for is shown in Table 1. The actual sizes were obtained by simulating samples from standard normal distributions using the simulated 95% cutpoint. Simulated sizes are generally closer to the nominal level across the tests and sample sizes considered. We observe that the MJT test and the test due to Cuzick have the same power, which is also observed under other distributions. Although we do not theoretically prove that both tests have the same power using exact permutation test, it may be the case that they are equivalent to each other. For this reason, we only present one of them in the following power comparison results. The TM test has some power gain compared to other tests under the convex shape alternative (c) with decreasing sample sizes across groups. We have seen this trend from the other three distributions. The TM test has some power advantage as compared to others under the normal distribution with unequal variances. In all other configurations, the power of the TM test is lower than that of other tests. Out of the total 20 configurations from the alternative (a)-(e) and four difference sample sizes, the new test has more power than the JT test in 19 cases, and is at least as powerful as the CU test in 15 cases.

thumbnail
Table 1. Simulated size and power study based on normal distribution for .

https://doi.org/10.1371/journal.pone.0112924.t001

The power study under other distributions for are shown in Table 2 for the t alternative and in Table 3 for the exponential distribution. The exponential distribution is examined as an example of skewed distributions, with mean values: , and (d): . The new test has the highest power in 13 of the 16 configurations under the t distribution, and 12 under the exponential distribution. The new test is generally more powerful than other tests under the linear alternative (a) for the t distribution.

thumbnail
Table 2. Simulated power study based on t distributions with df = 3 of the form for .

https://doi.org/10.1371/journal.pone.0112924.t002

thumbnail
Table 3. Simulated power study based on exponential distribution for .

https://doi.org/10.1371/journal.pone.0112924.t003

We also compare the tests with mixed distributions for in Table 4. The mixed distribution considered here is: normal distribution for the first group, and exponential distributions for the second group and the third group, with mean values: , and (d): . In the normal distribution, univariate variance is used. When the same distributions are used for each group as aforementioned, the actual type I error rates are close to the nominal level. However, in the mixed distribution, the actual type I error rates are conservative for the case considered, especially in the case with decreasing sample sizes. Nevertheless, the new test has more power than other tests in 12 out of the total 16 configures.

thumbnail
Table 4. Simulated power study based on the mixed distribution for .

https://doi.org/10.1371/journal.pone.0112924.t004

The power comparison results for are shown in Tables 5, 6, 7, and 8 for the normal distribution, the t distribution, the exponential distribution, and the mixed distribution. The mixed distribution is the one with normal distributions for the first two groups, and exponential distributions with mean for the last two groups. As can be seen from these tables, the new test generally has more power than all other existing tests, and is almost uniformly more powerful than the commonly used JT test.

thumbnail
Table 5. Simulated size and power study based on normal distribution for .

https://doi.org/10.1371/journal.pone.0112924.t005

thumbnail
Table 6. Simulated size and power study based on t distribution with with df = 3 of the form for .

https://doi.org/10.1371/journal.pone.0112924.t006

thumbnail
Table 7. Simulated size and power study based on exponential distribution for .

https://doi.org/10.1371/journal.pone.0112924.t007

thumbnail
Table 8. Simulated size and power study based on the mixed distribution for .

https://doi.org/10.1371/journal.pone.0112924.t008

Example

A clinical trial for an antihypertensive drug [22] is provided to illustrate the use of the discussed tests. The primary objective of the study was to examine the effect of the selected doses on diastolic blood pressure by measuring the mean reduction in diastolic blood pressure. Patients with hypertension were randomized into four groups with different dose levels, 0, 10, 20, and 40 mg/day, where the group with 0 mg/day was the placebo group. The number of patients in each group were 17, 17, 18, and 16, respectively. The complete data can be found at the companion web site of the book by Dmitrienko et al. [22]. The mean reduction in diastolic blood pressure was expected to increase as the daily dose of the antihypertensive drug increased. Therefore, a monotonic increasing alternative is appropriate for this problem: . The permutation p-values for the JT test, the MJT test, the TM test, the CU test, and the new test are 0.00210, 0.00270, 0.01245, 0.00270, and 0.00250, respectively. At the significance level of 0.05, these five tests share the same conclusion that the relationship between the dose usage and the mean reduction in diastolic blood pressure is positive. The program is written in R, and is available from the author's website: https://faculty.unlv.edu/gshan/. You may contact the corresponding author for any questions you may have.

Conclusion

In this article we propose a new powerful nonparametric test, based on the rank difference between observations, for the monotonic ordering alternative problem in k-sample problem. The rank difference between observations for two groups is analogous to the two sample t test when the parametric assumptions are satisfied. The positive rank differences used in the test statistic are motivated by the idea of the sign test. We derive the asymptotic distribution of the new test statistic and studied the convergence rate of the simulation based exact distribution to the asymptotic distribution. The power comparison between the new test and other existing tests shows that the new test is generally more powerful than other tests for various distributions. We would recommend using the new test in practice due to substantial power gain.

The asymptotic distribution of the new test statistic was derived with continuous endpoints. No ties occur in continuous data. For ordinal and binary data, one has to consider the frequency of ties in the data, and the variance of the new test needs to be investigated. However, for given data, permutation based or simulation based approaches are readily employed for the p-value calculation. The application of the new test for ordinal or binary data is considered for future work. Other alternative hypotheses may be studied, such as the general alternative [1], the umbrella alternative [2], and the tree alternative [3]. An extension of the new test in exact testing framework [23], [24], [25], [26] and for repeated data from randomized block designs are also interesting.

Acknowledgments

We would like to thank the Editor and two reviewers for their valuable comments.

Author Contributions

Conceived and designed the experiments: GS. Performed the experiments: GS. Analyzed the data: GS DY LK. Contributed reagents/materials/analysis tools: GS. Wrote the paper: GS DY.

References

  1. 1. Kruskal WH, Wallis WA (1952) Use of Ranks in One-Criterion Variance Analysis. Journal of the American Statistical Association 47: 583–621.
  2. 2. Mack GA, Wolfe DA (1981) K-Sample Rank Tests for Umbrella Alternatives. Journal of the American Statistical Association 76: 175+.
  3. 3. Fligner MA, Wolfe DA (1982) Distribution-free tests for comparing several treatments with a control. Statistica Neerlandica 36: 119–127.
  4. 4. Cochran WG (1954) Some methods for strengthening the common χ2 tests. Biometrics 10: 417–451.
  5. 5. Armitage P (1955) Tests for Linear Trends in Proportions and Frequencies. Biometrics 11: 375–386.
  6. 6. Jonckheere AR (1954) A Distribution-Free k-Sample Test Against Ordered Alternatives. Biometrika 41: 133–145.
  7. 7. Terpstra TJ (1952) The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. Indigationes Mathematicae 14: 327–333.
  8. 8. Tryon PV, Hettmansperger TP (1973) A Class of Non-Parametric Tests for Homogeneity Against Ordered Alternatives. The Annals of Statistics 1: 1061–1070.
  9. 9. Shan G, Hutson AD, Wilding GE (2012) Two-stage k-sample designs for the ordered alternative problem. Pharmaceut Statist 11: 287–294.
  10. 10. Neuhäuser M, Liu PY, Hothorn LA (1998) Nonparametric Tests for Trend: Jonckheere's Test, a Modification and a Maximum Test. Biom J 40: 899–909.
  11. 11. Cuzick J (1985) A Wilcoxon-type test for trend. Statistics in medicine 4: 87–90.
  12. 12. Randles RH, Wolfe DA (1979) Introduction to the Theory of Nonparametric Statistics. Krieger Pub Co. URL http://www.worldcat.org/isbn/0894645439.
  13. 13. Le CT (1988) A New Rank Test Against Ordered Alternatives in K-Sample Problems. Biom J 30: 87–92.
  14. 14. Mahrer JM, Magel RC (1995) A comparison of tests for the k-sample, non-decreasing alternative. Statist Med 14: 863–871.
  15. 15. Terpstra J, Magel R (2003) A new nonparametric test for the ordered alternative problem. Journal of Nonparametric Statistics 15: 289–301.
  16. 16. Kössler W (2005) Some c-sample rank tests of homogeneity against ordered alternatives based on U-statistics. Journal of Nonparametric Statistics 17: 777–795.
  17. 17. Alonzo TA, Nakas CT, Yiannoutsos CT, Bucher S (2009) A comparison of tests for restricted orderings in the three-class case. Statist Med 28: 1144–1158.
  18. 18. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics.
  19. 19. Shan G (2014) New Nonparametric Rank-Based Tests for Paired Data. Open Journal of Statistics 04: 495–503.
  20. 20. Lehmann EL (1975) Nonparametrics Statistical Methods Based on Ranks.
  21. 21. Hajek J (1961) Some Extensions of the Wald-Wolfowitz-Noether Theorem. The Annals of Mathematical Statistics 32: 506–523.
  22. 22. Dmitrienko A, Chuang-Stein C, D'Agostino R (2006) Pharmaceutical Statistics Using SAS: A Practical Guide (SAS Press). SAS Institute, 1 edition. Available: http://www.worldcat.org/isbn/159047886X.
  23. 23. Shan G, Ma C, Hutson AD, Wilding GE (2012) An efficient and exact approach for detecting trends with binary endpoints. Statistics in Medicine 31: 155–164.
  24. 24. Wilding GE, Shan G, Hutson AD (2012) Exact two-stage designs for phase II activity trials with rank-based endpoints. Contemporary Clinical Trials 33: 332–341.
  25. 25. Shan G (2014) Exact approaches for testing non-inferiority or superiority of two incidence rates. Statistics & Probability Letters 85: 129–134.
  26. 26. Wilding GE, Consiglio JD, Shan G (2014) Exact approaches for testing hypotheses based on the intra-class kappa coefficient. Statist Med 33: 2998–3012.