Dr. Mark Gardener


Statistics for Ecologists Edition 2 COver

Statistics for Ecologists Using R and Excel.

Edition 2 is on the way.

Statistics for Ecologists Using R and Excel:
Data collection, exploration, analysis and presentation

Original Edition 1 Available now from
Pelagic Publishing

Pelagic Publishing

Get a 20% discount on "Statistics for Ecologists" when you buy direct from the publisher! Enter the voucher code S4E20 in the shopping basket at Pelagic Publishing.

Writer's Bloc

On this page you can find out about my latest writing project. I'll post updates on progress, tables of contents and also some of the R scripts (and possibly Excel spreadsheets) I am developing in support of the new book. I'll try to keep the material reasonably up to date.

The Writer's Bloc homepage contains a table of contents and an index of the pages that contain custom R commands and R scripts.

I am working on a new edition of my book Statistics for Ecologists Using R and Excel. I am currently revising the chapter on exploring differences between more than two samples. These notes are about post hoc analysis when you use the non-parametric Kruskal-Wallis test.

Post Hoc testing for Kruskal-Wallis analysis


The Kruskal-Wallis test is a non-parametric test for differences between more than two samples. It is essentially an analogue for a one-way anova. There is no "standard" method for carrying out post hoc analysis for KW tests. These notes show you how you can use a modified form of the U-test to carry out post hoc analysis.

When you carry out a Kruskal-Wallis test you are looking at the ranks of the data and comparing them. If the ranks are sufficiently different between samples you may be able to determine that these differences are statistically significant.

However, the main analysis only tells you about the overall situation, the result tells you that "there are differences". Look at the following graph, which shows three samples.

Three samples and error bars


A Kruskal-Wallis test of these data gives a significant result, H = 6.54 p < 0.05, but does not give any information about the pair by pair comparisons. Looking at the graph you might suppose that the Upper and Mid samples are perhaps not significantly different as their error bars (IQR) overlap considerably. The Lower sample appears perhaps to be different from the other two.

The way to determine these pairwise differences is with a post hoc test. You cannot simply carry out regular U-tests because you'll need to carry out at least 3 and this "boosts" the chances of you getting a significant result by a factor of 3. You could simply adjust the p-values (e.g. using a Bonferroni correction), but this is generally too conservsative.

These notes show you how to carry out a modified version of the U-test as a post hoc tool. The approach is analogous to the Tukey HSD test you'd use with parametric data.

Use a modified version of the U-test to calculate a critical value using Q, the Studentized Range.


A modified U-test as a post-hoc tool

With a bit of tinkering you can modify the formula for the U test to produce the following:

Critical U value for KW post hoc
A critical value for U in a post hoc test for Kruskal-Wallis.
Q is the Studentized range and n the harmonic mean of sample sizes.

In the formula n is the harmonic mean of the sample sizes being compared. and Q is the value of the Studentized Range for df = Inf, and number of groups equal to the original number of samples.

Number of




















The formula calculates a critical U-value for the pairwise comparison. You simply carry out a regular U-test, then use the largest U-value as a test statistic. If your value is equal or larger than the critical value from the formula then the pairwise comparison is a statistically significant one.

Harmonic mean is a way to get an "average" sample size.


Harmonic mean

The harmonic mean is easy to determine:

Harmonic mean
Calculating the harmonic mean of two values.

The harmonic mean is a way of overcoming differences in sample size. The more different the sample sizes the more unreliable this approach will be.

Calculate Q, the Studentized Range from the result of a U-test.

R can compute an exact p-value from Q.


Calculate Q directly

If you re-arrange the formula you can calculate a value for Q:

Calculate value for Q
Calculate a value for Q from the result of a U-test.

You can now use the result of a U-test (use the larger of the two calculated U-values) to work out a value for Q. Now you can compare your Q-value to the critical value.

The Studentized range is a distribution built into the basic R distro. This gives you a way to compute an exact p-value for the pairwise comparison.

Custom functions for Kruskal-Wallis post hoc available in the file:

KW posthoc.R


Custom R functions for Kruskal-Wallis post hoc

I've produced four custom functions for use with Kruskal-Wallis post hoc tests:

  • h.mean() – Calculates the harmonic mean of two values.
  • KW.post() – Calculates the post hoc results for a pairwise comparison of two samples from a larger dataset.
  • KWp.post() – Calculates the exact p-value for a post hoc given a U-value, number of groups and sample sizes.
  • KWu.post() – Calculates a critical U-value for a post hoc given a confidence level (default 95%), number of groups and sample sizes.

These functions are contained in a single file: KW posthoc.R. If you source() the file you will set-up the functions and see a message giving some idea of what the functions do.

h.mean() calculates the harmonic mean of two values.


The h.mean() function

h.mean(n1, n2)
n1, n2 Numerical values representing sample sizes of two samples.

This function simply returns the harmonic mean of two numbers, i.e. the sample sizes of two samples.

> h.mean(5, 7)
[1] 5.833333

The function is called by the other post hoc functions (and is built-in) but it might be "handy" to have separately.

KW.post() calculates the post hoc significance between two samples from a larger dataset.

Results include:
Critical U (95%)


The KW.post() function

KW.post(x, y, data)
x, y Numeric samples to compare.
data The data object that contains the samples. It is assumed that the data is in sample format with multiple columns, each representing a separate sample.

The function returns several results as a list:

  • uval – The calculated U value for the pairwise comparison (the larger of the two calculated values).
  • crit – The critical U value at 95%.
  • p.value – The exact probability for the pairwise comparison.

The function also displays the results to the console, even if you assign the result to a named object.

> hog3
  Upper Mid Lower
1     3   4    11
2     4   3    12
3     5   7     9
4     9   9    10
5     8  11    11
6    10  NA    NA
7     9  NA    NA

> KW.post(Upper, Lower, data = hog3)
Data:  hog3
Pairwize comparison of:  Upper  and  Lower
U-value:  32.5
U-crit (95%):  31.06011
Post-hoc p-value:  0.02641968 

If your data are in scientific recording layout, that is you have a response variable and a predictor variable, then you need a slightly different approach. You'll have to work out a U-test result first, then run the KWp.post() and/or KWu.post() commands (see below).

KWp.post() calculates an exact p-value given a U-value.

You supply the original number of groups and the sample sizes too.


The KWp.post() function

KWp.post(Uval, grp, n1, n2)
Uval A calculated U-value for the pairwise comparison (the larger of the two calculated values).
grp The number of groups (samples) in the original Kruskal-Wallis test.
n1, n2 The sample sizes of the two groups being analysed.

This function returns an exact p-value for a post hoc analysis. The value is returned immediately.

> KWp.post(18, grp = 3, 7, 5)

Post-hoc p-value: 0.9851855

You can carry out a wilcox.test() on two samples to obtain a U-value. You need to know samples sizes because you need the larger of the two U values.

The wilcox.test() only gives one value so you must work out if the value you got was the largest. It so happens that:

n1 * n2 = U1 + U2

This means that you can work out the alternative U-value easily if you know sample sizes.

If your data are in recording layout, with a predictor and response, you'll need to use the subset parameter to carry out the pairwise test:

> hog2
   count  site
1      3 Upper
2      4 Upper
3      5 Upper
4      9 Upper
5      8 Upper
6     10 Upper
7      9 Upper
8      4   Mid
9      3   Mid
10     7   Mid
11     9   Mid
12    11   Mid
13    11 Lower
14    12 Lower
15     9 Lower
16    10 Lower
17    11 Lower

> wilcox.test(count ~ site, data = hog2, subset = site %in% c("Upper", "Lower"))
 Wilcoxon rank sum test with continuity correction
data:  count by site
W = 32.5, p-value = 0.01732
alternative hypothesis: true location shift is not equal to 0

Check the sample sizes:

> replications(count ~ site, data = hog2)
Lower Mid Upper
5 5 7

Now you can see if the U-value you got was the largest:

> 5*7 - 32.5
[1] 2.5

Since it is the largest, you can use it in the KWp.post() function:

> KWp.post(32.5, grp = 3, 5, 7)
Post-hoc p-value:	 0.02641968    

Generally speaking the wilcox.test() will return the largest U-value if you use the response ~ predictor format for the command. If you run the command on separate samples the returned U-value will depend on the order you specify the samples.

KWu.post() calculates a critical value for U at a given confidence interval.

You supply the original number of groups and the sample sizes too.

You can compare a pairwise U-test result to your critical value.


The KWu.post() function

KWu.post(CI = 0.95, grp, n1, n2)
CI = 0.95 The confidence interval, defaults to 0.95. This is essentially a significance level of p = 0.05.
grp The number of groups (samples) in the original Kruskal-Wallis test.
n1, n2 The sample sizes of the two groups being analysed.

This function returns a U-value for a given confidence level. You supply the number of groups (samples) in the original Kruskal-Wallis test and the sizes of the two samples being compared. The result is a critical value, which means you can carry out the wilcox.test() and compare the resulting U-value to this critical value.

> KWu.post(CI = c(0.95, 0.99), grp = 3, n1 = 5, n2 = 5)

Post-hoc critical U value: 23.71961 26.44729

In the example you see that you can set multiple confidence intervals. Here the critical U values for p = 0.05 and p = 0.01 are returned.


Providing training for:

  • Ecology
  • Data analysis
  • Statistics
  • R The statistical programming language
  • Data management
  • Data mining

My Publications

Data Management Using Excel, Cover

See my personal pages at GardenersOwn

Follow me...
Facebook Twitter Google+ Linkedin Amazon
Contact DataAnalytics Homepage