Dr. Mark Gardener

 
About

Statistics for Ecologists Edition 2 Cover
Available soon from
Pelagic Publishing


Statistics for Ecologists Using R and Excel (Edition 2)

Data Collection, Exploration, Analysis and Presentation

by: Mark Gardener

Available soon from Pelagic Publishing

Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material.

Support Index | Exercises Index | Outline & TOC | Data files


 

Exercise 7.3.3

Pelagic Publishing Logo

Table of Contents


Section 7.3.3

 

Get exercise data here:

Wilcoxon.xlsx

Top

7.3.3 Use Excel to carry out a Wilcoxon matched pairs test

This exercise is concerned with matched pairs tests (Section 7.3) and in particular how to carry out the non-parametric Wilcoxon Matched Pairs test using Excel (Section 7.3.3).

Introduction

There is no in-built function that will carry out the Wilcoxon Matched Pairs test in Excel. However, you can rank the data and compute the rank sums you require using the RANK.AVG function.

You do need to omit zero differences from the calculations and also to separate ranks of positive differences from ranks of negative differences. In this exercise you can see how to use the IF function to help you do this separation.

You can get the sample data here: Wilcoxon.xlsx. There are two worksheets, one has the data only and the other has a completed set of calculations so you can check your progress.


Data in matched pairs

Wilcoxon matched pairs test is for non-parametric data

Top

Here's what the data look like:

Obs
A
B
1
8
11
2
6
8
3
12
4
4
2
9
5
3
3
6
3
5
7
1
2
8
7
2

You can see that there are 8 pairs of data.


Use IF to make a "decision" about which item is largest.

Then work out if the difference is + or - or zero

Top

Show the direction of the difference

Start by making a column to show the direction of the difference. Make a heading in cell D1, Dir will do.

Now in cell D2 type a formula to show if the difference between B2 and C2 is positive or negative. You're going to subtract the second column from the first. You also want to take into account possible zero differences:

=IF(B2<C2,"-",IF(B2=C2,"","+"))

So if cell C2 is larger than cell B2 you'll get a minus symbol. If the cells are equal then you get a blank. If B2 is larger than C2 you'll get a + symbol. You will use these symbols to separate the ranks later.

Copy the formula down the rest of the column to show the direction of each of the differences.

Obs
A
B
Dir
1
8
11
-
2
6
8
-
3
12
4
+
4
2
9
-
5
3
3
6
3
5
-
7
1
2
-
8
7
2
+

Note that observation 5 (row 6 of the spreadsheet) shows a blank because the ites are the same (i.e. a zero difference).


Calculate differences as absolute magnitude, use the ABS function.

For zero differences show a blank ""

Top

Calculate the differences

Make a column for the difference between samples, make the heading in cell E1, D(A-B) or something similar.

You want to subtract the values in the second column of data from the first column of data (i.e. Col B - Col C). So in cell E2 type a formula to do that. You'll need to omit any zero difference, so use an IF function to place a blank "" if the difference is zero:

=IF(B2-C2=0,"",ABS(B2-C2))

You are not interested in the sign of any difference, just the magnitude, so the ABS function is needed.

Copy the result down the rest of the column.

Obs
A
B
Dir
D(A-B)
1
8
11
-
3
2
6
8
-
2
3
12
4
+
8
4
2
9
-
7
5
3
3
6
3
5
-
2
7
1
2
-
1
8
7
2
+
5

You can see that you have 7 values, with one blank (a zero difference, observation 5).


Use RANK.AVG to work out ranks of non-zero differences.

Make sure ranks are sorted ascending, the smallest difference should get the smallest rank.

Top

Calculate ranks of differences

Now you want to work out the ranks of the differences. That is the ranks of the absolute value of the differences that you just worked out (column E) and called D(A-B).

Make a new column label in cell F1, call it Rd or something similar.

Now in cell F2 type a formula to work out the ranks of the items from column E:

=IF(E2="","",RANK.AVG(E2,$E$2:$E$9,1))

Note that you need to take care of any possible blank cells (corresponding to zero differences). You also need to "fix" the cell range E2:E9 using $ since you will be copying the cell down the rest of the column. The final 1 makes sure that your ranks are sorted ascending order, with the smallest difference getting the smallest rank.

Obs
A
B
Dir
D(A-B)
Rd
1
8
11
-
3
4
2
6
8
-
2
2.5
3
12
4
+
8
7
4
2
9
-
7
6
5
3
3
6
3
5
-
2
2.5
7
1
2
-
1
1
8
7
2
+
5
5

You can see that there are some tied ranks (each given a value of 2.5).


Separate ranks from positive differecnes from ranks of negative differences

Top

Ranks of + and of - differences

Now you need to separate the ranks due to the positive differences and the ranks due to negative differences. Mae two more column headings in cells G1 and H1, R+ and R- will do fine.

In cell G2 type a formula that shows the rank if it is due to a positive difference but leaves the cell blank is not:

=IF(D2="+",F2,"")

Copy the cell down the rest of the column G.

In cell H2 type a formula that shows the rank if it is due to a negative difference but leaves the cell blank is not:

=IF(D2="-",F2,"")

Copy the cell down the rest of the column H.

Obs
A
B
Dir
D(A-B)
Rd
R+
R-
1
8
11
-
3
4
4
2
6
8
-
2
2.5
2.5
3
12
4
+
8
7
7
4
2
9
-
7
6
6
5
3
3
6
3
5
-
2
2.5
2.5
7
1
2
-
1
1
1
8
7
2
+
5
5
5

You should now see the ranks split according to the sign of the difference between the samples.


Sum the ranks using SUM to get the final test statistic W.

The smaller of the two rank sums is the test result.

Use COUNT to get the number of non-zero differences.

Look up the critical value and compare to your test score.

Top

Rank sums

Now you need to add up the ranks for the positive and negative differences (columns G & H). You can use the SUM function for this.

In cells A11 and A12 type labels for the counts and sums, n and Sum, will do nicely.

In cell B11 type a formula to work out hte number of observations:

=COUNT(B2:B9)

Copy this into cells C11 and F11. The latter being the number of non-zero differences.

In cell B12 type a formula to sum the data in column 12:

=SUM(B2:B9)

Copy the cell into cells C12, F12, G12 and H12. Cells F12:H12 contain the ranks sums; overall, +ve and -ve. Note that the overall sum of ranks should equal the two other rank sums combined.

Obs
A
B
Dir
D(A-B)
Rd
R+
R-
1
8
11
-
3
4
4
2
6
8
-
2
2.5
2.5
3
12
4
+
8
7
7
4
2
9
-
7
6
6
5
3
3
6
3
5
-
2
2.5
2.5
7
1
2
-
1
1
1
8
7
2
+
5
5
5
n
8
8
7
42
44
28
12
16

The two rank sums are 12 and 16 so the 12 is the test statistic, W. You'll need the number of non-zero differences to look up the appropriate critical value (see Table 7.13 in the book).

For Nd = 7 the critical value is 2. The calculated value of W is 12, which is larger than the critical value so the result is not significant.


Top
Support Index | Exercises Index | Outline & TOC | Data files
 
My Publications

My Publications

Managing Data Using Excel, Cover

See my personal pages at GardenersOwn

Follow me...
Facebook Twitter Google+ Linkedin Amazon

See also:

Writer's Bloc
MonogRaphs
Tips & Tricks

Keywords

Here is a list of keywords: it is by no means complete!

T-test, U-test, Kruskal-Wallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, box-whisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Inter-quartile Range, IQR

Top DataAnalytics Home
Publications
Contact GardenersOwn Homepage