Dr. Mark Gardener 



Statistics for Ecologists Using R and Excel (Edition 2)Data Collection, Exploration, Analysis and Presentationby: Mark GardenerAvailable soon from Pelagic Publishing Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material. Support Index  Exercises Index  Outline & TOC  Data files 

Exercise 10.1.5 

Table of Contents


Twoway ANOVA incorporates two predictor variables. Excel can carry out calculations but needs data in sample format. Exercise data: 
10.1.5 TwoWay ANOVA using ExcelThis exercise is concerned with analysis of variance (ANOVA) in Chapter 10. In particular with the situation when you have two predictor variables, that is twoway ANOVA. IntroductionExcel can carry out the necessary calculations to conduct ANOVA and has several useful functions that can help you:
However, Excel is most suitable for oneway ANOVA, where you have a single predictor variable. When you have two predictor variables twoway ANOVA is possible, but can be tricky to arrange. In order to carry out the calculations you need to have your data arranged in a particular layout, let's call it sample layout or "on the ground" layout. This is not generally a good layout to record your results but it is the only way you can proceed sensibly using Excel. In this exercise you'll see how to set out your data and have a go at the necessary calculations to perform a twoway ANOVA. The data for this exercise are available as a file: Two Way online.xlsx. There are two worksheets, one with the bare data and a completed version so you can check your work. If you have Windows you can use the Analysis ToolPak to carry out the computations for you but you'll still need to arrange the data in a particular manner. 

Exercise data: Data are set out in sample format, with a column for one predictor (water) and the other predictor (plant) as separate columns. Each block has the same number of replicates. 
The exercise dataThe data you'll use for this exercise are in the file Two Way online.xlsx and are presented in the following table:
These data represent the growth of two different plant species under three different watering regimes. The first column shows the levels of the Water variable, this is one of the predictor variables and you can see that there are three levels: Lo, Mid and Hi. The next two columns show the growth results for two plant species, labelled vulgaris and sativa. These two columns form the second predictor variable (we'll call that Plant, which seems suitable). This layout is the only way that Excel can deal with the values but it is not necessarily the most useful general layout for your data. The scientific recording layout is more powerful and flexible. The other thing to note is that there are an equal number of items (replicates) in each "block". Here there are only 3 observations per block. This replication balance is important; the more unbalanced your situation is the more "unreliable" the result is. In fact if you use the Analysis ToolPak for 2way ANOVA you must have a completely balanced dataset (or the routine refuses to run). 

Calculate column Sums of Squares using sample mean and overall mean. 
Calculate Column Sums of SquaresStart by opening the example datafile: Two Way online.xlsx. Make sure you go to the Data worksheet (the worksheet Completed is there for you to check your results). The data are set out like the previous table, in sample layout. You will need to calculate the various sums of squares and to help you the worksheet has some extra areas higlighted for you. Start by calculating the column SS, that is the sums of squares for the Plant predictor. In the formula x represents each column, T is the overall data and n is the number of samples.
You should now have the two sums of squares for the columns (the Plant predictor). 

Calculate row Sums of Squares using row (block) mean and overall mean. 
Calculate Row Sums of SquaresThe row SS are calculated in a similar manner to the col SS.
You've now got the row and column SS. 

Error sums of squares are block variance multiplied by replicates per block minus 1. 
Calculate Error Sums of Squares (within groups SS)The next step is to determine the error sums of squares. You can get this by multiplying the block variance by the number of replicates for each group minus 1. This is essentially a tinkering of the formula for variance:
You now have the error term for the ANOVA. This is an important value, as you'll need it to calculate the final Fvalues. 

Interaction sums of squares are calculated by using the means of various elements (bloxk, column, row, total). 
Calculate Interaction Sums of SquaresThe final SS component is that for the interactions between the two variables (Water and Plant). To do this you use the means of the variaous groups and the replication like so: The formula looks horrendouns but in reality it is more tedious than really hard. The first mean is the mean of a single block. The next Xa, is essentially the column mean. The Xb mean is the "row" mean. The final mean (double overbar) is the overall mean. The n is the number of replicates in each block.
Now you have all the sums of squares values you need to complete the ANOVA. 

Total SS equals the sum of the SS of the components. Calculate it from overall variance and df as a test of your maths. 
Compute total SS and degrees of freedomThe total sums of squares can be calculated by adding the component SS together. On the other hand, it would be good to check your maths by calculating it from the total variance and df. You'll also need the degrees of freedom for the various components before you can construct the final ANOVA table.
Now you have everything except the total df, which you can place in the final ANOVA table shortly. 

The final ANOVA table shows the mean squares, F values and significance. It is also common to show the critical values of F for the appropriate degrees of freedom. Use FINV function to work out critical values of F. 
Make the final ANOVA tableYou can now construct the final ANOVA table and compute the Fvalues and significance of the various components. You want your table to end up looking like the following:
Now you have the final ANOVA table completed. You can see that the interaction term is highly significant (p = 0.0018). The Water treatment is also significant but the Plant variable is not (p = 0.14). 

Show results of twoway ANOVA graphically. Bar chart with error bars is one method of displaying the result. 
Graphing the resultYou should plot your result as a chart of some kind. There are several ways you might proceed. Chapter 6 gives details about constructing charts in Excel and in R for various scenarios. One option would be to make a bar chart, showing the mean for each block, and with the blocks grouped by watering treatment or by plant species. You should give an impression of the variability using error bars. The following column chart was produced using 95% confidence intervals: The critical value for t can be determine dusing the TINV funtion and the degrees of freedom: =(TINV(0.05,16)). Note that in this case df = 16 because we are splitting the blocks by plant (there are 91 + 91 = 16 degrees of freedom). The size of the error bars is then:


Top  
My Publications  My Publications See my personal pages at GardenersOwn 

Follow me... 


See also: 
KeywordsHere is a list of keywords: it is by no means complete! Ttest, Utest, KruskalWallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, boxwhisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Interquartile Range, IQR 

Top  DataAnalytics Home  Contact  GardenersOwn Homepage 