Dr. Mark Gardener 



Statistics for Ecologists Using R and Excel (Edition 2)Data Collection, Exploration, Analysis and Presentationby: Mark GardenerAvailable soon from Pelagic Publishing Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material. Support Index  Exercises Index  Outline & TOC  Data files 

Exercise 12.1.4 

Table of Contents


Section 12.1.4 Compare Shannon diversity of two samples with the Hutcheson ttest. 
12.1.4 Comparing the Shannon diversity index of two samples using a ttestThis exercise is concerned with comparing diversity (Section 12.1.4) and in particular how to carry out a modified version of the ttest designed to compare the Shannon diversity index of two samples. IntroductionWhen you have two samples of community data you can calculate a diversity index for each one. The Shannon diversity index is a commonly used measure of diversity. However, you cannot compare the two index values using classic hypothesis tests because you do not have replicated data. The Hutcheson ttest is a modified version of the classic ttest that provides a way to compare two samples. The key is the formula that determines the variance of the Shannon index. These notes will show you how to conduct the Hutcheson ttest and so get a statistical significance of the difference in Shannon diversity between two samples. There is also a spreadsheet calculator, that you can download and use for your own data. 

The Hutcheson ttest allows you to compare the Shannon diversity of two samples.

The Hutcheson ttestThe Hutcheson ttest was developed as a method to compare the diversity of two community samples using the Shannon diversity index (Hutcheson 1970, J. Theor. Biol. 29 p.151). The basic formula is similar in appearence to the classic ttest formula.
In the formula H represents the Shannon diversity index for each of the two samples (subscripted a and b). The bottom of the formula refers to the variance of each of the samples. 

The Hutcheson ttest allows you to compute variance of the Shannon index for a sample. You need to know: 
Computing variance for the Shannon diversity indexComputing the variance of the Shannon diversity is done using the formula shown below.
In the formula S is the total number of species, whilst N is the total abundance. The p is the proportion that each species makes towards the total. The formula is fairly easy to evaluate; it looks fairly horrid but the components are easily computed. 

Download the example spreadsheet Shannon diversity ttest calculator.xlsx to calculate the Hutcheson ttest for yourself. The spreadsheet is protected (but there is no password) but you can add extra rows for your own data. 
Getting a tvalueIt is easiest to use a spreadsheet for the calculations. The computations are relatively straightforwards. You can use the example spreadsheet Shannon diversity ttest calculator.xlsx rather than make your own! Here is an example of how the calculations appear in the spreadsheet.
The species names are not essential but the spreadsheet has room for them. You need the total abundance in order to calculate the proportions. To compute the Shannon diversity you need the next column, labelled Ln(P), but to assist in the variance calculations it is convenient to calculate two others, which you can see in the table. 

Download the spreadsheet Shannon diversity ttest calculator.xlsx to calculate the Hutcheson ttest for yourself. 
Using the spreadsheet for calculationsThe spreadsheet Shannon diversity ttest calculator.xlsx will carry out the calculations for you. There are worksheets for two samples. You can enter a site name and your abundance values as well as species names if you like. The spreadsheet is protected (although there is no password) but you can add additional rows to accommodate extra data (row 23 is empty and that's where you can start adding). I advise that you don't remove the worksheet protection, as it is easy to inadvertantly change a formula! I've designed the spreadsheet so you can add extra rows without needing to remove the protection. Enter the data for the two samples and the final ttest results will appear in the Results worksheet (see later). 

Degrees of freedom are required to put a significance to your result. The df are close to the overall total abundance from the samples combined. 
Calculating degrees of freedomOnce you have a value for t you need to determine if it is statistically significant. In order to do that you'll need to work out the degrees of freedom. This is computed using the following formula.
In the formula you need the variance for each sample and the total abundance for each sample. The final value is close to the total abundance for the two samples added together. 

Use the degrees of freedom to get a critical value. You can also use =TINV in Excel If your calculated value is larger than the critical value then your result is statistically significant. 
Assessing statistical significanceOnce you have your value for t and you have the appropriate degrees of freedom you can determine if the result is statistically significant. You can use the degrees of freedom to look up a critical value. If your calculated tvalue exceeds the critical then your result is significant. You can use Excel to calculate a critical value using the =TINV function. You need the level of significance (usually 0.05) and the degrees of freedom. You can also determine the probability directly using the =TDIST function. You need your calculated tvalue and the degrees of freedom. The Shannon diversity ttest calculator.xlsx spreadsheet will display all the ttest results in the Results worksheet. The spreadsheet will also calculate confidence intervals and produce a couple of graphs (which you can edit). 

Confidence intervals are helpful to visualize differences between samples. Get approx CI from square root of variance x 2. 
Computing confidence intervalsIt is useful to be able to calculate confidence intervals to your results. The confidence intervals allow you to compare multiple samples and to visualize how "different" samples are to one another. You can compute a reasonable confidence interval using the standard deviaiton of the Shannon index (that is the square root of the variance) and multiplying by 2. With most communities you will have fairly large degrees of freedom and the critical value (for t) will approach 2 (the critical value for t at infinity is 1.96). Once you have your confidence interval(s) you can add them to a graph to show the variability and possible overlap between samples. The Shannon diversity ttest calculator.xlsx will produce two charts (a bar chart and a point chart) with error bars based on the confidence intervals. You can edit the charts. 

Visualize Shannon diversity index with a bar chart or point chart. Use 95% confidence intervals as error bars. 
Graphing the resultsIt is useful to chart your results, the Shannon diversity ttest calculator.xlsx will produce two graphs, which you can edit. The error bars are based on the confidence interval.
The bar chart is a "classic" form of graph to show differences between samples. The point chart is a good alternative, as it allows you to see difference between samples more easily, especially if you customize the yaxis scale. The chart shown here woulld benefit from having the axis run from 1.0 to 1.6 for example. 

Top  
My Publications  My Publications See my personal pages at GardenersOwn 

Follow me... 


See also: 
KeywordsHere is a list of keywords: it is by no means complete! Ttest, Utest, KruskalWallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, boxwhisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Interquartile Range, IQR 

Top  DataAnalytics Home  Contact  GardenersOwn Homepage 