Dr. Mark Gardener

 
About

Statistics for Ecologists Edition 2 Cover
Available soon from
Pelagic Publishing


Statistics for Ecologists Using R and Excel (Edition 2)

Data Collection, Exploration, Analysis and Presentation

by: Mark Gardener

Available soon from Pelagic Publishing

Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material.

Support Index | Exercises Index | Outline & TOC | Data files


 

Exercise 12.1.4

Pelagic Publishing Logo

Table of Contents


Section 12.1.4

Compare Shannon diversity of two samples with the Hutcheson t-test.

Top

12.1.4 Comparing the Shannon diversity index of two samples using a t-test

This exercise is concerned with comparing diversity (Section 12.1.4) and in particular how to carry out a modified version of the t-test designed to compare the Shannon diversity index of two samples.

Introduction

When you have two samples of community data you can calculate a diversity index for each one. The Shannon diversity index is a commonly used measure of diversity. However, you cannot compare the two index values using classic hypothesis tests because you do not have replicated data.

The Hutcheson t-test is a modified version of the classic t-test that provides a way to compare two samples. The key is the formula that determines the variance of the Shannon index. These notes will show you how to conduct the Hutcheson t-test and so get a statistical significance of the difference in Shannon diversity between two samples. There is also a spreadsheet calculator, that you can download and use for your own data.


The Hutcheson t-test allows you to compare the Shannon diversity of two samples.

Top

 

The Hutcheson t-test

The Hutcheson t-test was developed as a method to compare the diversity of two community samples using the Shannon diversity index (Hutcheson 1970, J. Theor. Biol. 29 p.151). The basic formula is similar in appearence to the classic t-test formula.

Formula for the Hutcheson t-test
Formula for the Hutcheson t-test

In the formula H represents the Shannon diversity index for each of the two samples (subscripted a and b). The bottom of the formula refers to the variance of each of the samples.


The Hutcheson t-test allows you to compute variance of the Shannon index for a sample.

You need to know:
Species richness
Total abundance
Abundance of each species

Top

Computing variance for the Shannon diversity index

Computing the variance of the Shannon diversity is done using the formula shown below.

Formula to calculate variance in Shannon diversity
Formula to calculate variance in Shannon diversity.
S is the species richness, N is the total number of individuals.

In the formula S is the total number of species, whilst N is the total abundance. The p is the proportion that each species makes towards the total. The formula is fairly easy to evaluate; it looks fairly horrid but the components are easily computed.


Download the example spreadsheet Shannon diversity t-test calculator.xlsx to calculate the Hutcheson t-test for yourself.

The spreadsheet is protected (but there is no password) but you can add extra rows for your own data.

Top

Getting a t-value

It is easiest to use a spreadsheet for the calculations. The computations are relatively straightforwards. You can use the example spreadsheet Shannon diversity t-test calculator.xlsx rather than make your own! Here is an example of how the calculations appear in the spreadsheet.

Site Name: E1
Taxon Count P Ln(P) P*Ln(P) P*Ln(P)2
388 0.543 -0.611 -0.332 0.203
1 0.001 -6.572 -0.009 0.060
2 0.003 -5.879 -0.016 0.097
13 0.018 -4.007 -0.073 0.292
3 0.004 -5.474 -0.023 0.126
1 0.001 -6.572 -0.009 0.060
59 0.083 -2.495 -0.206 0.514
3 0.004 -5.474 -0.023 0.126
2 0.003 -5.879 -0.016 0.097
2 0.003 -5.879 -0.016 0.097
210 0.294 -1.225 -0.360 0.441
4 0.006 -5.186 -0.029 0.150
3 0.004 -5.474 -0.023 0.126
1 0.001 -6.572 -0.009 0.060
1 0.001 -6.572 -0.009 0.060
21 0.029 -3.528 -0.104 0.366
1 0.001 -6.572 -0.009 0.060
0
0
0
Total 715 1 -83.972 -1.267 2.934
Richness 17
SS 2.934
SQ 1.606
H 1.267
S2H 0.002

The species names are not essential but the spreadsheet has room for them. You need the total abundance in order to calculate the proportions. To compute the Shannon diversity you need the next column, labelled Ln(P), but to assist in the variance calculations it is convenient to calculate two others, which you can see in the table.


Download the spreadsheet Shannon diversity t-test calculator.xlsx to calculate the Hutcheson t-test for yourself.

Top

Using the spreadsheet for calculations

The spreadsheet Shannon diversity t-test calculator.xlsx will carry out the calculations for you. There are worksheets for two samples. You can enter a site name and your abundance values as well as species names if you like. The spreadsheet is protected (although there is no password) but you can add additional rows to accommodate extra data (row 23 is empty and that's where you can start adding).

I advise that you don't remove the worksheet protection, as it is easy to inadvertantly change a formula! I've designed the spreadsheet so you can add extra rows without needing to remove the protection.

Enter the data for the two samples and the final t-test results will appear in the Results worksheet (see later).


Degrees of freedom are required to put a significance to your result.

The df are close to the overall total abundance from the samples combined.

Top

Calculating degrees of freedom

Once you have a value for t you need to determine if it is statistically significant. In order to do that you'll need to work out the degrees of freedom. This is computed using the following formula.

Formula to calculate degress of freedom for Hutcheson t-test
Formula to calculate the degrees of freedom in the Hutcheson t-test.

In the formula you need the variance for each sample and the total abundance for each sample. The final value is close to the total abundance for the two samples added together.


Use the degrees of freedom to get a critical value.

You can also use =TINV in Excel

If your calculated value is larger than the critical value then your result is statistically significant.

Top

Assessing statistical significance

Once you have your value for t and you have the appropriate degrees of freedom you can determine if the result is statistically significant. You can use the degrees of freedom to look up a critical value. If your calculated t-value exceeds the critical then your result is significant.

You can use Excel to calculate a critical value using the =TINV function. You need the level of significance (usually 0.05) and the degrees of freedom.

You can also determine the probability directly using the =TDIST function. You need your calculated t-value and the degrees of freedom.

The Shannon diversity t-test calculator.xlsx spreadsheet will display all the t-test results in the Results worksheet. The spreadsheet will also calculate confidence intervals and produce a couple of graphs (which you can edit).


Confidence intervals are helpful to visualize differences between samples.

Get approx CI from square root of variance x 2.

Top

Computing confidence intervals

It is useful to be able to calculate confidence intervals to your results. The confidence intervals allow you to compare multiple samples and to visualize how "different" samples are to one another.

You can compute a reasonable confidence interval using the standard deviaiton of the Shannon index (that is the square root of the variance) and multiplying by 2. With most communities you will have fairly large degrees of freedom and the critical value (for t) will approach 2 (the critical value for t at infinity is 1.96).

Once you have your confidence interval(s) you can add them to a graph to show the variability and possible overlap between samples. The Shannon diversity t-test calculator.xlsx will produce two charts (a bar chart and a point chart) with error bars based on the confidence intervals. You can edit the charts.


Visualize Shannon diversity index with a bar chart or point chart.

Use 95% confidence intervals as error bars.

Top

Graphing the results

It is useful to chart your results, the Shannon diversity t-test calculator.xlsx will produce two graphs, which you can edit. The error bars are based on the confidence interval.

Point chart of Shannon diversity index
Two ways to represent the difference between Shannon diversity. Error bars are 95% confidence intervals.

The bar chart is a "classic" form of graph to show differences between samples. The point chart is a good alternative, as it allows you to see difference between samples more easily, especially if you customize the y-axis scale. The chart shown here woulld benefit from having the axis run from 1.0 to 1.6 for example.


Top
Support Index | Exercises Index | Outline & TOC | Data files
 
My Publications

My Publications

Managing Data Using Excel, Cover

See my personal pages at GardenersOwn

Follow me...
Facebook Twitter Google+ Linkedin Amazon

See also:

Writer's Bloc
MonogRaphs
Tips & Tricks

Keywords

Here is a list of keywords: it is by no means complete!

T-test, U-test, Kruskal-Wallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, box-whisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Inter-quartile Range, IQR

Top DataAnalytics Home
Publications
Contact GardenersOwn Homepage