Dr. Mark Gardener 



Statistics for Ecologists Using R and Excel (Edition 2)Data Collection, Exploration, Analysis and Presentationby: Mark GardenerAvailable soon from Pelagic Publishing Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material. Support Index  Exercises Index  Outline & TOC  Data files 

Exercise 7.1.1 

Table of Contents


Welch twosample ttest modifies the degrees of freedom to produce a more conservative result Exercise 7.1 data: 
7.1.1 Welch twosample ttestThis exercise is concerned with the ttest in Chapter 7 (Section 7.1). IntroductionThe ttest is used to compare the means of two samples that have a normal (parametric or Gaussian) distribution. The "classic" ttest has two major variants:
In the first case the common variance is calculated and used in place of the variance in the regular formula. The calculation for this is relatively simple but it is also pointless, since you still have to determine the variance of the two samples. The most commonly used modification is to adjust the degrees of freedom to make the result of the ttest a little more conservative. The degrees of freedom are reduced slightly using the Satterthwaite modification. This version of the ttest is generally called the Welch 2sample ttest. The calculations are relatively easy. You can then use the modified df for looking up critical values or for computing the exact pvalue. The Welch 2sample ttest is carried out by default in R via the t.test() command. In Excel the TTEST function will give you the exact pvalue but it will not provide the modified degrees of freedom. The Excel functions TDIST and TINV will give incorrect results as they assume equal variance and use unmodified degrees of freedom. This exercise works through the ttest and shows how to use the Satterthwaite modification to alter degrees of freedom. This allows you to get the "proper" result in Excel. The calculation matches that used in the Analysis ToolPak, which is only available in Windows. The exercise uses the data that you can see in the following table: Abundance of R. repens in ridges and furrows of a mediaeval meadow
The data show the abundance of a plant species in two different habitats. The two samples are small but are normally distributed. You can get a copy of the data in Excel .xlsx format here: ridge furrow.xlsx. 

Satterthwaite modified degrees of freedom 
CalculationThe calculation of the modified degrees of freedom are in two parts. To start with you determine a statistic called u, which you can think of as a kind of proportion (it varies from 0–1).
Once you have a value for u, you can determine f, the modified degrees of freedom.
The formula gives the same value whichever sample you choose to be #1 and which #2. The df are reduced slightly from the original: original df = (n1  1 + n2  1). Once you have a modified df you can use it to look up the critical value, either in a table or using the TINV function in Excel. The equivalent in R would be the qt() function. 

Calculate a tvalue using the basic formula Compute an exact pvalue using TTEST function For calculation of critical value you need to work out modified degrees of freedom "longhand" 
Carry out basic ttestStart by opening the ridge furrow.xlsx data file. Go to the Data worksheet (the ttest completed worksheet is provided for you to check your results). The two samples are in columns B and C.
The pvalue from step 13 (p = 0.019) is based on equal variance (and unmodified degrees of freedom) and so is not really correct. Carry on and calculate a critical value:
The critical value from step 15 (t = 2.201) is also based on equal variance and unmodified degrees of freedom.
Note that the TTEST function gives a more conservative pvalue (of 0.023) because it has calculated the modified degrees of freedom. However, you cannot extract the df directly and will have to compute it if you want to report a more appropriate critical value you'll need to compute it longhand. 

Use regular maths to calculate the modified degrees of freedom With modified df you can use TINV to get a better critical value 
Compute modified degrees of freedomThe calculation of the modified degrees of freedom (Satterthwaite modification) will require some simple maths using the equations from earlier:
Your result from step 21 should be 8.733. Excel cannot deal with degrees of freedom that are not integer values so you need to round up (Excel always rounds the df value upwards):
Now you have a modified df (df = 9) you can use it to calculate a critical value. You can also see how the modified df can be used in TINV and TDIST functions.
Note that the critical value from step 25 (t = 2.262) is higher than the value from step 15, and so a bit more conservative.
Note that the exact pvalue from step 27 (p = 0.022) is very similar to that you obtained from the TTEST function in step 17. If you have a pvalue you can get the value of t but only if you have the modified degrees of freedom. 28. In A26 type a label "tvalue" for the t calculation. The final tvalue you calculated in step 29 (t = 2.758) is the same as the one from step 11. Unfortunately, because of slight rounding errors you usually get a slightly different pvalue using the TTEST function compared to the "long" method. The upshot is that it is always better to calculate the value of t using the means and variance, rather than indirectly via TTEST and TINV. 

Use the Analysis ToolPak to work out statistics for the Welch twosample ttest, for unequal variances. Set the hypothesized mean difference to 0 Report the twotail results 
Use the Analysis ToolPakIf you have a Windows version of Excel you can use the Analysis ToolPak addin to carry out a ttest. This allows you to get the value of t, the critical value and the exact pvalue all at once. 1. Open the ridge furrow.xlsx data file.
3. Choose the ttest: TwoSample Assuming Unequal Variances option.
5. Choose the location to place the results. This can be a new workbook or worksheet. In the example above the results are placed in cell E2 of the existing sheet. Now you just have to interpret the results:
You can ignore the sign for the t Stat result, if the samples were in different column order the result would be the same value but with opposite sign. Note that the df are shown as an integer and the exact value is always rounded upwards. You want the twotail results in most cases, which are in the last two rows. 

Top  
My Publications  
Follow me... 


See also: 
KeywordsHere is a list of keywords: it is by no means complete! Ttest, Utest, KruskalWallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, boxwhisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Interquartile Range, IQR 

Top  DataAnalytics Home  Contact  GardenersOwn Homepage 