Dr. Mark Gardener 



Statistics for Ecologists Using R and Excel (Edition 2)Data Collection, Exploration, Analysis and Presentationby: Mark GardenerAvailable now from Pelagic Publishing Welcome to the support pages for Statistics for Ecologists. These pages provide information and support material for the book. You should be able to find an outline and table of contents as well as support datafiles and additional material. Support Index  Outline & TOC  Data files 

Exercises & supplementary notesThese exercises and supplementary notes provide a few additional details that I thought would be useful. The exercises are listed with some notes on this page. Use the Quick Index to view a summary of each exercise/note or the Links to exercises/notes to go directly to the individual exercise/notes page (the sidebar also has direct links). 

Index of Exercises & Notes 
Quick Index & introductory notes

Links to exercises/notes


Top  The exercises & examples are listed more or less in the order they appear in the book. At the time of writing (early 2016) these are referred to in the book text. However, I will probably add additional notes and exercises from time to time so please look back from time to time.


Chapter 3 Data exploration – using software tools 
Chapter 3The following exercises and notes relate to Chapter 3, Beginning data exploration – using software tools. 

Section 3.3 How to deal with an "incomplete final line" error message 
3.3 Getting data from Excel into RThese notes relate to Section 3.3, which is about how to export data from Excel and import to R. The notes show you what to do if you get an "incomplete final line" error message when importing a CSV file into R. Goto Exercise 3.3. IntroductionR can read CSV files using the read.csv() command. You can easily make CSV files from Excel spreadsheets. Occasionally you can get an error "incomplete final line", when you are using the read.csv() command. The error arises because sometimes Excel does not add a complete linefeed at the end of the final data row. You have two options:
The notes (part of the Tips & Tricks series) show you how to achieve this. 

Chapter 6 Exploring data using graphs 
Chapter 6The following exercises & notes relate to Chapter 6, exploring data using graphs. 

Section 6.2.1 Make a Tally plot in Excel in lieu of a histogram. 
6.2.1 Histograms & tally plotsThese notes relate to Chapter 6, exploring data using graphs. More specifically to Section 6.2.1 Exploratory graphs in Excel. The notes show how you can make a "quick and dirty" plot in Excel, a Tally plot. Goto Exercise 6.2.1. IntroductionData distribution is important. You need to know the shape of your data so that you can determine the best:
Some statistical tests use the properties of the normal (Gaussian) distribution, whilst others use data ranks. So, it's important to know if your data are normally distributed or otherwise. The classic way to look at the shape of your data is with a histogram, showing the frequency of observations that fall into certain size classes (called bins). However, you can make a "quick and dirty" histogram using pencil and paper, a tally plot. The tally plot is useful as it is something you can do in a notebook in the field as you collect data. These notes show how you can make a tally plot in Excel. 

Section 6.2.2 Make a Tally plot in R using the hist() command. 
6.2.2 Tally plots in RThese notes relate to Chapter 6, exploring data using graphs. More specifically to Section 6.2.2 Exploratory graphs using R. The notes show how you can coerce the hist() command to produce a histogram with dots instead of bars, thus forming a Tally plot. Goto Exercise 6.2.2. IntroductionThe hist() command is flexible and allows you to make a range of histograms in R. You can also use the stem() command to make a kind of Tally plot. The notes show you how to make a custom command that produces a histogram but with dots instead of bars, thus mimicking a Tally plot. 

Section 6.3 Using colour in graphs and charts 
6.3 Colour in graphsThese notes relate to Chapter 6, exploring data using graphs. The notes are mentioned in Section 6.3 Graphs to illustrate differences although they are generally relevant to graphical presentation. These are general notes with a little more information about controlling colour than is mentioned in the book. Goto Exercise 6.3. IntroductionColour is very important in presenting data and results. Both Excel and R have a wide range of colours you can use when creating your graphs and charts (certainly more than 50 shades of gray!). Controlling and managing the colours you display is an important element in presenting your work. With an increasing volume of work being presented via the Internet, colour is something not to take for granted. Using default colours is "easy" but for maximum impact you should think carefully about how to present the best colours for the job. Traditional journals generally use monochrome, which you can think of as just another set of colours, but even if you are "stuck" with shades of grey you need to think carefully. Pattern filling can be an especially useful option when using monochrome. These notes give a bit more information about controlling colour in Excel and R graphs than in the book. 

Section 6.3.1 Using legends in R plots 
6.3.1a LegendsThese notes relate to Chapter 6, exploring data using graphs. The notes are especially relevant to adding legends to barplots in R (Section 6.3.1). Goto Exercise 6.3.1a. IntroductionYou can add a legend to a barplot() using the legend parameter. You can also add a legend to any plot via the legend() command. These notes provide a few more details about how to produce and control legends in R plots. 

Section 6.3.1 Gridlines in graphs and charts 
6.3.1b Gridlines in graphs & chartsThese notes relate to Chapter 6, exploring data using graphs. The notes are especially relevant to adding gridlines to bar charts in Excel and R. Goto Exercise 6.3.1b. IntroductionGridlines are potentially useful items you might want to incorporate in your charts. Gridlines can help the reader to gauge the height of bars in a column chart more easily for example, and so the readability is improved. On the other hand gridlines can "get in the way" and hinder readability by making your chart cluttered. In scatter plots you may require both horizontal and vertical gridlines, having gridlines on one axis only can "lead the eye". Knowing when to apply gridlines or not is part of the skill of presentation. These notes show you how to add and tweak gridlines in both Excel and R. 

Section 6.3.2 Ordering the boxes in a boxplot() 
6.3.2 Ordering the boxes of a boxplot()These notes relate to Chapter 6, exploring data using graphs. The notes are relevant to boxwhisker plots, which are used to display "differences" between samples. Goto Exercise 6.3.2. IntroductionThe boxplot() command is one of the most useful graphical commands in R. The boxwhisker plot is useful because it shows a lot of information concisely. However, the boxes do not always appear in the order you would prefer. These notes show you how you can take control of the ordering of the boxes in a boxplot(). 

Section 6.4.2 Axis labels in R plots, using expression() to make superscript etc. 
6.4.2 Axis labels using the expression() commandThese notes relate to Chapter 6, exploring data using graphs. They are relevant to all types of R plot because they are concerned with labelleing of axes, especially making of superscript and subscript elements. Goto Exercise 6.4.2. IntroductionThe labelling of your graph axes is an important element in presenting your data and results. You often want to incorporate text formatting to your labelling. Superscript and subscript are particularly important for scientific graphs. You may also need to use bold or italics (the latter especially for species names). In Excel you can simply select the text of your labels and alter the formatting. In R you must create a specially formatted string using the expression() command. These notes show you how to use the expression() command and also how to place items using text(), title() and mtext() commands. 

Section 6.5.1 Interactive text placement in R plots. Use locator() in place of x, y coordinates to use the mouse as a pointer. 
6.5.1 Interactive labels in pie() chartsThese notes relate to Chapter 6, exploring data using graphs. They are especially relevant to Section 6.5.1, which is about using pie charts to show association data. However, the notes are generally relevant as they show how you can place text onto an existing R plot in an interactive manner, using your mouse as a pointer. Goto Exercise 6.5.1. IntroductionThe locator() command is used to "read" the mouse position and generate x, y coordinates. These can be used in various ways, in commands that require those x, y coordinates. For example, sometimes the default placement of labels on a plot is not quite what you want. You can use the text() command with locator() to place the labels exactly where you want. In this exercise you'll see the locator() command used to place labels on a pie() chart as well as some notes about making custom labels. 

Chapter 7 Tests for differences 
Chapter 7The following exercises & notes relate to Chapter 7, tests for differences. 

Section 7.1.1 Welch twosample ttest modifies the degrees of freedom to produce a more conservative result Use Excel function TTEST 
7.1.1 Welch twosample ttestThis exercise is concerned with the ttest in Chapter 7 (Section 7.1). This exercise walks you through the process of a ttest in Excel and calculating the modified degrees of freedom associated with the TTEST function. Goto Exercise 7.1 IntroductionThe ttest is used to compare the means of two samples that have a normal (parametric or Gaussian) distribution. The "classic" ttest has two major variants:
In the first case the common variance is calculated and used in place of the variance in the regular formula. The calculation for this is relatively simple but it is also pointless, since you still have to determine the variance of the two samples. The most commonly used modification is to adjust the degrees of freedom to make the result of the ttest a little more conservative. The degrees of freedom are reduced slightly using the Satterthwaite modification. This version of the ttest is generally called the Welch 2sample ttest. 

Section 7.1.2 Use t.test() command to carry out the ttest in R 
7.1.2 Using R for the ttestThis exercise is concerned with the ttest in Chapter 7 (Section 7.1) and particularly with running the ttest using R. Goto Exercise 7.1.2. IntroductionThe t.test() command carries out the ttest in R. The default is to compute the Welch twosample test (unequal variances). You can have your data in several forms and this exercise is a brief reminder of how to manage data in different forms, such as:
The data used are the same as those illustrated in the book text. 

Section 7.3.3 Use RANK.AVG in Excel to rank values. Use sum of ranks from positive and negative differences as the test statistic for Wilcoxon matched pair test. 
7.3.3 Using Excel for the Wilcoxon Matched Pairs testThis exercise is concerned with matched pairs tests (Section 7.3) and in particular how to carry out the nonparametric Wilcoxon Matched Pairs test using Excel (Section 7.3.3). Goto Exercise 7.3.3. IntroductionThere is no inbuilt function that will carry out the Wilcoxon Matched Pairs test in Excel. However, you can rank the data and compute the rank sums you require using the RANK.AVG function. You do need to omit zero differences from the calculations and also to separate ranks of positive differences from ranks of negative differences. In this exercise you can see how to use the IF function to help you do this separation. Once you have your result you'll have to look up the critical values for W (see Table 7.13 in the book), to see if your result is a significant one. 

Use a scatter plot in matched pairs situations. Plot one sample against the other and use an isocline. 
7.3.5 Graphs and matched pairs resultsThis exercise is concerned with matched pairs tests (Section 7.3), and in particular how you can represent the data/results graphically. Goto Exercise 7.3.5. IntroductionUsually you'll use a bar chart or boxwhisker plot to display data when looking at differences between samples. When you have a matched pairs situation however, these sorts of graph may not always be the best way to summarize your data/results. An alternative is to use a scatter plot, where you plot one sample against the other. If you add an isocline (a straight line with slope 1 and intercept 0) you can see more clearly how the pairs of observations match up with one another. These notes show you how to prepare such a scatter plot. 

Chapter 8 
Chapter 8The following exercises & notes relate to Chapter 8, tests for linking data – correlations. 

Use Excel for Spearman Rank correlation. Use RANK.AVG to rank data then CORREL to get a Spearman Rank coefficient. 
8.3.2 Spearman Rank correlation in Excel via t approximationThis exercise is concerned with correlation (Chapter 8) and in particular how you can use Excel to calculate Spearman's Rank correlation coefficient. Goto Exercise 8.3.2. IntroductionExcel has builtin functions that can calculate correlation, but only when data are normally distributed. The CORREL and PEARSON functions both calculate Pearson's Product Moment, a correlation coefficient. Once you have the correlation coefficient it is fairly easy to calculate the statistical significance. You compute a tvalue then use TINV to compute a critical value or TDIST to get an exact pvalue (see Section 8.3.1 in the book). This exercise shows how you can use the RANK.AVG function to rank the data, then use CORREL on the ranks to obtain a Spearman Rank coefficient. 

Chapter 9 Tests for linking data – Associations 
Chapter 9The following exercises & notes relate to Chapter 9, tests for linking data – associations. 

Section 9.4 Goodness of fit tests compare one set of frequencies against another. Often used in genetic studies to compare observed phenotype ratios. Excel functions: 
9.4 Using Excel for Chisquared goodness of fit testsThis exercise is concerned with association (Chapter 9), in particular goodness of fit testing using Excel (Section 9.4). Goto Exercise 9.4. IntroductionExcel has several functions related to the Chisquared statistic. This allows you to undertake chisquared goodness of fit testing for example. In goodness of fit tests you have one set of frequency data in various categories. You also have a matching set of frequencies that you want to "compare". The comparison set may be a theoretical set of values or perhaps a previous set of observations. The goodness of fit test is often used in genetic studies where you match up observed phenotypes against a theoretical ratio of expected phenotypes. The following Excel functions are helpful in carrying out the goodness of fit test:
This exercise shows you how to carry out the calculations using these Excel functions. 

Chapter 10 Differences between more than two samples. ANOVA and KruskalWallis 
Chapter 10The following exercises & notes relate to Chapter 10, Differences between more than two samples. Essentially this means analysis of variance (ANOVA) and the KruskalWallis nonparametric equivalent of 1way ANOVA. 

Layout of data is important. Excel tends to use sample layout, whilst R tends to use recording layout. Use the stack() command to help you rearrange your data to best advantage. 
10.1.4 Alter sample format data to scientific recording formatThese notes are related to Chapter 10, which is concerned with differences between more than two samples. The notes are relevant to other analyses too, as they show you how to alter the layout of your data from the form "preferred" by Excel and that required by R. Goto Exercise 10.1.4. IntroductionThere are two main ways you can layout your data. In sampleformat each column is a separate sample, which forms some kind of logical sampling unit. This is a typical way you layout data if you are using Excel because that's how you have to have your data to be able to make charts and carry out most forms of analysis. In scientific recording format each column is a variable; you have response variables and predictor variables. This recordinglayout is a more powerful and ultimately flexible layout because you can add new variables or observations easily. In R this layout is also essential for any kind of complicated analysis, such as regression or analysis of variance. I've written about scientific recording format before, see my Writer's Bloc page for a brief summary. When you have data in the "wrong" layout you need to be able to rearrange them into a more "sensible" layout so that you can unleash the power of R most effectively. The stack() command is a useful tool that can help you achieve this layout.


Section 10.1.5 Use Excel for twoway ANOVA Set out your data in sample format

10.1.5 TwoWay ANOVA using ExcelThis exercise is related to analysis of variance and in particular how you can carry out twoway ANOVA using Excel (Section 10.1.5). The calculations are not especially hard but it can be fiddly. Goto Exercise 10.1.5. IntroductionExcel can carry out the necessary calculations to conduct ANOVA and has several useful functions that can help you. However, it is most suitable for oneway ANOVA, where you have a single predictor variable. When you have two predictor variables twoway ANOVA is possible, but can be tricky to arrange. In order to carry out the calculations you need to have your data arranged in a particular layout, let's call it sample layout or "on the ground" layout. This is not generally a good layout to record your results but it is the only way you can proceed sensibly using Excel. In this exercise you'll see how to set out your data and have a go at the necessary calculations to perform a twoway ANOVA. If you have Windows you can use the Analysis ToolPak to carry out the computations for you but you'll still need to arrange the data in a particular manner. 

Section 10.2 Critical values for the KruskalWallis test.

10.2a Critical values for KruskalWallis testThese notes give critical values for the KruskalWallis test. The KW test is used to analyse differences between more than two samples (Chapter 10). The KW test is presented in Section 10.2. Goto Exercise 10.2a. IntroductionThe KruskalWallis test is appropriate when you have nonparametric data and one predictor variable (Section 10.2). It it analogous to a oneway ANOVA but uses ranks of items in various groups to determine the likely significance. If there are at least 5 replicates in each group the critical values are close to the ChiSquared distribution. There are exact critical values computed when you have equal group sizes. There are also exact critical values for situation where you have unequal group sizes. 

Adjusting KruskalWallis when there are tied ranks. 
10.2b Adjustment for tied ranks in KruskalWallis testThese notes relate to the KruskalWallis test for differences between more than two samples (Section 10.2). The notes show how you can adjust the test statistic in situations where there are tied ranks. Goto Exercise 10.2b. IntroductionThe KruskalWallis test is appropriate when you have nonparametric data and one predictor variable (Section 10.2). It it analogous to a oneway ANOVA but uses ranks of items in various groups to determine the likely significance. When you have tied values, you will get tied ranks. In these circumstances you should apply a correction to your calculated test statistic. The notes show you how this can be done. The calculations are simple but in Excel it can be difficult to get the process "automated". In R the kruskal.test() command computes the adjustment for you. 

Post Hoc testing with KruskalWallis tests. 
10.2.3 Posthoc testing in KruskalWallis using RThese notes relate to Chapter 10, differences between more than two samples. Specifically the notes deal with post hoc analysis when you use the KruskalWallis test (Section 10.2). Goto Exercise 10.2.3. IntroductionThe KruskalWallis test is a nonparametric test for differences between more than two samples. It is essentially an analogue for a oneway anova. There is no "standard" method for carrying out post hoc analysis for KW tests. These notes show you how you can use a modified form of the Utest to carry out post hoc analysis. The notes include the custom R commands in the file KW posthoc.R. 

Chapter 11 
Chapter 11The following exercises and notes relate to Chapter 11, tests for linking several factors. This means regression, multiple regression, curvilinear regression and logistic regression. 

Regression model diagnostics: 
11.1 Graphing Multiple regressionThese notes relate to Chapter 11, tests for linking several factors. The notes relate especially to Sections 11.1 (multiple regression) and 11.2 (curvilinear regression). The notes show how you can summarize your regression models graphically. Goto Exercise 11.1. IntroductionWhen you only have two variables (a predictor and a single response) you can use a regular scatter plot to show the relationship. Even if the relationship is logarithmic or polynomial you can represent the situation, as long as there is only one predictor variable. When you have two or more predictor variables it becomes hard to represent the situation graphically. You can try a 3D plot but they are rarely successful. Generally you'll stick to plotting the "most important" predictor variable and display the model as a standard regression table. However, it can be helpful show some diagnostics from your regression model as graphics. These notes show you how you can produce some simple regression diagnostics and present them graphically. 

Using R to calculate beta coefficients from regression models. R Code example: 
11.1.2 Beta CoefficientsThese notes are related to Sections 11.1 (multiple regression) and 11.2 (curvilinear regression). More specifically they show how to calculate beta coefficients from regression models using R. Goto Exercise 11.1.2. IntroductionIn linear regression your aim is to describe the data in terms of a (relatively) simple equation. The simplest form of regression is between two variables:
In the equation y represents the response variable and x is a single predictor variable. The slope, m, and the intercept, c, are known as coefficients. If you know the values of these coefficients then you can plug them into the formula for values of x, the predictor, and produce a value for the response. In multiple regression you "extend" the formula to obtain coefficients for each of the predictors. If you standardize the coefficients (using standard deviation of response and predictor) you can compare coefficients against one another, as they effectively assume the same units/scale. The functions for computing beta coefficients are not builtin to R. In these notes you'll see some custom R commands that allow you to get the beta coefficients easily. You can download the Beta coeff calc.R file directly (the code is explained in Exercise 11.1.2). 

Use the results of a regression model as a predictive tool. Predict the level of the response variable in a logistic regression. 
11.3.1 Logistic regression: model predictionThese notes relate to Section 11.3, logistic regression. In particular the notes show how to predict the value of the response variable for given values of the predictor in a logistic regression. Goto Exercise 11.3.1. The notes are also generally relevant to regression (Chapter 11) as the methods apply to general regression models. IntroductionWhen you carry out a regression you are looking to describe the data in terms of the variables that form the relationships. When you've got your regression model you are able to describe the relationship using a mathematical model (which is what the regression model is). The regression model can be used in several ways, for example you can calculate fitted values, which are "idealised" values of the response variable. These are used in making lines of bestfit and also in diagnostic plots (see Exercise 11.1). The difference between the idealised values and the actually observed values are called residuals, which are also used in diagnostic plots (see Exercise 11.1). You can also use the regression model to make predicted values, which is where you use "new" values of the predictor (that is ones not observed in the original dataset) to predict the response variable. These are especially important in logistic regression, where your response is binary, that is it only has two possibilities. The result you get when you "predict" response values in a logistic regression is a probability; the likelihood of getting a "positive" result when the predictor variable is set to a particular value. 

Section 11.3 Regression model building. Build a model in logistic regression. 
11.3.2 Logistic regression: model buildingThese notes relate to Section 11.3 logistic regression. The notes show how to build a logistic regression model that contains only the "best" components. The principles apply to all regression models but the exercise uses an example dataset that has a binary response variable. Goto Exercise 11.3.2. IntroductionWhen you have several (or indeed many) predictor variables you want to find a regression model that best describes the relationship between the variables. You should not incorporate every variable that you've got. Eventually you'll explain all the variability in your response variable simply because you've got so many explanatory predictors. The process of modelbuilding allows you to select the "best" variable to add to your current regression model. In the book you see how to carry out stepwise model building using a regular multiple regression (Section 11.1.2). In this exercise you can have a go at building a logistic regression model. The process is much the same as described in Section 11.1.2. 

Chapter 12 Community ecology: Diversity & Similarity 
Chapter 12These notes and exercises relate to Chapter 12, Community ecology. Specifically to diversity and similarity. 

Preparing & managing Use a Pivot Table in Excel to rearrange data in scientific recording layout to community layout Download the exercise data 
12.0.0 Preparing & managing community dataThis exercise relates to all of Chapter 12 (community ecology), and is primarily aimed at helping you to prepare data and assemble it in a form that allows you to carry out further investigation. This follows on from an earlier exercise in Section 3.2.7, where you used an Excel Pivot Table. Goto Exercise 12.0.0. IntroductionIt is important that your data are arranged and set out in a manner that allows you to carry out the analyses that you require. In general a scientific recording layout is a good starting point (see Section 2.2). In the scientific recording layout (a.k.a. biological recording layout) you have a column for each variable (e.g. site name, species name, abundance). For most purposes this is the most "robust" way to record your data, as you can use the data most flexibly. For community analyses however, you'll generally want to have the data arranged by site and species, with the rows being site names and the columns the species (with the body of the table being the abundance). This exercise shows you how to take a dataset that is in recording layout and convert to community layout. The exercise data are: Preparing and managing community data exercise.xlsx. See also some notes about data layout and management from my book Managing Data Using Excel, on my Writer's Bloc page.


Comparing Shannon diversity index from two community samples. Hutcheson ttest. Download the spreadsheet example: 
12.1.4 Comparing diversityThese notes relate to Section 12.1.4 comparing diversity. The notes show how to compare the Shannon diversity index from two separate community samples. This is done using the Hutcheson ttest, a variant of the classic ttest. Goto Exercise 12.1.4. IntroductionWhen you have two samples of community data you can calculate a diversity index for each one. The Shannon diversity index is a commonly used measure of diversity. However, you cannot compare the two index values using classic hypothesis tests because you do not have replicated data. The Hutcheson ttest is a modified version of the ttest that provides a way to compare two samples. The key is the formula that determines the variance of the Shannon index. These notes will show you how to conduct the Hutcheson ttest and so get a statistical significance of the difference in Shannon diversity between two samples. There is also a spreadsheet calculator, that you can download and use for your own data. 

Section 12.2 Visualizing community similarity using a dendrogram in Excel. Download the spreadsheet exercise: 
12.2.1 Visualizing SimilarityThis exercise is concerned with looking at similarity between ecological communities (Section 12.2). This exercise shows you how to visualize the similarity between several communities using a dendrogram drawn using Excel. Goto Exercise 12.2.1. IntroductionWhen you have two or more ecological communities you can use the presenceabsence of species (or abundance information, if you have it) to determine measures of similarity (or the corollary, dissimilarity). In the case of presenceabsence data you use the species richness of each sample and the number of shared species to calculate an index of (dis)similarity. Once you have a matrix of (dis)similarity you can visualize the relationship between the community samples using a dendrogram. Think of it as being like a family tree, with communities most similar being "near" one another in the diagram. You can carry out calculations for (dis)similarity using Excel, although things can get rather tedious when you have more than a few samples. There is no builtin chart type that will create a dendrogram in Excel so you must use other drawing tools. R is able to carry out the calculations and dendrogram rather easily but it is a worthwhile exercise to use Excel as it helps you understand how the (dis)similarity is "converted" to a dendrogram and therefore helps you to understand more clearly what you are looking at. You can get the sample data here: Dendrogram Exercise.xlsx. 

Community distance measures using abundance data. Calculations with Excel using various metrics. Download the spreadsheet exercise: 
12.2.2 Abundancebased dissimilarity metricsThis exercise is concerned with similarity between ecological communities (Section 12.2) and in particular at calculation of distance measures when you have abundance data (Section 12.2.2). In this exercise you'll see get the chance to undertake some simple calculations using Excel. Goto Exercise 12.2.2. IntroductionWhen your community data samples include abundance information (as opposed to simple presenceabsence) you have a wider choice of metrics to use in calculating (dis)similarity. When you have presenceabsence data you use the number of shared species (J) and the species richness of each sample (A & B). Measures of (dis)similarity obtained are therefore slightly "crude". When you have abundance data your measures of (dis)similarity are a bit more "refined" and you have the potential to pick up patterns in the data that you would otherwise not see using presenceabsence data. There are many metrics that you might use to explore (dis)similarity, in this exercise you'll see four of the more commonly used ones:
The exercise shows you how you can carry out the calculations using Excel (in the book you also see how to do this using R). You can get the sample spreadsheet here: Distance metrics.xlsx. 

My Publications  
Follow me... 


See also: 
KeywordsHere is a list of keywords: it is by no means complete! Ttest, Utest, KruskalWallis, Analysis of Variance, Spearman Rank, Correlation, Regression, Logistic Regression, Curved linear regression, histogram, scatter plot, bar chart, boxwhisker plot, pie chart, Mean, Median, Mode, Standard Deviation, Standard Error, Range, Max, Min, Interquartile Range, IQR 

Top  DataAnalytics Home  Contact  GardenersOwn Homepage 