Microsoft Excel

From NoskeWiki
Jump to: navigation, search

About Microsoft Excel

Excel is the standard for spread sheeting, laden with features and is extremely powerful tool if you know how to use it. I even use Excel to draw flow diagrams etc, using the "snap to grid" feature and ability to resize columns / rows. To get the most out of Excel I recommend you learn about using "Formula", "Charts", "Conditional Formatting" and "Sorting". I also enjoy the fact I can write Macros to speed up work. Excel isn't the be all and end all for statistical analysis (especially in science), but most of the time it gets you by.


Formula

Standard Error of the Mean

It's strange that Excel doesn't include a formula for calculating Standard Error of the Mean (S.E.M.), but it can be done as follows:

=(STDEV(A1:A2))/(SQRT(COUNT(A1:A2)))


Unpaired Student t-Test

The formula to calculate the probability, p, associated with a "Students t-Test" in Excel is:

=TTEST(array1,array2,tails,type)
  • array1 is the range of the first data set
  • array2 is the range of the second data set
  • tails specifies the number of distribution tails where: 1 = one-tailed distribution; 2 = two-tailed distribution.
  • type is the type of t-Test to perform where: 1 = paired distribution; 2 = equal variance (homoscedastic); 3 = unequal variance (heteroscedastic).

In science you often have two sets of numbers (eg: length of fish from pond A and pond B) and want to prove one is statistically different from the other. This means we want to test the hypothesis that mean and deviations of both sets is equal; thus we want to test for two-tailed distribution and two-sample equal variance, and so your formula will look like this:

=TTEST(A1:A30,B1:B30,2,2)

Typically a p<0.05 (sometimes p<0.005) is considered "statistically significant" - meaning the hypothesis is almost certainly wrong, thus the two sets are NOT equal. The "p" value returned generally gets smaller as sample size increases (unless you are wrong, and the two sets are no different). The order of array1 and array2 is not important, and can be unequal size. For very large data sets (columns) the result may be so small it gets rounded down to 0. For a more in-depth report of t-Test results (not just probability) you might consider using the R statistics package.


Links