INTRODUCTION TO SAMPLES AND VARIABILITY

[Link Map]
begin comments text sound
line

A sample consists of a set of items (eg. plants) which is a subset of some population about which we want to draw conclusions. As the items will exhibit random variation, and our measurements may not be exact, any statistic calculated from observations on the sample (such as a mean) is also subject to a variation.

To simulate measuring a feature on a sample of 16 items drawn from a population whose mean is 20 and has a standard deviation of 2.0, select the following :
Tools > Data Analysis.. > Random Number Generation

A dialog box like the one on the right appears. Enter '1' in the Number of Variables box, '16' for Number of Random Numbers and select Normal from the pull-down list in the Distribution box. Then enter '20' as the mean and '2.0' as the Standard Deviation. Clicking OK gives you the 16 values.

To obtain the mean, select a free cell and click on the Function Wizard. Choose Statistical and then Average. Highlight the sample values for the Number 1 box and click OK.

If you repeat this process and take lots of samples, you can explore the variability in the sample means and see how close the sample is likely to be to the population mean.

Mathematically, SD(sample mean) = SD / Sqrt(n), where n is the sample size. The standard deviation of the sample mean is also known as the standard error (SE) and this idea is generalised to the SE of any other quantity calculated from a sample.

Variability in samples can be used to decide if the difference between two or more samples is genuine or due to chance. If the variability between samples is about what would be expected from sampling variability, then there is no reason to explain it any other way. Otherwise, it is evidence that differences between samples are causing variation.

line
Basic statistics in Excel   23.2.99   Page: 15 of 25