Next: The Error in the
Up: Normal Distribution
Previous: Normal Distribution
In order to develop confidence in the result of a measurement of a
single quantity, such as the length of a table top, we often repeat
the measurement process a number of times. The results of the
measurement vary because of difficulties in reading the meter stick
scale to the last tenth of a millimeter, and for other reasons.
Suppose we repeated the measurement
times, getting a list of
values
. Our best guess for the true value is usually the
average of these values:
 |
(6) |
which is also called the ``mean'' value of this sample set of
observations. In our notation,
indicates our best,
imperfect estimate of the true value
. If we repeat the
measurement an infinite number of times, ideally, the mean value
should approach the ``true'' value of the measurement. The
statistical way to describe what is happening is that our set of
measurements is a sample of
values taken from an infinite
``population''. The true population mean is given by
 |
(7) |
We might ask of this infinite population, what is the probability of
getting a value of
in the range
when we make a
measurement? This probability is expressed in terms of a probability
function
as
. The factor
is necessary because as
the interval width
gets smaller, the probability of getting a
value in that tiny range must get smaller in proportion to
. If
we make enough measurements, we can begin to construct this
probability function, but usually we don't make enough measurements to
know it very well. So we often assume for want of any better reason
that the probability is given by the Gaussian distribution function
(normal distribution)
![\begin{displaymath}
P(x) = \exp[-(x - \bar x)^2/2\sigma^2]/(\sqrt{2 \pi} \sigma)
\end{displaymath}](img40.gif) |
(8) |
In this expression the true mean of the population is
and
is the true ``standard deviation''. This probability is
normalized so that
 |
(9) |
i.e. the probability of measuring any value of
is 1. The
Gaussian distribution is peaked at
and falls off on
either side of
over a distance in
that is controlled by
the value of
. If
is large, the fall off is slow and
the most probable values of
are in a broad range around
;
if
is small, the fall off is rapid, and the most probable
values of
are narrowly clustered around
. A property of
the Gaussian distribution is that the probability of making a
measurement and getting a value in the range
and
is about 68%. (This value is found by calculating
the integral under the probability distribution from
to
.) Thus in common usage, we say that for a single
measured value of
, the result is
. The
standard deviation of a quantity is sometimes called the ``error'' in
that quantity, so we say the error in a single measurement is
. The statement that
lies in the range
is a statement we can make with 68% confidence. That means
the result of a measurement is likely to be outside this range 32% of
the times we repeat the experiment.
A measure of the width of this peak is given by
 |
(10) |
This is just the average of
over the population.
If we made an infinite number of measurements, we would be able to
determine the two parameters
and
or the distribution
exactly. With a finite set of measurements, however, we can estimate
them. To estimate the mean value, we simply compute the average of
the measurements
:
 |
(11) |
Notice that we have put a star on
to distinguish the
estimate from the true value
. The sample also permits an
estimate of this population standard deviation
. It is just
 |
(12) |
The quantity
is the estimated standard deviation, and its
square is called the estimated variance of
from the mean value
, or just the estimated variance of
. 1
Another useful formula
is obtained by expanding the square on the right side to give
 |
(13) |
The
means the average of
. In other words the
estimated variance is just the difference between the average of the
squares and the square of the average times
.
As an exercise in this course, you will be asked to write a program
that reads a list of values
and calculates
and
.
Next: The Error in the
Up: Normal Distribution
Previous: Normal Distribution
Carleton DeTar
2009-11-18