**NOTE**

**This manual describes the laboratory
experiment used during the 1996 - 1997 academic year. Significant
changes have been made since then, and the manual used during
the current academic year is in NOT available yet on the WEB.
Hardcopies can be purchased at the bookstore.**

**Purpose**

To understand statistical distributions and their appropriate errors by calculating a binomial distribution and comparing it to the Poisson and Gaussian (normal) distributions.

**Introduction**

Probability distributions are widely used primarily in experiments which involve counting. The sampling errors which occur in counting experiments are called statistical errors. Statistical errors are one special kind of error in a class of errors which are known as random errors. You will find that what you learn in this laboratory is relevant not only in the natural and social sciences, but also in every day life. Please read the theory section that follows, and then the file on Error Analysis before proceeding to do the prelab. Bring the completed error analysis prelab with you.

This section will help the student with the prelab homework. You are probably familiar with polls conducted before a presidential election. If the sample of people who are polled is carefully chosen to represent the general population, then the error in the prediction depends on the number of people in the sample. The larger the number of people, the smaller the error. If the sample is not properly chosen, it would result in a bias (i.e. an additional systematic error).

If a fraction p of the population will vote Democratic and
a fraction q = (1-p) will vote Republican, then one expects that
in a sample of N people, one will find on average _{ }= _{ }
people who say that they will vote Democratic, and _{ } = _{ }
= N(1 - p) who say that they will vote Republican. If this poll
is taken many times for different samples one will find that the
distribution of the results for x (which is the number of people
who say they will vote Democratic) follows a binomial distribution
with the mean of x = _{ }
= _{ }. The probability distribution
B(x) for finding x in a sample of N is a function of the probabilities
p and q, and is given by the binomial distribution as follows:

_{ } **(1.1)**

where x = 0,1,2,...,N and N! = N(N-1)(N-2)...1. 0! =1 by definition.
Here _{ } is the number of combinations
for x objects taken from a sample of N, p^{x} is the probability
of getting x number of Democratic voters, and q^{N-x}
is the probability of having the remain N-x voters be Republican.
The above equation acts as a model that can provide the probability
of having a particular x value. Often, the most needed information
provided by this distribution is the mean of x and its standard
deviation,

_{ }. **(1.2)**

For example, if p = 0.51 and q = 0.49 and N = 900, one expects
that the poll will indicate a number close to 50% for the fraction
who say that they will vote Democratic. The pollster will find
a number x close to Np = 900x0.51 (i.e. around 459) with a standard
deviation expected to be _{ }.

If in a particular poll the pollster finds x = 450, he will claim that the poll indicates that 50% (450/900) will vote Democratic with a margin of error of 1.7% (15/900) (another way to calculate the margin of error is given in an optional section at the end of this section). It is not likely that the pollster will find a number such as 40%. This is because 900x0.4 = 360, which is 99 away from the expected number of 459. It is possible but very unlikely that the results will be six (99/15) standard deviations away from the expected value.

For large N and small p, the binomial distribution approaches a Poisson distribution. The Poisson distribution is more commonly applied to phenomenon which occur at a random fixed rate. For example, suppose you stand outside and count the number of people walking by. You stand for 1 hour and count n = 900. If you repeated the experiment many times, you would find the mean of the number of people passing by in one hour is M. The standard deviation of the Poisson distribution is given by,

_{ }. **(1.3)**

The Poisson distribution for measuring n = x when the expected mean is M is given by,

_{ . (1.4)}

where e = 2.71828, and x = 1, 2, 3.... Note that the mean M, does not need to be an integer.

For large values of N (the total number for the case of the
binomial distribution), and also for large values of M for the
case of the Poisson distribution (say M greater than 10 - 30)
both binomial and the Poisson distributions approach a Gaussian
(normal or Bell curve) distribution. The normal distribution has
a mean M and a standard deviation _{ },
which are independent. It is a continuous probability distribution
G(x) given by,

_{ } where _{ }
**(1.6)**

If you take the point in the normal distribution that is one standard deviation below the mean and the point that is one standard deviation above the mean, the area under the curve between the two points is 0.6827, or 68.27%. That is, the probability of a single measurement falling within one standard deviation of the mean is 68.27%. The probabilities of it to fall between +/-2 and +/-3 standard deviations are 0.9545 and 0.9973, respectively. To the extent that the binomial and Poisson distributions can be approximated by a normal distribution, these probabilities are indicative of how likely or unlikely for a measurement to fall outside one, two or three standard deviations of the mean.

Distribution | Mean | Standard Deviation |

binomial | Np | _{} |

Poisson | M | _{} |

normal(Gaussian) | M | _{} |

**Table 1.1**

**Prelab Homework**

Before you do this prelab, read this lab, and the file on Error Analysis. The prelab homework must be done at home and handed to the lab TA before you start the lab.

In order to do this prelab you need to understand the concept of a standard deviation for a binomial distribution.

**Questions**

It is the month of August and a group of students are having dinner. They are discussing a recent Campus Times article on a medical study which reported that 1 in 10 of the general population suffers from allergies to ragweed pollen. Two of the students were sneezing and rubbing their eyes during the dinner. They lamented the fact that this was ragweed season and they were really suffering.

**1)** One student noticed that sitting around the table,
there were 2 students who were allergic to ragweed and 8 students
who were not. He commented that it was a ratio of 2/10 = 0.20,
in contrast to the medical study claiming that the ratio should
be 1/10= 0.10. He said that this indicated that students at the
U of R were twice as likely to be allergic to ragweed than the
national average.

**a)** What is the standard deviation expected from the
binomial distribution and the sample size of ten students?

**b) **How many standard deviations away from the national
average are the results of this experiment?

**c) **Are the results of this experiment consistent with
the national average?

**2)** The rumor quickly spread around campus, and people
began to worry about the allergy cluster in Lattimore Hall. Some
people commented that exposure to chemicals can increase the likelihood
of developing allergies, and that those people were most likely
chemistry majors. Not having access to the original source of
the rumor, some students decided to conduct their own independent
studies. Student A stood for one hour outside Wilson Commons and
asked students if they were allergic to ragweed. After one hour
he found that 61 students did not have allergies and 3 students
did. He calculated that the ratio of the two groups was 3/64 =
0.047, and concluded that at the U of R the ragweed allergy rate
was actually half of the national average.

Are the results of this experiment consistent with the national average? Can you offer a likely explanation for the result? Be quantitative, use the concept of standard deviation.

**3) **Two other students decided to do a more elaborate
study. Student C stood outside Wilson Commons for a full day and
student D did a similar survey in Marketplace Mall. Student C
found that there were 40 students with this allergy and 605 students
without and obtained a ratio 40/645 = 0.062. Student D's sample
consisted of 702 people with no allergy and 81 people with allergy
to ragweed for a ratio of 81/783 = 0.103.

What conclusion should student C and student D conclude from their joint venture? Can you offer some likely explanation for their results? Be quantitative; use the concept of standard deviation.

**4)** A comment by a student on one of the 1992 student
TA evaluation questionnaires: "Why do we have to learn about
errors? The physics department should just buy good and accurate
equipment." What can you say about this student's comment?

**The Experiment**

**You will need to bring to this lab:**

1. A scientific calculator.

2. Linear graph paper

3. A completed error analysis pre-lab assignment

**Procedure**

The experiment consists of measuring the fraction of galvanized (silver or nickel color) washers in a mixture of both galvanized and non galvanized (yellow brass color) 1/4-20 brass washers. The 10"x17" plastic bucket contains 24 lb (about 4500) of yellow brass washers, and 8 lb (about 1500) of galvanized (silver color) washers. The washers have been mixed, so the probability of getting a galvanized washer is about 1500/6000 = 0.25. The TA should give each student a small 6" metal bucket containing a random sample of 100 washers from the mixed large bucket (obtained by weight). The TA should do the experiment as one of the students.

**Check List**

Each student (including the TA) is given:

**1.** A 6" metal bucket containing 0.5 lb. of washers.
The bucket should contain 100 washers.

**2.** A 3" clear plastic cup containing total of 11
plastic washers to be used as spacers.

**3. **A 9" long aluminum rod which is threaded at
both ends.

**4. **Two 1/4-20 wings nuts.

**A. Setting up the Data Sample:**

**1)** Remove one of the nuts from the end of the aluminum
rod. Place one plastic washer on the rod.

**2)** Mix the 100 washers in your metal bucket.

**3)** Without looking directly at the cup, take one metal
washer at a time and put it on the rod. When you have counted
10 metal washers, place a plastic washer on the rod as a spacer.

**4)** Repeat until you have __ten__ groups of 10 metal
washers spaced by plastic washers. This consists of your data
sample. If you have extra metal washers left over, return them
to the TA. If you do not have enough, ask the TA for more.

**B. Obtaining Data for a Binomial Distribution with n=10:**

** _{ }= **number of students
in the lab

_{ }= number of silver
washers in a group of 10

_{ }= number of brass
washers in a group of 10

_{ }= total number of
silver washers you have

_{ }= total number of brass
washers you have

_{ }= total number of
silver washers in class

_{ }= total number of
brass washers you have

N=100: total number of washers you have

_{ }: total number of
washers in class

p=fraction of any sample that is silver

q=1-p=fraction of any sample that is brass

**5) **Check your rod and record the number of combinations
that you see on the

rod. Silver-color/yellow color: (_{ }/_{}, with _{ }+_{} = 10).

Individual Totals

0/10 | 1/9 | 2/8 | 3/7 | 4/6 | 5/5 | 6/4 | 7/3 | 8/2 | 9/1 | 10/0 | Total |

Record the # of combinations under each combination above. You should have a

total of 10 samples. You should give this data to the TA.

**6)** Total number of students (about 20) in the class
(including the TA): N_{s} =

**7) **The TA should ask each student how many _{ }/_{}
combinations she or he has and add up the total number of combinations
_{ }/_{}
for all the students in the class for a total of 10xN_{s}(about
200) samples.

**8)** Record the data for each student in the data sheet
at the end of this lab.

**9)** Copy the totals for the 10xN_{s} samples
in the class silver/yellow : (_{ }/_{}, with _{ }+_{} = 10).

Class Totals

0/10 | 1/9 | 2/8 | 3/7 | 4/6 | 5/5 | 6/4 | 7/3 | 8/2 | 9/1 | 10/0 | Total |

**10)** Record # of combinations(under each combination
above). You should have a total of about 200 (= 10x N_{s})
samples.

**C. Obtaining Data for a Binomial Distribution with n=100:**

**11) **Total number of silver brass washers in your sample
of 100: _{ } =_______.

Total number of yellow washers in your sample of 100: _{
} =_______.

The TA should ask the class and write on the board the combinations
_{ }/_{}
(with _{ }+_{}
= 100) values for each person.

**12)** Copy these numbers and record those in the last
row of the table at the end of this lab. You should have N_{s}
(about 20) samples.

**Data Analysis**

The following data analysis is to be done in the lab after the experiment is completed. You need the data from the other students in order to complete the analysis. The lab report is to be handed in within one week. The entire laboratory is expected to take one hour, with one additional hour for the data analysis.

** _{ }= **number of students
in the lab

_{ }= number of silver
washers in a group of 10 _{ }=
number of brass washers in a group of 10

_{ }= total number of
silver washers you have _{ }=
total number of brass washers you have

_{ }= total number of
silver washers in class _{ }=
total number of brass washers in class

_{ }=100: total number of
washers you have _{ }:
total number of washers in class

p=fraction of any sample that is silver q=1-p=fraction of any sample that is brass

**13) **Determine p = (#silver/total) and q = (#yellow/total)
for the sample taken by the entire class (as done below). Determine
the uncertainty (standard deviation) in p and q.

**(a) **__Using the entire class sample:__

_{ }, _{ }

expected error in p=expected error in q=_{ }

# of standard deviations away=_{ }

error in p

Total number of silver washers in class sample _{ } = _________

Total number of yellow washers in class sample _{ } = _________

Total number of washers in the class sample (_{ }+_{}) = _{ }
_________

(Should be 100xN_{s} or about 2000)

p (measured) =______ q(measured) = ______

The expected standard deviation for _{ }
given p=0.25 and q =0.75 should be _{ }.
Use this expression to obtain the error in p(measured) and q(measured).

expected error in p =______ expected error in q =______

How many standard deviations is p away from expectation?

**(b) **__Using your own sample of 100 washers:__

_{ }, _{ }

expected error in p=expected error in q=_{ }

Do the same analysis as in the previous example, but this time
find p and q as measured by your single sample of 100 washers
(_{ } and _{ }).

Total number of silver washers in your sample _{ }
= _________

Total number of yellow washers in your sample _{ }
= _________

Total number of washers in the your sample (_{ }+_{}) = N =__________

(Should be 100)

p (measured) =_______ q(measured) = _______

The expected standard deviation for _{ }
given p=0.25 and q =0.75 should be _{ }.
Use this expression to obtain the error in p(measured) and q(measured).

expected error in p =_______ expected error in q =________

How many standard deviations is p away from expectation?

**14) **Tests of the binomial distribution with N=100.

**(a) **Make a table and then plot the distribution of _{
} silver using the N_{s} samples of
the class.

Mean of _{ }: _{ }

standard deviation of _{ }:
_{ }

**(b)** Find the mean of _{ },
and the standard deviation of the distribution for the data taken
by the class. For the calculation of standard deviation of the
sample use the formula from the file on Error Analysis. Note that
if there are 20 students there should be 20 samples of _{ }.

Is the standard deviation consistent with the expected standard deviation?

A better estimate of the expected standard deviation from a
set of 20 measurements is given by the standard deviation of the
sample times _{ }(see file
on Error Analysis).

Let u=x+_{ }-25

Let v=_{ }y

Then plot u vs. v

**(c)** For the distribution of _{ }
(mean should be around _{ }=25)
the normal distribution should be a good approximation to the
Poisson and binomial distributions. Plot a normal distribution
with a mean equal to the data, but with a standard deviation of
5. Use the attached table of the values of the normal distribution
for mean = 25, and standard deviation of 5. Shift the x position
of the plotted curve such that it agrees with the mean for the
data. The attached table is a probability distribution which is
normalized to 1.0. Therefore, the y values need to be multiplied
by N_{s }in order to normalize the distribution to the
total number of samples.

**15) **Tests of the binomial distribution with N=10 (using
10xN_{s} samples).

**(a)** Repeat the same analysis as the previous example
14(a) but now do it for the distribution of _{ }
(mean should be about 2.5), using the 10xN_{s} samples
of ten washers each.

Mean of _{ }: _{ }

_{ }, where _{ }
is the number of groups of ten washers in the class having i silver
washers in them.

**(b) **The attached tables gives the binomial distribution
for N=10, p = 0.25 and q = 0.75. They also give the Poisson distribution
for M = 2.5, and the Gaussian distribution with mean = 2.5 and
standard deviation =_{ }
Plot the binomial, Poisson and Gaussian distributions with a mean
of 2.5 and compare to the data.

Note that the probability distributions must be multiplied
by the number in the sample (about 200). In order to get an idea
of how well the distribution fits the data, you must plot the
data with error bars. For the measured distribution, the error
in each point on the distribution can be obtained by assuming
that the error on k is _{ }
where k is the number of samples with that value of _{ }. This makes the assumption that counting
experiments are Poisson distributed and have a typical error of
_{ .}

Multiply y values of the binomial, Poisson, and Gaussian data
by _{ } and plot on the graph
of experimental points (_{ }
vs. i)

**16) **Have your **plots** and **data sheet** signed
by the TA. These should be handed in as part of your lab report.

** Lab Homework** (Due one week after the lab)

Finish a complete lab report for this experiment. Follow the example given in the file Writing a Lab Report. In addition, hand in the following Lab homework :

**1)** Read the file on Error Analysis and learn to do combination
of errors either by differentiation or by using the error table
at the end of the file. Learn the difference between statistical
and systematic errors.

**2)** For g=[2s/t^{2 }] what is the contribution
to the error in g (_{ })
from an error in s ([Delta]s), and^{ what is the contribution
to the error in g ( )
from an error in t ([Delta]t).} ^{What is the total error
in g ([Delta]g total)?}

^{}

^{}**3)** The error [Delta]t in a single measurement
of time for a falling body is 2 seconds. Four measurements of
the time are performed and averaged. What is the error of the
mean, the average of the __four__ times? (a) If [Delta]t is
a random error. (b) If [Delta]t is a systematic (e.g. scale) error.
(Hint: You should get real values (in seconds) and (a) is less
than (b).)

__Optional__

There is another way of calculating margin of error for the presidential poll result described in the beginning of this lab. We chose the case in which the total number of people sampled is 900, and there is a probability of p = 0.5 of voting Democratic and q = 0.5 of voting Republican.

If the sampling number N, (or 900 in our example), is not fixed,
but is chosen randomly, then one can say that _{ }
and _{ } are independent variables
and are randomly distributed. Therefore, _{ }
and _{ }** **are two independent
measurements and each is Poisson distributed with standard errors
_{ } and _{ },
respectively. The fraction of people voting Democratic is _{
.}

_{}

By taking the derivative of F with respect to _{ }
and with respect to _{ }
and by adding the errors in F from_{ }
and _{ } in quadrature (i.e.,
using the standard rules for the addition of independent errors),
one finds that the standard error in F is equal to _{ }. The details are left as an exercise for
the student.

**References**

**1. **Schaum's outline series, "Statistics" by
Murray R. Spiegel, McGraw Hill Book Company.

**2. **Data and Error analysis in the introductory "Physics
Laboratory", by William Lichten, Allyn and Bacon Inc. Newton,
MA. 1988.

**3.** See also references in file: Error Analysis.

**DATA SHEET FOR RECORDING CLASS SAMPLE**

(record number of combinations for each student in the class)

Combination: (Silver-Color/Yellow-Color)

0/10, 1/9, 2/8, 3/7, 4/6, 5/5, 6/4, 7/3, 8/2, 9/1, 10/0, (?=10) Silver

Student 1: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 2: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 3: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 4: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 5: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 6: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 7: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 8: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 9: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 10: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 11: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 12: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 13: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 14: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 15: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 16: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 17: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 18: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 19: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 20: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 21: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 22: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 23: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 24: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 25: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

Student 26: ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

TOTAL : ____, ____, ____, ____, ____, ____, ____, ____, ____, ____, ____: ____,: ____

* Check that the sum total is 10xN_{s,} where N_{s}
is the number of students.

* Copy the results of the Total to Section B.

__Binomial Distribution__:

_{ }

N=10, p=0.25, q=0.75,

_{ }

__Poisson Distribution__:

_{ }

M=2.5,

_{ }

__Gaussian Distribution__:

_{ }

where

_{ }

M=2.5,

_{ }

(Graph to be handed out in lab)

Graph of binomial, Poisson, and Gaussian Distributions with mean=2.5

**Data Points**

x | binomial | Poisson | Gaussian |

0 | 0.056314 | 0.082085 | 0.072289 |

1 | 0.187712 | 0.205212 | 0.160882 |

2 | 0.281568 | 0.256516 | 0.240008 |

3 | 0.250282 | 0.213763 | 0.240008 |

4 | 0.145998 | 0.133602 | 0.160882 |

5 | 0.058399 | 0.066801 | 0.072289 |

6 | 0.016222 | 0.027834 | 0.021773 |

7 | 0.003090 | 0.009941 | 0.004396 |

8 | 0.000386 | 0.003106 | 0.000595 |

9 | 0.000029 | 0.000863 | 0.000054 |

10 | 0.000001 | 0.000216 | 0.000003 |

**Gaussian Distribution(mean=25, dx=**

_{ }=1,_{})

x | y | x | y |

0 | 0.000000 | 26 | 0.078209 |

1 | 0.000001 | 27 | 0.073654 |

2 | 0.000002 | 28 | 0.066645 |

3 | 0.000005 | 29 | 0.057938 |

4 | 0.000012 | 30 | 0.048394 |

5 | 0.000027 | 31 | 0.038837 |

6 | 0.000058 | 32 | 0.029945 |

7 | 0.000122 | 33 | 0.022184 |

8 | 0.000246 | 34 | 0.015790 |

9 | 0.000477 | 35 | 0.010798 |

10 | 0.000886 | 36 | 0.007095 |

11 | 0.001583 | 37 | 0.004479 |

12 | 0.002717 | 38 | 0.002717 |

13 | 0.004479 | 39 | 0.001583 |

14 | 0.007095 | 40 | 0.000886 |

15 | 0.010798 | 41 | 0.000477 |

16 | 0.015790 | 42 | 0.000246 |

17 | 0.022184 | 43 | 0.000122 |

18 | 0.029945 | 44 | 0.000058 |

19 | 0.038837 | 45 | 0.000027 |

20 | 0.048394 | 46 | 0.000012 |

21 | 0.057938 | 47 | 0.000005 |

22 | 0.066645 | 48 | 0.000002 |

23 | 0.073654 | 49 | 0.000001 |

24 | 0.078209 | 50 | 0.000000 |

25 | 0.079788 |