Unit 10: Hypothesis Tests

Text reading and homework:

Read chapter 26 of FPP and do the following review exercises:
Chapter 26 (pages 497-501): 2, 3, 4, 5, 7

Reading:

"Who'll Stop the Rain?," by Sharon Begley, Newsweek, "On Science" 15 column, August 4, 2008, page 56.
Available via LexisNexis Academic by searching for "Begley stop the rain".

Possible Essay Questions:

What is a "hydrometeor"?
An article in the Irish Times shortly after the Olympics began reported that there was no rain on Beijing during the Friday opening ceremonies, but there was rain on Sunday. Does that mean Begley is wrong?

Note:

The Final Project Progress Report is due with this unit assignment. See the final project guidelines for details.

Computer project:

Suppose we want to test whether a particular coin is fair by tossing it 100 times and then using a test of significance. (The null hypothesis will be that the coin is fair.)

Preliminary writeup. Find the number of heads that corresponds to the cutoff for rejecting the null hypothesis of a fair coin using a two- tailed test for a significance level of 0.10, 0.05 and 0.01. [For example, to get a significance level of 0.10 in a two-tailed test, the desired z-value corresponds to an area of 100 - 2(5) = 90 percent, which the normal table says is z = 1.65. Since the null hypothesis implies an EV of 50 and an SE of 5, we should reject a result outside of the interval 50 plus or minus 1.65(5), i.e., from 41.8 to 58.3; these are the values that should go in the "bin array".]

What percentage of the tests should fail to get the correct result if the coin is actually fair when the significance level is 0.10, 0.05 and 0.1? Now a much more difficult question: Can we tell what percentage of the tests should fail to get the correct result when the coin is actually unfair? Explain.

Simulations: For each problem below, use a spreadsheet to implement 50 simulations of 1) tossing a coin 100 times and 2) applying a test of significance. For each problem, report the percentage of simulations where the test of significance failed to reach the correct decision. The letter p denotes the probability of a head on any given toss of the coin.

Use a fair coin (p=0.5) and a significance level of 0.1.
Use a fair coin (p=0.5) and a significance level of 0.05.
Use a fair coin (p=0.5) and a significance level of 0.01.

Does this agree with your preliminary answers?
Now we use the tests on an unfair coin. Note that the test of significance DOESN'T give the chance of correctly rejecting the coin when the null hypothesis is not true! The simulations give us an idea of this chance for this particular setting. Repeat the above simulations, only now for an unfair coin:

Use a coin with p=0.55 and a significance level of 0.05.
Use a coin with p=0.6 and a significance level of 0.05.
Use a coin with p=0.65 and a significance level of 0.05.

Simulation Tips:

Use each column to simulate 100 flips of a coin. We wish to count the number of heads, so let 1 be a head and 0 be a tail [e.g., in the first question,

=IF(RAND()<.55,1,0)]

. At the bottom of each column use the SUM command to count the number of heads. You can then see in how many of the 50 simulations (sets of 100 flips) the null hypothesis is rejected by using FREQUENCY or HISTOGRAM to count how many simulations gave too many or too few heads: Construct the "bin array" with the values for the lower and upper acceptable range (calculated by hand using the normal distribution). The bins should have just two numbers, there should be 3 counts produced by the FREQUENCY command.

The FREQUENCY or HISTOGRAM function then can give the number of simulations "below the lower level" (i.e., unfair, too few heads), "in the acceptable range" (fair, about the right number of heads) and "above the higher level" (unfair, too many heads). In the last three questions, the coin is unfair, so any simulation with count of heads inside the acceptable range, i.e., in which the null hypothesis is not rejected, got the wrong answer; in the first three, the coin is fair, so any simulation with count of heads outside the acceptable range, so that the null hypothesis is rejected, got the wrong answer.