Unit 3: Variation and the Normal Curve

Text reading and review exercises:

Review Chapters 5 and 6 of FPP and do the following review exercises:
Chapter 5 [pages 93-95]: 1, 2, 3, 8, 9, 11
Chapter 6 [pages 104-105]: 4, 5
Special Review Exercises [pages 105-108]: 9, 10, 13

Reading:

Reading between the (noble) lies, by Phil Brand, The Washington Times, September 7, 2008, Books section, M28.
For whom the bell curve tolls; students vary widely in academic ability, but we pretend everyone can be above average, by Margaret Wente, The Globe and Mail (Canada), September 6, 2008, Comment column, page A25.
Document source: LexisNexis.

Possible essay questions:

Computer project

This week we work with salary data for 2002 Major League Baseball players.

Preliminary Writeup
Before you do any computations with Excel, what percent of the players do you expect to have below-average salaries? And what percent of the players do you expect to have salaries below the median? Do you expect the data to be normally distributed?

Now, suppose the data is normally distributed. What percent of the players would have salaries below the 70th percentile? What value would the 70th percentile be for a normally distributed dataset with average $2.387 million and an SD of $3.067 million?

On the Computer
Copy the 2002 baseball salary data below into a spreadsheet program:
2002 Baseball salaries (salaries-only list): Move these into the spreadsheet.
Baseball salaries: More details from the source, if you are interested (but you don't need them).

  1. Create a histogram for the data using class intervals of length $1,000,000, up to $20,000,000.
  2. Compute the value of the average, median, mode, 70th percentile and standard deviation for the salaries. (The Excel function for what we call SD is "stdevp" [the "p" stands for population]. The function "stdev" gives what we will later in the course call "SD+", the "sample standard deviation". The Excel "percentile" command requires two arguments, first the location of the list of numbers and second the desired percent, in the form .7 rather than 70.)
  3. Determine the percentage of players that have below-average salaries.
  4. Compare the actual data with your predictions. Is salary data for baseball players approximately normally distributed?
  5. Create an accurate one- or two-sentence summary of the salary data suitable for a newspaper article. When you state the value of a statistic, explain to your reader how it should be interpreted.
  6. Create a misleading summary of the data using the value of at least one TRUE statistic.
If you have trouble with the spreadsheet program, consult the supplement Using Excel 1: Excel Basics.


Last revised: September 23, 2008. Mail to dlantz@mail.colgate.edu
Copyright 2008 © Colgate University. All rights reserved.