Unit 4 Homework Solutions

Chapter 8, pages 134-139

1. For (a), the averages of the IQ's, for both husbands and wives, are too low, nearer to 50 than to 100. For (b), the SD's are too small, since the deviations average nearer to 5 than to 15. For (c), the averages are too low, the SD's are too high and the correlation coefficient is too high, closer to 0.8 or 0.9 than to 0.6. So (d) must be the correct scatter diagram.

3. 1, because all the data points would lie exactly on the line y = 0.92 x .

5. (i) 0.60, (ii) 0.30 and (iii)0.95. The correlation between length and weight of two-by-four boards is likely to be very high, since they are (probably) of the same material, and the width and thickness do not change. The correlation between GPA's for the first and second years is likely to be higher than that between first and fourth, since they are closer in time, so there is less change in the student, and all correlations are positive.

8. (a) 42 inches, the middle of the x-values of the data points. (b) 2.5 inches, roughly the average deviation in the y-values of the data points from their average. (c) 0.80, because the data is follows a line roughly as closely as data in the middle left diagram on page 127. (d) The solid line; it goes through the center of the data.

9. For (c), the correlation coefficient is +1, because all the points lie on the line y=2x. For the others, computation is needed: My spreadsheet tells me the answers are (a) -0.8 and (b) 0.3

10. The predictions at 75 are likely to be more accurate, because there is less variability in the y-values for that x-value than at 125.

11. -1: All the points lie on the line y=10-x, which has a negative slope.

12. (a) The three students who all got 91 on the first count and 82 on the second probably worked together, since their counts are equal and far from the correct counts. (The three who got 85 both times were probably just good point counters.)
12. (b) False: There does not seem to be an upward tendency in the data points. (In fact, there is a slight downward tendency: the correlation coefficient is -0.28.)

Chapter 9, pages 153-157

2. (a) False: With a strong negative correlation, below-average x-values correspond mostly to above-average y-values.
2. (b) False, or at least not true in general: The condition described just means that the data points are all below and to the right of the diagonal line y=x, but the tendency of the points may be either upward or downward.

3. (a) The correlation between heights at age 16 and age 18 should be higher than the correlation between heights at age 4 and age 18 -- a tall preschooler need not be a tall adult, but a tall teen is likelier to be a tall adult.
3. (b) Since height is largely genetically determined and weight is strongly influenced by lifestyle, heights are more likely to correlate closely at different ages than are weights.
3. (c) Here is the answer in the instructor's manual: Height and weight are likelier to correlate better at 4 than at 18, because very light and very heavy preschoolers are rare, but weight varies greatly for adults of the same height. Prof. Tucker points out, however, that the correlation may in fact be higher at age 18, because there are two distinct populations, male and female, at age 18 that do not appear at age 4.

4. The correlation will be higher if two very separate data "clouds" are combined into one dataset; see Exercise Set B, page 145, where the regression lines are the same for different data subsets. In this case, it may appear less clear since the data sets are not very separate, but it is still true that correlation for the combined group would be somewhat higher.

5. (a) Since the first three points lie on the line y=2x-1, if we fill in the blank with 2(4)-1=7, the correlation coefficient will be 1.
5. (b) Since the three given points are already not on the same line, there is no way to fill in the blank to get a correlation coefficient of 1.

8. False: Again, different ages means different people. The older women in the study grew up in a time when formal education was not felt to be as necessary and not as extended as it now is, so they didn't have the opportunity.

10. (a) True: The fewer students (in percentage) who take the test, the more likely that that they are among their schools better students. So a state in which a high percentage of students takes the test will have more middle-level students taking it, lowering the state's average score.
10. (b) False: If the explanation in (a) is correct, it is probable that in Iowa only the best students are taking the test, while in Connecticut a broader range of students are taking it.


Revised: September 11, 2007. Questions to: dlantz@mail.colgate.edu
Copyright 2007 © Colgate University. All rights reserved.