1. For (a), the averages of the IQ's, for both husbands and wives, are too low, nearer to 50 than to 100. For (b), the SD's are too small, since the deviations average nearer to 5 than to 15. For (c), the averages are too low, the SD's are too high and the correlation coefficient is too high, closer to 0.8 or 0.9 than to 0.6. So (d) must be the correct scatter diagram.
3. 1, because all the data points would lie exactly on the line y = 0.92 x .
5. (i) 0.60, (ii) 0.30 and (iii)0.95. The correlation between length and weight of two-by-four boards is likely to be very high, since they are (probably) of the same material, and the width and thickness do not change. The correlation between GPA's for the first and second years is likely to be higher than that between first and fourth, since they are closer in time, so there is less change in the student, and all correlations are positive.
8. (a) 42 inches, the middle of the x-values of the data points. (b) 2.5 inches, roughly the average deviation in the y-values of the data points from their average. (c) 0.80, because the data is follows a line roughly as closely as data in the middle left diagram on page 127. (d) The solid line; it goes through the center of the data.
9. For (c), the correlation coefficient is +1, because all the points lie on the line y=2x. For the others, computation is needed: My spreadsheet tells me the answers are (a) -0.8 and (b) 0.3
10. The predictions at 75 are likely to be more accurate, because there is less variability in the y-values for that x-value than at 125.
11. -1: All the points lie on the line y=10-x, which has a negative slope.
12. (a) The three students who all got 91 on the first count and
82 on the second probably worked together, since their counts
are equal and far from the correct counts. (The three who got
85 both times were probably just good point counters.)
12. (b) False: There does not seem to be an upward
tendency in the data points. (In fact, there is a slight downward
tendency: the correlation coefficient is -0.28.)
2. (a) False: With a strong negative correlation, below-average
x-values correspond mostly to above-average
y-values.
2. (b) False, or at least not true in general: The condition
described just means that the data points are all below and to
the right of the diagonal line y=x, but the tendency
of the points may be either upward or downward.
3. (a) The correlation between heights at age 16 and age 18
should be higher than the correlation between heights at age 4
and age 18 -- a tall preschooler need not be a tall adult, but
a tall teen is likelier to be a tall adult.
3. (b) Since height is largely genetically determined and weight
is strongly influenced by lifestyle, heights are more likely to
correlate closely at different ages than are weights.
3. (c) Here is the answer in the instructor's manual: Height and
weight are likelier to correlate better at 4
than at 18, because very light and very heavy preschoolers are
rare, but weight varies greatly for adults of the same height.
Prof. Tucker points out, however, that the correlation may in
fact be higher at age 18,
because there are two distinct populations, male and female, at
age 18 that do not appear at age 4.
4. The correlation will be higher if two very separate data "clouds" are combined into one dataset; see Exercise Set B, page 145, where the regression lines are the same for different data subsets. In this case, it may appear less clear since the data sets are not very separate, but it is still true that correlation for the combined group would be somewhat higher.
5. (a) Since the first three points lie on the line y=2x-1,
if we fill in the blank with 2(4)-1=7, the correlation coefficient
will be 1.
5. (b) Since the three given points are already not on the same line,
there is no way to fill in the blank to get a correlation coefficient
of 1.
8. False: Again, different ages means different people. The older women in the study grew up in a time when formal education was not felt to be as necessary and not as extended as it now is, so they didn't have the opportunity.
10. (a) True: The fewer students (in percentage) who take the test,
the more likely that that they are among their schools better students.
So a state in which a high percentage of students takes the test will
have more middle-level students taking it, lowering the state's
average score.
10. (b) False: If the explanation in (a) is correct, it is probable that
in Iowa only the best students are taking the test, while in Connecticut
a broader range of students are taking it.