Excel Slope/Intercept Instructions
Excel has functions which calculate the SLOPE and INTERCEPT of a
regression line given a dataset. It also has a function, FORECAST,
which creates predicted values (using the "regression" line). These
instructions walk you through
the steps needed to do the assignment in Unit 5. Similar steps
will work to create predicted values for any dataset.
Overview
It is a good idea to set up your spreadsheet in some logical fashion.
While it is possible to create a nice graph from data with the
x-component data at the top of the worksheet and the
y-component 1000 cells below, it's not a good way to set things up.
Organization of a worksheet is an art, but any effort you make is
very helpful, especially when you have to come back to a spreadsheet
you created last year and update a report before your boss's
meeting at noon. (And who knows when your
boss will want to use your spreadsheet while you're gone?)
In addition to logical arrangement of data, headings and comments
are very helpful additions to any complex spreadsheet.
Here we set up two extra columns of data to produce straight
lines that pass through the original data. One line is the SD-line
and the other is the regression line. So, we will set up columns of
data with x-values next to y-values.
I suggest leaving some room at the top of the spreadsheet to
put comments, calculated values like the average, and titles.
The bottom is OK too, but harder for others to find if they borrow
your spreadsheet.
Make the SD line data
- Copy the data into the worksheet and move the columns (cut and
paste) so that the variables are lined up next to each other.
- Find the averages and standard deviations for each column
and the correlation coefficient;
place them in cells adjacent to labels so that you (and others)
know what they are.
- We now have all the information needed to create the SD-line.
put an appropriate label at the top of a new column and enter a
formula in the first row of data which uses the
SD-line formula to calculate the correct y-value for the
x-value
of that row. Each row corresponds to an x-value and its related
y-values (one for the data point, one on the SD-line that we are
now computing, and one on the regression line that we will compute
below). Remember to use dollar signs before the letter and number
of any cells that shouldn't change when you copy from one row to the next.
For example, the average and SDs are the same for each row. If they
appear in cell D4, then use $D$4 in place of D4.
The SD-line of interest will be downward sloping if the correlation
coefficient is negative (so you'll need to have the correlation
coefficient computed -- and, I hope, labelled -- somewhere on the
sheet). So, for example, use the function SIGN($F$6) to get a plus or
minus 1 depending on the sign of the cell F6.
- Check that your formula gives the correct answer. Then
copy the formula down the column to the other rows. Check to
make sure that the correct cell names (like A10) are getting
incremented while the ones for the average, SD, etc are not.
If necessary, add more dollar signs and copy again.
Make the regression line data
Now to add the column for the regression line: (Did you label it first?)
There are three ways to get the regression line.
- You can enter the formula for the regression line just
like you did for the SD-line. You have all the needed averages
and SDs and the correlation coefficient.
- You can have Excel calculate the slope and intercept for
you, putting them into special cells (don't forget to label them).
Then use the formula for a line in each row of the column.
The functions SLOPE and INTERCEPT give you the values you need.
They each require two data ranges as inputs, one for the x values
and one for the y values -- but for some reason Excel wants the
y-values first.
- You can use the FORECAST function. This function requires
three inputs:
the particular x value for which you want a regression result,
all the y values, and all the x values.
Remember to put dollar signs in the data ranges that you don't
want to change when you copy.
Whichever way you choose, copy the formula down the column and check that
it incremented (or left alone) the cell ranges correctly.
Make a scatter plot with SD and regression lines
You now have four columns of data! You can make a scatter plot
with lines for the SD-line and regression line.
- Highlight the four data columns (with their labels at the top)
and choose the "chart wizard" (a button at the top with a bar chart).
Choose the scatter plot format without lines connecting the points.
- The next screen asks you to verify that the four columns that
Excel thinks you want are really the ones you want. Select the
Series tab at the top and verify that the x and y
ranges are correct for each of the three data series
(a data series is a column of y values) All OK? Then move to
Next screen.
-
Put titles as you wish and/or remove the legend (i.e., the table
telling which symbol represents which series). There will be three
sets of
data points. Click FINISH to put the chart into your worksheet.
- By clicking on the emphasized points on the rim of the chart
and moving the mouse, you can change the size or position of the
chart.
- Initially, the axes will cross at (0,0), while all the data
points appear well above and to the right. To make the graph focus
on the
data points, change the scale of your plot by double-clicking on
the x- (or y-) axis and then choosing the "Scale" Tab
in the window
that appears. Choose appropriate values for the minimum and
maximum values of the x and y variables, near the minima
and maxima for the data points. You may also want to change the major
units to some smaller value, to add more grid lines to the graph.
- Make the second (SD line) points and third (predicted) data
points appear as lines: Double-click on one
of the SD-line points (respectively, "predicted" data points).
Then, under the "Pattern" Tab, set the line style to "automatic"
and the marker style to "none". Click OK, then click somewhere
off the chart (to unselect the data) and look at your chart.
Revised: January 7, 2000. Questions to:
dlantz@mail.colgate.edu
Copyright 2000 © Colgate University. All rights reserved.