Using Excel 3: General Instructions for Simulations using Excel
Overview
These instructions are general. They describe how to create random
numbers
for use in simulations, how to process the output of the simulation,
and
how to organize the spreadsheet. Several functions are discussed under
the following topics:
Random numbers
There are at least three ways to create random numbers with Excel.
-
The RAND() function returns a random number between zero and one. You
may
notice that the numbers in any cell with the RAND function change each
time you do anything in the spreadsheet. This is because the
spreadsheet
is "recalculating" automatically each time you do something. You can
turn
this off if you wish from the Preferences page.
What if we don't want zero to one? If you want numbers
between
zero and five, multiply by five: =5*RAND(). If you want numbers between
3 and 8, use =3+5*RAND(). If you want only INTEGERS, you need to round
off the decimals somehow. The function to do this is TRUNC(). TRUNC()
truncates
all decimals from a number. So, TRUNC(2*RAND()) returns zero half the
time,
and one the other half of the time [Note: this is the same result as
IF(RAND()<0.5,0,1)].
If you want an equal chance for the numbers 1, 2 or 3, use
=1+TRUNC(3*RAND()).
-
The RANDBETWEEN(lo,hi) function returns a random integer between lo and
hi. [For example, RANDBETWEEN(1,5) returns an integer between 1 and 5.]
As with RAND() this function recalculates each time you do anything in
the spreadsheet. This is probably the one you will want to use the most.
-
In the Data Analysis Toolpak [under the Tools menu--see the Excel
instructions
attached to Unit 3 if it does not appear in the Tools menu] there is a
tool called "Random Number Generation". You can specify how many cells
you want, what distribution they should follow, and many other features
of the data. More than what we need, but you might try it out.
Logic operators
You can interpret the answers you get from a simulation by using
logical
operators. For example, to see if cells A3 and B3 are the same, use
=IF(A3=B3,1,0).
The first argument to IF() is an expression to be evaluated as TRUE or
FALSE. The second argument is the number to return if TRUE and the
third
is the number to return if FALSE. These do not have to be numbers; they
can also be expressions themselves: =IF(A3=B3,IF(A3=1,1,A1),RAND())
would
give the value 1 if the entries in A3 and B3 are both 1, would give the
value in A1 if A3 and B3 were equal but not 1, and would give an
entirely
random number if the entries in A3 and B3 were unequal.
Sometimes you will want to use AND() and OR() to build up your
expression
for IF(). For example, =IF(AND(A3=B3,C3>2000),1,0). The function
AND()
returns TRUE is both expressions are TRUE. The function OR() returns
TRUE
is either or both of the expressions are TRUE.
Counting and frequency
If you want to know how many cells have values greater or less than
some
cutoff values, you can use the FREQUENCY() function. For example, if
you
want to know how many simulations ended up less than -100 and greater
than
zero and greater than 100, then FREQUENCY() is the function for you!
To use FREQUENCY(), you need to have a range of data cells and a
range
of cutoffs which are the borderlines for the "bins" we wish to sort
into.
The list of cutoffs is called the "bin array". The data array and the
bin
array must be entered somewhere. For the example above the bin array
would
be three cells with the values -100,0,100 respectively. Notice that the
number of bins is actually one more than the number of cells in the
"bin
array" because the numbers are really cutoffs between bins. The extra
bin
is labelled MORE, because it includes the numbers with values more than
the largest bin cutoff [in this case, 100].
You need to be aware that FREQUENCY is an "array
functions".
We have to have enough room to put all the output numbers, or they
don't
appear in the worksheet. The easiest way to ensure
this
works, is to start by entering the function into the first cell you
wish
to put the values [as for a usual function]. One number will appear.
Warning: If you copy this function to a range of cells you will
not
get correct values!
So, starting with the cell that contains the output for the first
interval, simply highlight the range into which you wish to put the
list of results. Click
in
the formula bar somewhere as if you were going to type something there,
then press Control-Shift-Return [on PCs] or
Command-Return
[on Macs].
The output of the FREQUENCY() function shows the number of cells
between
the previous bin value and the current bin value. For example, with the
bin array -100,0,100, the first output cell contains the number of
cells
in the data range with value less than [or equal to] -100; the next
cell
contains the number between -100 and 0; the next between 0 and 100; and
the last cell contains the number of cells in the data array that have
a value greater than the last cutoff, 100.
Organizing simulations
Try to organize your spreadsheet so that you can follow what you have
done
even if you look at it a year from now. Keep all cells corresponding to
a single trial in either a column or a row. Make summary statistics
appear
above, below or to the right of the corresponding trial with labels for
what value that is [for example, the average or SD].
Use one spreadsheet as a template for other similar simulations. You
can copy it, change a few things and read off the new results. This is
especially helpful when doing the same experiment with 10 then 40 then
160 flips of a coin. All you need to do is copy the first 10 rows and
recalculate.
Last revised: February 2004. Mail to kvalente@mail.colgate.edu
Copyright 1999 © Colgate University. All rights reserved.