Using Excel 3: General Instructions for Simulations using Excel


Overview

These instructions are general. They describe how to create random numbers for use in simulations, how to process the output of the simulation, and how to organize the spreadsheet. Several functions are discussed under the following topics:
 

Random numbers

There are at least three ways to create random numbers with Excel.
  1. The RAND() function returns a random number between zero and one. You may notice that the numbers in any cell with the RAND function change each time you do anything in the spreadsheet. This is because the spreadsheet is "recalculating" automatically each time you do something. You can turn this off if you wish from the Preferences page.
    1.  
      What if we don't want zero to one? If you want numbers between zero and five, multiply by five: =5*RAND(). If you want numbers between 3 and 8, use =3+5*RAND(). If you want only INTEGERS, you need to round off the decimals somehow. The function to do this is TRUNC(). TRUNC() truncates all decimals from a number. So, TRUNC(2*RAND()) returns zero half the time, and one the other half of the time [Note: this is the same result as IF(RAND()<0.5,0,1)].

      If you want an equal chance for the numbers 1, 2 or 3, use =1+TRUNC(3*RAND()).
       

  2. The RANDBETWEEN(lo,hi) function returns a random integer between lo and hi. [For example, RANDBETWEEN(1,5) returns an integer between 1 and 5.] As with RAND() this function recalculates each time you do anything in the spreadsheet. This is probably the one you will want to use the most.
  3. In the Data Analysis Toolpak [under the Tools menu--see the Excel instructions attached to Unit 3 if it does not appear in the Tools menu] there is a tool called "Random Number Generation". You can specify how many cells you want, what distribution they should follow, and many other features of the data. More than what we need, but you might try it out.

Logic operators

You can interpret the answers you get from a simulation by using logical operators. For example, to see if cells A3 and B3 are the same, use =IF(A3=B3,1,0). The first argument to IF() is an expression to be evaluated as TRUE or FALSE. The second argument is the number to return if TRUE and the third is the number to return if FALSE. These do not have to be numbers; they can also be expressions themselves: =IF(A3=B3,IF(A3=1,1,A1),RAND()) would give the value 1 if the entries in A3 and B3 are both 1, would give the value in A1 if A3 and B3 were equal but not 1, and would give an entirely random number if the entries in A3 and B3 were unequal.

Sometimes you will want to use AND() and OR() to build up your expression for IF(). For example, =IF(AND(A3=B3,C3>2000),1,0). The function AND() returns TRUE is both expressions are TRUE. The function OR() returns TRUE is either or both of the expressions are TRUE.
 

Counting and frequency

If you want to know how many cells have values greater or less than some cutoff values, you can use the FREQUENCY() function. For example, if you want to know how many simulations ended up less than -100 and greater than zero and greater than 100, then FREQUENCY() is the function for you!

To use FREQUENCY(), you need to have a range of data cells and a range of cutoffs which are the borderlines for the "bins" we wish to sort into. The list of cutoffs is called the "bin array". The data array and the bin array must be entered somewhere. For the example above the bin array would be three cells with the values -100,0,100 respectively. Notice that the number of bins is actually one more than the number of cells in the "bin array" because the numbers are really cutoffs between bins. The extra bin is labelled MORE, because it includes the numbers with values more than the largest bin cutoff [in this case, 100].

 You need to be aware that FREQUENCY is an "array functions". We have to have enough room to put all the output numbers, or they don't appear in the worksheet.  The easiest way to ensure this works, is to start by entering the function into the first cell you wish to put the values [as for a usual function]. One number will appear.

Warning: If you copy this function to a range of cells you will not get correct values!

So, starting with the cell that contains the output for the first interval, simply highlight the range into which you wish to put the list of results.  Click in the formula bar somewhere as if you were going to type something there, then press Control-Shift-Return [on PCs] or Command-Return [on Macs].

The output of the FREQUENCY() function shows the number of cells between the previous bin value and the current bin value. For example, with the bin array -100,0,100, the first output cell contains the number of cells in the data range with value less than [or equal to] -100; the next cell contains the number between -100 and 0; the next between 0 and 100; and the last cell contains the number of cells in the data array that have a value greater than the last cutoff, 100.
 

Organizing simulations

Try to organize your spreadsheet so that you can follow what you have done even if you look at it a year from now. Keep all cells corresponding to a single trial in either a column or a row. Make summary statistics appear above, below or to the right of the corresponding trial with labels for what value that is [for example, the average or SD].

Use one spreadsheet as a template for other similar simulations. You can copy it, change a few things and read off the new results. This is especially helpful when doing the same experiment with 10 then 40 then 160 flips of a coin. All you need to do is copy the first 10 rows and recalculate.


Last revised: February 2004. Mail to kvalente@mail.colgate.edu
Copyright 1999 © Colgate University. All rights reserved.