General Instructions for Simulations using Spreadsheets

These instructions are general. They describe how to create random numbers for use in simulations, how to organize the spreadsheet and some about how to process the output of the simulation. The functions discussed here are: RAND, TRUNC, IF, AND, OR, and FREQUENCY.


Random Numbers

There are at least three ways to create random numbers with Excel.
  1. The RAND() function returns a random number between zero and one. You may notice that the numbers in any cell with the RAND function change each time you do anything in the spreadsheet. This is because the spreadsheet is "recalculating" automatically each time you do something. You can turn this off if you wish from the Preferences page.

    What if we don't want zero to one? If you want numbers between zero and five, multiply by five: =5*RAND(). If you want numbers between 3 and 8, use =3+5*RAND(). If you want only INTEGERS, you need to round off the decimals somehow. The function to do this is TRUNC(). TRUNC() truncates all decimals from a number. So, TRUNC(2*RAND()) returns zero half the time, and one the other half of the time [Note: this is the same result as IF(RAND()<0.5,0,1)].

    If you want an equal chance for the numbers 1, 2 or 3, use =1+TRUNC(3*RAND()).

  2. The RANDBETWEEN(lo,hi) function returns a random integer between lo and hi. (for example, RANDBETWEEN(1,5) returns an integer between 1 and 5. As with RAND() this function recalculates each time you do anything in the spreadsheet. This is probably the one you will want to use the most.
  3. In the Data Analysis Toolpak (under the Tools menu--see the "histogram" instructions if it does not appear in the Tools menu) there is a tool called "Random Number Generation". You can specify how many cells you want, what distribution they should follow, and many other features of the data. It is more than we need, but you might try it out.

Logic Operators

You can interpret the answers you get from a simulation by using logical operators. For example, to see if cells A3 and B3 are the same, use =IF(A3=B3,1,0). The first argument to IF() is an expression to be evaluated as TRUE or FALSE. The second argument is the number to return if TRUE and the third is the number to return if FALSE. These do not have to be numbers; they can also be expressions themselves: =IF(A3=B3,IF(A3=1,1,A1),RAND()) would give the value 1 if the entries in A3 and B3 are both 1, would give the value in A1 if A3 and B3 were equal but not 1, and would give an entirely random number if the entries in A3 and B3 were unequal.

Sometimes you will want to use AND() and OR() to build up your expression for IF(). For example, =IF(AND(A3=B3,C3>2000),1,0). The function AND() returns TRUE is both expressions are TRUE. The function OR() returns TRUE is either or both of the expressions are TRUE.

Counting and Frequency

If you want to know how many cells have values greater or less than some cutoff values, you can use the FREQUENCY() function. For example, if you want to know how many simulations ended up less than -100 and greater than zero and greater than 100, then FREQUENCY() is the function for you!

To use FREQUENCY(), you need to have a range of data cells and a range of cutoffs which are the borderlines for the "bins" we wish to sort into. The list of cutoffs is called the "bin array". The data array and the bin array must be entered somewhere. For the example above the bin array would be three cells with the values -100,0,100 respectively. Notice that the number of bins is actually one more than the number of cells in the "bin array" because the numbers are really cutoffs between bins. The extra bin is labelled MORE, because it includes the numbers with values more than the largest bin cutoff (in this case, 100).

The command looks like =FREQUENCY(A101:G101,A102:C102) where A101:G101 is the data array and A102:C102 is the bin array. But don't use the "function wizard" (the button with the fx symbol on it), or the "Insert-->Function" menu item! For some reason they don't work right. The way to get the desired output is to select (highlight) the cells into which the output is to go (one more than the bin array, remember), type the command as above, and then type Control-Shift-Enter (i.e., hold down the control and shift keys and type enter), not just the Enter key.

The output of the FREQUENCY() function shows the number of cells between the previous bin value and the current bin value. For example, with the bin array -100,0,100, the first output cell contains the number of cells in the data range with value less than (or equal to) -100; the next cell contains the number between -100 and 0; the next between 0 and 100; and the last cell contains the number of cells in the data array that have a value greater than the last cutoff, 100.

Organization of simulations

Try to organize your spreadsheet so that you can follow what you have done even if you look at it a year from now. Keep all cells corresponding to a single trial in either a column or a row. Make summary statistics appear above, below or to the right of the corresponding trial with labels for what value that is (for example, the average or SD).

Use one spreadsheet as a template for other similar simulations. You can copy it, change a few things and read off the new results. This is especially helpful when doing the same experiment with 10, then 40, then 160 flips of a coin. All you need to do is copy the first 10 rows and recalculate.


Revised: January 7, 2000. Questions to: dlantz@mail.colgate.edu
Copyright 2000 © Colgate University. All rights reserved.