Example 1: Simulating rolls of a die

We begin with a simple programming problem:

Simulate one thousand rolls of a six-sided die, and report the frequencies of 1's, 2's, 3's, 4',s 5's, and 6's rolled.

In a procedural language such as Pascal or C, the code might look something like this:

program RollDie1;

var roll: integer;
    frequencyTable: array[1..6] of integer;            { Define our variables }

Procedure PrintResults
begin
  [ PRINTING ROUTINES HERE ]
end;

begin

  For x:=1 to 6 do
    frequencyTable[x]=0;                            { Clear out the array }
    
  For x:=1 to 1000 do                               { Loop 1000 times }
  begin
    roll:=RandomInteger[1,6];                         { Roll the die    }
    frequencyTable[roll]:=frequencyTable[roll]+1    { Add to the tally }
  end;
  
  PrintResults;                                     { Print the results}
  
end.

One way to code this in Mathematica is to convert the above code directly:

frequencyTable={0,0,0,0,0,0};
Do[
  roll=Random[Integer,{1,6}];
  Increment[frequencyTable[[roll]]],
{i, 1000}]
[Graphics:../Images/index_gr_1.gif]
[Graphics:../Images/index_gr_2.gif]

Even in this form, the Mathematica program is a bit simpler and more elegant. Most notably, it is not necessary to define the types and lengths of each variable. Furthermore, one now can take advantage of the many built-in Mathematica routines to display the data in various forms, do statistics on the results, etc. For example,

<<Graphics`Graphics`
[Graphics:../Images/index_gr_3.gif]

[Graphics:../Images/index_gr_4.gif]

[Graphics:../Images/index_gr_5.gif]

[Graphics:../Images/index_gr_6.gif]

[Graphics:../Images/index_gr_7.gif]
[Graphics:../Images/index_gr_8.gif]
[Graphics:../Images/index_gr_9.gif]
[Graphics:../Images/index_gr_10.gif]



However, we can do without the looping structure entirely, and instead write a functional program to perform the same task. For an example this simple, the advantages of this approach are less obvious than for some of the more complicated examples we will consider later.

[Graphics:../Images/index_gr_11.gif]
[Graphics:../Images/index_gr_12.gif]

Notice here that the entire program consists of a series of operations involving lists. The first line creates a table (simply a list) of 1000 random integers between 1 and 6 inclusive. The second line constructs a second list with six elements. The first element is the result of  this one the result of applying the Count function to the previous list with the argument 1 (counting all the 1's in the list of 1000 rolls). The second is number of 2's, etc. This gives us a our desired result: a list of  the number of occurences of 1's, 2's,...6's in the list of 1000 rolls.

Philosophical aside #1: Operations on lists

The very simple program that we wrote above illustrates a nice general strategy when writing simulations in Mathematica:

Store all of the data generated in the course of a simulation in a list.
Use operations on this list to summarize the results of the simulation in whatever manner is desired.


A simulation is really a type of experiment. The first step of coding a simulation in Mathematica is to write the functions which actually "execute" the experiment. In this case, this is a matter of generating 1000 consecutive random integers. The results of the experiment will generally be a series of events. In this case, each event is simply the number rolled on the die in each roll.  It is generally a good idea, when not prohibited by memory restrictions, to store all of the information about these events.

Philosophical aside #2: Generating and storing simulation data.

Why am I advocating that we save all the data? First, I've found it leads to a clean programming style, in which the simulation - basically, a series of stochastic events - is separated from the analysis: the results of these stochastic events on the system of interest, and statistical treatment of these results. Of more immediate practical importance, once you've seen the results of a simulation presented in one way, you often decide that you want to see the results in another form. If you've saved the data, you can do this without re-running the simulation. Of greater importance, the necessary modifications to the code are usually much simpler as well. Again an example will be useful for illustration.

Suppose that once we've seen the frequencies with which each number on the die is rolled, we become interested in a second question: what is the distribution of runs of the same number, i.e., how many times do we see the same number rolled twice in a row? Three times in a row? Four times? In the procedural version of our initial simulation, we didn't bother to keep track of the order of the rolls. Modifying the code to evaluate run lengths is a relatively complicated process. A typical solution might look something like this:

program RollDie2;

var rolls: array[1..1000] of integer;
    numbers: array[1..6] of integer;
    runs: array[1..1000] of integer;
    
Procedure CountRuns;
var loop,position, runlength: integer;
begin
  for loop:=1 to 1000 do
    runs[loop]:=0
  position:=1
  repeat
    CurrentValue:=rolls[position];
    runlength:=0;
    repeat
      position:=position+1
      runlength:=runlength+1
    until (position>1000) or (CurrentValue<>rolls[position])
    runs[runlength]:=runs[runlength]+1;
   until (position>1000)
end;

Procedure CountNumbers;
var loop1,loop2: integer
begin
  for loop1:=1 to 6 do
  begin
    numbers[loop1]:=0;
    for loop2:=1 to 1000 do
      if rolls[loop2]=loop1
        then numbers[loop1]:=numbers[loop1]+1;
  end;
end;
      
Procedure PrintResults;
begin
  [ PRINTING ROUTINES HERE ]
end;

begin
  For x:=1 to 1000 do
    rolls[x]:=RandomInteger[1,6]
  CountRuns;
  CountNumbers;
  PrintResults;
end.

In our functional Mathematica code, by contrast, we first generated the results of the simulation, then compiled the statistics of interest. We can simply add a new line to the code, in which the runs are counted from the original list of simulation results. (As an exercise, can you see how this line works?)

[Graphics:../Images/index_gr_13.gif]
[Graphics:../Images/index_gr_14.gif]

Is this behaving as expected? Let's see if the table of log values is linear. The following command plots the log values, and fits a straight line through the points.

[Graphics:../Images/index_gr_15.gif]

[Graphics:../Images/index_gr_16.gif]

Looks good, at first glance, anyway. We could of course do the statistical test of fit if we wanted.

In this particular example, the gains from this approach are modest; after all, the process of generating the simulation results was single simple line of code, so we did not gain much from not having to mess with it. If the process were a more complicated one (as is usually the case in scientific applications) , it would be even more beneficial to do as we have done here, to leave the simulation code itself intact and simply write a new function to extract statistical information from the list of simulation results.


Converted by Mathematica      November 5, 1999