Fun with Functions II

Lets start this section by by first introducing array formulas. These are formulas that operate on one or more arrays and return an array of values . These are often called CSE (CTRL+SHIFT+ENTER) formulas since the user must type CTRL+SHIFT+ENTER when you complete typing the formula instead of just enter as you would normally do with a standard formula. As a very simple example suppose that you would like to multiple a column of numbers times another column of numbers. Create the following arrays.

Note the two column A and B on the left. Our desire at this point is to multiply the two ‘Arrays’ together and place their answers in the adjacent column C. Here is the array formula method. First select array C1:C6. Now type in =A1:A6*B1:B6 and follow it with a CSE. Column C should contain the values shown on the right. It is important that you take note of the fact that the formula has been renamed {=A1:A6*B1:B6}, in other words two braces { and } and been placed around the formula. This is done to remind the programmer that this is an array formula and not a regular one.

This process can be simplified by first naming the array. Select the A array and then type in the name, say A in the space one the left side of the formula bar, hitting return when finished. Do the same for columns B and C. If you have done this then the formula you enter is =A*B, followed by a CSE. You should get the same results. There many time the naming of columns or rows can seriously simplify things. Try it you will like it!

Irritating Obstacle: if you attempt to modify a cell in the C array the system will get upset with you. Each time you attempt to do so you will get a popup that says “You cannot change part of an array”. If you encounter this problem type the Exc key in the upper left hand corner of the keyboard. This should clear it up.

The next example of a CSE formula will concern itself with the FREQUENCY function. Look it up in Excel. We will use FREQUENCY to simulate a stochastic process. You may recall that a stochastic process, aka random process, is one that has built in random variation. Each time you run a program the outcome is different. The programs that you have written in Perl have all be deterministic, ie. each time you run the program on the same data the result is the same. The main thing to remember is that stochastic programs (whether they be Python Programs, Excel functions or as we will see later in the course randomized NetLogo programs) use a random number function within there code.

Problem: You may recall that when you flip a fair coin that the probability that you will get a head is .5 which is the same as the probability that you get for getting a tail. Similarly if we throw a die we will get a 1 on the die 1/6(.16666) of the time, a 2 1/6 of the time and so on. Lets perform an empirical study and see how true this actually. Excel has a special function called RANDBETWEEN(low, high) that will generate a random integer between low and high inclusive. For example RANDBETWEEN(1,6) will generate random die throws by returning random numbers between 1 and 6. Here is our process

  1. Insert the function =RANDBETWEEN(1,6) in cell A1 and copy it to the cells A2 thru A100. You should see a long column of die throw.
  2. Insert a 1 in C1, a 2 into C2 down to a 6 in C6
  3. Now here is where it gets cool. First we select the range array D1:D6 and enter =FREQUENCY(A1:A30,C1:C6) in the function bar. IMPORTANT NOTE. When you finish typing the function you type CTRL+SHIFT+ENTER by hitting all three keys at once. This will enter the array function into the cells D1:D6. You should immediately see the count of each die value as it occurs in the random column.
  4. Draw a histogram by inserting a Column Graph underneath the valu es in columns C and D. Here is what I obtained.

First note column A. It contains the random die throws. This is the data array. Array C which contains the values 1 thru 6 is called the bin array. The D column is where we placed the =FREQUENCY array contains the counts of the dies 1 thru 6. For example the die 1 what thrown 16 times, the die 2 what thrown 20 times and so on. The included graph is a barchart of column D. Now it is clear that the random predicted value (16.6) is some what different form what we obtained.

Here is an interesting issue with this experiment. Look at column A closely, now click on a random cell say G2 and enter a space. What happened? Do it again. Does the Graph change also. What is going on here?

What could you do to stop this from happening?

As a final note: what do you think the outcome would be if column A was much longer, say 1000 slots. 10000 slots?

Additional Problems (Use Help if you need to)

1. Solving Simultaneous Linear Equations (advanced) Suppose that you have the following equations

	5x +7y +3z = 10
           -2y + z = 3
       -3x    + 4z = 5

You might recall from Algebra ( or some other math class that discussed matrices ) that we can solve this using the matrix equation Ax = b. where, A is the 3×3 matrix of coefficients and be is the [10,3,5] vector. The solution of the equation is x=(A)*b. Build and name these matrices and then apply the excel function =MMULT(MINVERSE(A),b) the x array of answers. MMULT is matrix multiplication and MINVERSE(A) returns the inverse matrix of A. If you apply this to the above case your answer for x should be [1.253.-.04051,2.189].

2. Suppose that an excel chart has an array of numbers each of which between 0 and 1. The construction of the data that filled this array was prone to error so every so often an error occurred and the value -1 was placed into the cell. Our problem boils down to a simple process. We would like to average all the valid ( IE not = -1) values in this array and return the result.

Comments are closed.