## Tuesday, August 10, 2010

### Our friend Bayes -- Part II

In Part I of this series, we developed the idea that Thomas Bayes was a rebel in his time, looking at probability problems in a different light, specifically from the proposition of dependencies between probabilistic events.

In Part I we posed the project situation of 'A' and 'B', where 'A' is a probabilistic event--in our example 'A' is the weather--and 'B' is another probabilistic event, the results of tests. We hypothesized that 'B' had a dependency on 'A', but not the other way 'round.

Bayes' Grid

The Figure below is a Bayes' Grid for this situation. 'A+' is good weather, and 'B+' is a good test result. 'A' is independent of 'B', but 'B' has dependencies on 'A'. The notation, 'B+ | A' means a good test result given any conditions of the weather, whereas 'B+ | A+' [shown in another figure] means a good test result given the condition of good weather. 'B+ and A+'  means a good test result when at the same time the weather is good. Note the former is a dependency and the latter is a intersection of two conditions; they are not the same.

The blue cells all contain probabilities; some will be from empirical observations, and others will be calculated to fill in the blanks. The dark blue cells are 'unions' of specific conditions of 'A' and 'B'. The light blue cells are probabilities of either 'A' or 'B'.

Grid Math

There are a few basic math rules that govern Bayes' Grid.
• The dark blue space [4 cells] is every condition of 'A' and 'B', so the numbers in this 'space' must sum 1.0, representing the total 'A' and 'B' union
• The light blue row just under the 'A' is every condition of 'A', so this row must sum to 1.0
• The light blue column just adjacent to 'B' is every condition of 'B' so this column must sum to 1.0
• The dark blue columns or rows must sum to their light blue counter parts
Now, we are not going to guess or rely on a hunch to fill out this grid. Only empirical observations and calculations based on those observations will be used.

Empirical Data

First, let's say the empirical observations of the weather are that 60% of the time it is good and 40% of the time it is bad. Going forward, using the empirical observations, we can say that our 'confidence' of good weather is 60%-or-less. We can begin to fill in the grid, as shown below.

In spite of the intersections of A and B shown on the grid, it's very rare for the project to observe them. More commonly, observations are made of conditional results.  Suppose we observe that given good weather, 90% of the test results are good. This is a conditional statement of the form P(B+ | A+) which is read: "probability of B+ given the condition of A+".  Now, the situation of 'B+ | A+' per se is not shown on the grid.  What is shown is 'B+ and A+'.  However, our friend Bayes gave us this equation:
P(B+ | A+) * P(A+) = P (B+ and A+)  = 0.9 * 0.6 = 0.54

Take note: B+ is not 90%; in fact, we don't know yet what B+ is.  However, we know the value of 'B+ and A+' is 0.54 because of Bayes' equation given above.

Now, since the grid has to add in every direction, we also know that the second number in the A+ column is 0.06, P(B- and A+).

However, we can go no farther until we obtain another independent emprical observation.

To be continued

In the next posting in this series, we will examine how the project risk manager uses the rest of the grid to estimate other conditional situations. 