Saturday, September 1, 2012

Porting success


 What God hath woven together, even multiple regression analysis cannot tear asunder.
Anonymous

For those who are a little vague on regression analysis, here's a quick refresher example so you will understand the point we're making:
  • Let's say you have a bunch of observations of real outcomes, like unit test results.
  • And, let's say you've grouped them as they occur: Test set 1, Test set 2, Test 3, and so forth out to TS 'N', for the Nth test set.
  • And, let's say that each test set itself has a metric, like some kind of scalar size, so that the size of TS 1 is less than TS 2, and so forth
  • And, for each test set, let's say there is a metric you are interested in, like "discovered but unresolved errors".
  • That's a bit of an awkward phrase, so let's short hand with "quality factor 1", or QF1 for short.
We could then ask the project data analyst who lives in the PMO to plot QF (errors) versus TS (size) on a graph. And, we could ask the analyst to "fit" a line through the data such that the average distance between an observation of QF error and the line is minimized. The analyst would give us back something that looks like the following:


Now, there are two questions you should be interested in:
  1. Is the variability in quality (metric A) is strongly related to the TS size (metric B) or not?
  2. And, for the next TS, with a size within the sizes already observed, will it's QF be on or near the line?  
If you've not already guessed, the line is a "regression" line or curve. Here's the "tear asunder" part: does the regression line fit to the observations (of God's work?) really reveal the constituent influences on the outcomes?

In less grandiose terms, regression analysis is simply used to predict the next outcome, given that the next outcome occurs in the same circumstances as the prior observations. (You can't do regression predictions outside of the domain or limits you have in the observations). Given another value for Metric B, regression predicts the value of Metric A.

But, here's the next big thing: Can you take your regression curve with you to your next project? In otehr words, if you understand all the parts that went into the success of the outcomes, can you expect the same results if all the parts port over to the next project?

There's actually no closed-form answer on this; the best you can say is maybe. The most important thinng to understand is that you probably don't know or understand all the constituents that went into the former success. Thus, regression is helpful, but often incomplete in revealing the true secrets of success.