The Prediction Registry describes events and specifies analytical parameters to document the details of the prediction. However, since most events are processed in the same way, the term "standard analysis" (#1 below) is used. This refers to a definite set of steps that are presented in the procedures section, to avoid repetition for each item in the registry. For a small subset (about 5 to 10 % depending on the criteria for the standardization) some other analysis recipe is used. In these cases, sufficient detail is usually given in place. Most of these "other" analyses are what we call "device variance analyses" and the algorithmic recipe for these can also be given in a separate, general description (#3 below), to which we can refer in the event descriptions. Here, with some contextual discussion, are the major recipies.
As of this writing in late August, 2002, we are considering whether we should change from the correlated meanshift (standard) analysis to the device variance for the default procedure. There are good arguments for doing so, including that it may be more sensitive. In any case, we intend to apply both algorithms to all events where this is feasible, in order to learn more about the question. For an interim period, we will use the composite probability of the two measures as the formal output probability. This will, in effect give an average outcome.
beginning in mid-2002, Peter Bancel posed the question, "Without any a priori's, how many different "recipes" are in the prediction registry and how do results look in subgroups?
"By 'recipe' I simply mean a precise procedure that will get me from the raw data to a stated formal GCP df and Chisquare for each event. I want to count how many of these recipes there are and count how many predictions go with each recipe. And eventually ask if effect size changes with group."
We begin to answer these questions here, by describing the analysis algorithms. This is work in progress.
Recipe #1: The "Standard Analysis"
If there is a need to modify this recipe for a given prediction, then it's a new recipe and the prediction goes into a new group. A good, if very specialized example, is the formal prediction for event 38, which requires appropriate modification of Recipe #1 to replicate the stated GCP result. The prediction for event 38 was specified as two contiguous segments, predefined to show positive and negative expected deviations, respectively. Duplicating the analysis and the GCP "bottom line" result obviously requires something extra to be added to recipe #1.
Recipe #1.5 (Follows Recipe #1 with an additional step:)
Predictions for several events in the formal database were specified in a fundamentally different manner that examines the variability among the individual egg scores. One way to express this is as the concatenation across eggs of their squared z-scores. The result is a Chi-square distributed quantity, that can be composed over the period of interest much as in Recipe #1. The formal analyses use a direct computation of the variance among the eggs (which is essentially the same measure).
Recipe #3: The "Device Variance Analysis"