Comments on Critiques
I received an email from Ivar Fernemo with some questions he still had after careful reading of the GCP website. The issues he raises are of potential interest to others, especially with regard to some of the criticisms made by serious people about the GCP data and analyses.
The XOR removes bias of the mean. The typical design compares each voltage measurement in the analog part of the hardware REG (e.g., the sum of electrons in a sample is measured as a voltage) against a threshold level that is set so that half the voltage measures are higher and half are lower. You can imagine that various influences like temperature and component aging might shift the relative level of the threshold, resulting in a bias. This is what the XOR cancels, logically. The resulting bit-stream is still fully random, and deviations can and do occur, with the statistics predicted by theory (albeit with a possible change in the second order statistics as the price for keeping the first order unbiased.)
No. As I said, the statistics are really unchanged -- there is variation exactly as before around the *unbiased* mean. Scargle's point is that the XOR must prevent any effect. But he is working from a model that is inappropriate, namely a physical force model, where he conceives the effect (if any) must be "caused" by something like an EM field. Our XOR explicitly precludes such physical fields from affecting the REG, because we do not want it to be vulnerable to spurious sources like radio, power grid, telephone cell radiation, etc. The fact is, we see changes from expectation in the laboratory experiments, the FieldREG experiments, and in the GCP data (and this is not happening after the data are transmitted to Princeton for archiving).
The Spottiswoode and May criticism is itself questionable. They looked at one event (9/11) and they chose to ask whether a half hour more or less would have produced a significant effect. Who knows how they arrived at this specific (post facto) choice? Looking at the actual data, we can see that the measurement began to depart from expectation around the time of the attacks and continued with a strong trend (slope) for 50 hours, while showing usual random variation. Take a look at the first figure on the 9/11 explorations page.
I think that figure shows non-normal data by any reasonable standard, and while it is inappropriate to make formal claims because it does not test a pre-defined hypothesis, it certainly suggests that we need to look at longer spans of time when big events occur (our formal hypothesis specified 4 hours and 10 minutes, based on previous experience with what we regarded as similar events). As for S&M's criticism, if we use the same standards of evidence we apply to our own formal protocols, their claims are vulnerable. Looking at random data after the fact, one can pick a moment to make any point -- to say it is significant, or to say it is not. That is exactly what they have done.
In the main body of the Spottiswoode and May criticism, they confuse the formal, pre-defined trials with the explorations of data we use to provide context and to learn how to formulate better hypotheses. Our formal series of hypothesis tests now has about 225 events, and when they are combined in the equivalent of a meta-analysis, the odds against chance are well beyond the 0.05 level Spottiswoode and May are worried about. Moreover, we have enough events in the formal series now to get a good estimate of the average effect size, equivalent to a Z of 0.3. This means that no single event should be expected to show significance (which was S&M's complaint), but instead that we must patiently assemble many replications if we wish to learn something.