0

I am performing a goodness of fit test on a 64×32 matrix where the expected frequency of any a[i,j] is 500000 and the observed frequency can lie between 0 and 1000000

I am taking DoF=(32×64)−1.The problem is that I am getting the chi-squared parameter of the order 10^7 and p-value is resulting to 0.

Am I going wrong somewhere ? Could you please give some advises ? I am a novice at statistics. I am using the following formula to calculate the $chi-square parameter: formula

I am trying to measure the strict avalanche criterion for different hash functions. An entry a[i,j] in the matrix reflects the number of times jth bit in the output changed due to flipping ith bit in the input. According to the strict avalanche criterion every bit should change with a probability of 0.5. I have done the experiment with 1000000 strings so the observed frequency can lie between 0-1000000 whereas the expected frequency is 500000

  • One thing about hypothesis tests is that they become exceedingly sensitive as the sample size grows. Your cell frequencies suggests a huge sample. –  Apr 10 '16 at 00:38
  • Don't forget, a chi-square tests gives a p-value for the hypothesis of "no effect" essentially, so even a small effect gets magnified as sample size increases. –  Apr 10 '16 at 00:39
  • so, do you recommend any other test or any other way the goodness of fit test should be applied? maybe some kind of binning or normalization before applying the test –  Apr 10 '16 at 00:41
  • Just so there is no useless duplication, this is a more specific version of Question http://math.stackexchange.com/questions/1733269/. – BruceET Apr 10 '16 at 00:42
  • No, with that sample size it's unlikely any of the "stock" distributions will give a satisfactorily high p-value –  Apr 10 '16 at 00:42
  • One way to evaluate would be to calculate the largest deviation of your theoretical cell probabilities from the actual cell probabilities. This will give a practical difference between your model and data. –  Apr 10 '16 at 00:48

0 Answers0