1

I have two programs for playing a $2$ player zero-sum perfect information game.

The game has a very high "branching factor".

No luck is involved, but game results are chaotic due to a rather large number of starting states, so when two programs play, the better program may win only a modest percent more often.

My question is how many games must I let them play, using random starting states, before I am $95\%$ confident that I have identified the better program?

This might be simple, but my statistics course was back in the 70s ;)

Alternate form of question: $X$ wins $255$ games and $Y$ only $245$.

How certain am I that $X$ is the better player?

Sam Houston
  • 2,277
Conrad
  • 111
  • This is some sort of binomial thing I bet. – Conrad Apr 28 '15 at 22:17
  • The result of pitting one program against the other is just X wins or Y wins? They don't get scores? When you say it is "chaotic" is that implying that each iteration of this game is NOT independent from previous iterations? If they are independent then what is "chaotic" ? – futurebird Apr 28 '15 at 22:20
  • It sounds like a binomial distribution, I agree. I'm unclear on how the "branching factor" or the fact that each game is chaotic (if you mean it is sensitively dependent on initial conditions) might matter-- what are the initial conditions? How would the results be different than the distribution from a coin flip if the games were evenly matched? – futurebird Apr 28 '15 at 22:23
  • By chaotic I'm only referring to high sensitivity to the starting states. Once the pieces are set up you can think of it as chess, with 10 to 20 times as many moves available each turn. Games are indeed independent, as I randomly generate starting states. Coin flipping is aa good model I guess. – Conrad Apr 28 '15 at 22:28
  • Maybe I should use this: http://stattrek.com/online-calculator/binomial.aspx – Conrad Apr 28 '15 at 22:29

1 Answers1

0

Looks like the "Sign Test" is what I need, and there's an online version at http://www.fon.hum.uva.nl/Service/Statistics/Sign_Test.html

Conrad
  • 111