"off by 1" lottery probability

Question

Suppose there is a lottery such that 6 random balls are chosen from a set of 50. The balls are numbered 1 thru 50. The lottery officials determine that lottery ticket sales are sluggish so they want to make it easier to win so they decide to allow "off by 1" for each of the 6 numbers drawn. For example, suppose with the original lottery the winning numbers drawn were (sorted) 5, 11, 12, 18, 37, and 44). Of course the only way to win would be to match those 6 numbers (in this example case) exactly. However, with the off by 1 variation, each number chosen by the lottery contestant can be off by as much as 1. So for example, to match the digit 5, the player can either have a 4, 5, or a 6 and that would be considered a match. Let's also consider that the player's ticket has randomly chosen numbers as well.

So the question is, how much easier is it to win with the "off by 1" variation compared to the original version?

Things to be careful of are things like matching the 11 and 12 as in the example. If the player chooses say 11, it cannot match both the 11 and the 12 actual numbers drawn. However to match both the 11 and 12, there are several ways the player can do that... (10,11), (10,12), (10,13), (11,12), (11,13), (12,13).

We will say that the player cannot choose the same number more than once per play so they cannot choose the number 11 twice for example in the same 6 number game. As with the original version, all numbers chosen much be unique.

$Update$: For clarity, I should mention that the chosen numbers must "map" to the drawn numbers with both in sorted order. For example, if the chosen numbers are 5, 10, 15, 16, 20, and 44 and the drawn numbers are 5, 10, 16, 17, 20, and 45, then the pair of 16s will not "map" to each other. The numbers get mapped in sorted (ascending) order. This example is a winning combination (all 6 numbers "match" using off by 1 rule).

In my simulation program which picks 6 random unique numbers, sorts them, then checks how many possible ticket combinations can match those 6 numbers, I have buckets to record how many ways there are and I tally them up. It appears 486, 648, and 729 ways are common and the lowest I've seen so far is 72 and the highest is 729. This may be useful information for those analyzing this problem. These results were acquired from only 1000 simulated drawings. I can let the simulation program run overnight and get a larger sample then I can post the results.

I can now easily and quickly simulate millions of randomly generated tickets and check how many possible winning tickets there are to cover those numbers. The long term average I appear to be getting is 503. I wish someone else would write a simulation program too to help verify my results. Also, mathematically, there are a lot of "buckets" indicating there are many different cases that contribute different probability "boosts" vs. the original game (where all chosen numbers must match exactly). So, this may be a pure simulation type problem as the "on paper" complexity appears to be too high. Amazing how only little variation makes the problem go from "cake" "on paper" to unwieldy due to complexity.

So to clarify my findings... whereas the original lottery had only 1 winning ticket combination (all 6 numbers must match exactly), the off by 1 simulation is showing me 503 winning tickets (on average) for any given randomly drawn 6 balls (out of the 50 possible).

I think why this would be so hard to solve on paper is because of the many different scenarios such as a ticket with 1 "neighbor" (such as 5, 10, 15, 20, 21, 25), 2 neighbors, 3 neighbors... Also 1 and 50 are special cases cuz they can only go in one direction. It would be interesting in itself to find out how many different cases there are which contribute different boost to the final answer. I suspect there are maybe 100 or so classes/categories (buckets) that all the tickets fall into. I could try running maybe 1 billion decisions overnight and count them up. With 1 million decisions I am already seeing over 110 buckets. Half of those appear instantly on my screen and the other half take a while to pop up.

I am currently running 1 billion simulated decisions and so far have 137 buckets so this problem would not be easy to solve on paper.

In "things to be careful of", is it equivalent to -- a # chosen can either stand for itself or a neighbor, but not both ? — true blue anil, Jun 20 '15 at 04:43
Yes that is a correct interpretation. For example, if the player has an 11 as one of their number choices and 10,11,and 12 are 3 of the 6 winning numbers drawn, then 11 can only match one of them, not 2 and not all 3. — David, Jun 20 '15 at 11:48
I am writing a computer simulation to approximate the correct answer but I would like to know how to solve this mathematically. The original version had $50 \choose 6$ ways to draw 6 unique balls from 50 but only 1 way to win (exact matches on all 6 balls). The "off by 1" variation has many more ways to win than just 1 but the question is how many on average? — David, Jun 20 '15 at 12:19
I ran a "crude" simulation of only 10 sets of 6 random numbers and the # of winning possible tickets ranged from 288 (low) to 729 (high). The actual number of reported winning tickets per iteration was 486, 729, 432, 288, 648, 729, 729, 486, 729, 486. The average for the 10 drawings was 574.2. Many more than 10 is needed. Also notice there are patterns in the output such as 729 happened 4 of the 10 times and 486 happened 3 of the 10 times. — David, Jun 20 '15 at 14:06
I reran the simulation for 100 "decisions" and I am seeing a 493x boost on the chances of winning vs. the original game where the numbers has to match exactly. So the "off by 1" rule makes it roughly 500 times easier to win this lottery game according to my simulation program but mathematically I would like to know exactly. I could run thousands of decisions but it would have to run overnight. I could also try to use a better algorithm to speed it up so I will do that too. — David, Jun 20 '15 at 14:34
Just check for choosing 3 out of 9 lottery. I choose numbers 2,5 & 8 which cover {1,2,3}, {4,5,6}, {7,8,9}. In a normal lottery, there would only be one way to win, here there will be $3^3$ = 27, which can easily be enumerated ! — true blue anil, Jun 20 '15 at 15:42
True but that is not what I am asking here. You are only showing 1 of the many cases that may happen. You may be assuming that the player will always pick his/her own numbers and that for "maximum coverage" they will use the "skip at least 3 between numbers chosen " method but that is not how lotteries usually work. Many players let the computer choose random numbers for them. So with that information, try to rethink your solution. My simulation is showing about a 500x increase in winning odds so roughly 1 in 32,000 chance to win vs. roughly 1 in 16 million for the original game. — David, Jun 21 '15 at 04:13
Here is an easy way to see why the 729x increase in winning probability fails (doesn't happen all the time). Suppose both the winning numbers and the ticket numbers are randomly chosen. Winning numbers happen to be 5, 10, 11, 20, 30, and 44. To match the first number (5), we can have either a 4, 5, or a 6 so we have increased our winning odds by 3x already. To match the 10, we can have a 9, 10, or 11 so again a 3x gain for 9x total so far. To match the 11, we have to be careful cuz we cannot choose 10 or 11 twice so in that case, it is NOT a 3x gain thus the 729x total fails in this case. — David, Jun 21 '15 at 13:32
In fact, when I plug in the winning numbers of 5, 10, 11, 20, 30, and 44 into my simulation program and just check that, it shows me 486 winning ticket combinations are possible. The worst case is if all the players numbers are adjacent and contain either a 1 or a 50 (such as 1,2,3,4,5,6 or 45,46,47,48,49,50). In those 2 cases there is only a 7x bost in winning odds. If the chosen numbers are adjacent but not containing a 1 or a 50, then there is a 28x boost (such as 2,3,4,5,6,7 or 44,45,46,47,48,49). I see maybe 30 or more different cases in my simulation and there may be more. — David, Jun 21 '15 at 13:44
I suggest you repost the question, clearly specifying that the player does not know that "off by one" is operating and chooses 6 numbers randomly. — true blue anil, Jun 21 '15 at 15:18
Voting to close, you're not stating the question in your post. We have to guess it and you're angry at us if we guess wrong. — Yuval Filmus, Jun 21 '15 at 15:38
I already updated the question to state random ticket. Look at the end of the first paragraph. I am not angry. If the question is ambiguous, responders are supposed to ask for clarification before answering a "wrong" interpretation. I agree I could have worded it better. Sorry about the confusion. — David, Jun 21 '15 at 15:42
Re your latest update, I can understand that it would be tiresome to enumerate. I experimented with only 3 winning balls, and the concept that the balls could have 0 gaps, 1 gap, 2 gaps between balls distributed in various ways, and count for each pattern ("bucket") the increased number of wins possible, but decided that computer help was needed. By the way, how many of a particular pattern exist could easily be found out using stars and bars — true blue anil, Jun 29 '15 at 07:58

Yuval Filmus · Answer 1 · 2015-06-20T05:51:06.293

2

To maximize her probability of winning, the gambler should choose any sequence $a_1 < a_2 < a_3 < a_4 < a_5 < a_6$ such that $|a_i - a_{i+1}| \geq 3$, $a_1 > 1$, $a_6 < 50$ (such sequences do exist, for example $3,6,9,12,15,18$). Each such sequence matches $3^6$ sequences using the off-by-one rule, so the winning probability grows by a factor of $3^6 = 729$.

edited Jun 20 '15 at 05:51

answered Jun 20 '15 at 04:57

Yuval Filmus

57,157

They need to differ by at least 3 or you have overlap, and you don't want to pick 1 or 50 which cover 2, e.g. 1,3,5,7,9,11 only cover 12 numbers not 18. – BruceZ Jun 20 '15 at 05:44
np, 1 is still problematic – BruceZ Jun 20 '15 at 05:48
Right again, thanks! – Yuval Filmus Jun 20 '15 at 05:50
729 times as likely to win doesn't seem right. For example, if someone has the lottery machine pick random numbers for them, there is a chance that numbers 1 or 50 or both are chosen. Also, there is a chance that adjacent (neighboring) numbers will be chosen (such as in worst case, 5,6,7,8,9,10). In that case, the chances of winning are not "magnified" by anywhere close to 729. We would only pick up coverage on the numbers 4 and 11 in that case. For example, 4,6,7,8,9,11 would be a winner. – David Jun 20 '15 at 12:04
@David I'm afraid I don't agree with your argument. The probability of winning in straight lottery is $1/\binom{50}{6}$, whereas the probability of winning in off-by-one lottery is $729/\binom{50}{6}$. Compare this to the strategy which only bets on $1,2,3,4,5,6$ – its winning probability in straight lottery is $1/\binom{50}{6}$, even though conditional on the bet, the winning probability is either $0$ or $1$. – Yuval Filmus Jun 20 '15 at 16:07
If someone bets numbers 1,2,3,4,5 and 6 for the off by one lottery, then the only number they pick up for additional "coverage" is 7. There are not now 729 ways to win so your constant of 729 is not accurate in all cases. I agree in SOME cases a 729x boost in winning chances occurs but that is not the final correct answer because my simulation program is showing me otherwise and it is apparent by inspection. Just enumerate all the winning possible drawings that match chosen numbers 1,2,3,4,5 and 6 and you will see. I'll start you off. 123456, 123457, 123467, 123567...234567. – David Jun 21 '15 at 02:15
Your "bet every 3rd number" scheme wont work in all cases. I agree it covers many more combinations than in the straight lottery but not always 729 times as many. In fact if you bet numbers 2,5,8,11,14, and 17, you wont even win if the drawn numbers are 1,2,3,4,5,6. The problem is you are missing the close neighbor winning numbers where a "block" of numbers close together are drawn. Even an embedded block will cause your method to fail such as if the winning numbers were 2, 5, 8, 9, 10, and 17. – David Jun 21 '15 at 02:42
@David I still don't agree with your argument. Of course the better can bet non-optimally, but why would she. With my optimal bidding scheme, her probability of winning, assuming that the numbers are chosen uniformly at random, is 729 as much as in straight lottery. It is true that the lottery operator can foil my scheme. Perhaps you should define your goal more formally; are you interested in a betting scheme that guarantees a probability $p$ of winning under any lottery scheme, aiming at maximizing $p$? – Yuval Filmus Jun 21 '15 at 05:29
I am asking for randomly chosen numbers, what is the increase in winning odds using the off by 1 allowance vs. the straight betting system. I am not asking how to optimize the betting system. You can just assume the players 6 number choices are randomly generated by the computer such as 5, 10, 11, 20, 25, and 44. – David Jun 21 '15 at 11:50
The probability of winning the off by 1 lottery is NOT 729x (vs. the original format) for a randomly picked players ticket. It is only 729x in certain circumstances but not all. This is why I am seeing about a 503x boost in my simulation. The person stating 729x boost is stating the best case only but not the average case which is really what I am interested in. Please do not oversimplify and/or alter the question. I did not ask about optimal betting scheme. I asked what is the increase in chances of winning using off by 1, meaning average case (for a randomly generated players ticket). – David Jun 21 '15 at 12:03
1

@David, I think it would help if you clarified in the question itself that you are assuming that the player's six numbers are also chosen at random. When you asked "how much easier is it to win?" just about everyone assumed you were asking about an optimal player's strategy. I know I did, until I read your comments here. – Barry Cipra Jun 21 '15 at 13:42
But you have to realize that many people who play lotteries do not pick their own numbers, they use computer generated numbers. Also even if the player picks their own numbers, there is no guarantee they will pick something like 2,5,8,11,14, 17. They may pick something like my example, 5, 10, 11, 20, 30, and 44, which I have calculated to only have a 486x boost in winning chances vs. the original lottery game. Sorry I did not make that clear but interpreting the question to mean optimal betting strategy would be a trivial problem since it is just $3^6$ = 729x. boost. – David Jun 21 '15 at 13:48
I didn't explicitly say the players numbers are chosen at random cuz they don't have to be. For example, many people play birthdays, ages of their kids... Those are not really random. The question becomes most interesting if it is interpreted to mean random though because I am interested in the "on average boost" and I am sure so would the lottery officials. It is interesting to me that allowing each chosen number to be off by 1 (at most) increases the odds of winning by slightly over 500 times. Many more people would win but the jackpots would be much smaller. – David Jun 21 '15 at 13:52
Also it was implied in the question that I wasn't looking for optimal betting strategy because I stated to watch out for things like adjacent numbers such as 10 and 11 in the betters numbers. It should have been clear just from that that I was looking for an "on average" boost. – David Jun 21 '15 at 13:55

true blue anil · Answer 2 · 2015-06-20T14:35:12.623

0

I think the optimal strategy is to choose one of three sets like

[ A ] 2,5,8,11,14,17 covers 1 thru 18

[ B ] 18,21,24,27,30,33 covers 17 thru 34

[ C ] 34,37,40,43,46,49 covers 33 thru 50

Ways of winning are get $3\cdot ^{18}C_6$ = 55,692

Improvement in win probability = $^{50}C_6 / 55692$ ≈ 285 times

$Edit:$

Each of the chosen #s in each set cover 3#s, and there is no overlap,

e.g. set [ A ] covers {1, 2, 3}, {4, 5, 6}, .......

Thus each of [A], [ B] or [C] cover $3^6$ = 729 possible wins only.

We have decided to choose from only one set,

and in a normal lottery, there is only one way of winning,

thus win ratio = 729

The funny thing is, the figure has again become 729 !

edited Jun 20 '15 at 14:35

answered Jun 20 '15 at 13:19

true blue anil

41,295

Something seems "fishy" with this answer. You are describing 3 possible ticket combinations, one that covers the "lows, one that covers the "mids", and one that covers the "highs". However for those to be winners, all 6 drawn numbers have to be within that subset of numbers. For example, 6 "lows". Also, the coverage is not complete. For example, let's take your 2,5,8,11,14,17 scenario which "covers" 1 thru 18. What if the winning drawn numbers were 1, 3, 8, 11, 14, and 17? Your combination would be a losing ticket. There is no match for the drawn 3. Your "coverage" has gaps. – David Jun 20 '15 at 13:27
An easy way to see that 2,5,8,11,14,17 only covers a small subset of the winning "low" numbers is consider winning drawn numbers with neighbors such as 2,5,8,9,14,17. There is a 1 to 1 mapping of the chosen numbers and the winning drawn numbers. You will have no match for the number 9 since your 8 was used to match the drawn 8 and 11 is more than 1 away from 9. – David Jun 20 '15 at 13:32
Hmm, true. Let me brood over how to fix it ! In the meantime, it would be good to see the simulation result. – true blue anil Jun 20 '15 at 13:33
I didn't get the simulation running yet. I have to go to work but I can try it later. The simulation is just an approximation anyway. There should be a way to do this mathematically although I don't know at what level of difficulty. I think the original version is 1 out of 15,890,700 chance to win which is 1 out of (50 choose 6). The off by one rule should help that a lot but how much is not clear to me just yet. – David Jun 20 '15 at 13:35
If someone else would like to try a simulation I welcome that. Not sure how many iterations would be needed to give an accurate result but 1 million draws minimum perhaps? I would just pick 6 random unique balls and then check how many different possible tickets (of the 15,890,700 of them) actually match. For any given set of 6 drawn balls, there may be hundreds of winning possible tickets but I would like to know on average what that number is. It seems the off by 1 variation has a "variable boost" in winning depending on what numbers are actually drawn. I will post my results when I get. – David Jun 20 '15 at 13:42
After a while, I will return and try to work out the difference overlap between A, B and C can make. – true blue anil Jun 20 '15 at 14:02
I think the 729 factor comes into play if the numbers chosen are all at least 3 apart from each other and other conditions are true such as no "end" numbers such as 1 and 50 are chosen either by the player or by the lottery drawing. I see that 729 case happen frequently in the simulation. Something tells me there are a small number of cases that can happen here so this is solvable mathematically by isolating the cases and computing the probabilities. I can do some analysis of the # of distinct cases and that should help someone solve this mathematically. – David Jun 20 '15 at 14:26
I have come to the conclusion that 729 is the correct answer. There can be no overlap because we decided to choose one of [A], [B] or [C]. I think you are looking at how you may not win. That is not necessary. There is only one way of winning under a normal lottery, here just count the number of ways you can win – true blue anil Jun 20 '15 at 14:39
729x boost in expected winning is a case that happens frequently but not all the time. My simulation is showing me "boosts" of anywhere between 216 and 729 with an average of around 500 or so. The tricky part of this problem is there are many cases, each one contributing a different amount of "boost" in expected win probability. I would have to run a simulation overnight to isolate all the possible scenarios. However I think it is safe to say there are at least 10 different boost cases and likely more, which tells me this problem may not be an easy one to do on paper. – David Jun 20 '15 at 14:50

"off by 1" lottery probability

2 Answers2

Linked