Odds of pulling four of the same champ from fourteen featured crystals
DNA3000
Member, Guardian Guardian › Posts: 19,693 Guardian
Saw a recent video which asked this question, thought it would be worth going through the math because I suspect this lands into very unintuitive territory. Also, with the new featured crystal rolling over, I thought this might be topical.
Note: I'm not calling out anyone here, if anyone knows the video I'm referencing, this is strictly an exercise in probability I thought would be interesting. I don't actually believe the video author was making any accusations or anything, they were just saying something off the cuff.
How do you even calculate something like this? There is the direct way which is actually much more complicated than you might think, and then there's the cop out way using a computer. I'm going with the easy way. But first, for those reading this, wanna take a guess what the odds of pulling at least four of the same champ from a featured crystal with twenty four different random possibilities would be if you opened fourteen of them? My guess is the average guess would be one percent or lower. Having done a ton of these calculations, I know that's way too low.
You might think there's an easy closed form formula for this, but those formulas run into a problem: with fourteen pulls when you try counting what four of them do you end up miscounting what happens in the other ten in lots of very subtle ways. Fourteen pulls can lliterally pack three four of a kinds in there in theory. There are two ways to get around this: do an iterative calculation where you calculate the specific odds of exactly four of a kind, five of a kind, and so on, and sum, or do a inclusion/exclusion calculation that accounts for the set permutations directly. Both of these suck, so I'm instead going to just whip up a python script to do this Monte Carlo style.
Basically, we just want to roll 24 sided dice fourteen times, and determine which number came up the most often. We then record that, do that, say, a million times, and generate a frequency table: how often do we get four of a kind, five of a kind, and so on. We only care if the fourteen pulls generated *at least* one four of a kind or higher, so recording the highest frequency result is good enough. And we get:
[0, 8120, 595188, 345642, 46690, 4091, 254, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
So, out of a million tries, we get 8120 where the highest multiple was one, which is to say they were all different. That's 0.00812, or 0.812% or one in 123. Getting all different pulls is actually quite uncommon. The odds of four of a kind are 46690 out of a million or 4.7%. But that's exactly four in a row. The odds for four in a row or more are the sum of four in a row, five in a row, six in a row, and so on. Which for this estimate, is 46690 + 4091 + 254 + 15 = 51050.
So the odds of getting at least four of a kind in fourteen pulls is about 5%, or about one in twenty. This falls into the category of not common, but not exactly rare either.
I suspect most people would have guessed way lower than that. It *seems* like it should be much rarer. I do this a lot and my original guess was one in thirty which is still lower than the actual result.
Full disclosure: I did a back of envelope calculation for this first, got an answer I thought was sus, then did the monte carlo estimation script which showed I made an error in my first lazy calculation. Always be careful about statistics calculations: everyone makes mistakes with those.
Note: I'm not calling out anyone here, if anyone knows the video I'm referencing, this is strictly an exercise in probability I thought would be interesting. I don't actually believe the video author was making any accusations or anything, they were just saying something off the cuff.
How do you even calculate something like this? There is the direct way which is actually much more complicated than you might think, and then there's the cop out way using a computer. I'm going with the easy way. But first, for those reading this, wanna take a guess what the odds of pulling at least four of the same champ from a featured crystal with twenty four different random possibilities would be if you opened fourteen of them? My guess is the average guess would be one percent or lower. Having done a ton of these calculations, I know that's way too low.
You might think there's an easy closed form formula for this, but those formulas run into a problem: with fourteen pulls when you try counting what four of them do you end up miscounting what happens in the other ten in lots of very subtle ways. Fourteen pulls can lliterally pack three four of a kinds in there in theory. There are two ways to get around this: do an iterative calculation where you calculate the specific odds of exactly four of a kind, five of a kind, and so on, and sum, or do a inclusion/exclusion calculation that accounts for the set permutations directly. Both of these suck, so I'm instead going to just whip up a python script to do this Monte Carlo style.
Basically, we just want to roll 24 sided dice fourteen times, and determine which number came up the most often. We then record that, do that, say, a million times, and generate a frequency table: how often do we get four of a kind, five of a kind, and so on. We only care if the fourteen pulls generated *at least* one four of a kind or higher, so recording the highest frequency result is good enough. And we get:
[0, 8120, 595188, 345642, 46690, 4091, 254, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
So, out of a million tries, we get 8120 where the highest multiple was one, which is to say they were all different. That's 0.00812, or 0.812% or one in 123. Getting all different pulls is actually quite uncommon. The odds of four of a kind are 46690 out of a million or 4.7%. But that's exactly four in a row. The odds for four in a row or more are the sum of four in a row, five in a row, six in a row, and so on. Which for this estimate, is 46690 + 4091 + 254 + 15 = 51050.
So the odds of getting at least four of a kind in fourteen pulls is about 5%, or about one in twenty. This falls into the category of not common, but not exactly rare either.
I suspect most people would have guessed way lower than that. It *seems* like it should be much rarer. I do this a lot and my original guess was one in thirty which is still lower than the actual result.
Full disclosure: I did a back of envelope calculation for this first, got an answer I thought was sus, then did the monte carlo estimation script which showed I made an error in my first lazy calculation. Always be careful about statistics calculations: everyone makes mistakes with those.
18
Comments
This in improbable so why does it happen and I am being bold by saying relatively frequently. How does this happen the RNG might be set to 0.3% pay out but again assuming there is only 1 crystal RNG in mcoc with let's say 100k of players if a group of these are opening crystals at the same time these rates are artificial inflated as there are only 300 champs.
If you place this in context with powerball (lotto) where there is 1 chance in 134 million to win given 134 million combinations with sequence and number probability wins aren't normally distributed thus questioning your Monte Carlo simulation. Meaning sometimes no one wins for weeks other times 2 or 3 people win.
What's the solution,
Kabam could (if not done already) people in groups of x to ensure there median monthly crystal opening is similar, which would standardise the even weighting of pulling unique and multiple champs
" on the back of an envelope"