Some reviewers of my Dice Roller app have commented that the distribution of rolls isn't as random as it could (or should) be. This is a record of my in-progress attempts to test the randomness of the app. I appreciate (almost) all the reviews I get, even the negative ones.
One of the complaints regarding the app is that the numbers on D20 rolls seems to be skewed toward the bottom end of the range. Another comment was that the distribution of rolls on a D20 doesn't seem to be adequately random, and that rolling a group of D20s resulted in a surprising number of duplicate values amongst the group.
First, let me say that there is no intentional biasing of results. I am making every effort to ensure that rolls are random. On default settings (in particular, when using the "throw" roll method), this includes:
Once the initial random forces above are imparted to each die, the rest of the roll is handled by the physics engine used in Unity (the game engine I'm using).
The first test was to use the app to roll 100 D20s twice, and then roll a physical D20 100 times, to see what kind of distribution they would result in. A disclaimer: I have no idea if my physical dice are fair. I have no reason to believe they aren't.
Before you scroll down too far, compare the three bar charts below. The x axis is the die result (1-20) and the y axis is the number of results in a test of 100 rolls of the die. Two of the charts are the results of the app, and one of the charts is the results of rolling a physical die. Pick the one that you think is the physical die, and scroll down to see if you're right.
If you picked the second chart as the physical die rolls, you are correct. While shockingly unscientific, this first test suggests to me that the app is providing no worse a distribution than rolling a real, physical D20.
The second test uses Pearson's chi-squared test as explained on Stack exchange. Feel free to follow those links if you want to know more. Note that I'm merely following along with the Stack Exchange post here, if there's something wrong with this methodology please do let me know.
For this test, I rolled four times the minimum number of dice, did the calculations, and here are my results:
D4 | D6 | D8 | D10 | D12 | D20 | |
---|---|---|---|---|---|---|
Rolls | 80 | 120 | 180 | 200 | 240 | 400 |
Result | 0.70 | 5.90 | 10.49 | 13.00 | 13.80 | 9.50 |
If we cross check the result with the "upper-tails critical values" table (reproduced from the stack exchange site below), we can see that all of the results fall below the 0.90 probability value, which suggests that the die probably has no bias.
v | 0.90 | 0.95 | 0.975 | 0.99 | 0.999 |
---|---|---|---|---|---|
3 (D4) | 6.251 | 7.815 | 9.348 | 11.345 | 16.266 |
5 (D6) | 9.236 | 11.070 | 12.833 | 15.086 | 20.515 |
7 (D8) | 12.017 | 14.067 | 16.013 | 18.475 | 24.322 |
9 (D10) | 14.684 | 16.919 | 19.023 | 21.666 | 27.877 |
11 (D12) | 17.275 | 19.675 | 21.920 | 24.725 | 31.264 |
19 (D20) | 27.204 | 30.144 | 32.852 | 36.191 | 43.820 |
As I mentioned earlier, one user emailed me and said that he rolled 5D20 twice, and found that the first roll contained three 10s and the second roll contained three threes. This surprised him (and me), and it just doesn't seem right.
I'm not sure how to test this. Just for fun, I rolled 50 5D20 groups using the app and also 50 5D20 groups using physical dice (again, I don't know if my physical dice are fair), and these are the results:
Pairs | Two pairs | Three of a kind | |
---|---|---|---|
App | 18 (36%) | 2 (4%) | 1 (2%) |
Physical | 17 (34%) | 2 (4%) | 0 (0%) |
I'm not sure if we can draw any conclusions from this test, but it seems to me that I'm not getting a huge number of more duplicates from the app as I am from real dice.
It seems like the app is rolling fair dice. The Pearson test does indeed suggest that the dice are probably not biased, and when comparing the app results to physical dice results, I don't see anything unusual.
We as human beings see patterns everywhere, even when they're created by random chance. Sometimes crazy things happen as a result of randomness, and it's tempting to say that there's an order to it, or a bias, but often there just isn't.
And remember, if we held a global coin-flipping championship, someone would win.