Dice Roller Testing

Some reviewers of my Dice Roller app have commented that the distribution of rolls isn't as random as it could (or should) be. This is a record of my in-progress attempts to test the randomness of the app. I appreciate (almost) all the reviews I get, even the negative ones.

One of the complaints regarding the app is that the numbers on D20 rolls seems to be skewed toward the bottom end of the range. Another comment was that the distribution of rolls on a D20 doesn't seem to be adequately random, and that rolling a group of D20s resulted in a surprising number of duplicate values amongst the group.

First, let me say that there is no intentional biasing of results. I am making every effort to ensure that rolls are random. On default settings (in particular, when using the "throw" roll method), this includes:

A random initial rotation on the die,
a random forward force on the die,
a random left/right force on the die, and
a random spin on the die.

Once the initial random forces above are imparted to each die, the rest of the roll is handled by the physics engine used in Unity (the game engine I'm using).

100D20

The first test was to use the app to roll 100 D20s twice, and then roll a physical D20 100 times, to see what kind of distribution they would result in. A disclaimer: I have no idea if my physical dice are fair. I have no reason to believe they aren't.

Before you scroll down too far, compare the three bar charts below. The x axis is the die result (1-20) and the y axis is the number of results in a test of 100 rolls of the die. Two of the charts are the results of the app, and one of the charts is the results of rolling a physical die. Pick the one that you think is the physical die, and scroll down to see if you're right.

If you picked the second chart as the physical die rolls, you are correct. While shockingly unscientific, this first test suggests to me that the app is providing no worse a distribution than rolling a real, physical D20.

Pearson

The second test uses Pearson's chi-squared test as explained on Stack exchange. Feel free to follow those links if you want to know more. Note that I'm merely following along with the Stack Exchange post here, if there's something wrong with this methodology please do let me know.

For this test, I rolled four times the minimum number of dice, did the calculations, and here are my results:

	D4	D6	D8	D10	D12	D20
Rolls	80	120	180	200	240	400
Result	0.70	5.90	10.49	13.00	13.80	9.50

If we cross check the result with the "upper-tails critical values" table (reproduced from the stack exchange site below), we can see that all of the results fall below the 0.90 probability value, which suggests that the die probably has no bias.

v	0.90	0.95	0.975	0.99	0.999
3 (D4)	6.251	7.815	9.348	11.345	16.266
5 (D6)	9.236	11.070	12.833	15.086	20.515
7 (D8)	12.017	14.067	16.013	18.475	24.322
9 (D10)	14.684	16.919	19.023	21.666	27.877
11 (D12)	17.275	19.675	21.920	24.725	31.264
19 (D20)	27.204	30.144	32.852	36.191	43.820

Dupes in Groups

As I mentioned earlier, one user emailed me and said that he rolled 5D20 twice, and found that the first roll contained three 10s and the second roll contained three threes. This surprised him (and me), and it just doesn't seem right.

I'm not sure how to test this. Just for fun, I rolled 50 5D20 groups using the app and also 50 5D20 groups using physical dice (again, I don't know if my physical dice are fair), and these are the results:

	Pairs	Two pairs	Three of a kind
App	18 (36%)	2 (4%)	1 (2%)
Physical	17 (34%)	2 (4%)	0 (0%)

I'm not sure if we can draw any conclusions from this test, but it seems to me that I'm not getting a huge number of more duplicates from the app as I am from real dice.

Conclusions

It seems like the app is rolling fair dice. The Pearson test does indeed suggest that the dice are probably not biased, and when comparing the app results to physical dice results, I don't see anything unusual.

We as human beings see patterns everywhere, even when they're created by random chance. Sometimes crazy things happen as a result of randomness, and it's tempting to say that there's an order to it, or a bias, but often there just isn't.

And remember, if we held a global coin-flipping championship, someone would win.