6.4 Fisher’s exact test

This was all nice, but there is a small problem with the \(\chi^2\) test, namely it relies on some approximations and works well only for large sample sizes. How large, well, I’ve heard about the rule of fives (that’s what I called it). The rule states that there should be >= 50 (not quite 5) observations per matrix and >= 5 expected observations per cell (applies to every cell). In case this assumption does not hold, one should use, e.g. Fisher’s exact test (Fisher, yes, I think I heard that name before).

So let’s assume for a moment that we were able to collect somewhat less data like in the matrix below:

mEyeColorSmall = round.(Int, mEyeColor ./ 20)
mEyeColorSmall
2×2 Matrix{Int64}:
 11   8
 14  16

Here, we reduced the number of observations 20 times compared to the original mEyeColor matrix from the previous section. Since the test we are going to apply (Htests.FisherExactTest) requires integers then instead of rounding a number to 0 digits [e.g. round(12.3, digits = 0) would return 12.0, so Float64] we asked the round function to deliver us the closest integers (e.g. 12).

OK, let’s, run the said Htests.FisherExactTest. Right away we see a problem, the test requires separate integers as input: Htests.FisherExactTest(a::Integer, b::Integer, c::Integer, d::Integer).

Note: Just like Real type from Section 3.4 also Integer is a supertype. It encompasses, e.g. Int and BigInt we met in Section 3.9.5.

Still, we can obtain the necessary results very simply, by:

# assignment goes column by column (left to right), value by value
a, c, b, d = mEyeColorSmall

Htests.FisherExactTest(a, b, c, d)
Fisher's exact test
-------------------
Population details:
    parameter of interest:   Odds ratio
    value under h_0:         1.0
    point estimate:          1.55691
    95% confidence interval: (0.4263, 5.899)

Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.6373

Details:
    contingency table:
        11   8
        14  16

We are not going to discuss the output in detail. Still, we can see that here due to the small sample size we don’t have enough evidence to reject the \(H_{0}\) (p > 0.05) on favor of \(H_{A}\). Interestingly, due to the small sample size we came to a different conclusion despite the same underlying populations and the same proportions. Let’s make an analogy here and let’s take it to an extreme. Imagine I got two coins in my pocket, one fair (50/50 heads to tails rate) and one biased (70/30 heads to tails ratio). I give you one to find out which coin it is. That’s easy to settle out with 1’000 tosses (since you wold get, e.g. 688/312 heads to tails ratio instead of 494/506), but it is not possible to do it with just one toss (no matter the outcome). With three tosses and two heads we still cannot be sure of it since a fair coin would have produced this exact output with the probability of 37.5% (HHT, or THH, or HTH each with p = \(\frac{1}{2}^3 = \frac{1}{8} = 0.125\)) and more extreme (HHH) with the probability = 12.5% (\(\frac{1}{2}^3 = \frac{1}{8}\) = 0.125). So, there just wouldn’t be enough evidence.



CC BY-NC-SA 4.0 Bartlomiej Lukaszuk