ࡱ > T V S c U@ 9? bjbj . q6
<+ <+ <+ 8 t+ 4 + J 2 , v- - - - g. g. g. PJ RJ RJ RJ RJ RJ RJ $ L R UN vJ / g. g. / / vJ - - J 5 5 5 / - - C Z 5 / PJ 5 ( 5 6 V > @ f? - , `ȗTb <+ #0 > ? $ J 0 J > x N 1 x N f? N f? l g. {. 5 . . p g. g. g. vJ vJ ( <+ 5 ^ <+ The analysis of categorical data: Fishers Exact test
Jenny V Freeman, Michael J Campbell
Introduction
In the previous tutorial we have outlined some simple methods for analysing binary data, including the comparison of two proportions using the Normal approximation to the binomial and the Chi-squared test( ADDIN REFMGR.CITE Freeman2007320The analysis of categorical dataJournal320The analysis of categorical dataFreeman,J.V.Julious,S.A.2007scopeStatisticsNot in File1821Scope161Scope1(1)). However, these methods are only approximations, although the approximations are good when the sample size is large. When the sample size is small we can evaluate all possible combinations of the data and compute what are known as exact P-values.
Fishers Exact test
When one of the expected values (note: not the observed values) in a 2x2 table is less than 5, and especially when it is less than 1, then Yates correction can be improved upon. In this case Fishers Exact test, proposed in the mid-1930s almost simultaneously by Fisher, Irwin and Yates ADDIN REFMGR.CITE Armitage2002319Statistical Methods in Medical ResearchBook, Whole319Statistical Methods in Medical ResearchArmitage,P.Berry,P.J.Matthews,J.N.S.2002StatisticsNot in File1344OxfordBlackwells2(2), can be applied. The null hypothesis for the test is that there is no association between the rows and columns of the 2x2 table, such that the probability of a subject being in a particular row is not influenced by being in a particular column. If the columns represented the study group and the rows represented the outcome, then the null hypothesis could be interpreted as the probability of having a particular outcome is not influenced by the study group, and the test evaluates whether the two study groups differ in the proportions with each outcome.
An important assumption for all of the methods outlined, including Fishers exact test, is that the binary data are independent. If the proportions are correlated, then more advanced techniques should be applied. For example in the leg ulcer example of the previous tutorial ADDIN REFMGR.CITE Freeman2007320The analysis of categorical dataJournal320The analysis of categorical dataFreeman,J.V.Julious,S.A.2007scopeStatisticsNot in File1821Scope161Scope1(1), if there were more than one leg ulcer per patient, we could not treat the outcomes as independent.
The test is based upon calculating directly the probability of obtaining the results that we have obtained (or results more extreme) if the null hypothesis is actually true, using all possible 2x2 tables that could have been observed, for the same row and column totals as the observed data. These row and column totals are also known as marginal totals. What we are trying to establish is how extreme our particular table (combination of cell frequencies) is in relation all the possible ones that could have occurred given the marginal totals.
This is best explained by a simple worked example. The data below come from an RCT comparing intra-muscular magnesium injections with placebo for the treatment of chronic fatigue syndrome ADDIN REFMGR.CITE Cox1991318Red blood cell magnesium and chronic fatigue syndromeJournal318Red blood cell magnesium and chronic fatigue syndromeCox,I.MCampbell,M.J.Dowson,D.1991riskStatisticsNot in File757760Lancet337Lancet1(3). Of the 15 patients who had the intra-muscular magnesium injections 12 felt better (80%), whereas, of the 17 on placebo, only 3 felt better (18%).
Table 1: Results of the study to examine whether intramuscular magnesium is better than placebo for the treatment of chronic fatigue syndrome
MagnesiumPlaceboTotal Felt better
Did not feel better12
33
1415
17Total151732
There are 16 different ways of rearranging the cell frequencies for the above table, whilst keeping the marginal totals the same, as illustrated below in figure 1. The result that corresponds to our observed cell frequencies is (xiii):
Figure 1: Illustration of all the different ways of rearranging cell frequencies in table 1, but with the marginal totals remaining the same
(i) 0 15
15 2(ii) 1 14
14 3(iii) 2 13
13 4(iv) 3 12
12 5(v) 4 11
11 6(vi) 5 10
10 7(vii) 6 9
9 8(viii) 7 8
8 9(ix) 8 7
7 10(x) 9 6
6 11(xi) 10 5
5 12(xii) 11 4
4 13(xiii) 12 3
3 14(xiv) 13 2
2 15(xv) 14 1
1 16(xvi) 15 0
0 17
The general form of table 1 is given in table 2 and under the null hypothesis of no association Fisher showed that the probability of obtaining the frequencies, a, b, c and d in table 2 is
EMBED Equation.3 (1)
where x! is the product of all the integers between 1 and x, e.g. 5!=1x2x3x4x5=120 (note that for the purpose of this calculation, we define 0! as 1). Thus for each of the results (i) to (xvi) the exact probability of obtaining that result can be calculated (table 3). For example, the probability of obtaining table (i) in figure 1 is EMBED Equation.3 =0.0000002
Table2: General form of table 1
Column 1Column 2TotalRow 1
Row 2a
cb
da+b
c+dTotala+cb+da+b+c+d
Table 3: Probabilities associated with each of the frequency tables above, calculated using formula 1
TableabcdP-value i
ii
iii
iv
v
vi
vii
viii
ix
x
xi
xii
xiii
xiv
xv
xvi0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1515
14
13
12
11
10
9
8
7
6
5
4
3
2
1
015
14
13
12
11
10
9
8
7
6
5
4
3
2
1
02
3
4
5
6
7
8
9
10
11
12
13
14
15
16
170.0000002
0.0000180
0.0004417
0.0049769
0.0298613
0.1032349
0.2150728
0.2765221
0.2212177
0.1094916
0.0328475
0.0057426
0.0005469
0.0000252
0.0000005
0.0000000
From table 3 we can see that the probability of obtaining the observed frequencies for our data is that which corresponds with (xiii), which gives p=0.0005469 and the probability of obtaining our results or results more extreme (a difference that is at least as large) is the sum of the probabilities for (xiii) to (xvi) = 0.000573. This gives the one-sided P-value or obtaining our results or results more extreme, and in order to obtain the two-sided p-value there are several approaches. The first is to simply double this value, which gives p=0.0001146. A second approach is to add together all the probabilities that are the same size or smaller than the one for our particular result, in this case, all probabilities that are less than or equal to 0.0005469, which are tables (i), (ii), (iii), (xiii), (xiv), (xv) and (xvi). This gives a two-sided value of p=0.001033. Generally the difference is not great, though the first approach will always give a value greater than the second. A third approach, which is recommended by Swinscow and Campbell ADDIN REFMGR.CITE Swinscow2002209Statistics at square oneBook, Whole209Statistics at square oneSwinscow,T.D.VCampbell,M.J.2002displaying dataStatisticsNot in File10LondonBMJ Books2(4) is a compromise and is known as the mid-p method. All the values more extreme than the observed p-value are added up and these are added to one half of the observed value. This gives p=0.000759.
The criticism of the first two methods is that they are too conservative, i.e. is the null hypothesis was true, over repeated studies they would reject the null hypothesis less often than 5%. They are conditional on both sets of marginal totals being fixed, i.e. exactly 15 people being treated with magnesium and 15 feeling better. However, if the study were repeated, even with 15 and 17 in the magnesium and placebo groups respectively, we would not necessarily expect exactly 15 to feel better. The mid-p value method is less conservative, and gives approximately the correct rate of type I errors (false positives).
In either case, for our example, the P-value is less than 0.05, the nominal level for statistical significance and we can conclude that there is evidence of a statistically significant difference in the proportions feeling better between the two treatment groups. However, in common with other non-parametric tests, Fishers exact test is simply a hypothesis test. It will merely tell you whether a difference is likely, given the null hypothesis (of no difference). It gives you no information about the likely size of the difference, and so whilst we can conclude that there is a significant difference between the two treatments with respect to feeling better or not, we can draw no conclusions about the possible size of the difference.
Example data from last week
Table 1 shows the data from the previous tutorial. It is from a randomised controlled trial of community leg ulcer clinics ADDIN REFMGR.CITE Morrell1998231Cost effectiveness of community leg ulcer clinic: randomised controlled trialJournal231Cost effectiveness of community leg ulcer clinic: randomised controlled trialMorrell,C.J.Walters,S.J.Dixon,S.Collins,K.Brereton,L.M.L.Peters,J.Brooker,C.G.D.1998scopeStatisticsNot in File14871491British Medical Journal316British Medical Journal1(5), comparing the cost-effectiveness of community leg ulcer clinics with standard nursing care. The columns represent the two treatment groups, specialist leg ulcer clinic (clinic) and standard care (home), and the rows represent the outcome variable, in this case whether the leg ulcer has healed or not.
Table 1: 2 x 2 contingency table of Treatment (clinic/home) by Outcome (ulcer healed / not healed) for the Leg ulcer study
TreatmentClinicHomeTotalOutcome:
Healed
Not healed
22 (18%)
98 (82%)
17 (15%)
96 (85%)
39
194Total120 (100%)113 (100%)233
For this example the two-sided p-value from Fishers exact test is 0.599 two-sided and in this case we would not reject the null and would conclude that there is insufficient evidence to
Summary
This tutorial has described in detail Fishers exact test, for analysing simple 2x2 contingency tables when the assumptions for the chi-squared test are not met. It is tedious to do by hand, but nowadays is easily computed by most statistical packages.
ADDIN REFMGR.REFLIST Reference List
(1) Freeman JV, Julious SA. The analysis of categorical data. Scope 2007;16(1):18-21.
(2) Armitage P, Berry PJ, Matthews JNS. Statistical Methods in Medical Research. 4 ed. Oxford: Blackwells; 2002.
(3) Cox IM, Campbell MJ, Dowson D. Red blood cell magnesium and chronic fatigue syndrome. Lancet 1991;337:757-60.
(4) Swinscow TDV, Campbell MJ. Statistics at square one. 10 ed. London: BMJ Books; 2002.
(5) Morrell CJ, Walters SJ, Dixon S, Collins K, Brereton LML, Peters J, et al. Cost effectiveness of community leg ulcer clinic: randomised controlled trial. British Medical Journal 1998;316:1487-91.
When organising data such as this is it good practice to arrange the table with the grouping variable forming the columns and the outcome variable forming the rows.
Page PAGE 1
6 7 H Z \ ] j 7 8 9
" # % , 1 2 = > ) : + - 7 ? ' < @ d f Ҿںںڮڮڮ h
1 h j h Uh[ hy hg hS j hS Uh*p hM h> h j h> UhO h4 5\ hg 5h4 h4 5 A 6 7 \ ] j k
, - z { $$If a$gd gda gd q> ? 8? f i n s Y Z l m p q I \ l t u v m n o p q r ŻŔ h; h; CJ aJ h; h h; 5\ hZ h> ha 5\ j h> h> 0J 5U h> h 5\ h h 5\ h ha 5\ h 5\ h j ha Uha h h[ h
1 hq 3 ^ O O A A A A $x $If a$gd
@x $If gd kd $$If l \ H$ Z Z Z Z
t 0 6 4 4
l a P E x $If gd kd $$If l \ H$ Z Z Z Z
t 0 6 4 4
l a $x $If a$gd n o ^ Y T Y T O gd; gd gda kd $$If l \ H$ Z Z Z Z
t 0 6 4 4
l a o s y
h8BBx $If gd $x $If a$gdZ
h8BBx $If gd; $
h8BBx $If a$gdZ 7 # $
h8BBx $If a$gdZ kd $$If l 0ִ LTx<# h
t 6 4 4
l a $x $If a$gd $x $If a$gdZ
h8BBx $If gd; 7 # $
h8BBx $If a$gdZ kd $$If l [ ִ LTx<# h
t 6 4 4
l a ! !
! !
! ! ! ! ! ! ! ! !! #! $! %! '! (! )! +! ,! .! /! 1! 2! 4! 5! 6! 7! 9! :! =! >! @! B! C! D! F! G! H! J! K! L! T! V! Z! [! h! i! l! m! o! q! r! s! u! v! w! y! z! {! }! ~! ! ! ! ! ! ! ! ! ! ! ! ! h; h hZ h; hZ CJ aJ \
h8BBx $If gd; $x $If a$gdZ
h8BBx $If gd ! ! 9 % $
h8BBx $If a$gdZ kd. $$If l ִ LTx<# h
t 6 4 4
l a ! ! ! ! ! ! ! ! $x $If a$gd $x $If a$gdZ
h8BBx $If gd; ! ! ! 9 % $
h8BBx $If a$gdZ kd $$If l ִ LTx<# h
t 6 4 4
l a ! ! ! ! "! (! -! 3! 9! ?! E! K!
h8BBx $If gd
h8BBx $If gdZ $x $If a$gdZ
h8BBx $If gd; K! L! M! 9 % $
h8BBx $If a$gdZ kd $$If l ִ LTx<# h
t 6 4 4
l a M! N! O! P! Q! R! S! T! $x $If a$gd $x $If a$gdZ
h8BBx $If gd; T! U! \! 9 % $
h8BBx $If a$gdZ kdC $$If l ִ LTx<# h
t 6 4 4
l a \! b! h! n! t! z! ! ! ! ! ! !
h8BBx $If gdZ $x $If a$gdZ
h8BBx $If gd
h8BBx $If gd; ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! _" `" s" t" u" v" z" {" # # # # # # # # $ $ `$ a$ $ `&