## Factorial Anova

If all ANOVA had to offer was a small edge over t tests in looking after significance levels, it wouldn't be worth all the effort involved in calculating it. But by an extension of the approach, called factorial ANOVA, we can include any number of factors in a single experiment and look at the independent effect of each factor without committing the cardinal sin of distorting the overall probability of a chance difference. As a bonus, by examining interactions between factors, we can also see whether, for example, some treatments work better on some types of subjects or have synergistic effects with other treatments.

To illustrate, let's examine that mainstay of midwinter television, the cough and cold remedy. Colds seem to come in two varieties, runny noses and hacking coughs. Some remedies like to take a broad-spectrum approach; at last count, Driptame was supposed to knock out 26 symptoms. Other brands try for "specificity"; Try-a-mine-ic eats up about 6 feet of drugstore shelf space with all the permutations to make you dry up or drip more, cough less or cough loose. All in all, an ideal situation for factorial ANOVA. The original question remains "Is there any difference overall among brands?" But some other testable questions come to mind, for example, "Do broad-spectrum remedies work better or worse than specific ones?""Which kind of cold is more uncomfortable?" and "Do remedies work better or worse on runny noses or hacking cough?" Believe it or not, it's possible to have a swing at all of these questions at one time, using factorial ANOVA.

Here's how. Start out with, say, 100 folks with runny noses and 100 others with bad coughs. Half of each group uses a broad-spectrum (BS) agent, and half uses a specific (SP) remedy. In turn, half of these groups get Brand A, and half get Brand B, or 25 in each subgroup. The experimental design would look like Table 5-3.

Now if we get everyone to score the degree of relief on a 15-point scale, the mean of runny noses (RNs) is shown as 8.0 on the right, and the mean of hacking coughs (HCs) is 6.5. Similarly, the mean for Brand A is at the bottom, 6.0, as are the other brand means. Means for BS and SP drugs are shown on the last line and are the averages of the two brands in each group. Finally, we have indicated all the individual subgroup means. Sums of Squares for each factor can be developed as before by taking differences between individual group means and the grand mean, squaring, and summing. This is then mul-

Table 5-3

Experimental Design for Cold Remedy Study

Broad- Specific

Broad- Specific

Table 5-3

Experimental Design for Cold Remedy Study

Runny |
7.5 |
8.5 |
9.0 |
7.0 |
8.0 |

noses | |||||

(RNs) | |||||

Hacking |
4.5 |
5.5 |
5.0 |
11.0 |
6.5 |

coughs | |||||

(HCs) | |||||

Mean |
6.0 |
7.0 |
7.0 |
9.0 |
7.25 |

(brands) | |||||

Mean | |||||

(BS/SP) |
6.5 |
8.0 |

tiplied by a constant related to the number of levels of each factor in the design. For example, the Sum of Squares for BS versus SP drugs is as follows:

Sum of Squares (BS/SP) = [(6.5 - 7.25)2 + (8.0 - 7.25)2] X 100 = 112.5

Mean squares can then be developed by dividing by the degrees of free-dom—in this case, 1—as we did before. But there is still more gold in them thar hills, called interaction terms. As discussed, one would hope that relievers that are specific for drippy noses work better in the RN group and that those specific to coughs would be more helpful to HC members. That's a hypothesis about how the two factors go together or interact.

In Figure 5-2, we have displayed, on the left, the four cell means for the BS remedies. The center picture contains the cell means for the SP cures. Finally, the right graph looks at BS remedies against SP remedies by averaging across brands and plotting.

In the left picture, we see that overall, the BS drugs are more effective for RNs than HCs (8.0 vs 5.0) and that Brand B has an edge of 1 unit on Brand A. But there is no evidence of interaction, because the lines are parallel. In the jargon, this overall difference is called a main effect (in this case, a main effect of Brand A and a main effect of RN vs HC). By contrast, in the middle picture, Brand C, which was specific for RNs, works much better for RNs, and Brand D works better for HCs. Overall, Brand D is only a bit better than C. So this picture shows a strong interaction but little in the way of main effects. Finally, the right picture has a bit of both because SP and BS drugs are equally

BS brands

SP brands

SP brands

RN Figure 5-2

Graphs of interactions.

effective for RNs, but SP drugs work better for HCs. The important point is that useful information is often contained in the interaction terms—in this case, information about the very strong effect of SP drugs if they are used as intended, which would be lost in examining the overall effect of each factor.

The calculations become a bit hairy, but the general principle remains the same. The numerators for each effect are based on squared differences among mean values. The denominators, or error terms, are derived from squared differences between individual values and the appropriate mean. The end result is an expanded ANOVA table with one line for each main effect and interaction and an F ratio for each effect that indicates whether or not it was significant. Table 5-4 is an example of an ANOVA table from the present study.

The F ratio is like all our other tests in that the larger it is, the more statistically significant is the result. However, the associated probabilities are dependent on the number of terms in both numerator and denominator.

## Post a comment