The beta weights are, in essence, partial correlations. We'll briefly review what this means because it has major implications for what follows. If we have two predictor variables, such as age (A) and serum cholesterol (B), and one criterion variable, such as degree of stenosis (Y), then we can compute simple run-of-the-mill correlations between each of the predictors and the criterion. Let's make up some figures and assume rAY is 0.75 and rBY is 0.70. But we know that age and cholesterol are themselves correlated, say, at the 0.65 level. Then, the partial correlation between B and Y is the correlation after removing the contribution of age to both cholesterol and stenosis. In this case, the figure drops from 0.70 to 0.42.
So, the beta weight for variable x1 is the correlation between x1 and xA, eliminating the effects of x2 and x3; the beta weight for x2 eliminates x1 and x3, and so forth.
What this means for canonical correlation is that the two canonical variates that we've derived, xA and xB, do not account for all of the variance since only that portion of the variance uncorrelated with the other variables in the set was used. We can extract another pair of canonical variates that accounts for at least some of the remaining variability. How many pairs of equations can we get? If we have n variables in set A and m variables in set B, then the number of canonical correlations is the smaller of the two numbers. In our example, n is 3 and m is 2, so there will be two canonical correlations, or two pairs of canonical variates, xA1 paired with xB1 and xA2 paired with xB2.
In some ways, these variates are similar to the factors in exploratory factor analysis. First, they are extracted in order, so that the first one accounts for more of the variance than the second, the second more than the third (if we had a third one), and so on. Second, they are uncorrelated with each other ("orthogonal," to use the jargon). Third, not all of them need to be statistically significant. We usually hope that at least the first one is, but we may reach a point where one pair of variates, and all of the succeeding ones, may not be. There is a statistical test, based on the chi-square, that can tell us how many of the correlations are significant, in fact.
Was this article helpful?