General Linear Hypothesis Testing (GLHT) and Matrices, Contrasts, and F-Tests

Author
Affiliation

Alex Kaizer

University of Colorado-Anschutz Medical Campus

This page is part of the University of Colorado-Anschutz Medical Campus’ BIOS 6618 Recitation collection. To view other questions, you can view the BIOS 6618 Recitation collection page or use the search bar to look for keywords.

Contrasts and General Linear Hypothesis Testing

We’ve discussed various ways we can compare different variables in our regression model:

  • With categorical variables we can either change the reference category and refit the model to compare two groups if neither is the reference (e.g., never smokers as reference, with former and current smokers as the indicators included in the model). Or we can use the variance-covariance matrix to calculate these differences from the original model.
  • We can calculate the difference for more than a 1-unit change in a continuous predictor (e.g., what if \(X\) increases by 5 units).

As noted above, we could potentially refit the models the change the reference category or otherwise transform some aspect of the data/model to try and answer hypotheses that may stem from these questions:

  • Is there a significant difference between former and heavy smokers?
  • Is the change for a 5-unit increase in \(X\) significantly different from 0 (or any other null hypothesis)?

Let’s examine some of these questions with our carotenoids data set:

On our homework you are given a dataset on a cross-sectional study was designed to investigate the relationship between personal characteristics, dietary factors, and plasma concentrations of retinol, beta-carotene and other carotenoids:

Code
library(multcomp) # package to load for contrasts/general linear hypothesis testing

dat <- read.table("../../.data/carotenoids.dat")

colnames(dat) <- c('age','sex','smoke','bmi','vitamins','calories','fat',
                    'fiber','alcohol','chol','betadiet','retdiet','betaplas','retplas')

dat$AK_smoke_f <- factor(dat$smoke, levels=c(1,2,3), labels=c('Never','Former','Current'))

Contrasts

In the most general sense, contrasts are differences between regression coefficients. Further, the comparison of coefficients must sum to 0: \(\sum c_i = 0\).

For example, if our regression model was comparing some outcome between never, former, and current smokers we might have:

\[ \hat{Y} = \hat{\beta}_{0} + \hat{\beta}_{\text{former}} I_{F} + \hat{\beta}_{\text{current}} I_{C} \]

If we wanted to compare (1) never and former or (2) never and current smokers we would be able to simply evaluate our output’s regression coefficient table and interpret the \(t\)-test/p-value provided.

If we wanted to compare if the mean levels were the same between former and current smokers, we could propose a contrast: \((0, 1, -1)\). This would translate to \(H_0 \colon \hat{\beta}_{\text{former}} - \hat{\beta}_{\text{current}} = 0\).

We can make a more direct connection to the mean for each group if we recall:

  • \(\mu_{\text{never}} = \hat{\beta}_{0}\)
  • \(\mu_{\text{former}} = \hat{\beta}_{0} + \hat{\beta}_{\text{former}}\)
  • \(\mu_{\text{current}} = \hat{\beta}_{0} + \hat{\beta}_{\text{current}}\)

Then we can see what is being compared for each pairwise comparison (and a more direct connection to ANOVA):

  • Never vs. Former: \((\hat{\beta}_{0} + \hat{\beta}_{\text{former}}) - \hat{\beta}_{0} = \hat{\beta}_{\text{former}}\)
  • Never vs. Current: \((\hat{\beta}_{0} + \hat{\beta}_{\text{current}}) - \hat{\beta}_{0} = \hat{\beta}_{\text{current}}\)
  • Current vs. Former: \((\hat{\beta}_{0} + \hat{\beta}_{\text{former}}) - (\hat{\beta}_{0} + \hat{\beta}_{\text{current}}) = \hat{\beta}_{\text{former}} - \hat{\beta}_{\text{current}}\)

Our default hypothesis test for each of these is that there is no difference (e.g., 0).

While this is meant to be a general comparison of regression coefficients, it is restricted in that the contrast must sum to 0. We are also restricted to testing if the difference is 0.

Contrasts Example

Let’s fit a reference cell model first between beta-carotene in the diet as our outcome and smoking status as our predictor:

Code
mod1 <- glm(betadiet ~ AK_smoke_f, data=dat)
summary(mod1)

Call:
glm(formula = betadiet ~ AK_smoke_f, data = dat)

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)         2.1934     0.1168  18.771   <2e-16 ***
AK_smoke_fFormer    0.1618     0.1797   0.901   0.3685    
AK_smoke_fCurrent  -0.4899     0.2520  -1.944   0.0528 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 2.143603)

    Null deviance: 682.12  on 314  degrees of freedom
Residual deviance: 668.80  on 312  degrees of freedom
AIC: 1139.1

Number of Fisher Scoring iterations: 2

We see from our output, that there is no statistically significant difference between never smokers and former or current smokers (p=0.3685, p=0.0528, respectively). However, if we wished to compare the former and current smokers we would need to specify our contrast:

Code
c <- matrix( c(0,1,-1), nrow=1 )
summary(glht(mod1,linfct=c))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)  
1 == 0   0.6517     0.2617    2.49   0.0128 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

We see here that p=0.0128, so we’d reject the null hypothesis that former and current smokers have equal means for the dietary beta-carotene.

Contrasts with Cell Means Models

One of the natural applications for contrasts is with cell means models for regression:

Code
mod1_cellmeans <- glm(betadiet ~ AK_smoke_f - 1, data=dat)
summary(mod1_cellmeans)

Call:
glm(formula = betadiet ~ AK_smoke_f - 1, data = dat)

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
AK_smoke_fNever     2.1934     0.1168   18.77  < 2e-16 ***
AK_smoke_fFormer    2.3552     0.1365   17.25  < 2e-16 ***
AK_smoke_fCurrent   1.7035     0.2233    7.63 2.87e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 2.143603)

    Null deviance: 2186.8  on 315  degrees of freedom
Residual deviance:  668.8  on 312  degrees of freedom
AIC: 1139.1

Number of Fisher Scoring iterations: 2

We see in our summary output that all of our means are significantly different from 0, but we haven’t completed any pairwise comparisons. We can specify three contrasts to replicate our results above from the reference cell model:

Code
c_cellmeans <- matrix( c(1,-1,0, 
                         1,0,-1, 
                         0,1,-1), 
                       nrow=3, byrow=T ) # specify a matrix with 3 rows (one for each contrast)

pairwise_cellmeans <- glht(mod1_cellmeans, linfct=c_cellmeans)
summary(pairwise_cellmeans, test=adjusted("none"))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f - 1, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)  
1 == 0  -0.1618     0.1797  -0.901   0.3678  
2 == 0   0.4899     0.2520   1.944   0.0519 .
3 == 0   0.6517     0.2617   2.490   0.0128 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- none method)

This reflects our results from before!

We can also note that summary.glht includes options to adjust for multiplicity (the default is a “single-step method” adjustment):

Code
summary(pairwise_cellmeans)

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f - 1, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)  
1 == 0  -0.1618     0.1797  -0.901   0.6353  
2 == 0   0.4899     0.2520   1.944   0.1233  
3 == 0   0.6517     0.2617   2.490   0.0333 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

With these specific results, we see that our conclusions would not change for each pairwise test even after accounting for multiple testing.

General Linear Hypothesis Testing

Testing Differences Other Than 0

We can generalize contrasts to test for differences that aren’t just 0. For example, we might wish to revise our cell means result to compare to a difference of 0.5 between groups:

Code
c_cellmeans <- matrix( c(1,-1,0, 
                         1,0,-1, 
                         0,1,-1), 
                       nrow=3, byrow=T ) # specify a matrix with 3 rows (one for each contrast)
pairwise_cellmeans_glht <- glht(mod1_cellmeans, linfct=c_cellmeans, rhs=c(0.5,0.5,0.5))
summary(pairwise_cellmeans_glht, test=adjusted("none"))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f - 1, data = dat)

Linear Hypotheses:
         Estimate Std. Error z value Pr(>|z|)    
1 == 0.5  -0.1618     0.1797  -3.683 0.000231 ***
2 == 0.5   0.4899     0.2520  -0.040 0.967883    
3 == 0.5   0.6517     0.2617   0.580 0.562171    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- none method)

We can see that the linear hypothesis test is now comparing our estimates to 0.5 instead of 0. It makes sense that the first pairwise difference (Never-Former) is now significant, but the other two are not. However, direction does matter since we aren’t testing if the difference is either positive or negative 0.5, just +0.5.

Comparisons with \(\sum c_i \neq 0\)

With GLHT, we can also compare any combination of interest, even if it doesn’t sum to 0.

For example, we can compare if the means for current smokers is equal to half that of never smokers. This null hypothesis is \(\frac{1}{2}\mu_{N} = \mu_{C} \implies \frac{1}{2}\mu_{N} + 0 \mu_F + -1 \mu_{C} = 0\):

Code
c3_cellmeans <- matrix(c(0.5,0,-1), nrow=1)
summary(glht(mod1_cellmeans,linfct=c3_cellmeans))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f - 1, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)   
1 == 0  -0.6068     0.2308  -2.629  0.00855 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

We can also specify this in terms of the reference cell model with

\[ \frac{1}{2}\mu_{N} + 0 \mu_F + -1 \mu_{C} = 0 \implies \frac{1}{2}\hat{\beta}_{0} - (\hat{\beta}_{0} + \hat{\beta}_{C}) = 0 \implies -\frac{1}{2}\hat{\beta}_{0} - \hat{\beta}_{C} = 0 \]

Code
c3 <- matrix(c(-0.5,0,-1), nrow=1)
summary(glht(mod1,linfct=c3))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)   
1 == 0  -0.6068     0.2308  -2.629  0.00855 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

We see the reference cell and cell means model results match, even though our contrasts were:

\[ \begin{matrix} \text{Reference Cell} \\ \text{Cell Means} \end{matrix} \begin{matrix} = \\ = \end{matrix} \begin{pmatrix} -0.5 & 0 & -1 \\ 0.5 & 0 & -1 \end{pmatrix} \]

More Complex Models

We can also play around with far more complex models and use our GLHT to estimate various effects, combinations of predictors, and/or compare to hypotheses that are equal to 0 or any other value. Consider:

Code
glm_be_cray <- glm(betadiet ~ AK_smoke_f + bmi + age, data=dat)
summary(glm_be_cray)

Call:
glm(formula = betadiet ~ AK_smoke_f + bmi + age, data = dat)

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        2.012354   0.495050   4.065  6.1e-05 ***
AK_smoke_fFormer   0.161407   0.180366   0.895   0.3715    
AK_smoke_fCurrent -0.461002   0.257023  -1.794   0.0738 .  
bmi               -0.003575   0.013861  -0.258   0.7966    
age                0.005400   0.005750   0.939   0.3484    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 2.150732)

    Null deviance: 682.12  on 314  degrees of freedom
Residual deviance: 666.73  on 310  degrees of freedom
AIC: 1142.1

Number of Fisher Scoring iterations: 2

If we wanted to compare a 35-year old nonsmoker and 50-year old former smoker with the same BMI, we would have:

\[ \begin{align} &\hat{\beta}_0 &+ 0 \hat{\beta}_{F} &+ 0 \hat{\beta}_{C} &+ X_{BMI}\hat{\beta}_{BMI} &+ 35 \hat{\beta}_{age} \\ - (&\hat{\beta}_0 &+ 1 \hat{\beta}_{F} &+ 0 \hat{\beta}_{C} &+ X_{BMI}\hat{\beta}_{BMI} &+ 50 \hat{\beta}_{age}) \\ \hline &0\hat{\beta}_0 & -1 \hat{\beta}_{F} & \;\;\;\;0 \hat{\beta}_{C} & 0 \hat{\beta}_{BMI} & -15 \hat{\beta}_{age} \end{align} \]

Code
c4 <- matrix(c(0,-1,0,0,-15), nrow=1)
summary(glht(glm_be_cray,linfct=c4))

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f + bmi + age, data = dat)

Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)
1 == 0  -0.2424     0.2012  -1.205    0.228
(Adjusted p values reported -- single-step method)

From this test result we see that there is no significant difference between these two combinations based on our regression output.

We could also calculate a confidence interval around our estimate:

Code
confint(glht(glm_be_cray,linfct=c4))

     Simultaneous Confidence Intervals

Fit: glm(formula = betadiet ~ AK_smoke_f + bmi + age, data = dat)

Quantile = 1.96
95% family-wise confidence level
 

Linear Hypotheses:
       Estimate lwr     upr    
1 == 0 -0.2424  -0.6367  0.1519

Note that it summarize our results as “simultaneous CIs” to control the 95% family-wise confidence level. In other words, if we were testing multiple hypotheses, it would apply a correction to not just our p-value, but also the confidence intervals:

Code
c4_plus2random <- matrix(c(0,-1,0,0,-15, 
                           1,0,0,10,0, 
                           0,-1,1,10,10), 
                         nrow=3, byrow=T)
confint(glht(glm_be_cray, linfct=c4_plus2random))

     Simultaneous Confidence Intervals

Fit: glm(formula = betadiet ~ AK_smoke_f + bmi + age, data = dat)

Quantile = 2.3513
95% family-wise confidence level
 

Linear Hypotheses:
       Estimate lwr     upr    
1 == 0 -0.2424  -0.7155  0.2307
2 == 0  1.9766   1.0396  2.9136
3 == 0 -0.6042  -1.3575  0.1492

The first CI is now wider, reflecting the correction for multiple testing to maintain the family-wise type I error rate.

Joint Testing

One aspect we haven’t covered yet, what if you wish to jointly test multiple hypotheses at the same time (e.g., like an overall \(F\)-test)? This would account for the potential of multiple testing and may better reflect specific combinations of hypotheses.

In one case, we might have a specific question of interest that relies on specifying multiple hypotheses. For instance, what if we believed that our dietary beta-carotene should have a linear trend where former smokers have a 0.5 higher level than never smokers, and current smokers are 1.0 higher level than never smokers:

Code
# returning to our simpler reference cell model with just smoking status
K <- matrix( c(0,1,0,
               0,0,1),
             nrow=2, byrow=T)
hyp_val <- c(0.5,1.0)

joint_test <- glht(mod1, linfct = K, rhs = hyp_val)

# Specify a joint test with test=Ftest() that conducts an overall F-test for the two hypotheses
summary(joint_test, test=Ftest())

     General Linear Hypotheses

Linear Hypotheses:
         Estimate
1 == 0.5   0.1618
2 == 1    -0.4899

Global Test:
      F DF1 DF2    Pr(>F)
1 17.48   2 312 6.366e-08

This \(F\)-test indicates that we would reject the joint null hypothesis, at least one of the null hypotheses would be rejected.

A natural follow-up question would be, which one(s) do we reject? To correct for multiple comparisons, we see

Code
# For comparison, summarize the family-wise corrected individual tests
summary(joint_test)

     Simultaneous Tests for General Linear Hypotheses

Fit: glm(formula = betadiet ~ AK_smoke_f, data = dat)

Linear Hypotheses:
         Estimate Std. Error z value Pr(>|z|)    
1 == 0.5   0.1618     0.1797  -1.882    0.113    
2 == 1    -0.4899     0.2520  -5.912 6.76e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)

In other words, we would reject our second null hypothesis that \(H_0 \colon \beta_{C} = 1\), but not the first that \(H_0 \colon \beta_{F} = 0.5\).

In practice, I tend to favor conducting the hypothesis test that directly addresses the question of interest (especially if you end up correcting for multiple comparisons anyway). However, there are likely cases where a global test of a joint hypothesis is most appropriate and efficient.

F-tests with Matrices versus anova and Connections to Joint Testing

The \(F\)-test can be used to test the GLH: \[\begin{equation*} F = \frac{[(\mathbf{c}\hat{\boldsymbol{\beta}}-\mathbf{d})^T(\mathbf{c}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{c}^T)^{-1}(\mathbf{c}\hat{\boldsymbol{\beta}}-\mathbf{d})/r]}{\hat{\sigma}^2_{Y|X}} \sim F_{r,n-p-1} \end{equation*}\]

where \(\hat{\sigma}^2_{Y|X}\) is our MSE from the full model. This reduces to our Partial \(F\)-test for testing a group of variables, because

\[\begin{equation*} (\mathbf{c}\hat{\boldsymbol{\beta}})^T(\mathbf{c}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{c}^T)^{-1}(\mathbf{c}\hat{\boldsymbol{\beta}}) = SS_{model}(full)-SS_{model}(reduced) \end{equation*}\]

Therefore, a test of a single linear hypothesis for a single parameter also reduces to our \(t\)-test.

Note, the above formulas do NOT require us to refit a “reduced” model to compare to our full model. Instead, all of our needed information is contained within our design matrix and our \(\mathbf{c}_{r \times p}\) matrix of rank \(r\) where \(r \leq p\) (i.e., number of rows that are “unique” (i.e., not made up of other combinations of rows)).

Let’s compare the overall F-test for the following model using matrices or by fitting a reduced model and using the anova() function:

Code
### Model-based approach
glm_be_cray <- glm(betadiet ~ AK_smoke_f + bmi + age, data=dat)
glm_be_null <- glm(betadiet ~ 1, data=dat)
anova(glm_be_cray, glm_be_null, test='F')
Analysis of Deviance Table

Model 1: betadiet ~ AK_smoke_f + bmi + age
Model 2: betadiet ~ 1
  Resid. Df Resid. Dev Df Deviance      F Pr(>F)
1       310     666.73                          
2       314     682.12 -4  -15.388 1.7887 0.1309
Code
### Matrix approach
# Create indicators for smoking categories
dat$AK_smoke_former <- dat$smoke==2
dat$AK_smoke_current <- dat$smoke==3

Y <- dat$betadiet
YtY <- t(Y) %*% Y
X <- cbind(1, dat$AK_smoke_former, dat$AK_smoke_current, dat$bmi, dat$age)
Xt <- t(X)
XtX <- t(X) %*% X
XtX_inv <- solve(XtX)
beta_vec <- XtX_inv %*% t(X) %*% Y
Yhat <- X %*% beta_vec
mse <- ( YtY - t(beta_vec) %*% t(X) %*% Y ) / ( length(Y) - ncol(X) )
c <- matrix( c(0,1,0,0,0,
               0,0,1,0,0,
               0,0,0,1,0,
               0,0,0,0,1), ncol=ncol(X), byrow=T)
d <- matrix( c(0,0,0,0), ncol=1 )
Fstat_num <- ( t(c%*%beta_vec - d) %*% solve(c %*% XtX_inv %*% t(c)) %*% (c%*%beta_vec - d) ) / nrow(c)
Fstat <- Fstat_num / mse
Fstat
         [,1]
[1,] 1.788729

We can see that we’ve arrived at the same result, but without having to fit the reduced model!

The connection between the partial F-test and the GLHT may help us better draw connections between our formulas and the two approaches:

\[ F = \frac{[SS_{model}(full) - SS_{model}(reduced)]/k^{*}}{MS_{error}(full)} \sim F_{k^{*},n-p^{*}-k^{*}-1} \] \[ F = \frac{[(\mathbf{c}\hat{\boldsymbol{\beta}}-\mathbf{d})^T(\mathbf{c}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{c}^T)^{-1}(\mathbf{c}\hat{\boldsymbol{\beta}}-\mathbf{d})/r]}{\hat{\sigma}^2_{Y|X}} \sim F_{r,n-p-1} \]

From our notation we see for the partial F-test that the total number of parameters in the full model is \(p^{*}+k^{*}\); whereas in the GLH F-test it is just \(p\).

Further, for the partial F-test \(k^{*}\) is the number of parameters removed from the full model; whereas in the GLH F-test \(r\) is the rank of \(\mathbf{c}\):

\[\begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}\]

where each column represents a beta coefficient, including the intercept, and each row represents a different hypothesis test being conducted jointly/simultaneously).

For completeness, the null hypothesis the GLH is testing is:

\[ H_0 \colon \; \mathbf{c}\boldsymbol{\beta} = \mathbf{d} \implies \begin{bmatrix} 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \beta_0 \\ \beta_{former} \\ \beta_{current} \\ \beta_{bmi} \\ \beta_{age} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} \]