Interpreting interaction term in a regression model

Interaction with two binary variables

In a regression model with interaction term, people tend to pay attention to only the coefficient of the interaction term.

Let’s start with the simpliest situation: \(x_1\) and \(x_2\) are binary and coded 0/1.

\[ E(y) = \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1x_2 \]

In this case, we have a saturated model; that is, we have three coefficients representing additive effects from the baseline situation (both \(x_1\) and \(x_2\) being 0). There are four different situations, with four combinations of \(x_1\) and \(x_2\).

A lot of people just pay attention to the interaction term. In the case of studying treatment effects between two groups, say female and male, that makes sense, the interaction term representing the difference between male and female in terms of treatment effect.

In this model:

\[ E(y) = \beta_1 female + \beta_2 treatment + \beta_{12} female*treatment \]

The two dummy-coded binary variables, female and treatment, form four combinations. The following 2x2 table represents the expected means of the four cells(combinations).

male female
control \(\beta_0\) \(\beta_0 + \beta_1\)
treatment \(\beta_0 + \beta_2\) \(\beta_0 + \beta_1 + \beta_2 + \beta_{12}\)

We can see from this table that, for example,

\[\beta_0=E(Y|(0,0))\]

that is, \(\beta_0\) is the expected mean of the cell (0,0) (male and control).

\[\beta_0 + \beta_1 =E(Y|(1,0))\]

that is ,\(\beta_0 + \beta_1\) is the expected mean of the cell (1,0) (female and control). And so on.

Now,

\[ \beta_{12} = (E(Y|(1,1))-E(Y|(0,1)))-(E(Y|(1,0))-E(Y|(0,0))) \]

that is, the coefficient on the interaction term is actually the difference in difference. That’s why in many situations, people are only interested in the interaction coefficient, since they are only interested in the diff-in-diff estimates. The usually diff-in-diff estimator in causal inference literature refer to something similar, instead of female vs. male, people are interested in the treatment effect difference in before and after treatment. If we simply replace female/male dummy with before/after dummy, we can use the same logic. In those situations, it’s fine to mainly focus on the interaction term coefficient.

In some other situations, the three coefficients are equally important. It depends on your interest. For example, if we are interested in studying differences between union member and non-union member and black vs. non-black, we may not be only interested in the interaction effect. Instead, we might be interested in all four cells, maybe all possible pairwise comparisons. In that case, we should pay attention to all three coefficients. Stata’s “margins” command is of great help if we’d like to compare the cell means.

Let’s take a look from a sample example in Stata:

webuse union3
reg ln_wage i.union##i.black, r
margins union#black
margins union#black, pwcompare
## 
## . webuse union3
## (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
## 
## . reg ln_wage i.union##i.black, r
## 
## Linear regression                               Number of obs     =      1,244
##                                                 F(3, 1240)        =      34.76
##                                                 Prob > F          =     0.0000
##                                                 R-squared         =     0.0762
##                                                 Root MSE          =     .37699
## 
## ------------------------------------------------------------------------------
##              |               Robust
##      ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##      1.union |   .2045053   .0291682     7.01   0.000     .1472808    .2617298
##      1.black |  -.1709034   .0308067    -5.55   0.000    -.2313425   -.1104644
##              |
##  union#black |
##         1 1  |   .0386275   .0516609     0.75   0.455     -.062725      .13998
##              |
##        _cons |   1.657525   .0138278   119.87   0.000     1.630396    1.684653
## ------------------------------------------------------------------------------
## 
## . margins union#black
## 
## Adjusted predictions                            Number of obs     =      1,244
## Model VCE    : Robust
## 
## Expression   : Linear prediction, predict()
## 
## ------------------------------------------------------------------------------
##              |            Delta-method
##              |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##  union#black |
##         0 0  |   1.657525   .0138278   119.87   0.000     1.630396    1.684653
##         0 1  |   1.486621    .027529    54.00   0.000     1.432613     1.54063
##         1 0  |    1.86203   .0256822    72.50   0.000     1.811644    1.912415
##         1 1  |   1.729754   .0325611    53.12   0.000     1.665873    1.793635
## ------------------------------------------------------------------------------
## 
## . margins union#black, pwcompare
## 
## Pairwise comparisons of adjusted predictions
## Model VCE    : Robust
## 
## Expression   : Linear prediction, predict()
## 
## -----------------------------------------------------------------
##                 |            Delta-method         Unadjusted
##                 |   Contrast   Std. Err.     [95% Conf. Interval]
## ----------------+------------------------------------------------
##     union#black |
## (0 1) vs (0 0)  |  -.1709034   .0308067     -.2313425   -.1104644
## (1 0) vs (0 0)  |   .2045053   .0291682      .1472808    .2617298
## (1 1) vs (0 0)  |   .0722294   .0353756      .0028268     .141632
## (1 0) vs (0 1)  |   .3754087   .0376487      .3015466    .4492709
## (1 1) vs (0 1)  |   .2431328   .0426388      .1594807     .326785
## (1 1) vs (1 0)  |  -.1322759   .0414705     -.2136359   -.0509159
## -----------------------------------------------------------------
## 
## .

What we get by using “margins union#black” is the four cell means of \(E(Y)\), in this case, log of wage. Then “margins union#black, pwcompare” tells us all pairwise comparison of these four cell means. Instead of only paying attention to the interaction coefficient, in this case we might be interested in some comparisons of the four different situations of union and black. In fact, in this example, despite the interaction term being insignificant, all six comparisons of the cell means turn out to have 95% confidence intervals that do not include zero.

Interaction with continuous variables

Let’s start with the simpliest situation: \(x_1\) and \(x_2\) are continuous.

\[ E(y) = \beta_1 x_1 + \beta_2 x_2 + \beta_{12} x_1*x_2 \]

In this case, we recommend “centering” \(x_1\) and \(x_2\) if they are continuous; that is, subtracting the mean value from each continuous independent variable when they are involved in the interaction term. There are two reason for it:

  1. To reduce multi-collinearity. If the range of \(x_1\) and \(x_2\) include only positive numbers, then \(x_1*x_2\) can be highly correlated with both or one of \(x_1\) and \(x_2\). This can lead to numerical problems and unstable coefficient estimates (multi-collinearity problem).

“Centering” can reduce the correlation between the interaction term and the independent variables. If the original variables are normally distributed, interaction term after centering is actually uncorrelated with the original variables. When they are not normally distributed, centering will still reduce the correlation to a large degree.

  1. To help with interpretation. In a model with interaction, \(\beta_1\) represents the effect of \(x_1\) when \(x_2\) is zero. However, in many situations, zero is not within the range of \(x_2\). After centering, centered \(x_2\) at zero simply means original \(x_2\) at its mean value.

When we have dummy variable interacting with continuous variable, only continuous variable should be centered.

Again, Stata’s margins command is helpful.

sysuse auto
sum mpg
gen mpg_centered=mpg-r(mean)
sum mpg_centered
reg price i.foreign##c.mpg_centered
margins foreign, at(mpg_centered=(-3 (1) 3))
marginsplot
graph export marginsplot.eps, replace
## 
## . sysuse auto
## (1978 Automobile Data)
## 
## . sum mpg
## 
##     Variable |        Obs        Mean    Std. Dev.       Min        Max
## -------------+---------------------------------------------------------
##          mpg |         74     21.2973    5.785503         12         41
## 
## . gen mpg_centered=mpg-r(mean)
## 
## . sum mpg_centered
## 
##     Variable |        Obs        Mean    Std. Dev.       Min        Max
## -------------+---------------------------------------------------------
## mpg_centered |         74   -4.03e-08    5.785503  -9.297297    19.7027
## 
## . reg price i.foreign##c.mpg_centered
## 
##       Source |       SS           df       MS      Number of obs   =        74
## -------------+----------------------------------   F(3, 70)        =      9.48
##        Model |   183435285         3  61145094.9   Prob > F        =    0.0000
##     Residual |   451630112        70  6451858.74   R-squared       =    0.2888
## -------------+----------------------------------   Adj R-squared   =    0.2584
##        Total |   635065396        73  8699525.97   Root MSE        =    2540.1
## 
## ------------------------------------------------------------------------------
##        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##      foreign |
##     Foreign  |   1666.519    717.217     2.32   0.023     236.0751    3096.963
## mpg_centered |  -329.2551   74.98545    -4.39   0.000    -478.8088   -179.7013
##              |
##      foreign#|
##           c. |
## mpg_centered |
##     Foreign  |   78.88826   112.4812     0.70   0.485    -145.4485     303.225
##              |
##        _cons |   5588.295   369.0945    15.14   0.000     4852.159    6324.431
## ------------------------------------------------------------------------------
## 
## . margins foreign, at(mpg_centered=(-3 (1) 3))
## 
## Adjusted predictions                            Number of obs     =         74
## Model VCE    : OLS
## 
## Expression   : Linear prediction, predict()
## 
## 1._at        : mpg_centered    =          -3
## 
## 2._at        : mpg_centered    =          -2
## 
## 3._at        : mpg_centered    =          -1
## 
## 4._at        : mpg_centered    =           0
## 
## 5._at        : mpg_centered    =           1
## 
## 6._at        : mpg_centered    =           2
## 
## 7._at        : mpg_centered    =           3
## 
## ------------------------------------------------------------------------------
##              |            Delta-method
##              |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
## -------------+----------------------------------------------------------------
##  _at#foreign |
##  1#Domestic  |    6576.06    370.446    17.75   0.000     5837.229    7314.891
##   1#Foreign  |   8005.915   766.8178    10.44   0.000     6476.545    9535.284
##  2#Domestic  |   6246.805   354.4734    17.62   0.000      5539.83     6953.78
##   2#Foreign  |   7755.548   709.9327    10.92   0.000     6339.632    9171.464
##  3#Domestic  |    5917.55   354.0032    16.72   0.000     5211.513    6623.587
##   3#Foreign  |   7505.181   658.8306    11.39   0.000     6191.185    8819.177
##  4#Domestic  |   5588.295   369.0945    15.14   0.000     4852.159    6324.431
##   4#Foreign  |   7254.814   614.9548    11.80   0.000     6028.325    8481.303
##  5#Domestic  |    5259.04    397.981    13.21   0.000     4465.292    6052.788
##   5#Foreign  |   7004.447   579.9479    12.08   0.000     5847.778    8161.117
##  6#Domestic  |   4929.785   437.9413    11.26   0.000     4056.338    5803.231
##   6#Foreign  |   6754.081   555.4891    12.16   0.000     5646.192    7861.969
##  7#Domestic  |    4600.53    486.253     9.46   0.000     3630.729    5570.331
##   7#Foreign  |   6503.714   543.0057    11.98   0.000     5420.723    7586.704
## ------------------------------------------------------------------------------
## 
## . marginsplot
## 
##   Variables that uniquely identify margins: mpg_centered foreign
## 
## . graph export marginsplot.eps, replace
## (note: file marginsplot.eps not found)
## (file marginsplot.eps written in EPS format)

In this example, the graph shows the predicted price for foreign and domestic cars at different level of mpg.

Senior Statistician / Data Scientist

I enjoy statistics and programming.

Related