Appendix to: Common Practice (misuse of) Moderation in Regression

Simulate Data

We’ll simulate a tri-variate data set, with two uncorrelated variables that are correlated to a third variable. To start, all variables will be centered at 0 (mu) and scaled to 1 (diagonal of Sigma):

library(tidyverse)
Sigma <- matrix(c(1.0, 0.6, 0.6,
                  0.6, 1.0, 0.0,
                  0.6, 0.0, 1.0),
                nrow = 3)
data <- MASS::mvrnorm(n = 1000,
                      mu = rep(0,3),
                      Sigma = Sigma,
                      empirical = T) %>% as.data.frame()

Let’s re scale V2 to increase it’s slope when predicting V1:

data <- data %>% 
  mutate(V2 = 5*V2+10)

Lets look at the correlation matrix (should be the same as Sigma):

knitr::kable(cor(data))
V1 V2 V3
V1 1.0 0.6 0.6
V2 0.6 1.0 0.0
V3 0.6 0.0 1.0

and the covariance matrix (should only be different in the scale of V2):

knitr::kable(cov(data))
V1 V2 V3
V1 1.0 3 0.6
V2 3.0 25 0.0
V3 0.6 0 1.0

Fit lavaan model

library(lavaan)
my_model <- '
V1 ~ a*V2 + b*V3
diff := a - b
'
fit <- sem(my_model, data = data)

If diff is computed on the standardized coefficients, we expect it to be 0.
If diff is computed on the unstandardized coefficients, we expect it to not 0.

summary(fit, standardized = T)
lavaan 0.6.13 ended normally after 1 iteration

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                         3

  Number of observations                          1000

Model Test User Model:
                                                      
  Test statistic                                 0.000
  Degrees of freedom                                 0

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  V1 ~                                                                  
    V2         (a)    0.120    0.003   35.857    0.000    0.120    0.600
    V3         (b)    0.600    0.017   35.857    0.000    0.600    0.600

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .V1                0.280    0.013   22.361    0.000    0.280    0.280

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    diff             -0.480    0.017  -28.128    0.000   -0.480    0.000

We can see that the Estimate of diff is non-zero, implying that it was computed on the non-standardized coefficients. BUT we also see that the Std.all of diff is 0, implying that it was computed on the standardized coefficients.
What about the significance test?

parameterEstimates(fit)[7,] %>% knitr::kable()
lhs op rhs label est se z pvalue ci.lower ci.upper
7 diff := a-b diff -0.48 0.0170646 -28.12843 0 -0.513446 -0.446554
standardizedSolution(fit)[7,] %>% knitr::kable()
lhs op rhs label est.std se z pvalue ci.lower ci.upper
7 diff := a-b diff 0 0.0236643 0 1 -0.0463812 0.0463812

We can see that we get two different \(z\)-tests, depending on the type of estimates we get. It seems that the test results returned from summary() are based on the standardizedSolution() function, and thus based on the unstandardized results.