Indistinguishable dyads: SEM in wide format

Wide format, lavaan, equality-constrained actor and partner slopes

The multilevel model in the previous tutorial is the workhorse for APIM work, but the SEM specification is more transparent about which slopes are constrained equal, and it gives you standard fit indices (CFI, TLI, RMSEA, SRMR) that you can report alongside the parameter estimates. This tutorial fits the indistinguishable APIM in lavaan with the Olsen & Kenny (2006) specification.

Setup

Code

library(lavaan)
library(dplyr)

load("../../../data/dyad_data.RData")

cat("Wide format:", nrow(ddw), "rows,", ncol(ddw), "columns\n")

Wide format: 100 rows, 9 columns

The model

In wide format, the two dyad members are represented by separate columns with the suffixes _a and _p. The Olsen & Kenny (2006) specification constrains the actor and partner slopes to be equal by giving the two paths the same label.

Code

model_indist <- '
  # Actor (the "_a" member)
  satisfaction_a ~ a_wnc*wnc_a + p_wnc*wnc_p +
                   a_rec*recovery_a + p_rec*recovery_p +
                   c_child*has_children + c_dual*dual_earner

  # Partner (the "_p" member)
  satisfaction_p ~ a_wnc*wnc_p + p_wnc*wnc_a +
                   a_rec*recovery_p + p_rec*recovery_a +
                   c_child*has_children + c_dual*dual_earner

  # Equal intercepts
  satisfaction_a ~ alpha*1
  satisfaction_p ~ alpha*1

  # Equal residual variances
  satisfaction_a ~~ sigma2*satisfaction_a
  satisfaction_p ~~ sigma2*satisfaction_p

  # Predictor variances and covariances
  wnc_a ~~ wnc_p
  recovery_a ~~ recovery_p

  # Residual covariance (the dyad-level non-independence)
  satisfaction_a ~~ res_cov*satisfaction_p
'

Reading the labels

a_wnc is used for both the path wnc_a → satisfaction_a (the actor effect on the actor’s own satisfaction) and the path wnc_p → satisfaction_p (the actor effect on the partner’s satisfaction). Same label = constrained equal.
p_wnc is used for the partner effect. The constraint forces a_wnc = p_wnc to be tested separately — the distinguishable SEM wide tutorial shows how.
alpha, sigma2, and res_cov are shared labels for the intercepts, residual variances, and residual covariance.

Fit the model

Code

fit_indist <- sem(model_indist, data = ddw)

summary(fit_indist, standardized = TRUE, fit.measures = TRUE)

lavaan 0.6-21 ended normally after 27 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        27
  Number of equality constraints                     8

  Number of observations                           100

Model Test User Model:
                                                      
  Test statistic                                66.150
  Degrees of freedom                                20
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               284.840
  Degrees of freedom                                27
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.821
  Tucker-Lewis Index (TLI)                       0.758

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -657.477
  Loglikelihood unrestricted model (H1)       -624.402
                                                      
  Akaike (AIC)                                1352.955
  Bayesian (BIC)                              1402.453
  Sample-size adjusted Bayesian (SABIC)       1342.446

Root Mean Square Error of Approximation:

  RMSEA                                          0.152
  90 Percent confidence interval - lower         0.112
  90 Percent confidence interval - upper         0.193
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    0.998

Standardized Root Mean Square Residual:

  SRMR                                           0.150

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  satisfaction_a ~                                                      
    wnc_a   (a_wn)   -0.256    0.037   -6.971    0.000   -0.256   -0.394
    wnc_p   (p_wn)   -0.178    0.037   -4.846    0.000   -0.178   -0.252
    recvry_ (a_rc)    0.226    0.036    6.300    0.000    0.226    0.302
    rcvry_p (p_rc)    0.125    0.036    3.476    0.001    0.125    0.177
    hs_chld (c_ch)    0.026    0.072    0.364    0.716    0.026    0.019
    dul_rnr (c_dl)    0.305    0.076    4.010    0.000    0.305    0.209
  satisfaction_p ~                                                      
    wnc_p   (a_wn)   -0.256    0.037   -6.971    0.000   -0.256   -0.363
    wnc_a   (p_wn)   -0.178    0.037   -4.846    0.000   -0.178   -0.275
    rcvry_p (a_rc)    0.226    0.036    6.300    0.000    0.226    0.321
    recvry_ (p_rc)    0.125    0.036    3.476    0.001    0.125    0.167
    hs_chld (c_ch)    0.026    0.072    0.364    0.716    0.026    0.019
    dul_rnr (c_dl)    0.305    0.076    4.010    0.000    0.305    0.209

Covariances:
                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  wnc_a ~~                                                               
    wnc_p              0.516    0.110    4.682    0.000    0.516    0.530
  recovery_a ~~                                                          
    rcvry_p            0.218    0.087    2.502    0.012    0.218    0.258
 .satisfaction_a ~~                                                      
   .stsfct_ (rs_c)     0.027    0.022    1.246    0.213    0.027    0.126

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .stsfct_ (alph)    4.824    0.164   29.359    0.000    4.824    7.221
   .stsfct_ (alph)    4.824    0.164   29.359    0.000    4.824    7.236
    wnc_a             0.247    0.103    2.397    0.017    0.247    0.240
    wnc_p            -0.033    0.095   -0.353    0.724   -0.033   -0.035
    recvry_           3.005    0.089   33.684    0.000    3.005    3.368
    rcvry_p           3.235    0.095   34.131    0.000    3.235    3.413

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .stsfct_ (sgm2)    0.215    0.022    9.922    0.000    0.215    0.482
   .stsfct_ (sgm2)    0.215    0.022    9.922    0.000    0.215    0.484
    wnc_a             1.058    0.150    7.071    0.000    1.058    1.000
    wnc_p             0.896    0.127    7.071    0.000    0.896    1.000
    recvry_           0.796    0.113    7.071    0.000    0.796    1.000
    rcvry_p           0.898    0.127    7.071    0.000    0.898    1.000

Compare to the multilevel model

The point estimates should match the MLM tutorial to three or four decimal places. The differences are in the standard errors (Wald vs. profile-likelihood) and in the inferential machinery (likelihood ratio tests in lavaan vs. in lme4).

Code

library(lme4)
library(lmerTest)

apim_mlm <- lmer(
  satisfaction ~ wnc + partner_wnc + recovery + partner_recovery +
    has_children + dual_earner + (1 | dyad_id),
  data = ddl
)

cat("--- MLM coefficients ---\n")

--- MLM coefficients ---

Code

print(round(fixef(apim_mlm), 4))

     (Intercept)              wnc      partner_wnc         recovery 
          4.8240          -0.2560          -0.1779           0.2261 
partner_recovery     has_children      dual_earner 
          0.1248           0.0262           0.3047

Code

cat("\n--- SEM coefficients (regression) ---\n")


--- SEM coefficients (regression) ---

Code

pe <- parameterEstimates(fit_indist, standardized = TRUE)
pe[pe$op == "~", c("lhs", "rhs", "est", "se", "pvalue")]

The two sets of estimates should be nearly identical.

Fit indices

Code

fits <- c("chisq", "df", "pvalue", "cfi", "tli", "rmsea", "srmr")
fitMeasures(fit_indist, fits)

 chisq     df pvalue    cfi    tli  rmsea   srmr 
66.150 20.000  0.000  0.821  0.758  0.152  0.150

The chi-square test of exact fit is a strict test; with N = 100 dyads and a model that is correctly specified for the data, you should still see a significant chi-square — this is a well-known limitation of the chi-square test for SEM. The other indices (CFI, TLI, RMSEA, SRMR) are alternative fit criteria that are less sensitive to sample size.

For an indistinguishable model on dyadic data, the SRMR is the most informative index. Values below 0.08 are conventionally considered acceptable. (See the Hu & Bentler (1999) cutoffs discussion for context.)

Test the equality constraint

The most informative test for indistinguishable dyads is whether the equality constraint fits as well as the unconstrained model. Fit the unconstrained model and compare.

Code

model_unconstrained <- '
  satisfaction_a ~ a_wnc_h*wnc_a + p_wnc_h*wnc_p +
                   a_rec_h*recovery_a + p_rec_h*recovery_p +
                   c_child_h*has_children + c_dual_h*dual_earner
  satisfaction_p ~ a_wnc_w*wnc_p + p_wnc_w*wnc_a +
                   a_rec_w*recovery_p + p_rec_w*recovery_a +
                   c_child_w*has_children + c_dual_w*dual_earner
  satisfaction_a ~ int_a*1
  satisfaction_p ~ int_p*1
  satisfaction_a ~~ var_a*satisfaction_a
  satisfaction_p ~~ var_p*satisfaction_p
  wnc_a ~~ wnc_p
  recovery_a ~~ recovery_p
  satisfaction_a ~~ res_cov*satisfaction_p
'

fit_unconstrained <- sem(model_unconstrained, data = ddw)

cat("--- LRT: unconstrained vs indistinguishable ---\n")

--- LRT: unconstrained vs indistinguishable ---

Code

lavTestLRT(fit_unconstrained, fit_indist)

A non-significant p-value supports indistinguishability. In our data, the equality constraint is tenable, which is consistent with the data-generating process (equal slopes across gender).

What to take away

Key takeaways

The wide-format SEM gives you the same point estimates as the MLM, with explicit labels for which slopes are constrained equal.
The Olsen & Kenny (2006) specification is the standard reference for this model.
The likelihood ratio test against the unconstrained model is the gold standard for testing distinguishability.

Setup

The model

Fit the model

Compare to the multilevel model

Fit indices

Test the equality constraint

What to take away

Key takeaways

What to read next