Skip to contents

Fit a Confirmatory Factor Analysis (CFA) model using the lavaan.pl() engine using the lavaan framework. By default, a quasi-Newton BFGS optimiser from the ucminf package is used. For large data sets, the stochastic approximation algorithm can be used.

Usage

cfa(
  model,
  data,
  std.lv = FALSE,
  estimator = "PML",
  estimator.args = list(method = c("ucminf", "SA"), init_method = c("SA", "custom",
    "standard"), cpp_control_init = NULL, ncores = 1, valdata = NULL, computevar_numderiv
    = FALSE),
  start = NULL,
  control = list(),
  verbose = FALSE,
  ...
)

Arguments

model

A description of the user-specified model. Typically, the model is described using the lavaan model syntax. See lavaan::model.syntax for more information. Alternatively, a parameter table (eg. the output of the lavaan::lavaanify() function) is also accepted.

data

A data frame containing the observed variables used in the model. Variables must be declared as ordered factors.

std.lv

If TRUE, the metric of each latent variable is determined by fixing their (residual) variances to 1.0. If FALSE, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0.

estimator

The estimator is PML. If any other estimator is provided, then lavaan::cfa() is used.

estimator.args

A list of arguments for fit_plFA()–see the help file for more details. Possible options are:

method

One of "ucminf" (default) for the quasi-Newton BFGS optimiser, or "SA" for stochastic approximation.

init_method

One of "SA" (default) for stochastic approximation, "custom" for user-defined starting values, or "standard" for standard starting values.

cpp_control_init

A list of control parameters for the initialisation algorithm.

ncores

The number of cores to use for parallel computation.

valdata

Validation data.

computevar_numderiv

If TRUE, the asymptotic variance-covariance matrix is computed using numerical derivatives.

start

A vector of starting values to use (in the order of free loadings, thresholds, and then factor correlations). If not provided, the starting point is computed according to fit_plFA()'s INIT_METHOD.

control

A list of control parameters for the estimation algorithm. See fit_plFA() for more information.

verbose

If TRUE, print additional information during the estimation process.

...

Additional arguments to be passed to lavaan().

Value

A plFAlavaan object, which is a subclass of the lavaan class. Therefore, all methods available for lavaan objects are expected to be compatible with plFAlavaan objects.

Details

Not all lavaan options can be used at present. Some options of interest are:

information

The information matrix to use. Only "observed" is currently supported.

se

The standard error method to use. Only "robust.huber.white" is currently supported.

test

No GOF tests are available as of now, so this is set to "none".

References

Katsikatsou, M., Moustaki, I., Yang-Wallentin, F., & Jöreskog, K. G. (2012). Pairwise likelihood estimation for factor analysis models with ordinal data. Computational Statistics & Data Analysis, 56(12), 4243–4258. https://doi.org/10.1016/j.csda.2012.04.010

Alfonzetti, G., Bellio, R., Chen, Y., & Moustaki, I. (2025). Pairwise stochastic approximation for confirmatory factor analysis of categorical data. British Journal of Mathematical and Statistical Psychology, 78(1), 22–43. https://doi.org/10.1111/bmsp.12347

Examples


# A simple binary factor model using the LSAT data
fit <- cfa("eta =~ y1 + y2 + y3 + y4 + y5", LSAT, std.lv = TRUE)
summary(fit)
#> lavaan.pl 0.1.0.9002 
#>
#> lavaan 0.6-19 ended normally after 22 iterations
#> 
#>   Estimator                                        PML
#>   Optimization method                           UCMINF
#>   Number of model parameters                        10
#> 
#>   Number of observations                          1000
#> 
#> 
#> Parameter Estimates:
#> 
#>   Parameterization                               Delta
#>   Standard errors                             Sandwich
#>   Information bread                           Observed
#>   Observed information based on                Hessian
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   eta =~                                              
#>     y1                0.389    0.103    3.767    0.000
#>     y2                0.397    0.080    4.937    0.000
#>     y3                0.472    0.097    4.871    0.000
#>     y4                0.376    0.087    4.332    0.000
#>     y5                0.340    0.103    3.310    0.001
#> 
#> Thresholds:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>     y1|t1            -1.433    0.068  -21.195    0.000
#>     y2|t1            -0.550    0.041  -13.401    0.000
#>     y3|t1            -0.133    0.040   -3.343    0.001
#>     y4|t1            -0.716    0.044  -16.457    0.000
#>     y5|t1            -1.126    0.052  -21.549    0.000
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .y1                0.849                           
#>    .y2                0.842                           
#>    .y3                0.778                           
#>    .y4                0.859                           
#>    .y5                0.885                           
#>     eta               1.000                           
#>