| Title: | A General Framework for Latent Classify and Profile Analysis |
|---|---|
| Description: | A unified latent class modeling framework that encompasses both latent class analysis (LCA) and latent profile analysis (LPA), offering a one-stop solution for latent class modeling. It implements state-of-the-art parameter estimation methods, including the expectation–maximization (EM) algorithm, neural network estimation (NNE; requires users to have 'Python' and its dependent libraries installed on their computer), and integration with 'Mplus' (requires users to have 'Mplus' installed on their computer). In addition, it provides commonly used model fit indices such as the Akaike information criterion (AIC) and Bayesian information criterion (BIC), as well as classification accuracy measures such as entropy. The package also includes fully functional likelihood ratio tests (LRT) and bootstrap likelihood ratio tests (BLRT) to facilitate model comparison, along with bootstrap-based and observed information matrix-based standard error estimation. Furthermore, it supports the standard three-step approach for LCA, LPA, and latent transition analysis (LTA) with covariates, enabling detailed covariate analysis. Finally, it includes several user-friendly auxiliary functions to enhance interactive usability. |
| Authors: | Haijiang Qin [aut, cre, cph] (ORCID: <https://orcid.org/0009-0000-6721-5653>), Lei Guo [aut, cph] (ORCID: <https://orcid.org/0000-0002-8273-3587>) |
| Maintainer: | Haijiang Qin <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.2 |
| Built: | 2026-06-09 05:58:09 UTC |
| Source: | https://github.com/cran/LCPA |
This function reorders the latent classes/profiles of object2 to best match those in object1
by minimizing the total assignment cost based on posterior class membership (MAP classification).
It uses the Linear Sum Assignment Problem (LSAP) solver to find the optimal one-to-one mapping between latent classes.
Useful for comparing or averaging models across replications, initializations, or algorithms where class labels may be permuted.
adjust.model(object1, object2)adjust.model(object1, object2)
object1 |
An object of class |
object2 |
An object of class |
The alignment is performed by:
Computing Maximum A Posteriori (MAP) classification matrices for both models.
Calculating a distance matrix between classes (typically Euclidean distance between binary MAP vectors).
Solving the Linear Sum Assignment Problem (LSAP) via solve_LSAP to find the permutation minimizing total mismatch cost.
Reordering all class-specific components of object2 according to this optimal assignment.
A modified version of object2, with all parameters and posterior probabilities reordered
to best match the latent class structure of object1. The returned object retains its original class
("LCA" or "LPA") and includes aligned:
Prior probabilities (P.Z)
Posterior probabilities (P.Z.Xn)
MAP classifications (Z)
Profile/Class-specific parameters:
For "LPA": means, covs
For "LCA": par, probability
All relevant dimnames and names are synchronized with object1
Both models must have identical numbers of observations (), latent classes (), and indicators ().
Designed for use after fitting multiple models (e.g., different random starts) to ensure consistent class labeling.
Does not modify object1; only returns a reordered object2.
## Not run: # need Mplus and Pyrthon library(LCPA) set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 3) # Fit two models with different random seeds fit1 <- LCA(data.obj$response, L = 3, method = "Mplus", nrep = 1) fit2 <- LCA(data.obj$response, L = 3, method = "NNE", nrep = 1) # Align fit2 to fit1's class ordering fit2_aligned <- adjust.model(fit1, fit2) # Compare prior probabilities before and after print("Before alignment:") print(fit2$params$P.Z) print("After alignment:") print(fit2_aligned$params$P.Z) ## End(Not run)## Not run: # need Mplus and Pyrthon library(LCPA) set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 3) # Fit two models with different random seeds fit1 <- LCA(data.obj$response, L = 3, method = "Mplus", nrep = 1) fit2 <- LCA(data.obj$response, L = 3, method = "NNE", nrep = 1) # Align fit2 to fit1's class ordering fit2_aligned <- adjust.model(fit1, fit2) # Compare prior probabilities before and after print("Before alignment:") print(fit2$params$P.Z) print("After alignment:") print(fit2_aligned$params$P.Z) ## End(Not run)
Standardizes polytomous response data by converting raw category values to consecutive integers starting from 0. Records original category values for potential reverse transformation. Handles varying numbers of response categories across indicators.
adjust.response(response)adjust.response(response)
response |
A matrix or data frame containing response data where:
Non-numeric columns will be coerced to numeric with warning. |
The function processes each indicator column independently:
Extracts unique response values and sorts them in ascending order
Maps smallest value to 0, second smallest to 1, etc.
Records original values in poly.orig for possible reverse transformation
Handles indicators with different numbers of categories through NA-padding
Missing values (NA) in input are preserved as NA in output.
A named list containing:
poly.orig matrix. Original sorted category values for each indicator.
Rows correspond to indicators, columns to category positions. Empty cells filled with NA.
poly.valueInteger vector of length . Number of unique response categories per indicator.
poly.maxScalar integer. Maximum number of categories across all indicators, i.e., .
response matrix. Adjusted response data where original values are replaced by
zero-based category indices (0 to for categories).
# Simulate response data with 3 indicators and varying categories set.seed(123) resp <- data.frame( indicator1 = sample(1:3, 10, replace = TRUE), indicator2 = sample(c(0, 5, 10), 10, replace = TRUE), indicator3 = sample(1:2, 10, replace = TRUE) ) # Apply adjustment adjusted <- adjust.response(resp) # Inspect results str(adjusted) print(adjusted$poly.orig) # Original category values print(adjusted$response) # Standardized responses# Simulate response data with 3 indicators and varying categories set.seed(123) resp <- data.frame( indicator1 = sample(1:3, 10, replace = TRUE), indicator2 = sample(c(0, 5, 10), 10, replace = TRUE), indicator3 = sample(1:2, 10, replace = TRUE) ) # Apply adjustment adjusted <- adjust.response(resp) # Inspect results str(adjusted) print(adjusted$poly.orig) # Original category values print(adjusted$response) # Standardized responses
Checks whether each column in the response matrix contains exactly the number
of unique response categories specified in poly.value. Handles edge cases
where all indicators have identical category counts efficiently.
check.response(response, poly.value)check.response(response, poly.value)
response |
A numeric matrix of dimension
Each cell contains the observed response value for a subject on an indicator. |
poly.value |
An integer vector of length |
Logical value indicating validation status:
TRUE if either:
All columns have identical numbers of unique values (regardless of poly.value specification)
Each column's unique value count matches its corresponding poly.value entry
FALSE if any column's unique value count mismatches its specified poly.value
(when columns have varying category counts)
This function contains a specific behavior: When all indicators have identical numbers of
unique response categories, it returns TRUE immediately without validating against
poly.value. This may lead to unexpected results if poly.value contains
inconsistent expectations. Users should ensure poly.value accurately reflects
their measurement model.
# Valid case: Matching category counts resp_matrix <- matrix(c(1,1,2,2, 1,2,3,1), ncol = 2) check.response(resp_matrix, poly.value = c(2, 3)) # Returns TRUE # Invalid case: Mismatched category counts check.response(resp_matrix, poly.value = c(2, 2)) # Returns FALSE # Special case: Uniform category counts bypass poly.value check uniform_resp <- matrix(rep(1:2, each = 4), ncol = 2) check.response(uniform_resp, poly.value = c(2, 5)) # Returns TRUE (bypass behavior)# Valid case: Matching category counts resp_matrix <- matrix(c(1,1,2,2, 1,2,3,1), ncol = 2) check.response(resp_matrix, poly.value = c(2, 3)) # Returns TRUE # Invalid case: Mismatched category counts check.response(resp_matrix, poly.value = c(2, 2)) # Returns FALSE # Special case: Uniform category counts bypass poly.value check uniform_resp <- matrix(rep(1:2, each = 4), ncol = 2) check.response(uniform_resp, poly.value = c(2, 5)) # Returns TRUE (bypass behavior)
Compares two nested latent class/profile models using multiple fit indices, likelihood ratio tests, and classification metrics.
compare.model(object1, object2, n.Bootstrap = 0)compare.model(object1, object2, n.Bootstrap = 0)
object1 |
An object of class |
object2 |
An object of class |
n.Bootstrap |
Integer specifying the number of bootstrap replications for the parametric
bootstrap likelihood ratio test (BLRT). Default is |
This function performs comprehensive model comparison between two nested LCA/LPA models. Key features include:
Automatically orders models by parameter count (smaller model first)
Computes multiple fit indices via get.fit.index
Calculates classification quality metrics (entropy, average posterior probabilities)
Performs three types of likelihood ratio tests:
Standard LRT, see LRT.test
VLMR adjusted LRT, see LRT.test.VLMR
Parametric bootstrap LRT (computationally intensive but robust), see LRT.test.Bootstrap
Computes Bayes Factor using Sample-Size Adjusted BIC (SIC)
Important Requirements:
Both models must be of the same type (LCA or LPA)
Models must be nested (one model is a constrained version of the other)
n.Bootstrap > 0 requires significant computational resources
An object of class compare.model containing:
nparNamed vector with number of free parameters for each model
entropyNamed vector with entropy values (classification accuracy measure) for each model
AvePPList containing average posterior probabilities per latent class/profile
fit.indexList of get.fit.index objects for both models
BFBayes Factor for model comparison (based on SIC)
LRT.objLikelihood ratio test (LRT) results
LRT.VLMR.objVuong-Lo-Mendell-Rubin (VLMR) adjusted LRT results
LRT.Bootstrap.objBootstrap LRT results (if n.Bootstrap > 0)
callThe matched function call
argumentsList containing the original arguments passed to the function
LCA, LPA, get.fit.index,
extract, LRT.test, LRT.test.VLMR
library(LCPA) set.seed(123) data.obj <- sim.LPA(N = 500, I = 5, L = 3, constraint = "V0") response <- data.obj$response # need Mplus ## Not run: # Compare 3-class vs 4-class LPA models object1 <- LPA(response, L = 3, method = "Mplus", constraint = "V0") object2 <- LPA(response, L = 4, method = "Mplus", constraint = "V0") compare.model.obj <- compare.model(object1, object2) print(compare.model.obj) ## End(Not run)library(LCPA) set.seed(123) data.obj <- sim.LPA(N = 500, I = 5, L = 3, constraint = "V0") response <- data.obj$response # need Mplus ## Not run: # Compare 3-class vs 4-class LPA models object1 <- LPA(response, L = 3, method = "Mplus", constraint = "V0") object2 <- LPA(response, L = 4, method = "Mplus", constraint = "V0") compare.model.obj <- compare.model(object1, object2) print(compare.model.obj) ## End(Not run)
A generic S3 extractor function designed to retrieve internal components from various model and simulation objects
produced by the LCPA package. This function provides a consistent interface across different classes,
allowing users to access estimated parameters, fit statistics, simulation truths, standard errors, and more.
extract(object, what, ...) ## S3 method for class 'LCA' extract(object, what, ...) ## S3 method for class 'LPA' extract(object, what, ...) ## S3 method for class 'LCPA' extract(object, what, ...) ## S3 method for class 'LTA' extract(object, what, ...) ## S3 method for class 'sim.LCA' extract(object, what, ...) ## S3 method for class 'sim.LPA' extract(object, what, ...) ## S3 method for class 'sim.LTA' extract(object, what, ...) ## S3 method for class 'fit.index' extract(object, what, ...) ## S3 method for class 'compare.model' extract(object, what, ...) ## S3 method for class 'SE' extract(object, what, ...)extract(object, what, ...) ## S3 method for class 'LCA' extract(object, what, ...) ## S3 method for class 'LPA' extract(object, what, ...) ## S3 method for class 'LCPA' extract(object, what, ...) ## S3 method for class 'LTA' extract(object, what, ...) ## S3 method for class 'sim.LCA' extract(object, what, ...) ## S3 method for class 'sim.LPA' extract(object, what, ...) ## S3 method for class 'sim.LTA' extract(object, what, ...) ## S3 method for class 'fit.index' extract(object, what, ...) ## S3 method for class 'compare.model' extract(object, what, ...) ## S3 method for class 'SE' extract(object, what, ...)
object |
An object of one of the following classes:
|
what |
A character string specifying the name of the component to extract.
Valid choices depend on the class of |
... |
Additional arguments passed to methods (currently ignored). |
This function supports extraction from ten primary object classes. Below are available components for each:
LCALatent Class Analysis model results. Available components:
paramsList containing all estimated model parameters.
par3D array () of conditional response probabilities.
P.ZVector of length with latent class prior probabilities.
nparNumber of free parameters in the model.
Log.LikLog-likelihood of the final model.
AICAkaike Information Criterion.
BICBayesian Information Criterion.
best_BICBest BIC value across replication runs (if nrep > 1).
P.Z.Xn matrix of posterior class probabilities.
ZVector of length with MAP-classified latent class memberships.
probabilityList of formatted conditional probability matrices per item.
Log.Lik.historyVector tracking log-likelihood at each EM iteration.
Log.Lik.nrepVector of log-likelihoods from each replication run.
modelTrained neural network model object (only when method="NNE").
callThe original function call used for model estimation.
argumentsList containing all input arguments passed to the LCA function.
LPALatent Profile Analysis model results. Available components:
paramsList containing all estimated model parameters.
means matrix of estimated mean vectors for each profile.
covs array of estimated covariance matrices.
P.ZVector of length with profile prior probabilities.
nparNumber of free parameters (depends on constraint).
Log.LikLog-likelihood of the final model.
AICAkaike Information Criterion.
BICBayesian Information Criterion.
best_BICBest BIC value across replication runs (if nrep > 1).
P.Z.Xn matrix of posterior profile probabilities.
ZVector of length with MAP-classified profile memberships.
Log.Lik.historyVector tracking log-likelihood at each EM iteration.
Log.Lik.nrepVector of log-likelihoods from each replication run.
modelTrained model object (neural network or Mplus).
callThe original function call used for model estimation.
argumentsList containing all input arguments passed to the LPA function.
constraintCovariance structure constraints applied during estimation (from original arguments).
LCPALatent Class/Profile Analysis (with covariates). Available components:
betaInitial class coefficients (p1 x L matrix).
beta.seStandard errors for beta.
beta.Z.staZ-statistics for beta.
beta.p.value.tail1One-tailed p-values for beta.
beta.p.value.tail2Two-tailed p-values for beta.
P.Z.XnPosterior probabilities (N x L).
P.ZPrior proportions (length L).
ZModal class assignments (length N).
nparNumber of free parameters.
Log.LikLog-likelihood.
AICAIC.
BICBIC.
iterationsOptimization iterations in Step 3.
coveragedLogical: did optimization converge early?
paramsStep 1 model parameters (LCA/LPA output).
callFunction call.
argumentsInput arguments list.
LTALatent Transition Analysis model results. Available components:
betaInitial class coefficients (p1 x L matrix).
gammaTransition coefficients (nested list).
beta.seStandard errors for beta.
gamma.seStandard errors for gamma.
beta.Z.staZ-statistics for beta.
gamma.Z.staZ-statistics for gamma.
beta.p.value.tail1One-tailed p-values for beta.
gamma.p.value.tail1One-tailed p-values for gamma.
beta.p.value.tail2Two-tailed p-values for beta.
gamma.p.value.tail2Two-tailed p-values for gamma.
P.Z.XnsList of posterior probabilities per time (each N x L).
P.ZsList of prior proportions per time (each length L).
ZsList of modal class assignments per time (each length N).
nparNumber of free parameters.
Log.LikLog-likelihood.
AICAIC.
BICBIC.
iterationsOptimization iterations in Step 3.
coveragedLogical: did optimization converge early?
paramsStep 1 model parameters (LCA/LPA output).
callFunction call.
argumentsInput arguments list.
sim.LCASimulated Latent Class Analysis data. Available components:
responseInteger matrix () of simulated categorical observations.
parArray () of true class-specific category probabilities.
ZInteger vector (length ) of true latent class assignments.
P.ZNumeric vector (length ) of true class proportions.
poly.valueInteger vector (length ) specifying categories per variable.
P.Z.XnBinary matrix () of true class membership indicators.
callThe original function call used for simulation.
argumentsList containing all input arguments passed to sim.LCA.
sim.LPASimulated Latent Profile Analysis data. Available components:
responseNumeric matrix () of simulated continuous observations.
means matrix of true class-specific means.
covs array of true class-specific covariance matrices.
P.Z.Xn matrix of true class membership indicators.
P.ZNumeric vector (length ) of true class proportions.
ZInteger vector (length ) of true profile assignments.
constraintOriginal constraint specification passed to sim.LPA.
callThe original function call used for simulation.
argumentsList containing all input arguments passed to sim.LPA.
sim.LTASimulated Latent Transition Analysis data. Available components:
responsesList of response matrices per time point.
ZsList of true latent class assignments per time.
P.ZsList of true class proportions per time.
parTrue conditional probabilities (for categorical items).
meansTrue profile means (for continuous variables).
covsTrue covariance matrices per class and time.
poly.valueCategories per variable (for categorical items).
rateTransition rate matrix or structure.
covariatesSimulated covariate matrix.
betaTrue initial class coefficients.
gammaTrue transition coefficients.
callOriginal simulation function call.
argumentsInput arguments used in simulation.
fit.indexModel fit indices object. Available components:
nparNumber of free parameters in the model.
Log.LikLog-likelihood of the model.
-2LLDeviance statistic (-2 times log-likelihood).
AICAkaike Information Criterion.
BICBayesian Information Criterion.
SICSample-Size Adjusted BIC (-0.5 * BIC).
CAICConsistent AIC.
AWEApproximate Weight of Evidence.
SABICSample-Size Adjusted BIC (alternative formulation).
callOriginal function call that generated the fit indices.
argumentsList containing input arguments (includes original model object).
compare.modelModel comparison results. Available components:
nparNamed numeric vector with free parameters for each model (model1, model2).
entropyNamed numeric vector with entropy values for each model.
AvePPList of average posterior probabilities per class/profile for each model.
fit.indexList of get.fit.index objects for both models.
BFBayes Factor comparing models (based on SIC differences).
LRT.objStandard likelihood ratio test results (requires nested models).
LRT.VLMR.objVuong-Lo-Mendell-Rubin adjusted likelihood ratio test results.
LRT.Bootstrap.objParametric bootstrap likelihood ratio test results (if n.Bootstrap > 0).
callThe matched function call used for comparison.
argumentsList containing original input arguments (object1, object2, n.Bootstrap).
SEStandard error estimation results. Available components:
seList containing standard errors for parameters (components depend on model type).
vcovVariance-covariance matrix (only for method="Obs").
hessianObserved information matrix (only for method="Obs").
diagnosticsMethod-specific diagnostic information (e.g., estimation method).
callFunction call that generated the object.
argumentsInput arguments used in estimation.
meansStandard errors for profile means (LPA models only — accessed via se list).
covsStandard errors for covariance parameters (LPA models only — accessed via se list).
P.ZStandard errors for class proportions (both LCA/LPA — accessed via se list).
parStandard errors for conditional probabilities (LCA models only — accessed via se list).
The requested component. Return type varies depending on what and the class of object.
If an invalid what is provided, an informative error is thrown listing valid options.
extract(LCA): Extract fields from a LCA object
extract(LPA): Extract fields from a LPA object
extract(LCPA): Extract fields from a LCPA object
extract(LTA): Extract fields from a LTA object
extract(sim.LCA): Extract fields from a sim.LCA object
extract(sim.LPA): Extract fields from a sim.LPA object
extract(sim.LTA): Extract fields from a sim.LTA object
extract(fit.index): Extractor method for fit.index objects
extract(compare.model): Extract fields from a compare.model object
extract(SE): Extract fields from a SE object
For LCA, LPA, LCPA, and LTA objects, components reflect estimated parameters.
For sim.LCA, sim.LPA, and sim.LTA objects, components reflect true data-generating parameters.
In SE objects:
Top-level components like vcov and hessian are only available when method = "Obs".
Requesting them under Bootstrap triggers a warning and returns NULL.
Parameter-specific SEs (e.g., means, par) are stored within the se list.
You can extract them directly by name (e.g., extract(se_obj, "means")).
Attempting to extract unavailable parameter SEs (e.g., par from an LPA model) triggers an error with available options.
For fit.index and compare.model objects, valid components are dynamically determined from the object’s names.
All methods ignore additional arguments (...).
set.seed(123) # Simulate LPA data: 500 observations, 3 continuous variables, 2 latent profiles # Constraint "E0": Equal variances across classes, zero covariances data.obj <- sim.LPA(N = 500, I = 3, L = 2, constraint = "E0") # Extract the simulated response matrix (N x I) for model fitting response <- extract(data.obj, "response") # Extract the TRUE covariance matrices (I x I x L array) extract(data.obj, "covs") # Fit an LPA model to the simulated data using the SAME constraint ("E0") fit_E0 <- LPA(response, L = 2, constraint = "E0") # Extract the ESTIMATED covariance matrices from the fitted model extract(fit_E0, "covs") # Simulate LCA data: 30 observations, 5 categorical items, 3 latent classes sim_data <- sim.LCA(N = 30, I = 5, L = 3) # Extract the TRUE conditional probability array extract(sim_data, "par")set.seed(123) # Simulate LPA data: 500 observations, 3 continuous variables, 2 latent profiles # Constraint "E0": Equal variances across classes, zero covariances data.obj <- sim.LPA(N = 500, I = 3, L = 2, constraint = "E0") # Extract the simulated response matrix (N x I) for model fitting response <- extract(data.obj, "response") # Extract the TRUE covariance matrices (I x I x L array) extract(data.obj, "covs") # Fit an LPA model to the simulated data using the SAME constraint ("E0") fit_E0 <- LPA(response, L = 2, constraint = "E0") # Extract the ESTIMATED covariance matrices from the fitted model extract(fit_E0, "covs") # Simulate LCA data: 30 observations, 5 categorical items, 3 latent classes sim_data <- sim.LCA(N = 30, I = 5, L = 3) # Extract the TRUE conditional probability array extract(sim_data, "par")
Computes the average posterior probability for the most likely class assignment
in latent class/profile analysis. This metric quantifies classification precision.
The total average posterior probability (Nylund-Gibson & Choi, 2018) indicate adequate classification quality.
get.AvePP(object)get.AvePP(object)
object |
An object of class
|
A matrix with the following structure:
Rows: Represent each latent class (1 to L) and a final "Total" row.
Columns: Represent each latent class (1 to L) and a final "Total" column.
Diagonal elements : Average posterior probability for observations assigned to class .
That is,
where is the number of observations assigned to class , and .
Off-diagonal elements (): Average posterior probability of class
among observations assigned to class . Useful for assessing classification confusion.
Bottom-right corner : Overall average posterior probability across all observations,
Classification quality is considered acceptable if (Nylund-Gibson & Choi, 2018).
Nylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about latent class analysis. Translational Issues in Psychological Science, 4(4), 440-461. https://doi.org/10.1037/tps0000176
# Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) AvePP_value <- get.AvePP(fit.em) print(AvePP_value)# Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) AvePP_value <- get.AvePP(fit.em) print(AvePP_value)
Computes the Classification Error Probability (CEP) matrices (Liang et al., 2023) used in the bias-corrected three-step estimation of Latent Class/Profile Analysis with Covariates.
get.CEP(P.Z.Xns, time.cross = TRUE)get.CEP(P.Z.Xns, time.cross = TRUE)
P.Z.Xns |
A list of length
The list must be ordered chronologically (e.g., time 1 to |
time.cross |
Logical. If |
The CEP matrix at time gives the probability that an individual truly belongs
to latent class given that they were assigned (via modal assignment)
to class at time .
Formally, for time point :
where:
is the true latent class of individual at time ;
is the posterior probability from the first-step model;
is the modal (most likely) assigned class;
is the observed proportion assigned to class at time ;
is the total sample size.
If time.cross = TRUE (default), a single pooled CEP matrix is computed by
aggregating counts across all time points. This assumes the classification error
structure is invariant over time (i.e., measurement invariance), as in
Liang et al. (2023). The same pooled matrix is then returned for every time point.
A named list of length . Each element is an matrix:
Row : true latent class;
Column : individuals assigned to class ;
Entry : estimated
.
When time.cross = TRUE, all matrices in the list are identical.
Names are "t1", "t2", ..., "tT".
Assumes complete data (no missing values in posterior matrices).
All matrices in P.Z.Xns must have identical dimensions
(same and ).
Assignment is based on modal class (which.max).
If no individual is assigned to a class at a time point, division by zero may occur.
Liang, Q., la Torre, J. d., & Law, N. (2023). Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach. Journal of Educational and Behavioral Statistics, 48(6), 690-718. https://doi.org/10.3102/10769986231163320
# Simulate posterior probabilities for 2 time points, 3 classes, 100 individuals set.seed(123) N <- 100; L <- 3; times <- 2 P.Z.Xns <- replicate(times, t(apply(matrix(runif(N * L), N, L), 1, function(x) x / sum(x))), simplify = FALSE) # Compute time-specific CEP matrices cep_time_specific <- get.CEP(P.Z.Xns, time.cross = FALSE) # Compute time-invariant (pooled) CEP matrix cep_pooled <- get.CEP(P.Z.Xns, time.cross = TRUE)# Simulate posterior probabilities for 2 time points, 3 classes, 100 individuals set.seed(123) N <- 100; L <- 3; times <- 2 P.Z.Xns <- replicate(times, t(apply(matrix(runif(N * L), N, L), 1, function(x) x / sum(x))), simplify = FALSE) # Compute time-specific CEP matrices cep_time_specific <- get.CEP(P.Z.Xns, time.cross = FALSE) # Compute time-invariant (pooled) CEP matrix cep_pooled <- get.CEP(P.Z.Xns, time.cross = TRUE)
Computes the relative entropy statistic to evaluate classification quality in Latent Class Analysis (LCA) or Latent Profile Analysis (LPA) models. Entropy measures how accurately cases are assigned to latent classes based on posterior probabilities, with values closer to 1 indicating better separation between classes.
get.entropy(object)get.entropy(object)
object |
An object of class
|
A numeric value between 0 and 1 representing the relative entropy (Nylund-Gibson et al., 2018; Clark et al., 2013):
1.0: Perfect classification (each case belongs exclusively to one class)
0.8-1.0: Good classification quality
0.6-0.8: Moderate classification quality
< 0.6: Poor classification quality (consider model simplification)
Calculated using the formula:
where:
= Sample size
= Number of latent classes
= Posterior probability of observation belonging to class
The calculation includes a small constant (1e-10) to avoid log(0)
instability when posterior probabilities are exactly zero.
Values should be interpreted alongside other diagnostics (BIC, bootstrapped LRT) as high entropy alone doesn't guarantee model validity. Low entropy may indicate:
Overly complex model (too many classes)
Poorly measured latent constructs
Violation of local independence assumption
Nylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about latent class analysis. Translational Issues in Psychological Science, 4(4), 440-461. https://doi.org/10.1037/tps0000176
Clark, S. L., Muthén, B., Kaprio, J., D'Onofrio, B. M., Viken, R., & Rose, R. J. (2013). Models and Strategies for Factor Mixture Analysis: An Example Concerning the Structure Underlying Psychological Disorders. Structural Equation Modeling: A Multidisciplinary Journal, 20(4), 681-703. https://doi.org/10.1080/10705511.2013.824786
# Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) entropy_value <- get.entropy(fit.em) cat("Classification entropy:", round(entropy_value, 3), "\n")# Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) entropy_value <- get.entropy(fit.em) cat("Classification entropy:", round(entropy_value, 3), "\n")
Computes a comprehensive set of model fit indices for objects returned by
LCA or LPA. These indices balance model
fit (log-likelihood) with model complexity (number of parameters) to facilitate
model selection. All indices are derived from the observed-data log-likelihood
and parameter count.
get.fit.index(object)get.fit.index(object)
object |
An object of class
|
An object of class "fit.index" containing:
Number of free parameters in the model
Log-likelihood of the model:
Deviance statistic:
,
where is the prior probability of class ,
is the probability density/mass function (multivariate normal for LPA,
multinomial for LCA), and are class-specific parameters.
Akaike Information Criterion:
,
where = number of free parameters. Lower values indicate better fit.
Bayesian Information Criterion:
,
where = sample size. Incorporates stronger penalty for complexity than AIC.
Sample-Size Adjusted BIC:
.
Equivalent to . Often used in latent class modeling.
Consistent AIC:
.
Consistent version of AIC that converges to true model as .
Approximate Weight of Evidence:
.
Penalizes complexity more heavily than CAIC.
Sample-Size Adjusted BIC:
.
Recommended for latent class/profile analysis with moderate sample sizes.
# Fit LPA model set.seed(123) data.obj <- sim.LPA(N = 100, I = 3, L = 2, constraint = "E0") fit <- LPA(data.obj$response, L = 2, constraint = "VV", method = "EM") # Compute fit indices fit_indices <- get.fit.index(fit) fit_indices extract(fit_indices, "SABIC")# Fit LPA model set.seed(123) data.obj <- sim.LPA(N = 100, I = 3, L = 2, constraint = "E0") fit <- LPA(data.obj$response, L = 2, constraint = "VV", method = "EM") # Compute fit indices fit_indices <- get.fit.index(fit) fit_indices extract(fit_indices, "SABIC")
Computes the log-likelihood of observed categorical data under a Latent Class Analysis (LCA) model given class probabilities and conditional response probabilities. The calculation assumes local independence of responses conditional on latent class membership.
get.Log.Lik.LCA(response, P.Z, par)get.Log.Lik.LCA(response, P.Z, par)
response |
A numeric matrix of dimension
|
P.Z |
A numeric vector of length
|
par |
A 3-dimensional array of dimension
|
The log-likelihood calculation follows these steps:
Response Standardization:
Original responses are converted to 0-based integers
using adjust.response.
For example, original values {1,2,5} become {0,1,2}
(ordered and relabeled sequentially).
Class-Specific Likelihood:
For each observation and class , compute:
where is the standardized response value, and probabilities are taken from par[l, i, x_{ni}+1].
Marginal Likelihood:
For each observation , combine class-specific likelihoods weighted by class probabilities:
Log Transformation: Sum log-transformed marginal likelihoods across all observations:
A single numeric value representing the total log-likelihood:
where is the standardized (0-based) response for person on indicator .
Computes the log-likelihood of observed continuous data under a Latent Profile Analysis (LPA) model with multivariate normal distributions within each latent profile. Implements robust numerical techniques to handle near-singular covariance matrices.
get.Log.Lik.LPA(response, P.Z, means, covs, jitter = 1e-10)get.Log.Lik.LPA(response, P.Z, means, covs, jitter = 1e-10)
response |
A numeric matrix of dimension |
P.Z |
A numeric vector of length
|
means |
A matrix of dimension |
covs |
An array of dimension |
jitter |
A small positive constant (default: 1e-10) added to diagonal elements of covariance matrices to ensure numerical stability during Cholesky decomposition. |
The log-likelihood calculation follows these steps:
Covariance Stabilization:
Each covariance matrix is symmetrized as .
If Cholesky decomposition fails:
Add jitter to diagonal elements iteratively (up to 10 attempts, scaling jitter by 10x each attempt)
Fall back to diagonal covariance matrix if decomposition still fails
Profile-Specific Density for observation in profile :
Computed efficiently using Cholesky decomposition where applicable.
Joint Probability for observation and profile :
uses to avoid undefined values.
Marginal Likelihood per observation using log-sum-exp trick for numerical stability:
where .
Total Log-Likelihood: Sum of across all observations .
A single numeric value representing the total log-likelihood:
where denotes the multivariate normal density function.
Critical implementation details:
Cholesky Decomposition: For non-degenerate cases (), used to compute:
and quadratic form
Univariate Handling: When , computes density directly without decomposition
Numerical Safeguards:
Densities clamped to when non-finite
Marginal likelihoods clamped to when non-finite
Explicit dimension checks for means and covs
Assumptions:
Multivariate normality within profiles
No missing data in response
Positive-definite covariance matrices (after stabilization)
Computes the observed-data log-likelihood for a Latent Transition Analysis (LTA) model
using the three-step approach with measurement error correction. The likelihood integrates over
all possible latent class paths while incorporating classification uncertainty via
Classification Error Probability (CEP) matrices. This function is designed to work with
parameters estimated from the LTA function.
get.Log.Lik.LTA( params, CEP, P.Z.Xns, Zs, covariates, covariates.timeCross = FALSE )get.Log.Lik.LTA( params, CEP, P.Z.Xns, Zs, covariates, covariates.timeCross = FALSE )
params |
A named
|
CEP |
A
where |
P.Z.Xns |
A
the posterior probability of individual |
Zs |
A |
covariates |
A |
covariates.timeCross |
Logical. If |
The log-likelihood calculation follows these steps:
Latent Path Enumeration:
All possible latent class trajectories are generated and cached.
Initial Class Probabilities (time 1):
For individual , compute using multinomial logit with covariates :
where (reference class constraint). Numerical stabilization
is applied via subtraction of the maximum linear predictor.
Transition Probabilities (times ):
For transition from class at time to class at time :
where for all (reference class constraint).
Path-Specific Likelihood:
For each path and individual :
Compute path probability:
Apply CEP weights:
Multiply path probability by CEP weights
Marginalization:
Sum path-specific likelihoods over all paths for each individual ,
then sum log-transformed marginal likelihoods across all individuals.
A single numeric value representing the total observed-data log-likelihood:
where is a latent class path, is the modal assignment,
and denotes all model parameters (beta, gama).
When no covariates are included:
Initial probabilities reduce to (multinomial probabilities)
Transition probabilities reduce to
(time-specific Markov transition probabilities)
LTA for three-step LTA estimation,
get.CEP for CEP matrix computation
Computes the total number of free parameters in an LCA model based on the number of categories per observed variable and the number of latent classes. This follows standard LCA parameterization with local independence assumption.
get.npar.LCA(poly.value, L)get.npar.LCA(poly.value, L)
poly.value |
A numeric vector of length |
L |
Integer specifying the number of latent classes. |
Parameter count derivation:
Conditional response probabilities: parameters
Independent class proportions: parameters (since )
For each observed variable with categories:
Each latent class requires conditional probabilities
With constraints for each class
Global constraints reduce total parameters to per variable
Integer representing the total number of free parameters in the model:
# Example 1: 3 binary variables (K_i=2), 2 latent classes poly.value <- c(2, 2, 2) # Three binary variables L <- 2 npar <- sum(poly.value * L - 1) + (L - 1) # = (4-1)+(4-1)+(4-1) + 1 = 3+3+3+1 = 10 get.npar.LCA(poly.value, L) # Returns 10 # Example 2: Mixed variable types (binary, ternary, quaternary) poly.value <- c(2, 3, 4) # Variables with 2, 3, and 4 categories L <- 3 npar <- sum(poly.value * L - 1) + (L - 1) # = (6-1)+(9-1)+(12-1) + 2 = 5+8+11+2 = 26 get.npar.LCA(poly.value, L) # Returns 26 # Example 3: Single polytomous variable with 5 categories, 4 latent classes poly.value <- 5 L <- 4 npar <- sum(poly.value * L - 1) + (L - 1) # = (20-1) + 3 = 19+3 = 22 get.npar.LCA(poly.value, L) # Returns 22# Example 1: 3 binary variables (K_i=2), 2 latent classes poly.value <- c(2, 2, 2) # Three binary variables L <- 2 npar <- sum(poly.value * L - 1) + (L - 1) # = (4-1)+(4-1)+(4-1) + 1 = 3+3+3+1 = 10 get.npar.LCA(poly.value, L) # Returns 10 # Example 2: Mixed variable types (binary, ternary, quaternary) poly.value <- c(2, 3, 4) # Variables with 2, 3, and 4 categories L <- 3 npar <- sum(poly.value * L - 1) + (L - 1) # = (6-1)+(9-1)+(12-1) + 2 = 5+8+11+2 = 26 get.npar.LCA(poly.value, L) # Returns 26 # Example 3: Single polytomous variable with 5 categories, 4 latent classes poly.value <- 5 L <- 4 npar <- sum(poly.value * L - 1) + (L - 1) # = (20-1) + 3 = 19+3 = 22 get.npar.LCA(poly.value, L) # Returns 22
Computes the total number of free parameters in an LPA model based on the number of observed
variables (), number of latent profiles (), and covariance structure constraints.
get.npar.LPA(I, L, constraint = "VV")get.npar.LPA(I, L, constraint = "VV")
I |
Integer specifying the number of continuous observed variables. |
L |
Integer specifying the number of latent profiles. |
constraint |
Character string specifying covariance structure constraints. Supported options:
Default: |
Parameter count breakdown:
Fixed components (always present):
Profile-specific means: parameters
Independent class proportions: parameters (since )
Covariance parameters (varies by constraint):
:"UE": 1 shared variance parameter
"UV": profile-specific variance parameters
:"E0": shared variance parameters (no covariances)
"V0": profile-specific variance parameters (no covariances)
"EE": parameters for one shared full covariance matrix
"VE": diagonal variances (free per profile) + off-diagonal covariances (shared across profiles)
"EV": diagonal variances (shared across profiles) + off-diagonal covariances (free per profile)
"VV": parameters for distinct full covariance matrices
Integer representing the total number of free parameters in the model:
Important considerations:
For , only "UE" and "UV" are meaningful; "EE", "E0", "VV", "V0", etc., are treated as "UE" or "UV" respectively.
Covariance parameters count only free elements in symmetric matrices (diagonal + upper triangle).
If an user-defined constraint is provided, the function defaults to "VV"
behavior but subtracts .
"VE" and "EV" constraints require to be meaningful (otherwise no covariances exist).
# Univariate examples (I=1) get.npar.LPA(I = 1, L = 2, constraint = "UE") get.npar.LPA(I = 1, L = 3, constraint = "UV") # Multivariate examples (I=3) get.npar.LPA(I = 3, L = 2, constraint = "E0") get.npar.LPA(I = 3, L = 2, constraint = "V0") get.npar.LPA(I = 3, L = 2, constraint = "EE") get.npar.LPA(I = 3, L = 2, constraint = "VV") get.npar.LPA(I = 3, L = 2, constraint = "VE") get.npar.LPA(I = 3, L = 2, constraint = "EV") # User defined example get.npar.LPA(I = 3, L = 2, constraint = list(c(1, 2), c(3, 3)))# Univariate examples (I=1) get.npar.LPA(I = 1, L = 2, constraint = "UE") get.npar.LPA(I = 1, L = 3, constraint = "UV") # Multivariate examples (I=3) get.npar.LPA(I = 3, L = 2, constraint = "E0") get.npar.LPA(I = 3, L = 2, constraint = "V0") get.npar.LPA(I = 3, L = 2, constraint = "EE") get.npar.LPA(I = 3, L = 2, constraint = "VV") get.npar.LPA(I = 3, L = 2, constraint = "VE") get.npar.LPA(I = 3, L = 2, constraint = "EV") # User defined example get.npar.LPA(I = 3, L = 2, constraint = list(c(1, 2), c(3, 3)))
Computes the total number of free parameters in a Latent Transition Analysis (LTA) model estimated via the three-step approach. The count depends on the number of latent classes, the number of time points, the number of covariates at each time point, and whether transition coefficients are constrained to be equal across time.
get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE)get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE)
covariates.ncol |
An integer vector of length |
L |
Integer scalar. Number of latent classes ( |
covariates.timeCross |
Logical. If |
Parameterization:
Multinomial logit model with classes (last class is reference).
Number of free parameters: .
):For each transition, a multinomial logit model conditioned on previous class.
For each origin class and destination class (),
there is a coefficient vector of length .
Total per transition: parameters.
The constraint covariates.timeCross determines whether these parameters
are shared across transitions.
Integer representing the total number of free parameters:
where:
time-invariant effects corresponds to covariates.timeCross = TRUE
time-varying effects corresponds to covariates.timeCross = FALSE
Critical assumptions:
The last latent class () is always the reference category for all logits.
When covariates.timeCross = TRUE, it is assumed that all time points after the first
have identical covariate structures (). If violated, the function
uses for all transitions to match LTA's internal behavior.
For , no transition parameters are estimated (pure latent class/profile analysis).
# Example 1: 2 time points, 2 classes, time-invariant transition coefficients # Time1: 2 covariates (intercept + 1 predictor) # Time2: 3 covariates (but constrained to match Time1 due to timeCross=TRUE) covariates.ncol <- c(2, 3) L <- 2 get.npar.LTA(covariates.ncol, L, covariates.timeCross = TRUE) # Example 2: Same as above but time-varying coefficients get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE) # Example 3: 3 time points, 3 classes, time-invariant coefficients covariates.ncol <- c(2, 2, 2) # All time points have identical covariates L <- 3 get.npar.LTA(covariates.ncol, L, covariates.timeCross = TRUE) # Example 4: 3 time points, 3 classes, time-varying coefficients covariates.ncol <- c(2, 3, 4) L <- 3 get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE) # Example 5: Single time point (equivalent to LCA) covariates.ncol <- c(3) L <- 4 get.npar.LTA(covariates.ncol, L)# Example 1: 2 time points, 2 classes, time-invariant transition coefficients # Time1: 2 covariates (intercept + 1 predictor) # Time2: 3 covariates (but constrained to match Time1 due to timeCross=TRUE) covariates.ncol <- c(2, 3) L <- 2 get.npar.LTA(covariates.ncol, L, covariates.timeCross = TRUE) # Example 2: Same as above but time-varying coefficients get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE) # Example 3: 3 time points, 3 classes, time-invariant coefficients covariates.ncol <- c(2, 2, 2) # All time points have identical covariates L <- 3 get.npar.LTA(covariates.ncol, L, covariates.timeCross = TRUE) # Example 4: 3 time points, 3 classes, time-varying coefficients covariates.ncol <- c(2, 3, 4) L <- 3 get.npar.LTA(covariates.ncol, L, covariates.timeCross = FALSE) # Example 5: Single time point (equivalent to LCA) covariates.ncol <- c(3) L <- 4 get.npar.LTA(covariates.ncol, L)
Computes posterior probabilities of latent class membership for each observation
using fixed conditional response probabilities (par) and fixed class prior
probabilities (P.Z).
get.P.Z.Xn.LCA(response, par, P.Z)get.P.Z.Xn.LCA(response, par, P.Z)
response |
Numeric matrix ( |
par |
3D array (
|
P.Z |
Vector of length |
Unlike an EM algorithm, this function does NOT iteratively update class prevalences. It performs a single calculation step based on Bayes' theorem:
where are the fixed priors provided in the P.Z argument.
Numeric matrix () of posterior probabilities.
Rows sum to 1. Columns are named "Class.1", "Class.2", etc.
library(LCPA) set.seed(123) # Simulate data data.obj <- sim.LCA(N = 200, I = 3, L = 2, IQ = 0.85) # Fit a model to get parameters fit <- LCA(data.obj$response, L = 2, method = "EM", nrep = 5) # Calculate posteriors using fixed parameters from the fitted model P.Z.Xn <- get.P.Z.Xn.LCA( response = data.obj$response, par = fit$params$par, P.Z = fit$params$P.Z ) head(P.Z.Xn)library(LCPA) set.seed(123) # Simulate data data.obj <- sim.LCA(N = 200, I = 3, L = 2, IQ = 0.85) # Fit a model to get parameters fit <- LCA(data.obj$response, L = 2, method = "EM", nrep = 5) # Calculate posteriors using fixed parameters from the fitted model P.Z.Xn <- get.P.Z.Xn.LCA( response = data.obj$response, par = fit$params$par, P.Z = fit$params$P.Z ) head(P.Z.Xn)
Computes posterior probabilities of latent profile membership for each observation using fixed profile parameters (means, covariances) and fixed prior probabilities.
get.P.Z.Xn.LPA(response, means, covs, P.Z)get.P.Z.Xn.LPA(response, means, covs, P.Z)
response |
Numeric matrix ( |
means |
Numeric matrix (
Row |
covs |
3D array (
Each slice must be symmetric and positive definite. |
P.Z |
Vector of length |
Unlike an EM algorithm, this function does NOT iteratively update profile prevalences. It performs a single E-step calculation:
where are the fixed priors provided in the P.Z argument.
Numeric matrix () of posterior probabilities.
Rows sum to 1. Columns are named "Class.1", "Class.2", etc.
library(LCPA) set.seed(123) data.obj <- sim.LPA(N = 300, I = 2, L = 2, constraint = "VV") fit <- LPA(data.obj$response, L = 2, method = "EM", nrep = 5) # Calculate posteriors using fixed parameters from a fitted model P.Z.Xn <- get.P.Z.Xn.LPA( response = data.obj$response, means = fit$params$means, covs = fit$params$covs, P.Z = fit$params$P.Z ) head(P.Z.Xn)library(LCPA) set.seed(123) data.obj <- sim.LPA(N = 300, I = 2, L = 2, constraint = "VV") fit <- LPA(data.obj$response, L = 2, method = "EM", nrep = 5) # Calculate posteriors using fixed parameters from a fitted model P.Z.Xn <- get.P.Z.Xn.LPA( response = data.obj$response, means = fit$params$means, covs = fit$params$covs, P.Z = fit$params$P.Z ) head(P.Z.Xn)
Computes approximate standard errors (SEs) for estimated parameters in Latent Class Analysis (LCA) or Latent Profile Analysis (LPA) models using two methods:
"Bootstrap": Non-parametric bootstrap with label-switching correction.
McLachlan & Peel (2004) suggest that 50–100 replicates often provide adequate accuracy
for practical purposes, though more (e.g., 500–1000) may be preferred for publication-quality inference.
"Obs": Numerical evaluation of the observed information matrix (Hessian of negative log-likelihood)
Users should note that get.SE computes standard errors based on the observed
information matrix via numerical differentiation, which may lack precision and often yields
ill-conditioned matrices. Therefore, we recommend using method = "Bootstrap".
get.SE(object, method = "Bootstrap", n.Bootstrap = 100, vis = TRUE)get.SE(object, method = "Bootstrap", n.Bootstrap = 100, vis = TRUE)
object |
|
method |
Character specifying SE calculation method: |
n.Bootstrap |
Integer. Number of bootstrap replicates when |
vis |
Logical. If |
A list of class "SE" containing:
seNamed list of SEs matching parameter structure of input model:
LPA: means (matrix: classes x variables),
covs (array: vars x vars x classes),
P.Z (vector: class prob SEs)
LCA: par (array: classes x indicators x categories),
P.Z (vector: class prob SEs)
Critical Note for "Obs" method:
Only free parameters have non-zero SEs. Non-free parameters (e.g., last class probability in P.Z due
to sum-to-1 constraint; last category probability in LCA indicators) have SE=0.
Bootstrap provides SEs for all parameters.
vcovNULL for bootstrap. For "Obs": variance-covariance matrix (may be regularized). Diagonal
contains squared SEs of free parameters.
hessianNULL for bootstrap. For "Obs": observed information matrix (pre-regularization).
Dimension = number of free parameters.
diagnosticsMethod-specific diagnostics:
Bootstrap: n.Bootstrap.requested, n.Bootstrap.completed
Obs: Hessian computation details, condition number, regularization status, step sizes
callFunction call that generated the object
argumentsList of input arguments
McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC
library(LCPA) set.seed(123) # LPA with Bootstrap (minimal replicates for example) lpa_data <- sim.LPA(N = 500, I = 4, L = 3) lpa_fit <- LPA(lpa_data$response, L = 3) se_boot <- get.SE(lpa_fit, method = "Bootstrap", n.Bootstrap = 10) print(se_boot) extract(se_boot, "covs") # LCA with Observed Information (note zeros for constrained parameters) lca_data <- sim.LCA(N = 500, I = 4, L = 3, poly.value = 5) lca_fit <- LCA(lca_data$response, L = 3) se_obs <- get.SE(lca_fit, method = "Obs") print(se_obs) extract(se_obs, "par")library(LCPA) set.seed(123) # LPA with Bootstrap (minimal replicates for example) lpa_data <- sim.LPA(N = 500, I = 4, L = 3) lpa_fit <- LPA(lpa_data$response, L = 3) se_boot <- get.SE(lpa_fit, method = "Bootstrap", n.Bootstrap = 10) print(se_boot) extract(se_boot, "covs") # LCA with Observed Information (note zeros for constrained parameters) lca_data <- sim.LCA(N = 500, I = 4, L = 3, poly.value = 5) lca_fit <- LCA(lca_data$response, L = 3) se_obs <- get.SE(lca_fit, method = "Obs") print(se_obs) extract(se_obs, "par")
Checks whether essential Python packages required to run neural latent variable models
(e.g., LCAnet, LPAnet) are installed in the current Python environment. If any are missing,
the user is interactively prompted to install them via reticulate::py_install().
The targeted packages are:
numpy — Fundamental package for numerical computing in Python.
torch — PyTorch deep learning framework (supports CPU/GPU computation).
matplotlib — 2D plotting and visualization library.
scikit-learn — Machine learning utilities.
scipy — Scientific computing and advanced linear algebra routines.
six — Python 3 compatibility library.
For torch, users can choose between CPU-only or GPU-enabled versions (with CUDA support).
Available CUDA versions are filtered by OS compatibility.
This function is especially useful when deploying models that bridge R and Python via reticulate, ensuring all backend dependencies are met before model execution.
install_python_dependencies()install_python_dependencies()
The function performs the following steps for each dependency:
Uses reticulate::py_module_available() to test if the module is importable.
If not available, prints a message describing the package's purpose.
Prompts the user interactively (via readline) whether to proceed with installation.
For torch, offers CPU/GPU choice and CUDA version selection if GPU is chosen.
Installs the package using reticulate::py_install() with appropriate index URL if needed.
Returns a logical list indicating initial installation status of each package.
Note: This function requires reticulate to be loaded and a valid Python environment configured. It does NOT automatically install reticulate or configure Python — that must be done separately.
A named list of logical values indicating whether each package was already installed before running this function:
numpy_installed |
Logical. Was |
torch_installed |
Logical. Was |
matplotlib_installed |
Logical. Was |
sklearn_installed |
Logical. Was |
scipy_installed |
Logical. Was |
six_installed |
Logical. Was |
library(reticulate) # Ensure reticulate is loaded and Python is configured # need python ## Not run: # Run dependency installer deps <- install_python_dependencies() # Check which were missing print(deps) ## End(Not run)library(reticulate) # Ensure reticulate is loaded and Python is configured # need python ## Not run: # Run dependency installer deps <- install_python_dependencies() # Check which were missing print(deps) ## End(Not run)
Performs hard clustering of observations using K-means algorithm to generate
initial parameter estimates for Latent Class Analysis (LCA) models. This
provides a data-driven initialization strategy that often outperforms random
starts when the number of observed categorical variables is large
(i.e., ).
Kmeans.LCA(response, L, nrep = 10)Kmeans.LCA(response, L, nrep = 10)
response |
A numeric matrix of dimension |
L |
Integer specifying the number of latent classes. Must be |
nrep |
Integer specifying the number of random starts for K-means algorithm (default: 10). The solution with the lowest within-cluster sum of squares is retained. |
The function executes the following steps:
Data preprocessing: Automatically adjusts non-sequential category values to sequential integers (e.g., categories {1,3,5} become {1,2,3}) using internal adjustment routines.
K-means clustering: Scales variables to mean=0 and SD=1 before clustering. Uses Lloyd's algorithm with Euclidean distance.
Parameter estimation:
For each cluster , computes empirical response probabilities
for all indicators and categories .
Handles singleton clusters by assigning near-deterministic probabilities
(e.g., for observed category, for others).
Posterior probabilities: Constructs hard-classification matrix where
for the assigned cluster and 0 otherwise.
A list containing:
paramsList of initialized parameters:
parAn array of initial conditional probabilities,
where is the maximum number of categories across indicators.
Dimension order: latent classes (1:L), indicators (1:I), response categories (1:K_max).
P.ZNumeric vector of length containing initial class prior probabilities
derived from cluster proportions.
P.Z.XnAn matrix of posterior class probabilities. Contains
hard assignments (0/1 values) based on K-means cluster memberships.
Requires at least one observation per cluster. If a cluster has only one observation,
probabilities are set to avoid zero values (using ) for numerical stability.
Data scaling is applied internally. Variables with zero variance are automatically excluded from clustering.
This function is primarily designed as an initialization method for LCA and not
intended for final model estimation.
# Simulate response data set.seed(123) response <- matrix(sample(1:4, 200, replace = TRUE), ncol = 5) # Generate K-means initialization for 3-class LCA init_params <- Kmeans.LCA(response, L = 3, nrep = 5) # Inspect initial class probabilities print(init_params$params$P.Z)# Simulate response data set.seed(123) response <- matrix(sample(1:4, 200, replace = TRUE), ncol = 5) # Generate K-means initialization for 3-class LCA init_params <- Kmeans.LCA(response, L = 3, nrep = 5) # Inspect initial class probabilities print(init_params$params$P.Z)
This function estimates parameters of a Latent Class Analysis (LCA; Hagenaars & McCutcheon, 2002) model using either the Expectation-Maximization (EM) algorithm or Neural Network Estimation (NNE). It supports flexible initialization strategies and provides comprehensive model diagnostics.
LCA( response, L = 2, par.ini = "random", method = "EM", is.sort = TRUE, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )LCA( response, L = 2, par.ini = "random", method = "EM", is.sort = TRUE, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )
response |
A numeric matrix of dimension |
L |
Integer specifying the number of latent classes (default: 2). |
par.ini |
Specification for parameter initialization. Options include:
|
method |
Character string specifying estimation algorithm:
|
is.sort |
A logical value. If |
nrep |
Integer controlling replication behavior:
|
starts |
Number of random initializations to explore during warm-up phase (default: 100). |
maxiter.wa |
Maximum number of training iterations allowed per warm-up run. After completion,
the top |
vis |
Logical. If |
control.EM |
List of control parameters for EM algorithm:
|
control.Mplus |
List of control parameters for Mplus estimation:
|
control.NNE |
List of control parameters for NNE algorithm:
|
An object of class "LCA" containing:
paramsList with estimated parameters:
par array of conditional response probabilities per latent class.
P.ZVector of length with latent class prior probabilities.
nparNumber of free parameters in the model. see get.npar.LCA
Log.LikLog-likelihood of the final model. see get.Log.Lik.LCA
AICAkaike Information Criterion value.
BICBayesian Information Criterion value.
best_BICBest BIC value across nrep runs (if applicable).
P.Z.Xn matrix of posterior class probabilities for each observation.
P.ZVector of length containing the prior probabilities/structural parameters/proportions for each latent class.
ZVector of length with MAP-classified latent class memberships.
probabilitySame as params$par (redundant storage for convenience).
Log.Lik.historyVector tracking log-likelihood at each EM iteration.
Log.Lik.nrepVector of log-likelihoods from each replication run.
modelThe optimal neural network model object (only for method="NNE").
Contains the trained transformer architecture corresponding to best_loss.
This object can be used for further predictions or model inspection.
argumentsA list containing all input arguments
When method = "EM", parameters are estimated via the Expectation-Maximization algorithm, which iterates between:
E-step: Compute posterior class probabilities given current parameters:
where is the standardized (0-based) response for person on indicator (see adjust.response).
M-step: Update parameters by maximizing expected complete-data log-likelihood:
Class probabilities:
Conditional probabilities:
Convergence: Stops when or maximum iterations reached.
When method = "NNE", parameters are estimated using a hybrid neural network architecture
that combines feedforward layers with transformer-based attention mechanisms. This approach jointly
optimizes profile parameters and posterior probabilities through stochastic optimization enhanced
with simulated annealing. See install_python_dependencies. Key components include:
Architecture:
Observed categorical responses are converted to 0-based integer indices per indicator (not one-hot encoded).
For example, original responses become .
A fully-connected neural network with layer sizes specified by hidden.layers and activation
function activation.function processes the integer-indexed responses. This network outputs
unnormalized logits for posterior class membership ( matrix).
A transformer encoder with nhead attention heads that learns latent class prior probabilities
directly from observed responses.
Input: response matrix (), where = observations, = continuous variables.
Mechanism: Self-attention dynamically weighs variable importance during profile assignment, capturing complex multivariate interactions.
Output: Class prior vector computed as the mean of posteriors:
This ensures probabilistic consistency with the mixture model framework.
Global conditional probability parameters () are stored as learnable
parameters par (an tensor). A masked softmax is applied
along categories to enforce:
Probabilities sum to 1 within each indicator-class pair
Non-existent categories (beyond indicator's actual max response) are masked to zero probability
Optimization Strategy:
Hybrid Training Protocol: Alternates between:
Gradient-based phase: AdamW optimizer minimizes negative log-likelihood with weight decay regularization:
where is controlled by lambda (default: 1e-5). Learning rate decays adaptively when
loss plateaus (controlled by scheduler.patience and scheduler.factor).
Simulated annealing phase: After gradient-based early stopping (maxiter.early), parameters
are perturbed with noise scaled by temperature:
Temperature decays geometrically () from
initial.temperature until threshold.sa is reached. This escapes poor local minima.
Each full cycle (gradient descent + annealing) repeats up to maxcycle times.
Model Selection: Across nrep random restarts (using Dirichlet-distributed initializations
or K-means), the solution with lowest BIC is retained.
Diagnostics: Training loss, annealing points, and global best solution are plotted when vis=TRUE.
When method = "Mplus", estimation is delegated to external Mplus software.
The function automates the entire workflow:
Workflow:
Creates inst/Mplus to store:
Mplus input syntax (.inp)
Data file in Mplus format (.dat)
Posterior probabilities output (.dat)
Files are automatically deleted after estimation unless control.Mplus$clean.files = FALSE.
Constructs Mplus syntax with:
CLASSES = c1(L) specification for latent classes
CATEGORICAL declaration for all indicator variables
ANALYSIS block with optimization controls:
TYPE = mixtureStandard mixture modeling setup
STARTS = starts nrepRandom starts and final stage optimizations
STITERATIONS = maxiter.wamax itertions during starts.
MITERATIONS = maxiterMaximum EM iterations
CONVERGENCE = tolLog-likelihood convergence tolerance
MODEL block with %OVERALL%
Calls Mplus via MplusAutomation::mplusModeler(), which:
Converts R data to Mplus-compatible format with automatic recoding
Invokes Mplus executable (requires valid license and system PATH configuration)
Hagenaars, J. A. , & McCutcheon, A. L. (2002). Applied Latent Class Analysis. United Kingdom: Cambridge University Press.
McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC
library(LCPA) # Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) # Fit 2-profile model using Mplus # need Mplus # NOTE: 'files.path' in control.Mplus is REQUIRED — function will error if not provided. # Example creates a timestamped subfolder (e.g., "Mplus_LCA_YYYY-MM-DD_HH-MM-SS") under './' # to store all temporary Mplus files (.inp, .dat, .out, etc.). ## Not run: fit.mplus <- LCA(response, L = 2, method = "Mplus", nrep = 3, control.Mplus = list(files.path = "")) ## End(Not run) # Fit 2-class model with neural network estimation # need Python ## Not run: fit.nne <- LCA(response, L = 2, method = "NNE", nrep = 3) ## End(Not run)library(LCPA) # Example with simulated data set.seed(123) data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9) response <- data.obj$response # Fit 2-class model with EM algorithm fit.em <- LCA(response, L = 2, method = "EM", nrep = 10) # Fit 2-profile model using Mplus # need Mplus # NOTE: 'files.path' in control.Mplus is REQUIRED — function will error if not provided. # Example creates a timestamped subfolder (e.g., "Mplus_LCA_YYYY-MM-DD_HH-MM-SS") under './' # to store all temporary Mplus files (.inp, .dat, .out, etc.). ## Not run: fit.mplus <- LCA(response, L = 2, method = "Mplus", nrep = 3, control.Mplus = list(files.path = "")) ## End(Not run) # Fit 2-class model with neural network estimation # need Python ## Not run: fit.nne <- LCA(response, L = 2, method = "NNE", nrep = 3) ## End(Not run)
Implements the three-step estimation method (Vermunt, 2010; Liang et al., 2023) for latent class/profile analysis
with covariates, treating latent class membership as an observed variable with measurement error.
This is mathematically equivalent to a latent transition analysis (LTA) with times=1.
LCPA( response, L = 2, ref.class = L, type = "LCA", covariate = NULL, CEP.error = TRUE, par.ini = "random", params = NULL, is.sort = TRUE, constraint = "VV", method = "EM", tol = 1e-04, lower = -10, upper = 10, method.SE = "Bootstrap", n.Bootstrap = 100, maxiter = 5000, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )LCPA( response, L = 2, ref.class = L, type = "LCA", covariate = NULL, CEP.error = TRUE, par.ini = "random", params = NULL, is.sort = TRUE, constraint = "VV", method = "EM", tol = 1e-04, lower = -10, upper = 10, method.SE = "Bootstrap", n.Bootstrap = 100, maxiter = 5000, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )
response |
A matrix or data frame of observed responses.
Rows of the matrix represent individuals/participants/observations ( |
L |
Integer scalar. Number of latent classes/profiles. Must satisfy |
ref.class |
Integer |
type |
Character string. Specifies the type of latent variable model for Step 1:
|
covariate |
Optional. A matrix or data frame of covariates for modeling latent class membership.
Must include an intercept column (all 1s) as the first column.
If |
CEP.error |
Logical. If |
par.ini |
Specification for parameter initialization. Options include:
|
params |
Optional
|
is.sort |
A logical value. If |
constraint |
Character (LPA only). Specifies structure of within-class covariance matrices:
|
method |
Character. Estimation algorithm for Step 1 models:
|
tol |
Convergence tolerance for log-likelihood difference (default: 1e-4). |
lower |
The upper bound for the estimation of regression coefficients, default is -10 |
upper |
The lower bound for the estimation of regression coefficients, default is 10 |
method.SE |
Character. Method for estimating standard errors of parameter estimates:
Default is |
n.Bootstrap |
Integer. Number of bootstrap replicates used when |
maxiter |
Maximum number of iterations for optimizing the regression coefficients. Default: 5000. |
nrep |
Integer controlling replication behavior:
|
starts |
Number of random initializations to explore during warm-up phase (default: 100). |
maxiter.wa |
Maximum number of training iterations allowed per warm-up run. After completion,
the top |
vis |
Logical. If |
control.EM |
List of control parameters for EM algorithm:
|
control.Mplus |
List of control parameters for Mplus estimation:
|
control.NNE |
List of control parameters for NNE algorithm:
|
An object of class LCPA, a named list containing:
betaMatrix of size . Coefficients for class membership multinomial logit model.
Columns 1 to are free parameters; column (reference class) is constrained to .
beta.seStandard errors for beta (if Hessian is invertible). Same dimensions as beta.
May contain NA if variance-covariance matrix is not positive definite.
beta.Z.staZ-statistics for testing null hypothesis that each beta coefficient equals zero.
Computed as beta / beta.se. Same structure as beta.
beta.p.value.tail1One-tailed p-values based on standard normal distribution: .
Useful for directional hypotheses. Same structure as beta.
beta.p.value.tail2Two-tailed p-values: .
Standard test for non-zero effect. Same structure as beta.
P.Z.XnMatrix of size of posterior class probabilities
for each individual and class .
P.ZVector of length containing prior class proportions
estimated at Step 1.
ZVector of length containing modal class assignments
(MAP classifications) for each individual.
nparNumber of free parameters in the model (depends on covariates).
Log.LikObserved-data log-likelihood value at convergence.
Log.Lik.historyVector tracking log-likelihood at each iteration.
AICAkaike Information Criterion value.
BICBayesian Information Criterion value.
iterationsInteger. Number of optimization iterations in Step 3.
coveragedLogical. TRUE if optimization terminated before reaching maxiter (suggesting convergence).
Note: This is a heuristic indicator; formal convergence diagnostics should check Hessian properties.
paramsList. Step 1 model parameters (output from LCA() or LPA()).
callThe matched function call.
argumentsList of all input arguments passed to the function (useful for reproducibility).
The three-step procedure follows the same principles as LTA but for a single time point:
Step 1 — Unconditional Latent Class/Profile Model:
Fit an unconditional LCA or LPA model (ignoring covariates). Obtain posterior class membership probabilities
for each individual and class using Bayes' theorem.
Step 2 — Classification Error Probabilities (equal to get.CEP):
Compute the CEP matrix where element estimates:
using a non-parametric approximation:
where is the modal class assignment.
Step 3 — Class Membership Model with Measurement Error Correction: Estimate the multinomial logit model for class membership:
where is the covariate vector for individual
(with intercept as first column), and
contains intercept and regression coefficients. Class is the reference category
().
The observed-data likelihood integrates over latent classes:
Parameters are estimated via maximum likelihood using the BOBYQA algorithm.
Reference Class: Coefficients for the reference class (ref.class) are ALWAYS fixed to
zero () in the multinomial
logit model.
CEP Matrices: When CEP.error = TRUE, misclassification probabilities are estimated
non-parametrically using Step 1 posterior probabilities. This corrects for
classification uncertainty. See in get.CEP.
Covariate Requirements: Covariate matrix MUST include an intercept column (all 1s) as the first
column. Dimensions must be , where represents
the number of covariate and is the Intercept.
Optimization & Standard Errors:
Step 3 uses BOBYQA algorithm (nloptr::nloptr) for stable optimization with box constraints.
For method.SE = "Obs": Standard errors derived from inverse Hessian (hessian). If Hessian is singular:
Uses Moore-Penrose pseudoinverse (ginv)
Sets negative variances to NA
For method.SE = "Bootstrap": Each replicate independently re-estimates Steps 1-3.
Failed bootstrap runs yield NA in SEs and derived statistics. Progress messages include
replicate index and optimization diagnostics.
Computational Notes:
Step 1 complexity increases with and .
Bootstrap is computationally intensive: 100 replicates = 100 full re-estimations of Steps 1-3.
Bootstrap Reproducibility: Always set a seed (e.g., set.seed(123)) before
calling LCPA() when using method.SE = "Bootstrap".
Monitor convergence in bootstrap runs via progress messages.
Vermunt, J. K. (2010). Latent class modeling with covariates: Two improved three-step approaches. Political Analysis, 18(4), 450–469. https://doi.org/10.1093/pan/mpq025
Liang, Q., la Torre, J. d., & Law, N. (2023). Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach. Journal of Educational and Behavioral Statistics, 48(6), 690-718. https://doi.org/10.3102/10769986231163320
library(LCPA) set.seed(123) N <- 2000 # Sample size L <- 3 # Number of latent classes I <- 6 # Number of indicators # Create covariates (intercept + 2 covariates + 1 interaction) Intercept = rep(1, N) X1 <- rnorm(N) X2 <- rbinom(N, 1, 0.5) X1.X2 <- X1 * X2 covariate <- cbind(Intercept, X1, X2, X1.X2) # Simulate data for LPA sim_data <- sim.LTA( N = N, I = I, L = L, times = 1, type = "LPA", covariates = list(covariate), is.sort=TRUE, beta = matrix(c( -0.2, 0.0, -0.1, ## fix reference class to class 2 0.2, 0.0, -0.3, 0.8, 0.0, -0.6, -0.1, 0.0, 0.3 ), ncol = L, byrow = TRUE) ) response <- sim_data$responses[[1]] ## It is strongly recommended to perform the following ## standardization to obtain more stable results when LPA. ## Standardization is not performed here in order to ## compare estimated values with true values. # response <- normalize(response) # Fit cross-sectional LPA with covariates ## fix reference class to class 2 # need Mplus ## Not run: fit <- LCPA( response = response, L = L, ref.class = 2, type = "LPA", is.sort=TRUE, covariate = covariate, method.SE = "Obs", CEP.error = TRUE, method = "Mplus", control.Mplus = list(files.path = ""), vis = TRUE ) print(fit) ## End(Not run)library(LCPA) set.seed(123) N <- 2000 # Sample size L <- 3 # Number of latent classes I <- 6 # Number of indicators # Create covariates (intercept + 2 covariates + 1 interaction) Intercept = rep(1, N) X1 <- rnorm(N) X2 <- rbinom(N, 1, 0.5) X1.X2 <- X1 * X2 covariate <- cbind(Intercept, X1, X2, X1.X2) # Simulate data for LPA sim_data <- sim.LTA( N = N, I = I, L = L, times = 1, type = "LPA", covariates = list(covariate), is.sort=TRUE, beta = matrix(c( -0.2, 0.0, -0.1, ## fix reference class to class 2 0.2, 0.0, -0.3, 0.8, 0.0, -0.6, -0.1, 0.0, 0.3 ), ncol = L, byrow = TRUE) ) response <- sim_data$responses[[1]] ## It is strongly recommended to perform the following ## standardization to obtain more stable results when LPA. ## Standardization is not performed here in order to ## compare estimated values with true values. # response <- normalize(response) # Fit cross-sectional LPA with covariates ## fix reference class to class 2 # need Mplus ## Not run: fit <- LCPA( response = response, L = L, ref.class = 2, type = "LPA", is.sort=TRUE, covariate = covariate, method.SE = "Obs", CEP.error = TRUE, method = "Mplus", control.Mplus = list(files.path = ""), vis = TRUE ) print(fit) ## End(Not run)
This function computes the logistic (also known as sigmoid) transformation of the input. The logistic function maps real-valued numbers to the open interval (0, 1), and is widely used in machine learning, statistical modeling (e.g., logistic regression), and neural networks as an activation function or link function.
logit(x)logit(x)
x |
A numeric vector, matrix, or array. Accepts any real number, including |
The logistic function is defined as:
Note: Despite the name "logit", this function actually computes the inverse logit (i.e., the
logistic function). The true logit function is the inverse: .
However, in many applied contexts—especially in software—the term "logit" is sometimes
informally used to refer to the sigmoid. For clarity, this implementation follows the
conventional definition of the logistic/sigmoid function.
A numeric object of the same dimension as x, where each element is the
logistic transformation of the corresponding input:
If x = 0, returns 0.5
As x -> Inf, output approaches 1
As x -> -Inf, output approaches 0
NA values remain NA
logit(0) # 0.5 logit(c(-Inf, 0, Inf)) # c(0, 0.5, 1) logit(c(-2, -1, 0, 1, 2))logit(0) # 0.5 logit(c(-Inf, 0, Inf)) # c(0, 0.5, 1) logit(c(-2, -1, 0, 1, 2))
This function estimates parameters of a Latent Profile Analysis (LPA) model for continuous observed variables using one of three methods: Expectation-Maximization (EM) algorithm, Neural Network Estimation (NNE), or external Mplus software. It supports flexible covariance structures and initialization strategies.
LPA( response, L = 2, par.ini = "random", constraint = "VV", method = "EM", is.sort = TRUE, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )LPA( response, L = 2, par.ini = "random", constraint = "VV", method = "EM", is.sort = TRUE, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )
response |
A numeric matrix of dimension |
L |
Integer specifying the number of latent profiles (default: 2). |
par.ini |
Specification for parameter initialization. Options include:
|
constraint |
Character string specifying covariance structure constraints:
|
method |
Character string specifying estimation algorithm:
|
is.sort |
A logical value. If |
nrep |
Integer controlling replication behavior:
|
starts |
Number of random initializations to explore during warm-up phase (default: 100). |
maxiter.wa |
Maximum number of training iterations allowed per warm-up run. After completion,
the top |
vis |
Logical. If |
control.EM |
List of control parameters for EM algorithm:
|
control.Mplus |
List of control parameters for Mplus estimation:
|
control.NNE |
List of control parameters for NNE algorithm:
|
An object of class "LPA" containing:
paramsList with estimated profile parameters:
means matrix of estimated mean vectors for each profile.
covs array of estimated covariance matrices for each profile.
P.ZVector of length with profile prior probabilities.
nparNumber of free parameters in the model (depends on constraint).
Log.LikLog-likelihood of the final model.
AICAkaike Information Criterion value.
BICBayesian Information Criterion value.
best_BICBest BIC value across nrep runs (if applicable).
P.Z.Xn matrix of posterior profile probabilities for each observation.
P.ZVector of length containing the prior probabilities/structural parameters/proportions for each latent class.
ZVector of length with MAP-classified profile memberships.
Log.Lik.historyVector tracking log-likelihood at each EM iteration (only for method="EM").
Log.Lik.nrepVector of log-likelihoods from each replication run.
modelThe optimal model object:
For method="NNE": Trained neural network model.
For method="Mplus": Estimated Mplus model.
When method = "EM", parameter estimation uses the Expectation-Maximization (EM) algorithm to maximize the observed-data log-likelihood:
The algorithm iterates between two steps until convergence (change in log-likelihood < tol or max iterations reached):
Compute posterior class probabilities (responsibilities) for observation and class :
where is the multivariate normal density, is the prior class probability, and ,
are current parameters. Numerical stability is ensured via the log-sum-exp trick.
Update parameters using responsibilities :
Class probabilities:
Class means:
Class covariances: Updated under constraints:
"VV""EE"Shared covariance:
"VE" (default) / "EV"
Hybrid constraints (e.g., "VE": varying variances, equal correlations).
Off-diagonal elements use weighted averages across classes; diagonals retain class-specific values.
User-specified variances/covariances (e.g., list(c(1,2), c(2, 2)), meaning the
covariates of observed variable 1 and observed variable 2 are equal across latent classes,
and the variance of observed variable 2 is equal across classes) are forced equal across
classes via weighted averaging.
Covariance matrices are regularized to ensure positive definiteness:
Eigenvalues < jitter (1e-10) are replaced with jitter
Failed Cholesky decompositions trigger diagonal jittering or perturbation of non-constrained elements
Edge Handling:
Empty classes () are reinitialized by redistributing responsibilities.
Non-finite likelihoods trigger fallback to previous valid parameters or covariance perturbation.
Univariate cases () bypass Cholesky decomposition for direct variance updates.
When method = "NNE", parameters are estimated using a hybrid neural network architecture
combining fully-connected layers with transformer-based attention mechanisms. This approach jointly
optimizes profile parameters and posterior probabilities through stochastic optimization with
simulated annealing. See install_python_dependencies. Key components include:
Architecture:
Continuous observed variables are standardized (mean-centered, unit-variance)
internally during training to improve numerical stability. No encoding is needed — raw values are fed directly.
A multi-layer perceptron with architecture defined by hidden.layers and activation.function
maps the continuous input vector into a latent space of dimension d.model. This layer learns non-linear
feature combinations predictive of latent profile membership.
A transformer encoder with nhead attention heads that learns latent class prior probabilities
directly from observed responses.
Input: response matrix (), where = observations, = continuous variables.
Mechanism: Self-attention dynamically weighs variable importance during profile assignment, capturing complex multivariate interactions.
Output: Class prior vector computed as the mean of posteriors:
This ensures probabilistic consistency with the mixture model framework.
Two separate projection heads branch from the transformer output:
Means Head: Linear projection to matrix .
Covariance Head: Outputs lower-triangular elements of Cholesky factors
for each profile. Diagonal elements are passed through softplus to ensure positivity;
off-diagonals use tanh scaled by 1.2 to bound magnitude and promote stability.
The full covariance is reconstructed via .
After reconstruction, covariance constraints (e.g., "EE", "V0", or custom lists) are applied by
averaging constrained elements across profiles and re-symmetrizing.
Optimization Strategy:
Hybrid Training Protocol: Alternates between:
Gradient-based phase: AdamW optimizer minimizes negative log-likelihood with weight decay regularization:
where is controlled by lambda (default: 1e-5). Learning rate decays adaptively when
loss plateaus (controlled by scheduler.patience and scheduler.factor).
Simulated annealing phase: After gradient-based early stopping (maxiter.early), parameters
are perturbed with noise scaled by temperature:
Temperature decays geometrically () from
initial.temperature until threshold.sa is reached. This escapes poor local minima.
Each full cycle (gradient descent + annealing) repeats up to maxcycle times.
Model Selection: Across nrep random restarts (using Dirichlet-distributed initializations
or K-means), the solution with lowest BIC is retained.
Diagnostics: Training loss, annealing points, and global best solution are plotted when vis=TRUE.
Constraint Handling:
Covariance constraints (constraint) are enforced after activation via:
Shared Parameters: Variances/covariances marked for equality are replaced by their average across profiles.
Positive Definiteness: Non-positive definite matrices are corrected via eigenvalue clamping, diagonal jittering, or adaptive Cholesky decomposition.
Custom constraints: e.g., list(c(1,2), c(3,3)), force equality of specific covariance elements
across profiles, with symmetry () automatically enforced.
When method = "Mplus", estimation is delegated to external Mplus software.
The function automates the entire workflow:
Workflow:
Creates inst/Mplus to store:
Mplus input syntax (.inp)
Data file in Mplus format (.dat)
Posterior probabilities output (.dat)
Files are automatically deleted after estimation unless control.Mplus$clean.files = FALSE.
Constructs Mplus syntax with:
CLASSES = c1(L) specification for latent classes
ANALYSIS block with optimization controls:
TYPE = mixtureStandard mixture modeling setup
STARTS = starts nrepRandom starts and final stage optimizations
STITERATIONS = maxiter.wamax itertions during starts.
MITERATIONS = maxiterMaximum EM iterations
CONVERGENCE = tolLog-likelihood convergence tolerance
MODEL block reflecting the specified constraint structure
Calls Mplus via MplusAutomation::mplusModeler()
, which:
Writes data to disk in Mplus-compatible format
Invokes the Mplus executable (requires valid license)
Captures convergence status and output
Constraint Handling:
Covariance constraints (constraint) are enforced after activation via:
Shared Parameters: Variances/covariances marked for equality are replaced by their average across profiles.
Positive Definiteness: Non-positive definite matrices are corrected via eigenvalue clamping, diagonal jittering, or adaptive Cholesky decomposition.
Custom constraints: e.g., list(c(1,2), c(3,3)), force equality of specific covariance elements
across profiles, with symmetry () automatically enforced.
McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC
library(LCPA) # Simulate bivariate continuous data for 2 profiles set.seed(123) data.obj <- sim.LPA(N = 500, I = 3, L = 2, constraint = "VV") response <- data.obj$response ## It is strongly recommended to perform the following ## standardization to obtain more stable results. ## Standardization is not performed here in order to ## compare estimated values with true values. # response <- normalize(response) # Fit 2-profile model with VV constraint (default) fit_vv <- LPA(response, L = 2, constraint = "VV") # Fit 2-profile model with E0 constraint using neural network estimation # need Python ## Not run: fit_e0_nne <- LPA(response, L = 2, constraint = "E0", method = "NNE", nrep = 2) ## End(Not run) # Fit 2-profile model using Mplus # Requires Mplus to be installed and available in system PATH. # NOTE: 'files.path' in control.Mplus is REQUIRED — the function will # throw an error if not provided. # This example creates a timestamped subdirectory # (e.g., "Mplus_LPA_YYYY-MM-DD_HH-MM-SS") under './inst' # to store all temporary Mplus files (.inp, .dat, .out, etc.). # The 'inst' directory will be created if it does not exist. # Setting files.clean=FALSE means temporary files will be preserved after execution. ## Not run: fit_mplus <- LPA(response, L = 2, method = "Mplus", constraint = list(c(1, 2), c(3, 3)), control.Mplus = list(files.path = "inst", files.clean=FALSE)) ## End(Not run)library(LCPA) # Simulate bivariate continuous data for 2 profiles set.seed(123) data.obj <- sim.LPA(N = 500, I = 3, L = 2, constraint = "VV") response <- data.obj$response ## It is strongly recommended to perform the following ## standardization to obtain more stable results. ## Standardization is not performed here in order to ## compare estimated values with true values. # response <- normalize(response) # Fit 2-profile model with VV constraint (default) fit_vv <- LPA(response, L = 2, constraint = "VV") # Fit 2-profile model with E0 constraint using neural network estimation # need Python ## Not run: fit_e0_nne <- LPA(response, L = 2, constraint = "E0", method = "NNE", nrep = 2) ## End(Not run) # Fit 2-profile model using Mplus # Requires Mplus to be installed and available in system PATH. # NOTE: 'files.path' in control.Mplus is REQUIRED — the function will # throw an error if not provided. # This example creates a timestamped subdirectory # (e.g., "Mplus_LPA_YYYY-MM-DD_HH-MM-SS") under './inst' # to store all temporary Mplus files (.inp, .dat, .out, etc.). # The 'inst' directory will be created if it does not exist. # Setting files.clean=FALSE means temporary files will be preserved after execution. ## Not run: fit_mplus <- LPA(response, L = 2, method = "Mplus", constraint = list(c(1, 2), c(3, 3)), control.Mplus = list(files.path = "inst", files.clean=FALSE)) ## End(Not run)
Conducts a likelihood ratio test to compare the fit of two models. The test evaluates whether a model with more parameters provides a significantly better fit than a model with fewer parameters.
LRT.test(object1, object2)LRT.test(object1, object2)
object1 |
Fitted model object with fewer parameters (i.e., fewer |
object2 |
Fitted model object with more parameters (i.e., more |
Note that since the small model may be nested within the large model, the result
of LRT.test may not be accurate and is provided for reference only.
More reliable conclusions should be based on a combination of fit indices (i.e., get.fit.index),
classification accuracy measures (i.e., get.entropy, get.AvePP), and a bootstrapped
likelihood-ratio test (i.e., BLRT, LRT.test.Bootstrap, which is very time-consuming).
Above all and the most important criterion, is that the better model is the one that aligns with theoretical
expectations and offers clear interpretability.
The LRT.test test statistic is defined as:
The models must be nested (i.e., the model with fewer parameters is a constrained version of the more one).
Both models must be fit on the identical dataset with the same response variables.
The test statistic asymptotically follows a chi-square distribution.
where:
: Log-likelihood of the smaller model (fewer parameters).
: Log-likelihood of the larger model (more parameters).
Under the null hypothesis (H_0: small model is true), LRT asymptotically follows
a chi-square distribution with degrees of freedom.
An object of class "htest" containing:
statistic: VLMR adjusted test statistic
parameter: Degrees of freedom ()
p.value: P-value from distribution
method: Name of the test
data.name: Model comparison description
Conducts a bootstrap likelihood ratio test (BLRT) to compare two nested latent class analysis (LCA) or latent profile analysis (LPA) models. The test evaluates whether a model with more classes (the alternative model) provides significantly better fit than a model with fewer classes (the null model). Implements both fixed-replicate and sequential stopping procedures for computational efficiency.
LRT.test.Bootstrap( object1, object2, n.Bootstrap = 100, vis = TRUE, use.sequential = TRUE )LRT.test.Bootstrap( object1, object2, n.Bootstrap = 100, vis = TRUE, use.sequential = TRUE )
object1 |
Fitted model object representing the null model (e.g., |
object2 |
Fitted model object representing the alternative model (e.g., |
n.Bootstrap |
Maximum number of bootstrap replicates (default = 100).
If |
vis |
Logical. If |
use.sequential |
Logical. If |
Core Workflow (Parametric Bootstrap):
Parameter Extraction: Parameters from the null model (object1) are treated as the population truth.
Data Simulation: Generates datasets using sim.LCA or sim.LPA,
preserving the original sample size and indicator structure.
Model Re-fitting: For each simulated dataset, both the null and alternative models are re-estimated.
Initial parameters are set to random values (par.ini = "random") to avoid convergence to local maxima;
the EM algorithm includes fallback strategies upon non-convergence.
LRT Distribution: Computes bootstrap LRT statistics:
.
P-value Calculation: The p-value is the proportion of bootstrap LRTs that are greater than or equal to the observed LRT statistic.
Sequential Stopping Rule (Nylund et al., 2007):
When use.sequential = TRUE, the algorithm checks three stopping criteria after each replication:
Upper Stopping Points: Stop if the current estimated p-value () is
greater than or equal to a pre-specified threshold at replication ,
indicating insufficient evidence against the null model.
Lower Stopping Points: Stop if at replication ,
indicating strong evidence in favor of the alternative model.
Conditional Lower Stopping Points: Stop if and
the observed LRT exceeds the mean of the current bootstrap LRTs by more than standard deviations.
This leverages the approximate normality of the LRT distribution under the null (Nylund et al., 2007).
The exact stopping thresholds follow Appendix A of Nylund et al. (2007):
Upper: (n, 2/n) for n = 2–3; (n, 3/n) for n = 4–9; (n, 4/n) for n = 10–17; (n, 5/n) for n = 18–26; (n, 6/n) for n = 27–99; (100, 0).
Lower: (49, 0), (78, 1/78).
Conditional: (5, 0, 20), (10, 0, 10), (20, 0, 5), where the third value denotes the required z-score threshold.
This rule ensures >95% agreement with the decision that would be made using an infinite number of replicates
when the true p-value is below 0.10. Near , the algorithm continues until the maximum
number of replicates to avoid premature conclusions.
Critical Interpretation Notes:
The BLRT should not be used in isolation. Always integrate results with:
Information criteria: e.g., BIC, aBIC, CAIC via get.fit.index.
Classification quality metrics: e.g., entropy (get.entropy),
average posterior probability (get.AvePP).
Substantive interpretability and theoretical coherence of the extracted classes.
The BLRT is especially valuable in LCA/LPA contexts where the standard asymptotic LRT is invalid due to boundary parameter issues (e.g., class probabilities approaching 0 or 1; McLachlan & Peel, 2000).
Early termination under sequential stopping (e.g., at ) yields reliable p-values
when the evidence is strong ( or ).
However, if stopping occurs near n.Bootstrap with ,
the result should be considered inconclusive.
In small samples () or with poorly separated classes,
the BLRT may exhibit inflated Type I error rates; corroborate findings with BIC and other indices.
An object of class "htest" containing:
statistic: Observed likelihood ratio test statistic,
.
parameter: Degrees of freedom (set to NA; the bootstrap p-value does not rely on asymptotic chi-square distribution).
p.value: Bootstrap p-value, computed as the proportion of bootstrap LRT statistics
greater than or equal to the observed LRT statistic.
method: Character string indicating the procedure used:
either "Bootstrap LRT with Sequential Stopping" or "Bootstrap LRT (Fixed Replicates)".
data.name: Descriptive string summarizing the model comparison,
including the actual number of bootstrap replicates performed and, if applicable,
the reason for early stopping under the sequential rule.
LRT.Bootstrap: Numeric vector of length (where is the actual number of
bootstrap replicates performed) containing the bootstrap LRT statistics:
.
These are generated under the null hypothesis (i.e., data simulated from the smaller model).
This vector can be used for diagnostic purposes, such as inspecting the empirical null distribution
or computing alternative p-value estimates.
McLachlan, G. J., & Peel, D. (2000). Finite Mixture Models. John Wiley & Sons.
Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study. Structural Equation Modeling: A Multidisciplinary Journal, 14(4), 535–569. https://doi.org/10.1080/10705510701575396
Implements the ad-hoc adjusted likelihood ratio test (LRT) described in Formula 15 of Lo, Mendell,
& Rubin (2001), also known as VLMR LRT (Vuong-Lo-Mendell-Rubin Adjusted LRT). This method is typically
used to compare models with L-1 and L classes. If the difference in the number of
classes exceeds 1, conclusions should be interpreted with extreme caution.
LRT.test.VLMR(object1, object2)LRT.test.VLMR(object1, object2)
object1 |
Fitted model object with fewer parameters (i.e., fewer |
object2 |
Fitted model object with more parameters (i.e., more |
Note that since the small model may be nested within the large model, the result
of LRT.test.VLMR may not be accurate and is provided for reference only.
More reliable conclusions should be based on a combination of fit indices (i.e., get.fit.index),
classification accuracy measures (i.e., get.entropy, get.AvePP), and a bootstrapped
likelihood-ratio test (i.e., BLRT, LRT.test.Bootstrap, which is very time-consuming).
Above all and the most important criterion, is that the better model is the one that aligns with theoretical
expectations and offers clear interpretability.
The LRT.test.VLMR statistic is defined as:
where:
is the standard likelihood ratio statistic. see LRT.test
is the difference in number of free parameters between models.
is the sample size.
Under the null hypothesis (H_0: small model is true), VLMR asymptotically follows
a chi-square distribution with degrees of freedom.
An object of class "htest" containing:
statistic: VLMR adjusted test statistic
parameter: Degrees of freedom ()
p.value: P-value from distribution
method: Name of the test
data.name: Model comparison description
Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal mixture. Biometrika, 88(3), 767-778. https://doi.org/10.1093/biomet/88.3.767
Nylund-Gibson, K., & Choi, A. Y. (2018). Ten frequently asked questions about latent class analysis. Translational Issues in Psychological Science, 4(4), 440-461. https://doi.org/10.1037/tps0000176
Implements the three-step estimation method (Vermunt, 2010; Liang et al., 2023) for Latent Transition Analysis (LTA), treating latent class memberships at each time point as observed variables with measurement error. Classification uncertainty from Step 1 (latent class/profile analysis) is explicitly incorporated into the transition model estimation in Step 3, ensuring asymptotically unbiased estimates of transition probabilities and covariate effects. This avoids the bias introduced by "hard" modal-class assignment.
LTA( responses, L = 2, ref.class = L, type = "LCA", covariates = NULL, CEP.timeCross = FALSE, CEP.error = TRUE, covariates.timeCross = FALSE, par.ini = "random", params = NULL, is.sort = TRUE, constraint = "VV", method = "EM", tol = 1e-04, lower = -10, upper = 10, method.SE = "Bootstrap", n.Bootstrap = 100, maxiter = 5000, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )LTA( responses, L = 2, ref.class = L, type = "LCA", covariates = NULL, CEP.timeCross = FALSE, CEP.error = TRUE, covariates.timeCross = FALSE, par.ini = "random", params = NULL, is.sort = TRUE, constraint = "VV", method = "EM", tol = 1e-04, lower = -10, upper = 10, method.SE = "Bootstrap", n.Bootstrap = 100, maxiter = 5000, nrep = 20, starts = 100, maxiter.wa = 20, vis = TRUE, control.EM = NULL, control.Mplus = NULL, control.NNE = NULL )
responses |
A |
L |
Integer scalar. Number of latent classes/profiles at each time point. Must satisfy |
ref.class |
Integer |
type |
Character string. Specifies the type of latent variable model for Step 1:
|
covariates |
Optional. A |
CEP.timeCross |
Logical. If |
CEP.error |
Logical. If |
covariates.timeCross |
Logical. If |
par.ini |
Specification for parameter initialization. Options include:
|
params |
Optional
|
is.sort |
A logical value. If |
constraint |
Character (LPA only). Specifies structure of within-class covariance matrices:
|
method |
Character. Estimation algorithm for Step 1 models:
|
tol |
Convergence tolerance for log-likelihood difference (default: 1e-4). |
lower |
The upper bound for the estimation of regression coefficients, default is -10 |
upper |
The lower bound for the estimation of regression coefficients, default is 10 |
method.SE |
Character. Method for estimating standard errors of parameter estimates:
Default is |
n.Bootstrap |
Integer. Number of bootstrap replicates used when |
maxiter |
Maximum number of iterations for optimizing the regression coefficients. Default: 5000. |
nrep |
Integer controlling replication behavior:
|
starts |
Number of random initializations to explore during warm-up phase (default: 100). |
maxiter.wa |
Maximum number of training iterations allowed per warm-up run. After completion,
the top |
vis |
Logical. If |
control.EM |
List of control parameters for EM algorithm:
|
control.Mplus |
List of control parameters for Mplus estimation:
|
control.NNE |
List of control parameters for NNE algorithm:
|
An object of class LTA, a named list containing:
betaMatrix of size . Coefficients for initial class membership multinomial logit model.
Columns 1 to are free parameters; column (reference class) is constrained to .
gammaList of length . Each element gamma[[t]] (for transition from time to )
is a nested list: gamma[[t]][[from_class]][[to_class]] returns coefficient vector of length .
The last class () is reference → coefficients fixed to for all .
beta.seStandard errors for beta (if Hessian is invertible). Same dimensions as beta.
May contain NA if variance-covariance matrix is not positive definite.
gamma.seStandard errors for gamma, same nested structure. May contain NAs.
beta.Z.staZ-statistics for testing null hypothesis that each beta coefficient equals zero.
Computed as beta / beta.se. Same structure as beta.
gamma.Z.staZ-statistics for gamma coefficients. Same nested structure as gamma.
Used for testing significance of transition effects.
beta.p.value.tail1One-tailed p-values based on standard normal distribution: .
Useful for directional hypotheses. Same structure as beta.
gamma.p.value.tail1One-tailed p-values for gamma coefficients. Same nested structure as gamma.
beta.p.value.tail2Two-tailed p-values: .
Standard test for non-zero effect. Same structure as beta.
gamma.p.value.tail2Two-tailed p-values for gamma coefficients. Same nested structure as gamma.
P.Z.XnsList of length . Each element is an matrix of posterior class probabilities
for each individual at time .
P.ZsList of length . Each element is a vector of length containing prior class proportions
estimated at Step 1 for time .
ZsList of length . Each element is a vector of length containing modal class assignments
(MAP classifications) for each individual at time .
nparNumber of free parameters in the model (depends on covariates).
Log.LikObserved-data log-likelihood value at convergence.
Log.Lik.historyVector tracking log-likelihood at each iteration.
AICAkaike Information Criterion value.
BICBayesian Information Criterion value.
iterationsInteger. Number of optimization iterations in Step 3.
coveragedLogical. TRUE if optimization terminated before reaching maxiter (suggesting convergence).
Note: This is a heuristic indicator; formal convergence diagnostics should check Hessian properties.
paramsList. Step 1 model parameters (output from LCA() or LPA()).
callThe matched function call.
argumentsList of all input arguments passed to the function (useful for reproducibility).
The three-step LTA proceeds as follows:
Step 1 — Unconditional Latent Class/Profile Model:
At each time point , fit an unconditional LCA or LPA model (ignoring transitions and covariates).
Obtain posterior class membership probabilities for each individual
and class using Bayes' theorem.
Step 2 — Classification Error Probabilities (equal to get.CEP):
Compute the CEP matrix for each time point , where element estimates:
using a non-parametric approximation based on posterior weights:
where is the modal (most likely) class assignment for individual at time .
Step 3 — Transition Model with Measurement Error Correction: Estimate the multinomial logit models for:
Initial class membership (time 1):
Transitions (time ):
where is the covariate vector for observation/participant at time 1,
with (intercept term) and () representing the value of the -th covariate.
The coefficient vector corresponds element-wise to ,
where is the intercept and () are regression coefficients for covariates.
Class is the reference class ().
is the covariate vector at time ,
with (intercept) and () as the -th covariate value.
The coefficient vector
corresponds element-wise to , where is the intercept and ()
are regression coefficients. Class is the reference class ( for all ).
The full observed-data likelihood integrates over all possible latent class paths :
Parameters are estimated via maximum likelihood using
the BOBYQA algorithm (box-constrained derivative-free optimization). Reference class satisfies and for all when .
When method.SE = "Bootstrap", standard errors are estimated using a nonparametric bootstrap procedure:
Draw (=n.Bootstrap) independent samples of size with replacement from the original data.
For each bootstrap sample , re-estimate the full three-step LTA model (Steps 1–3), yielding parameter vector .
Compute the bootstrap standard error for each parameter as the sample standard deviation across replicates:
where .
This approach does not rely on large-sample normality or correct specification of the information matrix,
making it particularly suitable for complex models like LTA where analytic derivatives are difficult or unstable.
However, it increases computational cost linearly with .
Reference Class: The last latent class () is always treated as the reference category.
All corresponding coefficients in beta and gamma are fixed to zero (, for ).
CEP Matrices: When CEP.error = TRUE, misclassification probabilities are estimated
non-parametrically using Step 1 posterior probabilities. This corrects for
classification uncertainty. Setting CEP.timeCross = TRUE assumes these
error structures are identical across time (measurement invariance).
See in get.CEP.
Covariate Handling: Covariates for initial status (time 1) and transitions (time ) can differ.
For transitions to time , the covariate matrix must have dimensions ,
i.e., an intercept column of all plus columns of covariates in time .
Optimization: Step 3 uses L-BFGS-B via nloptr to ensure numerical stability.
Standard errors are derived from the inverse Hessian (via hessian). If singular,
Moore-Penrose pseudoinverse (ginv) is used, and negative variances are set to NA.
Computational Complexity: Likelihood evaluation requires enumerating possible latent paths.
Bootstrap Computation: Each bootstrap iteration re-fits Steps 1–3 independently, including re-estimation of
P.Z.Xns, CEP matrices and transition parameters.
To ensure reproducibility, set a seed before calling LTA() when using method.SE = "Bootstrap".
Progress messages during bootstrapping include current replicate index and optimization diagnostics.
Users should monitor convergence in each bootstrap run; failed runs will result in NA entries in SEs and derived statistics.
Liang, Q., la Torre, J. d., & Law, N. (2023). Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach. Journal of Educational and Behavioral Statistics, 48(6), 690-718. https://doi.org/10.3102/10769986231163320
McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC
Vermunt, J. K. (2010). Latent class modeling with covariates: Two improved three-step approaches. Political Analysis, 18(4), 450–469. https://doi.org/10.1093/pan/mpq025
library(LCPA) set.seed(123) N <- 2000 ## sample size L <- 3 ## number of latent class I <- 6 ## number of variables/items ## Covariates at time point T1 covariates.inter <- rep(1, N) # Intercept term is always 1 for each individual covariates.X11 <- rnorm(N) # Covariate X1 is a continuous variable # Combine into covariates at T1 covariates.T1 <- cbind(Intercept=covariates.inter, X1=covariates.X11) ## Covariates at time point T2 covariates.inter <- rep(1, N) # Intercept term is always 1 for each individual covariates.X21 <- rnorm(N) # Covariate X1 is a continuous variable # Combine into covariates at T1 covariates.T2 <- cbind(Intercept=covariates.inter, X1=covariates.X21) # Combine into final covariates list covariates <- list(t1=covariates.T1, t2=covariates.T2) ## Simulate beta coefficients # last column is zero because the last category is used as reference ## fix reference class to class 2 beta <- matrix(c( 0.8, 0.0, 0.1, -0.3, 0.0, -0.5), ncol=L, byrow=TRUE) ## Simulate gamma coefficients gamma <- list( lapply(1:L, function(l) { lapply(1:L, function(k) if(k < L) runif(2, -2.0, 2.0) else c(0, 0)) # Last class as reference }) ) ## Simulate the data sim_custom <- sim.LTA( N=N, I=I, L=L, times=2, type="LCA", IQ=0.9, covariates=covariates, beta=beta, gamma=gamma ) summary(sim_custom) responses <- sim_custom$responses covariates <- sim_custom$covariates ## fix reference class to class 2 LTA.obj <- LTA(responses, L=L, ref.class=2, type="LCA", covariates=covariates, method.SE="Bootstrap", n.Bootstrap=10, CEP.timeCross=FALSE, CEP.error=TRUE, covariates.timeCross=FALSE, par.ini = "random", method="EM", vis = TRUE) print(LTA.obj)library(LCPA) set.seed(123) N <- 2000 ## sample size L <- 3 ## number of latent class I <- 6 ## number of variables/items ## Covariates at time point T1 covariates.inter <- rep(1, N) # Intercept term is always 1 for each individual covariates.X11 <- rnorm(N) # Covariate X1 is a continuous variable # Combine into covariates at T1 covariates.T1 <- cbind(Intercept=covariates.inter, X1=covariates.X11) ## Covariates at time point T2 covariates.inter <- rep(1, N) # Intercept term is always 1 for each individual covariates.X21 <- rnorm(N) # Covariate X1 is a continuous variable # Combine into covariates at T1 covariates.T2 <- cbind(Intercept=covariates.inter, X1=covariates.X21) # Combine into final covariates list covariates <- list(t1=covariates.T1, t2=covariates.T2) ## Simulate beta coefficients # last column is zero because the last category is used as reference ## fix reference class to class 2 beta <- matrix(c( 0.8, 0.0, 0.1, -0.3, 0.0, -0.5), ncol=L, byrow=TRUE) ## Simulate gamma coefficients gamma <- list( lapply(1:L, function(l) { lapply(1:L, function(k) if(k < L) runif(2, -2.0, 2.0) else c(0, 0)) # Last class as reference }) ) ## Simulate the data sim_custom <- sim.LTA( N=N, I=I, L=L, times=2, type="LCA", IQ=0.9, covariates=covariates, beta=beta, gamma=gamma ) summary(sim_custom) responses <- sim_custom$responses covariates <- sim_custom$covariates ## fix reference class to class 2 LTA.obj <- LTA(responses, L=L, ref.class=2, type="LCA", covariates=covariates, method.SE="Bootstrap", n.Bootstrap=10, CEP.timeCross=FALSE, CEP.error=TRUE, covariates.timeCross=FALSE, par.ini = "random", method="EM", vis = TRUE) print(LTA.obj)
Standardizes each column of a numeric matrix or data frame to have mean zero and standard deviation one. This transformation is essential for many multivariate techniques that assume standardized inputs. The function preserves all dimension names and returns a pure numeric matrix with attributes storing original column means and standard deviations.
normalize(response)normalize(response)
response |
A numeric matrix or data frame of dimension
Non-numeric columns will be coerced to numeric with a warning. Missing values are not allowed
and will cause the function to fail. Constant columns (zero variance) will produce |
A standardized numeric matrix of dimension with attributes:
scaled:center: Vector of original column means ()
scaled:scale: Vector of original column standard deviations ()
Row names: Preserved from original input's row names or row indices
Column names: Preserved from original input's column names
Values: Z-scores calculated as
where:
= original value for observation and variable
= sample mean of variable :
= sample standard deviation of variable :
The denominator provides an unbiased estimator of population variance.
For each column in the input matrix , the standardization is performed as:
where:
is the -th column vector of
is the sample mean of column
is the sample standard deviation of column
The resulting matrix has the properties:
for all .
# Basic usage with matrix set.seed(123) mat <- matrix(rnorm(30, mean = 5:7, sd = 1:3), ncol = 3, dimnames = list(paste0("Obs", 1:10), paste0("Var", 1:3))) norm_mat <- normalize(mat) # Verify attributes attr(norm_mat, "scaled:center") # Original column means attr(norm_mat, "scaled:scale") # Original column standard deviations # Verify properties apply(norm_mat, 2, mean) # Should be near zero apply(norm_mat, 2, sd) # Should be exactly 1 # With data frame input df <- as.data.frame(mat) norm_df <- normalize(df) all.equal(norm_mat, norm_df, check.attributes = FALSE) # Should be identical # Handling constant columns (produces NaN) const_mat <- cbind(mat, Constant = rep(4.2, 10)) normalize(const_mat)# Basic usage with matrix set.seed(123) mat <- matrix(rnorm(30, mean = 5:7, sd = 1:3), ncol = 3, dimnames = list(paste0("Obs", 1:10), paste0("Var", 1:3))) norm_mat <- normalize(mat) # Verify attributes attr(norm_mat, "scaled:center") # Original column means attr(norm_mat, "scaled:scale") # Original column standard deviations # Verify properties apply(norm_mat, 2, mean) # Should be near zero apply(norm_mat, 2, sd) # Should be exactly 1 # With data frame input df <- as.data.frame(mat) norm_df <- normalize(df) all.equal(norm_mat, norm_df, check.attributes = FALSE) # Should be identical # Handling constant columns (produces NaN) const_mat <- cbind(mat, Constant = rep(4.2, 10)) normalize(const_mat)
Generates user-friendly, publication-ready visualizations for objects generated by the LCPA package.
This generic function dispatches to class-specific methods that produce diagnostic and interpretive plots
tailored to each model type. Designed for interactive exploration, model interpretation, and presentation-quality output.
## S3 method for class 'LCA' plot(x, ncol = 2, ...) ## S3 method for class 'LPA' plot(x, ncol = 2, ...)## S3 method for class 'LCA' plot(x, ncol = 2, ...) ## S3 method for class 'LPA' plot(x, ncol = 2, ...)
x |
An object of one of the following classes: |
ncol |
Number of columns in the multi-panel layout (default: 2). Controls arrangement of latent class/profile panels. |
... |
Additional arguments passed to methods. |
Each method produces a structured, visually intuitive plot optimized for its object type.
See specific method documentation (plot.LCA, plot.LPA) for detailed parameter options.
Invisibly returns the final ggplot or patchwork object. No data is modified.
plot(LCA): Plot method for LCA objects
plot(LPA): Plot method for LPA objects (with covariance heatmap)
Creates a publication-quality density plot showing the distribution of responses across multiple indicators/items/variables. Automatically handles variable ordering, color scaling, and legend layout based on the number of variables.
plotResponse(response)plotResponse(response)
response |
A matrix or data frame containing response data where:
Non-numeric columns (except row identifiers) will cause errors. |
A ggplot object containing:
Density curves for each variable colored by indicator
Adaptive color schemes based on number of variables
Optimized legend layout for large numbers of indicators
Publication-ready theme with grid lines and clean styling
The plot can be further customized using standard ggplot2 syntax.
The plot uses a minimal theme with:
Light grey grid lines for readability
Black axis lines and ticks (0.8pt thickness)
White background with no panel border
Optimized font sizes (13pt axis titles, 11pt tick labels)
Legend positioned on right with adaptive sizing
# Simulate response data for 5 indicators set.seed(42) resp_data <- data.frame( indicator1 = rnorm(100, mean = 3, sd = 1), indicator2 = rnorm(100, mean = 2, sd = 0.8), indicator3 = rnorm(100, mean = 4, sd = 1.2), indicator4 = rnorm(100, mean = 3.5, sd = 0.9), indicator5 = rnorm(100, mean = 2.5, sd = 1.1) ) library(LCPA) # Generate and display plot p <- plotResponse(resp_data) print(p) # For data with many indicators (18 indicators example) many_indicators <- as.data.frame(replicate(18, rnorm(50, mean = runif(1, 1, 5), sd = 1))) names(many_indicators) <- paste0("Q", 1:18) p_large <- plotResponse(many_indicators) print(p_large)# Simulate response data for 5 indicators set.seed(42) resp_data <- data.frame( indicator1 = rnorm(100, mean = 3, sd = 1), indicator2 = rnorm(100, mean = 2, sd = 0.8), indicator3 = rnorm(100, mean = 4, sd = 1.2), indicator4 = rnorm(100, mean = 3.5, sd = 0.9), indicator5 = rnorm(100, mean = 2.5, sd = 1.1) ) library(LCPA) # Generate and display plot p <- plotResponse(resp_data) print(p) # For data with many indicators (18 indicators example) many_indicators <- as.data.frame(replicate(18, rnorm(50, mean = runif(1, 1, 5), sd = 1))) names(many_indicators) <- paste0("Q", 1:18) p_large <- plotResponse(many_indicators) print(p_large)
Provides user-friendly, formatted console output for objects generated by the LCPA package.
This generic function dispatches to class-specific methods that display concise summaries of model results,
simulated datasets, fit indices, model comparisons, and standard errors. Designed for interactive use
and quick diagnostic inspection.
## S3 method for class 'LCA' print(x, ...) ## S3 method for class 'summary.LCA' print(x, ...) ## S3 method for class 'LPA' print(x, ...) ## S3 method for class 'summary.LPA' print(x, ...) ## S3 method for class 'LTA' print(x, ...) ## S3 method for class 'summary.LTA' print(x, ...) ## S3 method for class 'LCPA' print(x, ...) ## S3 method for class 'summary.LCPA' print(x, ...) ## S3 method for class 'sim.LCA' print(x, ...) ## S3 method for class 'summary.sim.LCA' print(x, ...) ## S3 method for class 'sim.LPA' print(x, ...) ## S3 method for class 'summary.sim.LPA' print(x, ...) ## S3 method for class 'sim.LTA' print(x, ...) ## S3 method for class 'summary.sim.LTA' print(x, ...) ## S3 method for class 'fit.index' print(x, ...) ## S3 method for class 'summary.fit.index' print(x, ...) ## S3 method for class 'compare.model' print(x, ...) ## S3 method for class 'summary.compare.model' print(x, ...) ## S3 method for class 'SE' print(x, digits = 4, I.max = 5, L.max = 3, ...) ## S3 method for class 'summary.SE' print(x, ...)## S3 method for class 'LCA' print(x, ...) ## S3 method for class 'summary.LCA' print(x, ...) ## S3 method for class 'LPA' print(x, ...) ## S3 method for class 'summary.LPA' print(x, ...) ## S3 method for class 'LTA' print(x, ...) ## S3 method for class 'summary.LTA' print(x, ...) ## S3 method for class 'LCPA' print(x, ...) ## S3 method for class 'summary.LCPA' print(x, ...) ## S3 method for class 'sim.LCA' print(x, ...) ## S3 method for class 'summary.sim.LCA' print(x, ...) ## S3 method for class 'sim.LPA' print(x, ...) ## S3 method for class 'summary.sim.LPA' print(x, ...) ## S3 method for class 'sim.LTA' print(x, ...) ## S3 method for class 'summary.sim.LTA' print(x, ...) ## S3 method for class 'fit.index' print(x, ...) ## S3 method for class 'summary.fit.index' print(x, ...) ## S3 method for class 'compare.model' print(x, ...) ## S3 method for class 'summary.compare.model' print(x, ...) ## S3 method for class 'SE' print(x, digits = 4, I.max = 5, L.max = 3, ...) ## S3 method for class 'summary.SE' print(x, ...)
x |
An object of one of the following classes:
|
... |
Additional arguments passed to methods (currently ignored in most cases). |
digits |
Number of decimal places for numeric output (default: varies by method, often 4).
Used by: |
I.max |
Maximum number of variables/items to display before truncation (default: varies, e.g., 5).
Used by: |
L.max |
Maximum number of latent classes/profiles to display before truncation (default: varies, e.g., 3).
Used by: |
Each method produces a structured, human-readable summary optimized for its object type:
LCA/LPA/LCPA/LTA)Invokes summary() internally and prints comprehensive output including:
Model call and configuration (method, constraints, reference class)
Data characteristics (N, I, time points, distribution)
Fit statistics (LogLik, AIC, BIC, entropy, npar)
Class/profile prior probabilities and frequencies
Item-response probabilities (LCA) or profile means (LPA)
For LCPA/LTA: regression coefficients with significance markers and 95% CIs
Convergence diagnostics (iterations, tolerance, hardware)
Replication details (if nrep > 1)
sim.LCA/sim.LPA/sim.LTA)Displays simulation design and true parameter structure:
Configuration (N, I, L, times, constraint, distribution)
True class/profile proportions and observed frequencies
For sim.LCA: item category structure and conditional probabilities
For sim.LPA: covariance constraint description and mean ranges
For sim.LTA: transition mode (fixed/covariate), initial/transition coefficients
Output is truncated for high-dimensional structures using I.max and L.max.
fit.index)Presents a clean table of model fit criteria:
Header with dimensions (N, I, L, npar)
Formatted table: AIC, BIC, SABIC, CAIC, AWE, -2LL, SIC
Interpretation note: “Lower values preferred for ICs”
Values rounded to digits decimal places
compare.model)Compares two nested models with statistical tests:
Comparative fit table (npar, LogLik, AIC, BIC, entropy)
Classification quality (AvePP per class, overall entropy)
Bayes Factor with interpretive guidance
Likelihood ratio tests (standard, VLMR, Bootstrap) with p-values and significance codes
Clear section headers and visual separators
SE)Displays uncertainty estimates for model parameters:
Class probability SEs (always fully shown)
Profile means SEs (LPA) or item-response SEs (LCA), truncated by L.max/I.max
Covariance SE summary (non-zero count only; full access via extract())
Diagnostics: Bootstrap completion % or Hessian condition number with stability warnings
All summary.* methods are called internally by their corresponding print.* methods.
They pre-compute and structure output for consistent formatting. Direct calls are also supported.
Invisibly returns the input object x. No data is modified.
print(LCA): Print method for LCA objects
print(summary.LCA): Print method for summary.LCA objects
print(LPA): Print method for LPA objects
print(summary.LPA): Print method for summary.LPA objects
print(LTA): Print method for LTA objects
print(summary.LTA): Print method for summary.LTA objects
print(LCPA): Print method for LCPA objects
print(summary.LCPA): Print method for summary.LCPA objects
print(sim.LCA): Print method for sim.LCA objects
print(summary.sim.LCA): Print method for summary.sim.LCA objects
print(sim.LPA): Print method for sim.LPA objects
print(summary.sim.LPA): Print method for summary.sim.LPA objects
print(sim.LTA): Print method for sim.LTA objects
print(summary.sim.LTA): Print method for summary.sim.LTA objects
print(fit.index): Print method for fit.index objects
print(summary.fit.index): Print method for summary.fit.index objects
print(compare.model): Print method for compare.model objects
print(summary.compare.model): Print method for summary.compare.model objects
print(SE): Print method for SE objects
print(summary.SE): Print method for summary.SE objects
Numeric values are typically rounded to 4 decimal places unless overridden by digits.
Large matrices (e.g., item parameters, transition coefficients) are truncated with clear messages.
Significance markers: *** (<0.001), ** (<0.01), * (<0.05), . (<0.1).
95% confidence intervals computed as: Estimate ± 1.96 × Std_Error.
Reference classes (for multinomial models) are explicitly stated.
Warnings appear for unstable SEs (high condition number) or incomplete Bootstrap runs.
rdirichlet generates n random observations from a Dirichlet distribution
with a specified concentration parameter vector alpha.
rdirichlet(n, alpha)rdirichlet(n, alpha)
n |
Integer. The number of random vectors to generate. |
alpha |
Numeric vector. The concentration parameters (must be positive).
The length of this vector determines the number of dimensions |
The Dirichlet distribution is a family of continuous multivariate probability distributions
parameterized by a vector of positive reals. It is the multivariate
generalization of the beta distribution and is commonly used as a conjugate prior
to the multinomial distribution in Bayesian statistics.
Probability Density Function:
For a vector on the unit simplex (where
and ), the density is given by:
where the normalizing constant is the multivariate beta function:
Simulation Method:
The function utilizes the property that if are independent
Gamma random variables such that , then:
The resulting vector follows a Dirichlet distribution with parameters .
A matrix with n rows and length(alpha) columns.
Each row sums to 1, representing a single sample from the Dirichlet distribution.
# Generate 5 samples from a 3-dimensional Dirichlet distribution set.seed(123) alpha_params <- c(1, 2, 5) result <- rdirichlet(n = 5, alpha = alpha_params) print(result) # Check that rows sum to 1 rowSums(result)# Generate 5 samples from a 3-dimensional Dirichlet distribution set.seed(123) alpha_params <- c(1, 2, 5) result <- rdirichlet(n = 5, alpha = alpha_params) print(result) # Check that rows sum to 1 rowSums(result)
This function generates a random correlation matrix using the C-vine partial correlation
parameterization described in Joe & Kurowicka (2026). The method constructs the matrix recursively using
partial correlations organized in a C-vine structure, with distributional properties controlled by LKJ
concentration and skewness parameters.
sim.correlation( I, eta = 1, skew = 0, positive = FALSE, permute = TRUE, maxiter = 10 )sim.correlation( I, eta = 1, skew = 0, positive = FALSE, permute = TRUE, maxiter = 10 )
I |
Dimension of the correlation matrix (must be |
eta |
LKJ concentration parameter ( |
skew |
Skewness parameter (
|
positive |
Logical. If |
permute |
Logical. If |
maxiter |
Integer. Maximum number of generation attempts before numerical adjustment when |
The algorithm follows four key steps:
Partial correlation sampling:
For tree level and node , partial correlations are sampled as:
If positive = FALSE:
If positive = TRUE:
Recursive matrix construction (C-vine):
The correlation matrix is built without matrix inversion using backward recursion:
Tree 1 (raw correlations): for
Trees : For pairs where and :
This implements the dynamic programming approach from Joe & Kurowicka (2026, Section 2.1).
Positive definiteness enforcement (when positive = TRUE):
Attempt up to maxiter generations
On failure, project to nearest positive definite correlation matrix using nearPD with corr = TRUE
Final matrix has minimum eigenvalue > 1e-8
Exchangeability (optional):
If permute = TRUE, rows/columns are randomly permuted before returning the matrix.
An positive definite correlation matrix with unit diagonal.
When positive = TRUE, the function guarantees positive definiteness either through direct generation
(with retries) or numerical projection. The theoretical guarantee is recommended for high dimensions.
Joe, H., & Kurowicka, D. (2026). Random correlation matrices generated via partial correlation C-vines. Journal of Multivariate Analysis, 211, 105519. https://doi.org/10.1016/j.jmva.2025.105519
# Default 3x3 correlation matrix sim.correlation(3) # 5x5 matrix concentrated near identity (eta=3) sim.correlation(5, eta = 3) # Skewed toward positive correlations (no permutation) sim.correlation(4, skew = 0.7, permute = FALSE) # Positive partial correlations (enforced positive definiteness) R <- sim.correlation(6, positive = TRUE) min(eigen(R, symmetric = TRUE, only.values = TRUE)$values) # > 1e-8 # High-dimensional case (I=20) with theoretical guarantee R <- sim.correlation(20, eta = 10) # eta=10 > (20-2)/2=9 min(eigen(R, symmetric = TRUE, only.values = TRUE)$values)# Default 3x3 correlation matrix sim.correlation(3) # 5x5 matrix concentrated near identity (eta=3) sim.correlation(5, eta = 3) # Skewed toward positive correlations (no permutation) sim.correlation(4, skew = 0.7, permute = FALSE) # Positive partial correlations (enforced positive definiteness) R <- sim.correlation(6, positive = TRUE) min(eigen(R, symmetric = TRUE, only.values = TRUE)$values) # > 1e-8 # High-dimensional case (I=20) with theoretical guarantee R <- sim.correlation(20, eta = 10) # eta=10 > (20-2)/2=9 min(eigen(R, symmetric = TRUE, only.values = TRUE)$values)
Generates synthetic multivariate categorical data from a latent class model with L latent classes.
Each observed variable follows a multinomial distribution within classes, with flexible control over
class separation via the IQ parameter and class size distributions.
sim.LCA( N = 1000, I = 10, L = 3, poly.value = 5, IQ = "random", distribution = "random", params = NULL, is.sort = TRUE )sim.LCA( N = 1000, I = 10, L = 3, poly.value = 5, IQ = "random", distribution = "random", params = NULL, is.sort = TRUE )
N |
Integer; total number of observations to simulate. Must be |
I |
Integer; number of categorical observed variables. Must be |
L |
Integer; number of latent classes. Must be |
poly.value |
Integer or integer vector; number of categories (levels) for each observed variable.
If scalar, all variables share the same number of categories. If vector, must have length |
IQ |
Character or numeric; controls category probability distributions:
|
distribution |
Character; distribution of class sizes. Options: |
params |
List with fixed parameters for simulation:
|
is.sort |
A logical value. If |
Probability Generation:
Dirichlet Sampling (IQ="random"):
For each variable-class combination, probabilities are drawn from
where .
High-Discrimination Mode (IQ=numeric):
For each variable i:
Generate special probabilities par.special of length L containing:
, , and values uniformly sampled from .
For each class l, assign par.special[l] to one category, distribute
remaining probability uniformly (via Dirichlet) across other categories.
Shuffle category assignments to avoid position bias.
Data Generation:
Class assignments Z are generated first according to distribution.
For each observation p and variable i:
Retrieve cumulative probabilities for class Z[p]
Draw uniform random number
Assign category k where
Entire dataset is regenerated if any category of any variable has zero observations.
Critical Constraints:
When IQ is numeric: and min(poly.value) >= 2
N must be sufficiently large to observe all categories, especially when IQ is high
or poly.value is large. Simulation may fail for small N.
For distribution="uniform", empty classes possible when .
A list containing:
Integer matrix () of simulated observations. Rows are observations (named "O1", "O2", ...),
columns are variables named "I1", "I2", ... Values range from 0 to poly.value[i]-1.
Array () of true class-specific category probabilities,
where (i.e., the maximum number of categories across variables).
Dimensions: classes x variables x categories.
Note: For variables with poly.value[i] < K, unused category dimensions contain NA.
Dimension names: "L1", "L2", ... (classes); "I1", "I2", ... (variables);
"poly0", "poly1", ... (categories).
Integer vector (length ) of true class assignments (1 to L). Named with observation IDs (e.g., "O1").
Numeric vector (length ) of true class proportions. Named with class labels (e.g., "L1").
Integer vector (length ) specifying number of categories per variable.
Binary matrix () of true class membership indicators (one-hot encoded).
Row i, column l = 1 if observation i belongs to class l, else 0.
Row/column names match Z and class labels.
A list containing all input arguments.
Controls the discriminative power of observed variables:
IQ = "random"(Default)
Category probabilities for each variable-class combination are drawn from a symmetric Dirichlet distribution
(), resulting in moderate class separation.
IQ = numeric(0.5 < IQ < 1) Forces high discriminative power for each variable:
For each variable, two categories per class are assigned extreme probabilities:
one category gets probability , another gets .
Remaining categories share the residual probability .
Note: This requires poly.value >= 2 for all variables.
Category assignments are randomized within classes to avoid structural patterns.
Higher IQ values (closer to 1) yield stronger class separation but increase simulation failure risk.
"random"(Default) Class proportions drawn from Dirichlet distribution ( for all classes),
ensuring no empty classes. Sizes are rounded to integers with adjustment for exact N.
"uniform"Equal probability of class membership ( per class), sampled with replacement.
May produce empty classes if N is small relative to L.
The simulation enforces a critical constraint: every category of every observed variable must appear at least once in the dataset. If initial generation violates this (e.g., a rare category is missing), parameters and responses are regenerated until satisfied. This ensures compatibility with standard LCA estimation.
# Example 1: Default settings (moderate separation, random class sizes) sim_data <- sim.LCA(N = 30, I = 5, L = 3) # Example 2: High-discrimination indicators (IQ=0.85), uniform class sizes sim_high_disc <- sim.LCA( N = 30, I = 4, L = 2, poly.value = c(3,4,3,5), # Variable category counts IQ = 0.85, distribution = "uniform" ) # Example 3: Binary indicators (poly.value=2) with high separation sim_binary <- sim.LCA(N = 300, I = 10, L = 2, poly.value = 2, IQ = 0.9)# Example 1: Default settings (moderate separation, random class sizes) sim_data <- sim.LCA(N = 30, I = 5, L = 3) # Example 2: High-discrimination indicators (IQ=0.85), uniform class sizes sim_high_disc <- sim.LCA( N = 30, I = 4, L = 2, poly.value = c(3,4,3,5), # Variable category counts IQ = 0.85, distribution = "uniform" ) # Example 3: Binary indicators (poly.value=2) with high separation sim_binary <- sim.LCA(N = 300, I = 10, L = 2, poly.value = 2, IQ = 0.9)
Generates synthetic multivariate continuous data from a latent profile model with L latent classes.
Supports flexible covariance structure constraints (including custom equality constraints) and
class size distributions. All covariance matrices are ensured to be positive definite.
sim.LPA( N = 1000, I = 5, L = 2, constraint = "VV", distribution = "random", mean.range = c(-2, 2), covs.range = c(0.01, 4), params = NULL, is.sort = TRUE )sim.LPA( N = 1000, I = 5, L = 2, constraint = "VV", distribution = "random", mean.range = c(-2, 2), covs.range = c(0.01, 4), params = NULL, is.sort = TRUE )
N |
Integer; total number of observations to simulate. Must be |
I |
Integer; number of continuous observed variables. Must be |
L |
Integer; number of latent profiles (classes). Must be |
constraint |
Character string or list specifying covariance constraints. See detailed description below.
Default is |
distribution |
Character; distribution of class sizes. Options: |
mean.range |
Numeric vector of length 2; range for sampling class-specific means.
Each variable's means are sampled uniformly from |
covs.range |
Numeric vector of length 2; range for sampling variance parameters (diagonal elements).
Must satisfy |
params |
List with fixed parameters for simulation:
|
is.sort |
A logical value. If |
Mean Generation: For each variable, candidate means are sampled uniformly from mean.range.
distinct means are selected without replacement to ensure separation between classes.
Covariance Generation:
Positive Definiteness: All covariance matrices are adjusted using Matrix::nearPD
and eigenvalue thresholds () to guarantee validity. Failed attempts trigger explicit errors.
Univariate Case (I=1): Constraints "UE" and "UV" are enforced automatically.
Predefined constraints like "E0" map to "UE".
VE Constraint: Requires special handling—base off-diagonal elements are fixed, and diagonals
are sampled above a minimum threshold to maintain positive definiteness. May fail if covs.range is too narrow.
Class Assignment:
"random": Uses Dirichlet distribution () to avoid extremely small classes.
Sizes are rounded and adjusted to sum exactly to N.
"uniform": Simple random sampling with equal probability. May produce empty classes if N is small.
Data Generation: Observations are simulated using mvtnorm::rmvnorm per class.
Final data and class labels are shuffled to remove ordering artifacts.
A list containing:
Numeric matrix () of simulated observations. Rows are observations,
columns are variables named "V1", "V2", ..., or "UV" for univariate data.
Numeric matrix () of true class-specific means.
Row names: "Class1", "Class2", ...; column names match response.
Array () of true class-specific covariance matrices.
Dimensions: variables x variables x classes. Constrained parameters have identical values across class slices.
Dimension names match response and class labels.
Numeric matrix () of true class membership probabilities (one-hot encoded).
Row i, column l = 1 if observation i belongs to class l, else 0.
Row names: "O1", "O2", ...; column names: "Class1", "Class2", ...
Numeric vector (length ) of true class proportions.
Named with class labels (e.g., "Class1").
Integer vector (length ) of true class assignments (1 to L).
Named with observation IDs (e.g., "O1").
Original constraint specification (character string or list) passed to the function.
The constraint parameter controls equality constraints on covariance parameters across classes:
"UE" (Univariate only)Equal variance across all classes.
"UV" (Univariate only)Varying variances across classes.
"E0"Equal variances across classes, zero covariances (diagonal matrix with shared variances).
"V0"Varying variances across classes, zero covariances (diagonal matrix with free variances).
"EE"Equal full covariance matrix across all classes (homogeneous).
"EV"Equal variances but varying covariances (equal diagonal, free off-diagonal).
"VE"Varying variances but equal correlations (free diagonal, equal correlation structure).
"VV"Varying full covariance matrices across classes (heterogeneous; default).
Each element specifies a pair of variables whose covariance parameters are constrained equal across classes:
c(i,i)Constrains variance of variable i to be equal across all classes.
c(i,j)Constrains covariance between variables i and j to be equal across all classes
(symmetric: automatically includes c(j,i)).
Unconstrained parameters vary freely. The algorithm ensures positive definiteness by:
Generating a base positive definite matrix S0.
Applying constraints via a logical mask.
Adjusting unconstrained variances to maintain positive definiteness.
Critical requirements for custom constraints:
I.I=1), only list(c(1,1)) is valid."random"(Default) Class proportions drawn from Dirichlet distribution ( for all classes),
ensuring no empty classes. Sizes are rounded to integers with adjustment for exact N.
"uniform"Equal probability of class membership ( per class), sampled with replacement.
# Example 1: Bivariate data, 3 classes, heterogeneous covariances (default) sim_data <- sim.LPA(N = 500, I = 2, L = 3, constraint = "VV") # Example 2: Univariate data, equal variances # 'E0' automatically maps to 'UE' for I=2 sim_uni <- sim.LPA(N = 200, I = 2, L = 2, constraint = "E0") # Example 3: Custom constraints # - Equal covariance between V1 and V2 across classes # - Equal variance for V3 across classes sim_custom <- sim.LPA( N = 300, I = 3, L = 4, constraint = list(c(1, 2), c(3, 3)) ) # Example 4: VE constraint (varying variances, equal correlations) sim_ve <- sim.LPA(N = 400, I = 3, L = 3, constraint = "VE") # Example 5: Uniform class sizes sim_uniform <- sim.LPA(N = 300, I = 4, L = 5, distribution = "uniform")# Example 1: Bivariate data, 3 classes, heterogeneous covariances (default) sim_data <- sim.LPA(N = 500, I = 2, L = 3, constraint = "VV") # Example 2: Univariate data, equal variances # 'E0' automatically maps to 'UE' for I=2 sim_uni <- sim.LPA(N = 200, I = 2, L = 2, constraint = "E0") # Example 3: Custom constraints # - Equal covariance between V1 and V2 across classes # - Equal variance for V3 across classes sim_custom <- sim.LPA( N = 300, I = 3, L = 4, constraint = list(c(1, 2), c(3, 3)) ) # Example 4: VE constraint (varying variances, equal correlations) sim_ve <- sim.LPA(N = 400, I = 3, L = 3, constraint = "VE") # Example 5: Uniform class sizes sim_uniform <- sim.LPA(N = 300, I = 4, L = 5, distribution = "uniform")
Simulates longitudinal latent class/profile data where initial class membership and transition probabilities may be influenced by time-varying covariates. Supports both Latent Class Analysis (LCA) for categorical outcomes and Latent Profile Analysis (LPA) for continuous outcomes. Measurement invariance is assumed by default (identical indicator parameters across time).
sim.LTA( N = 500, I = 5, L = 3, distribution = "random", times = 2, type = "LCA", rate = NULL, constraint = "VV", mean.range = c(-2, 2), covs.range = c(0.01, 4), poly.value = 5, IQ = "random", params = NULL, is.sort = TRUE, covariates = NULL, beta = NULL, gamma = NULL )sim.LTA( N = 500, I = 5, L = 3, distribution = "random", times = 2, type = "LCA", rate = NULL, constraint = "VV", mean.range = c(-2, 2), covs.range = c(0.01, 4), poly.value = 5, IQ = "random", params = NULL, is.sort = TRUE, covariates = NULL, beta = NULL, gamma = NULL )
N |
Integer; sample size. |
I |
Integer; number of observed indicators/items/indicators per time point. |
L |
Integer; number of latent classes/profiles. |
distribution |
Character; distribution of initial class probabilities when not using covariates or |
times |
Integer; number of time points (must be |
type |
Character; type of latent model. |
rate |
List of matrices or NULL; transition probability matrices for non-covariate mode.
Each matrix is |
constraint |
Character; covariance structure for LPA ( |
mean.range |
Numeric vector; range for randomly generated class means in LPA (default: |
covs.range |
Numeric vector; range for covariance matrix diagonals in LPA (default: |
poly.value |
Integer; number of categories for polytomous LCA indicators (default: 5). |
IQ |
Character; method for generating indicator discrimination in LCA. |
params |
List or NULL; pre-specified parameters for reproducibility (see Details). |
is.sort |
A logical value. If |
covariates |
List of matrices or NULL; covariate matrices for each time point. Each matrix must have
dimensions |
beta |
Matrix or NULL; initial state regression coefficients of dimension |
gamma |
List or NULL; transition regression coefficients. Must be a list of length |
Covariate Requirements:
Covariate matrices must include an intercept (first column = 1). If omitted, the function adds an intercept and issues a warning.
When covariates is provided but beta or gamma is NULL, coefficients are
randomly generated from (non-reference classes only).
The reference class () always has zero coefficients (,
).
Parameter Compatibility:
Use params to fix indicator parameters (LCA) or class means/covariances (LPA) across simulations.
In non-covariate mode, rate must be a list of valid transition matrices (ignored when times=1).
In covariate mode with times>=2, all three (covariates, beta, gamma) must be consistent in dimensions.
A list of class "sim.LTA" containing:
responsesList of length times; observed data matrices ().
ZsList of length times; true latent class memberships ( vectors).
P.ZsList of length times; marginal class probabilities at each time.
parIndicator parameters for LCA (if type="LCA").
meansClass means for LPA (if type="LPA").
covsClass covariance matrices for LPA (if type="LPA").
rateTrue transition matrices (non-covariate mode only; NULL when times=1).
covariatesList of covariate matrices used (covariate mode only).
betaTrue initial state coefficients (covariate mode only).
gammaTrue transition coefficients (covariate mode only; NULL when times=1).
callFunction call.
argumentsInput arguments.
For observation/participant at time 1, the probability of belonging to latent class is:
where is the covariate vector for observation/participant at time 1,
with (intercept term) and () representing the value of the -th covariate.
The coefficient vector corresponds element-wise to ,
where is the intercept and () are regression coefficients for covariates.
Class is the reference class ().
For observation/participant transitioning from class at time to class at time ():
where is the covariate vector at time ,
with (intercept) and () as the -th covariate value.
The coefficient vector
corresponds element-wise to , where is the intercept and ()
are regression coefficients. Class is the reference class ( for all ).
Initial probabilities follow a multinomial distribution with probabilities .
When , transitions follow a Markov process with fixed probabilities ,
where for each and .
####################### Example 1: Single time point (times=1) ###################### library(LCPA) set.seed(123) sim_single <- sim.LTA(N = 200, I = 4, L = 3, times = 1, type = "LCA") print(sim_single) ####################### Example 2: LPA without covariates ###################### set.seed(123) sim_lta <- sim.LTA(N = 200, I = 3, L = 3, times = 3, type = "LPA", constraint = "VE") print(sim_lta) ################## Example 3: With custom covariates (times>=2) ###################### set.seed(123) N <- 200 ## sample size ## Covariates at time point T1 covariates.inter <- rep(1, N) # Intercept term is always 1 for each n covariates.X1 <- rnorm(N) # Covariate X1 is a continuous variable covariates.X2 <- rbinom(N, 1, 0.5) # Covariate X2 is a binary variable covariates.X1.X2 <- covariates.X1 * covariates.X2 # Interaction between covariates X1 and X2 covariates.T1 <- cbind(inter=covariates.inter, X1=covariates.X1, X2=covariates.X2, X1.X2=covariates.X1.X2) # Combine into covariates at T1 ## Covariates at time point T2 covariates.inter <- rep(1, N) # Intercept term is always 1 for each n covariates.X1 <- rnorm(N) # Covariate X1 is a continuous variable covariates.X2 <- rbinom(N, 1, 0.5) # Covariate X2 is a binary variable covariates.X1.X2 <- covariates.X1 * covariates.X2 # Interaction between covariates X1 and X2 covariates.T2 <- cbind(inter=covariates.inter, X1=covariates.X1, X2=covariates.X2, X1.X2=covariates.X1.X2) # Combine into covariates at T2 covariates <- list(t1=covariates.T1, t2=covariates.T2) # Combine into final covariates list ## Simulate beta coefficients # 3x3 matrix (last column is zero because the last category is used as reference) beta <- matrix(c( 0.8, -0.5, 0.0, -0.3, -0.4, 0.0, 0.2, 0.8, 0.0, -0.1, 0.2, 0.0), ncol=3, byrow=TRUE) ## Simulate gamma coefficients (only needed when times>=2) gamma <- list( lapply(1:3, function(l) { lapply(1:3, function(k) if(k < 3) runif(4, -1.0, 1.0) else c(0, 0, 0, 0)) # Last class as reference }) ) ## Simulate the data sim_custom <- sim.LTA( N=N, I=4, L=3, times=2, type="LPA", covariates=covariates, beta=beta, gamma=gamma ) summary(sim_custom)####################### Example 1: Single time point (times=1) ###################### library(LCPA) set.seed(123) sim_single <- sim.LTA(N = 200, I = 4, L = 3, times = 1, type = "LCA") print(sim_single) ####################### Example 2: LPA without covariates ###################### set.seed(123) sim_lta <- sim.LTA(N = 200, I = 3, L = 3, times = 3, type = "LPA", constraint = "VE") print(sim_lta) ################## Example 3: With custom covariates (times>=2) ###################### set.seed(123) N <- 200 ## sample size ## Covariates at time point T1 covariates.inter <- rep(1, N) # Intercept term is always 1 for each n covariates.X1 <- rnorm(N) # Covariate X1 is a continuous variable covariates.X2 <- rbinom(N, 1, 0.5) # Covariate X2 is a binary variable covariates.X1.X2 <- covariates.X1 * covariates.X2 # Interaction between covariates X1 and X2 covariates.T1 <- cbind(inter=covariates.inter, X1=covariates.X1, X2=covariates.X2, X1.X2=covariates.X1.X2) # Combine into covariates at T1 ## Covariates at time point T2 covariates.inter <- rep(1, N) # Intercept term is always 1 for each n covariates.X1 <- rnorm(N) # Covariate X1 is a continuous variable covariates.X2 <- rbinom(N, 1, 0.5) # Covariate X2 is a binary variable covariates.X1.X2 <- covariates.X1 * covariates.X2 # Interaction between covariates X1 and X2 covariates.T2 <- cbind(inter=covariates.inter, X1=covariates.X1, X2=covariates.X2, X1.X2=covariates.X1.X2) # Combine into covariates at T2 covariates <- list(t1=covariates.T1, t2=covariates.T2) # Combine into final covariates list ## Simulate beta coefficients # 3x3 matrix (last column is zero because the last category is used as reference) beta <- matrix(c( 0.8, -0.5, 0.0, -0.3, -0.4, 0.0, 0.2, 0.8, 0.0, -0.1, 0.2, 0.0), ncol=3, byrow=TRUE) ## Simulate gamma coefficients (only needed when times>=2) gamma <- list( lapply(1:3, function(l) { lapply(1:3, function(k) if(k < 3) runif(4, -1.0, 1.0) else c(0, 0, 0, 0)) # Last class as reference }) ) ## Simulate the data sim_custom <- sim.LTA( N=N, I=4, L=3, times=2, type="LPA", covariates=covariates, beta=beta, gamma=gamma ) summary(sim_custom)
Generates structured, comprehensive summaries of objects produced by the LCPA package.
This generic function dispatches to class-specific methods that extract and organize key information
including model configurations, fit statistics, parameter estimates, simulation truths, and diagnostics.
Designed for programmatic access and downstream reporting.
## S3 method for class 'LCA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'LPA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'LTA' summary(object, digits = 4, ...) ## S3 method for class 'LCPA' summary(object, digits = 4, ...) ## S3 method for class 'sim.LCA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'sim.LPA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'sim.LTA' summary(object, digits = 4, I.max = 5, L.max = 5, ...) ## S3 method for class 'fit.index' summary(object, digits = 4, ...) ## S3 method for class 'compare.model' summary(object, digits = 4, ...) ## S3 method for class 'SE' summary(object, ...)## S3 method for class 'LCA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'LPA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'LTA' summary(object, digits = 4, ...) ## S3 method for class 'LCPA' summary(object, digits = 4, ...) ## S3 method for class 'sim.LCA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'sim.LPA' summary(object, digits = 4, I.max = 5, ...) ## S3 method for class 'sim.LTA' summary(object, digits = 4, I.max = 5, L.max = 5, ...) ## S3 method for class 'fit.index' summary(object, digits = 4, ...) ## S3 method for class 'compare.model' summary(object, digits = 4, ...) ## S3 method for class 'SE' summary(object, ...)
object |
An object of one of the following classes:
|
digits |
Number of decimal places for numeric output (default: 4). Applied universally across all methods. |
I.max |
Maximum number of variables/items to display ( |
... |
Additional arguments passed to or from other methods (currently ignored). |
L.max |
Maximum number of latent classes/profiles to display before truncation ( |
Each method returns a named list with class-specific components optimized for structured access:
LCAReturns a summary.LCA object with components:
callOriginal function call.
model.configList: latent_classes, method.
data.infoList: N, I, poly.value, uniform_categories.
fit.statsList: LogLik, AIC, BIC, entropy, npar.
class.probsData frame: Class, Count, Proportion.
item.probsList of matrices (first I.max items) with conditional probabilities per class/category.
convergenceList: algorithm, iterations, tolerance, loglik change, hardware (if applicable).
replicationList: nrep, best_BIC (if multiple replications performed).
digits, I.max.shown, total.items
Metadata for printing/formatting.
LPAReturns a summary.LPA object with components:
callOriginal function call.
model.configList: latent_profiles, constraint, cov_structure, method.
data.infoList: N, I, distribution.
fit.statsList: LogLik, AIC, BIC, entropy, npar.
class.probsData frame: Profile, Count, Proportion.
class.meansMatrix (first I.max variables) of profile-specific means.
convergenceList: algorithm, iterations, tolerance, loglik change, hardware (if applicable).
replicationList: nrep, best_BIC (if multiple replications performed).
digits, I.max.shown, total.vars
Metadata for printing/formatting.
LCPAReturns a summary.LCPA object with components:
callOriginal function call.
model.configList: latent_classes, model_type, reference_class,
covariates_mode, CEP_handling.
data.infoList: sample_size, variables.
fit.statsList: LogLik, AIC, BIC, npar.
class.probsData frame: Class, Probability, Proportion, Frequency.
coefficientsData frame: regression coefficients for non-reference classes (Estimate, Std_Error, z_value, p_value, 95% CI).
reference_classInteger: reference class for multinomial logit.
convergenceList: iterations, coveraged, converg_note.
digits, I.max.shown, total.vars, has.covariates
Metadata for printing/formatting.
LTAReturns a summary.LTA object with components:
callOriginal function call.
model.configList: time_points, latent_classes, model_type,
reference_class, covariates_mode, CEP_handling, transition_mode.
data.infoList: sample_size, variables, time_points.
fit.statsList: LogLik, AIC, BIC, npar.
class.probsList of data frames (per time point): Class, Probability, Proportion, Frequency.
initial_modelList: coefficients (data frame), covariate_names, reference_class.
transition_modelsNamed list of data frames: transition coefficients per time interval (From_Class, To_Class, Estimate, Std_Error, etc.).
reference_classInteger: reference destination class for transitions.
convergenceList: iterations, coveraged, converg_note.
digits, I.max.shown, total.vars, covariates.timeCross
Metadata for printing/formatting.
sim.LCAReturns a summary.sim.LCA object with components:
callOriginal simulation call.
configList: N, I, L, poly.value, uniform_categories, IQ, distribution.
class.probsData frame: Class, Probability, Frequency.
item.probsList of matrices (first I.max items) with true conditional probabilities per class/category.
digits, I.max.shown, total.vars
Metadata for printing/formatting.
sim.LPAReturns a summary.sim.LPA object with components:
callOriginal simulation call.
configList: N, I, L, constraint, constraint_desc, distribution.
class.probsData frame: Profile, Probability, Frequency.
class.meansMatrix (first I.max variables) of true profile-specific means.
cov_structureCharacter: detailed description of covariance constraints.
digits, I.max.shown, total.vars
Metadata for printing/formatting.
sim.LTAReturns a summary.sim.LTA object with components:
callOriginal simulation call.
configList: N, I, L, times, type, distribution, constraint (if LPA).
class.probsList of data frames (per time point): Class, Probability, Frequency.
item.probsNested list (by time/item) of true conditional probabilities (if type="LCA").
class.meansList of matrices (by time) of true profile means (if type="LPA").
transitionList: mode ("fixed" or "covariate"), rate or beta/gamma coefficients, time_points.
covariatesList of data frames (per time point) with covariate summaries (Min, Max, Mean), if present.
digits, I.max.shown, L.max.shown, total.vars, total.classes
Metadata for printing/formatting.
fit.indexReturns a summary.fit.index object with components:
callFunction call that generated the fit indices.
data.infoList: N.
fit.tableData frame: Statistic, Value, Description for -2LL, AIC, BIC, SIC, CAIC, AWE, SABIC.
digitsNumeric: precision used for formatting.
compare.modelReturns a summary.compare.model object with components:
callFunction call that generated the comparison.
data.infoList: N, I, L (named vector for two models).
fit.tableData frame comparing fit indices for both models.
model_comparisonData frame: Classes, npar, AvePP, Entropy.
BFNumeric: Bayes Factor value (if computed).
BF_interpretationCharacter: interpretive guidance for Bayes Factor.
lrt_tableData frame: Test, Statistic, DF, p-value, Sig (significance markers).
lrt_objectsList: raw hypothesis test objects for further inspection.
digitsNumeric: precision used for formatting.
SEReturns a summary.SE object with components:
callOriginal function call.
methodCharacter: "Obs" or "Bootstrap".
diagnosticsList: method-specific diagnostic info (e.g., n.Bootstrap, hessian_cond_number).
model_typeCharacter: "LCA" or "LPA".
LInteger: number of latent classes/profiles.
IInteger: number of variables/items (NA if unknown).
nonzero_countsList: counts of non-zero SEs by parameter type (P.Z, means/par, covs).
total_PZInteger: total number of class probability parameters.
Invisibly returns a structured list containing summary components.
The exact structure depends on the class of object. All returned objects carry an appropriate
S3 class (e.g., summary.LCA, summary.LPA) for use with corresponding print methods.
summary(LCA): Summary method for LCA objects
summary(LPA): Summary method for LPA objects
summary(LTA): Summary method for LTA objects
summary(LCPA): Summary method for LCPA objects
summary(sim.LCA): Summary method for sim.LCA objects
summary(sim.LPA): Summary method for sim.LPA objects
summary(sim.LTA): Summary method for sim.LTA objects
summary(fit.index): Summary method for fit.index objects
summary(compare.model): Summary method for compare.model objects
summary(SE): Summary method for summary.SE objects
The update function provides a unified and convenient interface to refresh or modify
existing objects generated by the LCPA package. It allows users to re-run model fitting
or data simulation with new parameter settings while preserving all other original configurations.
Supported classes include: LCA, LPA,
LCPA, LTA, sim.LCA,
sim.LPA, and sim.LTA.
update(x, ...) ## S3 method for class 'LCA' update(x, ...) ## S3 method for class 'LPA' update(x, ...) ## S3 method for class 'LCPA' update(x, ...) ## S3 method for class 'LTA' update(x, ...) ## S3 method for class 'sim.LCA' update(x, ...) ## S3 method for class 'sim.LPA' update(x, ...) ## S3 method for class 'sim.LTA' update(x, ...)update(x, ...) ## S3 method for class 'LCA' update(x, ...) ## S3 method for class 'LPA' update(x, ...) ## S3 method for class 'LCPA' update(x, ...) ## S3 method for class 'LTA' update(x, ...) ## S3 method for class 'sim.LCA' update(x, ...) ## S3 method for class 'sim.LPA' update(x, ...) ## S3 method for class 'sim.LTA' update(x, ...)
x |
An object of one of the following classes: |
... |
Additional named arguments passed to override or extend the original call.
Valid arguments depend on the class of
|
Internally, each method extracts the stored arguments list from the input object x,
merges it with user-provided ... using modifyList, then re-invokes
the corresponding constructor function (LCA(), LPA(), LCPA(), LTA(),
sim.LCA(), etc.) with the merged argument list.
This ensures that:
Only explicitly overridden parameters are changed.
Default values from the original call remain intact.
Complex nested structures (e.g., control lists) can be partially updated.
Note: If an invalid argument is passed (e.g., constraint to LCA), it will be silently ignored
unless the underlying constructor validates inputs.
An object of the same class as x, reconstructed using the original arguments
updated with any provided in .... All unchanged parameters are preserved from the original call.
update(LCA): Update method for LCA objects
update(LPA): Update method for LPA objects
update(LCPA): Update method for LCPA objects
update(LTA): Update method for LTA objects
update(sim.LCA): Update method for sim.LCA objects
update(sim.LPA): Update method for sim.LPA objects
update(sim.LTA): Update method for sim.LTA objects
library(LCPA) # --- Update LCA --- data <- sim.LCA(N=500, I=5, L=3) lca.obj <- LCA(data$response, L=3) lca.updated <- update(lca.obj, method="EM", nrep=5) # --- Update LPA --- data2 <- sim.LPA(N=300, I=4, L=2) lpa.obj <- LPA(data2$response, L=2, constraint="VE") lpa.updated <- update(lpa.obj, constraint="VV") # --- Update Simulation Objects --- sim.obj1 <- sim.LCA(N=1000) sim.obj1_updated <- update(sim.obj1, N=2000, IQ=0.8) sim.obj2 <- sim.LPA(I=6) sim.obj2_updated <- update(sim.obj2, I=8, mean.range=c(-2,2)) sim.obj3 <- sim.LTA(N=200, I=5, L=2, times=3) sim.obj3_updated <- update(sim.obj3, N=300, times=4, constraint="ER")library(LCPA) # --- Update LCA --- data <- sim.LCA(N=500, I=5, L=3) lca.obj <- LCA(data$response, L=3) lca.updated <- update(lca.obj, method="EM", nrep=5) # --- Update LPA --- data2 <- sim.LPA(N=300, I=4, L=2) lpa.obj <- LPA(data2$response, L=2, constraint="VE") lpa.updated <- update(lpa.obj, constraint="VV") # --- Update Simulation Objects --- sim.obj1 <- sim.LCA(N=1000) sim.obj1_updated <- update(sim.obj1, N=2000, IQ=0.8) sim.obj2 <- sim.LPA(I=6) sim.obj2_updated <- update(sim.obj2, I=8, mean.range=c(-2,2)) sim.obj3 <- sim.LTA(N=200, I=5, L=2, times=3) sim.obj3_updated <- update(sim.obj3, N=300, times=4, constraint="ER")