cSEM: predict – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

cSEM

predict

Predict indicator scores

Description

Usage

predict(
 .object               = NULL,
 .benchmark            = c("lm", "unit", "PLS-PM", "GSCA", "PCA", "MAXVAR"),
 .cv_folds             = 10,
 .handle_inadmissibles = c("stop", "ignore", "set_NA"),
 .r                    = 10,
 .test_data            = NULL
 )

Arguments

`.object`	An R object of class cSEMResults resulting from a call to `csem()`.
`.benchmark`	Character string. The procedure to obtain benchmark predictions. One of "lm", "unit", "PLS-PM", "GSCA", "PCA", or "MAXVAR". Default to "lm".
`.cv_folds`	Integer. The number of cross-validation folds to use. Setting `.cv_folds` to `N` (the number of observations) produces leave-one-out cross-validation samples. Defaults to `10`.
`.handle_inadmissibles`	Character string. How should inadmissible results be treated? One of "stop", "ignore", or "set_NA". If "stop", `predict()` will stop immediatly if estimation yields an inadmissible result. For "ignore" all results are returned even if all or some of the estimates yielded inadmissible results. For "set_NA" predictions based on inadmissible parameter estimates are set to `NA`. Defaults to "stop"
`.r`	Integer. The number of repetitions to use. Defaults to `10`.
`.test_data`	A matrix of test data with the same column names as the training data.

Details

Predict the indicator scores of endogenous constructs.

Predict uses the procedure introduced by Shmueli et al. (2016) in the context of PLS (commonly called: "PLSPredict" (Shmueli et al. 2019)). Predict uses k-fold cross-validation to randomly split the data into training and test data and subsequently predicts the relevant values in the test data based on the model parameter estimates obtained using the training data. The number of cross-validation folds is 10 by default but may be changed using the .cv_folds argument. By default, the procedure is repeated .r = 10 times to avoid irregularities due to a particular split. See Shmueli et al. (2019) for details.

Alternatively, users may supply a matrix or a data frame of .test_data with the same column names as those in the data used to obtain .object (the training data). In this case, arguments .cv_folds and .r are ignored and predict uses the estimated coefficients from .object to predict the values in the columns of .test_data.

In Shmueli et al. (2016) PLS-based predictions for indicator i are compared to the predictions based on a multiple regression of indicator i on all available exogenous indicators (.benchmark = "lm") and a simple mean-based prediction summarized in the Q2_predict metric. predict() is more general in that is allows users to compare the predictions based on a so-called target model/specification to predictions based on an alternative benchmark. Available benchmarks include predictions based on a linear model, PLS-PM weights, unit weights (i.e. sum scores), GSCA weights, PCA weights, and MAXVAR weights.

Each estimation run is checked for admissibility using verify(). If the estimation yields inadmissible results, predict() stops with an error ("stop"). Users may choose to "ignore" inadmissible results or to simply set predictions to NA ("set_NA") for the particular run that failed.

Value

An object of class cSEMPredict with print and plot methods. Technically, cSEMPredict is a named list containing the following list elements:

$Actual: A matrix of the actual values/indicator scores of the endogenous constructs.
$Prediction_target: A matrix of the predicted indicator scores of the endogenous constructs based on the target model. Target refers to procedure used to estimate the parameters in .object.
$Residuals_target: A matrix of the residual indicator scores of the endogenous constructs based on the target model.
$Residuals_benchmark: A matrix of the residual indicator scores of the endogenous constructs based on a model estimated by the procedure given to .benchmark.
$Prediction_metrics: A data frame containing the predictions metrics MAE, RMSE, and Q2_predict.
$Information: A list with elements Target, Benchmark, Number_of_observations_training, Number_of_observations_test, Number_of_folds, Number_of_repetitions, and Handle_inadmissibles.

References

Shmueli G, Ray S, Estrada JMV, Chatla SB (2016). “The Elephant in the Room: Predictive Performance of PLS Models.” Journal of Business Research, 69(10), 4552–4564. doi: 10.1016/j.jbusres.2016.03.049, https://doi.org/10.1016/j.jbusres.2016.03.049.

Shmueli G, Sarstedt M, Hair JF, Cheah J, Ting H, Vaithilingam S, Ringle CM (2019). “Predictive Model Assessment in PLS-SEM: Guidelines for Using PLSpredict.” European Journal of Marketing, 53(11), 2322–2347. doi: 10.1108/ejm-02-2019-0189, https://doi.org/10.1108/ejm-02-2019-0189.

Examples

### Anime example taken from https://github.com/ISS-Analytics/pls-predict

# Load data
data(Anime) # data is similar to the Anime.csv found on 
            # https://github.com/ISS-Analytics/pls-predict but with irrelevant
            # columns removed

# Split into training and data the same way as it is done on 
# https://github.com/ISS-Analytics/pls-predict
set.seed(123)

index     <- sample.int(dim(Anime)[1], 83, replace = FALSE)
dat_train <- Anime[-index, ]
dat_test  <- Anime[index, ]

# Specify model
model <- "
# Structural model

ApproachAvoidance ~ PerceivedVisualComplexity + Arousal

# Measurement/composite model

ApproachAvoidance         =~ AA0 + AA1 + AA2 + AA3
PerceivedVisualComplexity <~ VX0 + VX1 + VX2 + VX3 + VX4
Arousal                   <~ Aro1 + Aro2 + Aro3 + Aro4
"

# Estimate (replicating the results of the `simplePLS()` function)
res <- csem(dat_train, 
            model, 
            .disattenuate = FALSE, # original PLS
            .iter_max = 300, 
            .tolerance = 1e-07, 
            .PLS_weight_scheme_inner = "factorial"
)

# Predict using a user-supplied training data set
pp <- predict(res, .test_data = dat_test)
pp$Predictions_target[1:6, ]
pp

### Compute prediction metrics  ------------------------------------------------
res2 <- csem(Anime, # whole data set
            model, 
            .disattenuate = FALSE, # original PLS
            .iter_max = 300, 
            .tolerance = 1e-07, 
            .PLS_weight_scheme_inner = "factorial"
)

# Predict using 10-fold cross-validation with 5 repetitions
## Not run: 
pp2 <- predict(res, .benchmark = "lm")
pp2
## There is a plot method available
plot(pp2)
## End(Not run)

cSEM

Composite-Based Structural Equation Modeling

v0.4.0

GPL-3

Authors

Manuel E. Rademaker [aut, cre] (<https://orcid.org/0000-0002-8902-3561>), Florian Schuberth [aut] (<https://orcid.org/0000-0002-2110-9086>), Tamara Schamberger [ctb] (<https://orcid.org/0000-0002-7845-784X>), Michael Klesel [ctb] (<https://orcid.org/0000-0002-2884-1819>), Theo K. Dijkstra [ctb], Jörg Henseler [ctb] (<https://orcid.org/0000-0002-9736-3048>)

Initial release

2021-04-09

predict

Description

Usage

Arguments

Details

Value

References

See Also

Examples

cSEM

We don't support your browser anymore