Title:  Bayesian Applied Regression Modeling via Stan 

Description:  Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors. 
Authors:  Jonah Gabry [aut], Imad Ali [ctb], Sam Brilleman [ctb], Jacqueline Buros Novik [ctb] (R/stan_jm.R), AstraZeneca [ctb] (R/stan_jm.R), Eren Elci [ctb] (R/stan_surv.R), Trustees of Columbia University [cph], Simon Wood [cph] (R/stan_gamm4.R), R Core Deveopment Team [cph] (R/stan_aov.R), Douglas Bates [cph] (R/pp_data.R), Martin Maechler [cph] (R/pp_data.R), Ben Bolker [cph] (R/pp_data.R), Steve Walker [cph] (R/pp_data.R), Brian Ripley [cph] (R/stan_aov.R, R/stan_polr.R), William Venables [cph] (R/stan_polr.R), PaulChristian Burkner [cph] (R/misc.R), Ben Goodrich [cre, aut] 
Maintainer:  Ben Goodrich <[email protected]> 
License:  GPL (>=3) 
Version:  2.35.0.9000 
Built:  20241021 06:34:08 UTC 
Source:  https://github.com/standev/rstanarm 
Stan Development Team
The rstanarm package is an appendage to the rstan package that
enables many of the most common applied regression models to be estimated
using Markov Chain Monte Carlo, variational approximations to the posterior
distribution, or optimization. The rstanarm package allows these models
to be specified using the customary R modeling syntax (e.g., like that of
glm
with a formula
and a data.frame
).
The sections below provide an overview of the modeling functions and estimation algorithms used by rstanarm.
The set of models supported by rstanarm is large (and will continue to
grow), but also limited enough so that it is possible to integrate them
tightly with the pp_check
function for graphical posterior
predictive checks with bayesplot and the
posterior_predict
function to easily estimate the effect of
specific manipulations of predictor variables or to predict the outcome in a
training set.
The objects returned by the rstanarm modeling functions are called
stanreg
objects. In addition to all of the
typical methods
defined for fitted model
objects, stanreg objects can be passed to the loo
function
in the loo package for model comparison or to the
launch_shinystan
function in the shinystan
package in order to visualize the posterior distribution using the ShinyStan
graphical user interface. See the rstanarm vignettes for more details
about the entire process.
See priors help page and the vignette Prior Distributions for rstanarm Models for an overview of the various choices the user can make for prior distributions. The package vignettes for the modeling functions also provide examples of using many of the available priors as well as more detailed descriptions of some of the novel priors used by rstanarm.
The model estimating functions are described in greater detail in their individual help pages and vignettes. Here we provide a very brief overview:
stan_lm
, stan_aov
, stan_biglm
Similar to lm
or aov
but with
novel regularizing priors on the model parameters that are driven by prior
beliefs about $R^2$
, the proportion of variance in the outcome
attributable to the predictors in a linear model.
stan_glm
, stan_glm.nb
Similar to glm
but with various possible prior
distributions for the coefficients and, if applicable, a prior distribution
for any auxiliary parameter in a Generalized Linear Model (GLM) that is
characterized by a family
object (e.g. the shape
parameter in Gamma models). It is also possible to estimate a negative
binomial model in a similar way to the glm.nb
function
in the MASS package.
stan_glmer
, stan_glmer.nb
, stan_lmer
Similar to the glmer
, glmer.nb
and
lmer
functions in the lme4 package in that GLMs
are augmented to have groupspecific terms that deviate from the common
coefficients according to a meanzero multivariate normal distribution with
a highlystructured but unknown covariance matrix (for which rstanarm
introduces an innovative prior distribution). MCMC provides more
appropriate estimates of uncertainty for models that consist of a mix of
common and groupspecific parameters.
stan_mvmer
A multivariate form of stan_glmer
, whereby the user can
specify one or more submodels each consisting of a GLM with groupspecific
terms. If more than one submodel is specified (i.e. there is more than one
outcome variable) then a dependence is induced by assuming that the
groupspecific terms for each grouping factor are correlated across submodels.
stan_nlmer
Similar to nlmer
in the lme4 package for
nonlinear "mixedeffects" models, but the groupspecific coefficients
have flexible priors on their unknown covariance matrices.
stan_gamm4
Similar to gamm4
in the gamm4 package, which
augments a GLM (possibly with groupspecific terms) with nonlinear smooth
functions of the predictors to form a Generalized Additive Mixed Model
(GAMM). Rather than calling glmer
like
gamm4
does, stan_gamm4
essentially calls
stan_glmer
, which avoids the optimization issues that often
crop up with GAMMs and provides better estimates for the uncertainty of the
parameter estimates.
stan_polr
Similar to polr
in the MASS package in that it
models an ordinal response, but the Bayesian model also implies a prior
distribution on the unknown cutpoints. Can also be used to model binary
outcomes, possibly while estimating an unknown exponent governing the
probability of success.
stan_betareg
Similar to betareg
in that it models an outcome that
is a rate (proportion) but, rather than performing maximum likelihood
estimation, full Bayesian estimation is performed by default, with
customizable prior distributions for all parameters.
stan_clogit
Similar to clogit
in that it models an binary outcome
where the number of successes and failures is fixed within each stratum by
the research design. There are some minor syntactical differences relative
to clogit
that allow stan_clogit
to accept
groupspecific terms as in stan_glmer
.
stan_surv
Fits models to survival (i.e. timetoevent) data using either a hazard scale or accelerated failure time (AFT) formulation. The user can choose between either a flexible splinebased approximation for the baseline hazard or a standard parametric distributional assumption for the baseline hazard. Covariate effects can then be accommodated under proportional or nonproportional hazards assumptions (for models on the hazard scale) or using timefixed or timevarying acceleration factors (for models on the AFT scale).
stan_jm
Fits shared parameter joint models for longitudinal and survival (i.e. timetoevent) data. The joint model can be univariate (i.e. one longitudinal outcome) or multivariate (i.e. more than one longitudinal outcome). A variety of parameterisations are available for linking the longitudinal and event processes (i.e. a variety of association structures).
The modeling functions in the rstanarm package take an algorithm
argument that can be one of the following:
algorithm="sampling"
)Uses Markov Chain Monte Carlo (MCMC) — in particular, Hamiltonian Monte
Carlo (HMC) with a tuned but diagonal mass matrix — to draw from the
posterior distribution of the parameters. See sampling
(rstan) for more details. This is the slowest but most reliable of the
available estimation algorithms and it is the default and
recommended algorithm for statistical inference.
algorithm="meanfield"
)Uses meanfield variational inference to draw from an approximation to the
posterior distribution. In particular, this algorithm finds the set of
independent normal distributions in the unconstrained space that — when
transformed into the constrained space — most closely approximate the
posterior distribution. Then it draws repeatedly from these independent
normal distributions and transforms them into the constrained space. The
entire process is much faster than HMC and yields independent draws but
is not recommended for final statistical inference. It can be
useful to narrow the set of candidate models in large problems, particularly
when specifying QR=TRUE
in stan_glm
,
stan_glmer
, and stan_gamm4
, but is only
an approximation to the posterior distribution.
algorithm="fullrank"
)Uses fullrank variational inference to draw from an approximation to the posterior distribution by finding the multivariate normal distribution in the unconstrained space that — when transformed into the constrained space — most closely approximates the posterior distribution. Then it draws repeatedly from this multivariate normal distribution and transforms the draws into the constrained space. This process is slower than meanfield variational inference but is faster than HMC. Although still an approximation to the posterior distribution and thus not recommended for final statistical inference, the approximation is more realistic than that of meanfield variational inference because the parameters are not assumed to be independent in the unconstrained space. Nevertheless, fullrank variational inference is a more difficult optimization problem and the algorithm is more prone to nonconvergence or convergence to a local optimum.
algorithm="optimizing"
)Finds the posterior mode using a C++ implementation of the LBGFS algorithm.
See optimizing
for more details. If there is no prior
information, then this is equivalent to maximum likelihood, in which case
there is no great reason to use the functions in the rstanarm package
over the emulated functions in other packages. However, if priors are
specified, then the estimates are penalized maximum likelihood estimates,
which may have some redeeming value. Currently, optimization is only
supported for stan_glm
.
Maintainer: Ben Goodrich [email protected]
Authors:
Jonah Gabry [email protected]
Other contributors:
Imad Ali [contributor]
Sam Brilleman [contributor]
Jacqueline Buros Novik (R/stan_jm.R) [contributor]
AstraZeneca (R/stan_jm.R) [contributor]
Eren Elci (R/stan_surv.R) [contributor]
Trustees of Columbia University [copyright holder]
Simon Wood (R/stan_gamm4.R) [copyright holder]
R Core Deveopment Team (R/stan_aov.R) [copyright holder]
Douglas Bates (R/pp_data.R) [copyright holder]
Martin Maechler (R/pp_data.R) [copyright holder]
Ben Bolker (R/pp_data.R) [copyright holder]
Steve Walker (R/pp_data.R) [copyright holder]
Brian Ripley (R/stan_aov.R, R/stan_polr.R) [copyright holder]
William Venables (R/stan_polr.R) [copyright holder]
PaulChristian Burkner [email protected] (R/misc.R) [copyright holder]
Bates, D., Maechler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixedEffects models using lme4. Journal of Statistical Software. 67(1), 1–48.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. https://stat.columbia.edu/~gelman/book/
Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge, UK. https://stat.columbia.edu/~gelman/arm/
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. https://mcstan.org/users/documentation/.
Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leaveoneout crossvalidation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s1122201696964. arXiv preprint: https://arxiv.org/abs/1507.04544
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Using stacking to average Bayesian predictive distributions. Bayesian Analysis, advance publication, doi:10.1214/17BA1091.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
Muth, C., Oravecz, Z., and Gabry, J. (2018) Userfriendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology. 14(2), 99–119. https://www.tqmp.org/RegularArticles/vol142/p099/p099.pdf
https://mcstan.org/ for more information on the Stan C++ package used by rstanarm for model fitting.
https://github.com/standev/rstanarm/issues/ to submit a bug report or feature request.
https://discourse.mcstan.org to ask a question about rstanarm on the Stanusers forum.
adapt_delta
: Target average acceptance probabilityDetails about the adapt_delta
argument to rstanarm's modeling
functions.
For the NoUTurn Sampler (NUTS), the variant of Hamiltonian Monte
Carlo used used by rstanarm, adapt_delta
is the target average
proposal acceptance probability during Stan's adaptation period.
adapt_delta
is ignored by rstanarm if the algorithm
argument
is not set to "sampling"
.
The default value of adapt_delta
is 0.95, except when the prior for
the regression coefficients is R2
, hs
, or
hs_plus
, in which case the default is 0.99.
These defaults are higher (more conservative) than the default of
adapt_delta=0.8
used in the rstan package, which may result in
slower sampling speeds but will be more robust to posterior distributions
with high curvature.
In general you should not need to change adapt_delta
unless you see
a warning message about divergent transitions, in which case you can
increase adapt_delta
from the default to a value closer to 1
(e.g. from 0.95 to 0.99, or from 0.99 to 0.999, etc). The step size used by
the numerical integrator is a function of adapt_delta
in that
increasing adapt_delta
will result in a smaller step size and fewer
divergences. Increasing adapt_delta
will typically result in a
slower sampler, but it will always lead to a more robust sampler.
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. https://mcstan.org/users/documentation/.
Brief Guide to Stan's Warnings: https://mcstan.org/misc/warnings.html#divergenttransitionsafterwarmup
For models fit using MCMC (algorithm="sampling"
), the posterior sample
—the postwarmup draws from the posterior distribution— can be extracted
from a fitted model object as a matrix, data frame, or array. The
as.matrix
and as.data.frame
methods merge all chains together,
whereas the as.array
method keeps the chains separate. For models fit
using optimization ("optimizing"
) or variational inference
("meanfield"
or "fullrank"
), there is no posterior sample but
rather a matrix (or data frame) of 1000 draws from either the asymptotic
multivariate Gaussian sampling distribution of the parameters or the
variational approximation to the posterior distribution.
## S3 method for class 'stanreg' as.matrix(x, ..., pars = NULL, regex_pars = NULL) ## S3 method for class 'stanreg' as.array(x, ..., pars = NULL, regex_pars = NULL) ## S3 method for class 'stanreg' as.data.frame(x, ..., pars = NULL, regex_pars = NULL)
## S3 method for class 'stanreg' as.matrix(x, ..., pars = NULL, regex_pars = NULL) ## S3 method for class 'stanreg' as.array(x, ..., pars = NULL, regex_pars = NULL) ## S3 method for class 'stanreg' as.data.frame(x, ..., pars = NULL, regex_pars = NULL)
x 
A fitted model object returned by one of the
rstanarm modeling functions. See 
... 
Ignored. 
pars 
An optional character vector of parameter names. 
regex_pars 
An optional character vector of regular
expressions to use for parameter selection. 
A matrix, data.frame, or array, the dimensions of which depend on
pars
and regex_pars
, as well as the model and estimation
algorithm (see the Description section above).
stanregdrawsformats
, stanregmethods
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) # Extract posterior sample after MCMC draws < as.matrix(example_model) print(dim(draws)) # For example, we can see that the median of the draws for the intercept # is the same as the point estimate rstanarm uses print(median(draws[, "(Intercept)"])) print(example_model$coefficients[["(Intercept)"]]) # The as.array method keeps the chains separate draws_array < as.array(example_model) print(dim(draws_array)) # iterations x chains x parameters # Extract draws from asymptotic Gaussian sampling distribution # after optimization fit < stan_glm(mpg ~ wt, data = mtcars, algorithm = "optimizing") draws < as.data.frame(fit) print(colnames(draws)) print(nrow(draws)) # 1000 draws are taken # Extract draws from variational approximation to the posterior distribution fit2 < update(fit, algorithm = "meanfield") draws < as.data.frame(fit2, pars = "wt") print(colnames(draws)) print(nrow(draws)) # 1000 draws are taken }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) # Extract posterior sample after MCMC draws < as.matrix(example_model) print(dim(draws)) # For example, we can see that the median of the draws for the intercept # is the same as the point estimate rstanarm uses print(median(draws[, "(Intercept)"])) print(example_model$coefficients[["(Intercept)"]]) # The as.array method keeps the chains separate draws_array < as.array(example_model) print(dim(draws_array)) # iterations x chains x parameters # Extract draws from asymptotic Gaussian sampling distribution # after optimization fit < stan_glm(mpg ~ wt, data = mtcars, algorithm = "optimizing") draws < as.data.frame(fit) print(colnames(draws)) print(nrow(draws)) # 1000 draws are taken # Extract draws from variational approximation to the posterior distribution fit2 < update(fit, algorithm = "meanfield") draws < as.data.frame(fit2, pars = "wt") print(colnames(draws)) print(nrow(draws)) # 1000 draws are taken }
Estimation algorithms available for rstanarm models
The modeling functions in the rstanarm package take an algorithm
argument that can be one of the following:
algorithm="sampling"
)Uses Markov Chain Monte Carlo (MCMC) — in particular, Hamiltonian Monte
Carlo (HMC) with a tuned but diagonal mass matrix — to draw from the
posterior distribution of the parameters. See sampling
(rstan) for more details. This is the slowest but most reliable of the
available estimation algorithms and it is the default and
recommended algorithm for statistical inference.
algorithm="meanfield"
)Uses meanfield variational inference to draw from an approximation to the
posterior distribution. In particular, this algorithm finds the set of
independent normal distributions in the unconstrained space that — when
transformed into the constrained space — most closely approximate the
posterior distribution. Then it draws repeatedly from these independent
normal distributions and transforms them into the constrained space. The
entire process is much faster than HMC and yields independent draws but
is not recommended for final statistical inference. It can be
useful to narrow the set of candidate models in large problems, particularly
when specifying QR=TRUE
in stan_glm
,
stan_glmer
, and stan_gamm4
, but is only
an approximation to the posterior distribution.
algorithm="fullrank"
)Uses fullrank variational inference to draw from an approximation to the posterior distribution by finding the multivariate normal distribution in the unconstrained space that — when transformed into the constrained space — most closely approximates the posterior distribution. Then it draws repeatedly from this multivariate normal distribution and transforms the draws into the constrained space. This process is slower than meanfield variational inference but is faster than HMC. Although still an approximation to the posterior distribution and thus not recommended for final statistical inference, the approximation is more realistic than that of meanfield variational inference because the parameters are not assumed to be independent in the unconstrained space. Nevertheless, fullrank variational inference is a more difficult optimization problem and the algorithm is more prone to nonconvergence or convergence to a local optimum.
algorithm="optimizing"
)Finds the posterior mode using a C++ implementation of the LBGFS algorithm.
See optimizing
for more details. If there is no prior
information, then this is equivalent to maximum likelihood, in which case
there is no great reason to use the functions in the rstanarm package
over the emulated functions in other packages. However, if priors are
specified, then the estimates are penalized maximum likelihood estimates,
which may have some redeeming value. Currently, optimization is only
supported for stan_glm
.
Modeling functions available in rstanarm
The model estimating functions are described in greater detail in their individual help pages and vignettes. Here we provide a very brief overview:
stan_lm
, stan_aov
, stan_biglm
Similar to lm
or aov
but with
novel regularizing priors on the model parameters that are driven by prior
beliefs about $R^2$
, the proportion of variance in the outcome
attributable to the predictors in a linear model.
stan_glm
, stan_glm.nb
Similar to glm
but with various possible prior
distributions for the coefficients and, if applicable, a prior distribution
for any auxiliary parameter in a Generalized Linear Model (GLM) that is
characterized by a family
object (e.g. the shape
parameter in Gamma models). It is also possible to estimate a negative
binomial model in a similar way to the glm.nb
function
in the MASS package.
stan_glmer
, stan_glmer.nb
, stan_lmer
Similar to the glmer
, glmer.nb
and
lmer
functions in the lme4 package in that GLMs
are augmented to have groupspecific terms that deviate from the common
coefficients according to a meanzero multivariate normal distribution with
a highlystructured but unknown covariance matrix (for which rstanarm
introduces an innovative prior distribution). MCMC provides more
appropriate estimates of uncertainty for models that consist of a mix of
common and groupspecific parameters.
stan_mvmer
A multivariate form of stan_glmer
, whereby the user can
specify one or more submodels each consisting of a GLM with groupspecific
terms. If more than one submodel is specified (i.e. there is more than one
outcome variable) then a dependence is induced by assuming that the
groupspecific terms for each grouping factor are correlated across submodels.
stan_nlmer
Similar to nlmer
in the lme4 package for
nonlinear "mixedeffects" models, but the groupspecific coefficients
have flexible priors on their unknown covariance matrices.
stan_gamm4
Similar to gamm4
in the gamm4 package, which
augments a GLM (possibly with groupspecific terms) with nonlinear smooth
functions of the predictors to form a Generalized Additive Mixed Model
(GAMM). Rather than calling glmer
like
gamm4
does, stan_gamm4
essentially calls
stan_glmer
, which avoids the optimization issues that often
crop up with GAMMs and provides better estimates for the uncertainty of the
parameter estimates.
stan_polr
Similar to polr
in the MASS package in that it
models an ordinal response, but the Bayesian model also implies a prior
distribution on the unknown cutpoints. Can also be used to model binary
outcomes, possibly while estimating an unknown exponent governing the
probability of success.
stan_betareg
Similar to betareg
in that it models an outcome that
is a rate (proportion) but, rather than performing maximum likelihood
estimation, full Bayesian estimation is performed by default, with
customizable prior distributions for all parameters.
stan_clogit
Similar to clogit
in that it models an binary outcome
where the number of successes and failures is fixed within each stratum by
the research design. There are some minor syntactical differences relative
to clogit
that allow stan_clogit
to accept
groupspecific terms as in stan_glmer
.
stan_surv
Fits models to survival (i.e. timetoevent) data using either a hazard scale or accelerated failure time (AFT) formulation. The user can choose between either a flexible splinebased approximation for the baseline hazard or a standard parametric distributional assumption for the baseline hazard. Covariate effects can then be accommodated under proportional or nonproportional hazards assumptions (for models on the hazard scale) or using timefixed or timevarying acceleration factors (for models on the AFT scale).
stan_jm
Fits shared parameter joint models for longitudinal and survival (i.e. timetoevent) data. The joint model can be univariate (i.e. one longitudinal outcome) or multivariate (i.e. more than one longitudinal outcome). A variety of parameterisations are available for linking the longitudinal and event processes (i.e. a variety of association structures).
Compute a Bayesian version of Rsquared or LOOadjusted Rsquared for regression models.
## S3 method for class 'stanreg' bayes_R2(object, ..., re.form = NULL) ## S3 method for class 'stanreg' loo_R2(object, ...)
## S3 method for class 'stanreg' bayes_R2(object, ..., re.form = NULL) ## S3 method for class 'stanreg' loo_R2(object, ...)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
... 
Currently ignored. 
re.form 
For models with grouplevel terms, 
A vector of Rsquared values with length equal to the posterior sample size (the posterior distribution of Rsquared).
Andrew Gelman, Ben Goodrich, Jonah Gabry, and Aki Vehtari (2018). Rsquared for Bayesian regression models. The American Statistician, to appear. doi:10.1080/00031305.2018.1549100 (Preprint, Notebook)
if (.Platform$OS.type != "windows") { fit < stan_glm( mpg ~ wt + cyl, data = mtcars, QR = TRUE, chains = 2, refresh = 0 ) rsq < bayes_R2(fit) print(median(rsq)) hist(rsq) loo_rsq < loo_R2(fit) print(median(loo_rsq)) # multilevel binomial model if (!exists("example_model")) example(example_model) print(example_model) median(bayes_R2(example_model)) median(bayes_R2(example_model, re.form = NA)) # exclude grouplevel }
if (.Platform$OS.type != "windows") { fit < stan_glm( mpg ~ wt + cyl, data = mtcars, QR = TRUE, chains = 2, refresh = 0 ) rsq < bayes_R2(fit) print(median(rsq)) hist(rsq) loo_rsq < loo_R2(fit) print(median(loo_rsq)) # multilevel binomial model if (!exists("example_model")) example(example_model) print(example_model) median(bayes_R2(example_model)) median(bayes_R2(example_model, re.form = NA)) # exclude grouplevel }
Collapse the linear predictor across the lower level units clustered an individual, using the function specified in the 'grp_assoc' argument
collapse_within_groups(eta, grp_idx, grp_assoc = "sum")
collapse_within_groups(eta, grp_idx, grp_assoc = "sum")
eta 
The linear predictor evaluated for all lower level groups at the quadrature points. 
grp_idx 
An N*2 array providing the indices of the first (col 1) and last (col 2) observations in eta that correspond to individuals i = 1,...,N. 
grp_assoc 
Character string, the function to use to collapse across the lower level units clustered within individuals. 
A vector or matrix, depending on the method called.
Collapse the linear predictor across the lower level units clustered an individual, using the function specified in the 'grp_assoc' argument
## Default S3 method: collapse_within_groups(eta, grp_idx, grp_assoc)
## Default S3 method: collapse_within_groups(eta, grp_idx, grp_assoc)
eta 
The linear predictor evaluated for all lower level groups at the quadrature points. 
grp_idx 
An N*2 array providing the indices of the first (col 1) and last (col 2) observations in eta that correspond to individuals i = 1,...,N. 
grp_assoc 
Character string, the function to use to collapse across the lower level units clustered within individuals. 
A vector or matrix, depending on the method called.
Collapse the linear predictor across the lower level units clustered an individual, using the function specified in the 'grp_assoc' argument
## S3 method for class 'matrix' collapse_within_groups(eta, grp_idx, grp_assoc)
## S3 method for class 'matrix' collapse_within_groups(eta, grp_idx, grp_assoc)
eta 
The linear predictor evaluated for all lower level groups at the quadrature points. 
grp_idx 
An N*2 array providing the indices of the first (col 1) and last (col 2) observations in eta that correspond to individuals i = 1,...,N. 
grp_assoc 
Character string, the function to use to collapse across the lower level units clustered within individuals. 
A vector or matrix, depending on the method called.
Evaluate the log baseline hazard at the specified times given the vector or matrix of MCMC draws for the baseline hazard coeffients / parameters
evaluate_log_survival(log_haz, qnodes, qwts)
evaluate_log_survival(log_haz, qnodes, qwts)
log_haz 
A vector containing the log hazard for each individual, evaluated at each of the quadrature points. The vector should be ordered such that the first N elements contain the log_haz evaluated for each individual at quadrature point 1, then the next N elements are the log_haz evaluated for each individual at quadrature point 2, and so on. 
qnodes 
Integer specifying the number of quadrature nodes at which the log hazard was evaluated for each individual. 
qwts 
A vector of unstandardised GK quadrature weights. 
A vector or matrix of log survival probabilities.
Evaluate the log baseline hazard at the specified times given the vector or matrix of MCMC draws for the baseline hazard coeffients / parameters
## Default S3 method: evaluate_log_survival(log_haz, qnodes, qwts)
## Default S3 method: evaluate_log_survival(log_haz, qnodes, qwts)
log_haz 
A vector containing the log hazard for each individual, evaluated at each of the quadrature points. The vector should be ordered such that the first N elements contain the log_haz evaluated for each individual at quadrature point 1, then the next N elements are the log_haz evaluated for each individual at quadrature point 2, and so on. 
qnodes 
Integer specifying the number of quadrature nodes at which the log hazard was evaluated for each individual. 
qwts 
A vector of unstandardised GK quadrature weights. 
A vector or matrix of log survival probabilities.
Evaluate the log baseline hazard at the specified times given the vector or matrix of MCMC draws for the baseline hazard coeffients / parameters
## S3 method for class 'matrix' evaluate_log_survival(log_haz, qnodes, qwts)
## S3 method for class 'matrix' evaluate_log_survival(log_haz, qnodes, qwts)
log_haz 
A vector containing the log hazard for each individual, evaluated at each of the quadrature points. The vector should be ordered such that the first N elements contain the log_haz evaluated for each individual at quadrature point 1, then the next N elements are the log_haz evaluated for each individual at quadrature point 2, and so on. 
qnodes 
Integer specifying the number of quadrature nodes at which the log hazard was evaluated for each individual. 
qwts 
A vector of unstandardised GK quadrature weights. 
A vector or matrix of log survival probabilities.
A model for use in the rstanarm examples related to stan_jm
.
Calling example("example_jm")
will run the model in the
Examples section, below, and the resulting stanmvreg object will then be
available in the global environment. The chains
and iter
arguments are specified to make this example be small in size. In practice,
we recommend that they be left unspecified in order to use the default
values or increased if there are convergence problems. The cores
argument is optional and on a multicore system, the user may well want
to set that equal to the number of chains being executed.
# set.seed(123) if (.Platform$OS.type != "windows"  .Platform$r_arch !="i386") example_jm < stan_jm(formulaLong = logBili ~ year + (1  id), dataLong = pbcLong[1:101,], formulaEvent = survival::Surv(futimeYears, death) ~ sex + trt, dataEvent = pbcSurv[1:15,], time_var = "year", # this next line is only to keep the example small in size! chains = 1, seed = 12345, iter = 100, refresh = 0)
# set.seed(123) if (.Platform$OS.type != "windows"  .Platform$r_arch !="i386") example_jm < stan_jm(formulaLong = logBili ~ year + (1  id), dataLong = pbcLong[1:101,], formulaEvent = survival::Surv(futimeYears, death) ~ sex + trt, dataEvent = pbcSurv[1:15,], time_var = "year", # this next line is only to keep the example small in size! chains = 1, seed = 12345, iter = 100, refresh = 0)
A model for use in rstanarm examples.
Calling example("example_model")
will run the model in the
Examples section, below, and the resulting stanreg object will then be
available in the global environment. The chains
and iter
arguments are specified to make this example be small in size. In practice,
we recommend that they be left unspecified in order to use the default
values (4 and 2000 respectively) or increased if there are convergence
problems. The cores
argument is optional and on a multicore system,
the user may well want to set that equal to the number of chains being
executed.
cbpp
for a description of the data.
if (.Platform$OS.type != "windows") { example_model < stan_glmer(cbind(incidence, size  incidence) ~ size + period + (1herd), data = lme4::cbpp, family = binomial, QR = TRUE, # this next line is only to keep the example small in size! chains = 2, cores = 1, seed = 12345, iter = 1000, refresh = 0) example_model }
if (.Platform$OS.type != "windows") { example_model < stan_glmer(cbind(incidence, size  incidence) ~ size + period + (1herd), data = lme4::cbpp, family = binomial, QR = TRUE, # this next line is only to keep the example small in size! chains = 2, cores = 1, seed = 12345, iter = 1000, refresh = 0) example_model }
Extract parameters from stanmat and return as a list
extract_pars(object, ...)
extract_pars(object, ...)
object 
A stanmvreg or stansurv object 
... 
Additional arguments passed to the method 
A named list
Extract parameters from stanmat and return as a list
## S3 method for class 'stanmvreg' extract_pars(object, stanmat = NULL, means = FALSE, ...)
## S3 method for class 'stanmvreg' extract_pars(object, stanmat = NULL, means = FALSE, ...)
object 
A stanmvreg or stansurv object 
stanmat 
A matrix of posterior draws, may be provided if the desired stanmat is only a subset of the draws from as.matrix(object$stanfit) 
means 
Logical, if TRUE then the posterior means are returned 
... 
Additional arguments passed to the method 
A named list
Extract parameters from stanmat and return as a list
## S3 method for class 'stansurv' extract_pars(object, stanmat = NULL, means = FALSE, ...)
## S3 method for class 'stansurv' extract_pars(object, stanmat = NULL, means = FALSE, ...)
object 
A stanmvreg or stansurv object 
stanmat 
A matrix of posterior draws, may be provided if the desired stanmat is only a subset of the draws from as.matrix(object$stanfit) 
means 
Logical, if TRUE then the posterior means are returned 
... 
Additional arguments passed to the method 
A named list
It is necessary to drop unneeded variables though so that errors are not encountered if the original data contained NA values for variables unrelated to the model formula. We generate a data frame here for insample predictions rather than using a model frame, since some quantities will need to be recalculated at quadrature points etc, for example in posterior_survfit.
get_model_data(object, ...)
get_model_data(object, ...)
object 
A stansurv, stanmvreg or stanjm object. 
... 
Additional arguments passed to the method 
A data frame or list of data frames with all the (unevaluated) variables required for predictions.
It is necessary to drop unneeded variables though so that errors are not encountered if the original data contained NA values for variables unrelated to the model formula. We generate a data frame here for insample predictions rather than using a model frame, since some quantities will need to be recalculated at quadrature points etc, for example in posterior_survfit.
## S3 method for class 'stanmvreg' get_model_data(object, m = NULL, ...)
## S3 method for class 'stanmvreg' get_model_data(object, m = NULL, ...)
object 
A stanmvreg. 
m 
Integer specifying which submodel to get the prediction data frame for (for stanmvreg or stanjm objects). 
... 
Additional arguments passed to the method 
A data frame or list of data frames with all the (unevaluated) variables required for predictions.
It is necessary to drop unneeded variables though so that errors are not encountered if the original data contained NA values for variables unrelated to the model formula. We generate a data frame here for insample predictions rather than using a model frame, since some quantities will need to be recalculated at quadrature points etc, for example in posterior_survfit.
## S3 method for class 'stansurv' get_model_data(object, ...)
## S3 method for class 'stansurv' get_model_data(object, ...)
object 
A stansurv object. 
... 
Additional arguments passed to the method 
A data frame or list of data frames with all the (unevaluated) variables required for predictions.
The kfold
method performs exact $K$
fold crossvalidation. First
the data are randomly partitioned into $K$
subsets of equal size (or as close
to equal as possible), or the user can specify the folds
argument
to determine the partitioning. Then the model is refit $K$
times, each time
leaving out one of the $K$
subsets. If $K$
is equal to the total
number of observations in the data then $K$
fold crossvalidation is
equivalent to exact leaveoneout crossvalidation (to which
loo
is an efficient approximation).
## S3 method for class 'stanreg' kfold( x, K = 10, ..., folds = NULL, save_fits = FALSE, cores = getOption("mc.cores", 1) )
## S3 method for class 'stanreg' kfold( x, K = 10, ..., folds = NULL, save_fits = FALSE, cores = getOption("mc.cores", 1) )
x 
A fitted model object returned by one of the rstanarm modeling functions. See stanregobjects. 
K 
For 
... 
Currently ignored. 
folds 
For 
save_fits 
For 
cores 
The number of cores to use for parallelization. Instead fitting
separate Markov chains for the same model on different cores, by default

An object with classes 'kfold' and 'loo' that has a similar structure
as the objects returned by the loo
and waic
methods and is compatible with the loo_compare
function for
comparing models.
Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leaveoneout crossvalidation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s1122201696964. arXiv preprint: https://arxiv.org/abs/1507.04544
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Using stacking to average Bayesian predictive distributions. Bayesian Analysis, advance publication, doi:10.1214/17BA1091.
if (.Platform$OS.type != "windows") { fit1 < stan_glm(mpg ~ wt, data = mtcars, refresh = 0) fit2 < stan_glm(mpg ~ wt + cyl, data = mtcars, refresh = 0) fit3 < stan_glm(mpg ~ disp * as.factor(cyl), data = mtcars, refresh = 0) # 10fold crossvalidation # (if possible also specify the 'cores' argument to use multiple cores) (kfold1 < kfold(fit1, K = 10)) kfold2 < kfold(fit2, K = 10) kfold3 < kfold(fit3, K = 10) loo_compare(kfold1, kfold2, kfold3) # stratifying by a grouping variable # (note: might get some divergences warnings with this model but # this is just intended as a quick example of how to code this) fit4 < stan_lmer(mpg ~ disp + (1cyl), data = mtcars, refresh = 0) table(mtcars$cyl) folds_cyl < loo::kfold_split_stratified(K = 3, x = mtcars$cyl) table(cyl = mtcars$cyl, fold = folds_cyl) kfold4 < kfold(fit4, folds = folds_cyl, cores = 2) print(kfold4) } # Example code demonstrating the different ways to specify the number # of cores and how the cores are used # # options(mc.cores = NULL) # # # spread the K models over N_CORES cores (method 1) # kfold(fit, K, cores = N_CORES) # # # spread the K models over N_CORES cores (method 2) # options(mc.cores = N_CORES) # kfold(fit, K) # # # fit K models sequentially using N_CORES cores for the Markov chains each time # options(mc.cores = N_CORES) # kfold(fit, K, cores = 1)
if (.Platform$OS.type != "windows") { fit1 < stan_glm(mpg ~ wt, data = mtcars, refresh = 0) fit2 < stan_glm(mpg ~ wt + cyl, data = mtcars, refresh = 0) fit3 < stan_glm(mpg ~ disp * as.factor(cyl), data = mtcars, refresh = 0) # 10fold crossvalidation # (if possible also specify the 'cores' argument to use multiple cores) (kfold1 < kfold(fit1, K = 10)) kfold2 < kfold(fit2, K = 10) kfold3 < kfold(fit3, K = 10) loo_compare(kfold1, kfold2, kfold3) # stratifying by a grouping variable # (note: might get some divergences warnings with this model but # this is just intended as a quick example of how to code this) fit4 < stan_lmer(mpg ~ disp + (1cyl), data = mtcars, refresh = 0) table(mtcars$cyl) folds_cyl < loo::kfold_split_stratified(K = 3, x = mtcars$cyl) table(cyl = mtcars$cyl, fold = folds_cyl) kfold4 < kfold(fit4, folds = folds_cyl, cores = 2) print(kfold4) } # Example code demonstrating the different ways to specify the number # of cores and how the cores are used # # options(mc.cores = NULL) # # # spread the K models over N_CORES cores (method 1) # kfold(fit, K, cores = N_CORES) # # # spread the K models over N_CORES cores (method 2) # options(mc.cores = N_CORES) # kfold(fit, K) # # # fit K models sequentially using N_CORES cores for the Markov chains each time # options(mc.cores = N_CORES) # kfold(fit, K, cores = 1)
The ShinyStan interface provides visual and numerical summaries of model parameters and convergence diagnostics.
## S3 method for class 'stanreg' launch_shinystan( object, ppd = TRUE, seed = 1234, model_name = NULL, note = NULL, rstudio = getOption("shinystan.rstudio"), ... )
## S3 method for class 'stanreg' launch_shinystan( object, ppd = TRUE, seed = 1234, model_name = NULL, note = NULL, rstudio = getOption("shinystan.rstudio"), ... )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
ppd 
Should rstanarm draw from the posterior predictive
distribution before launching ShinyStan? The default is 
seed 
Passed to pp_check if

model_name , note

Optional arguments passed to

rstudio 
Only relevant for 'RStudio' users. The default ( 
... 
Optional arguments passed to 
The launch_shinystan
function will accept a
stanreg
object as input. Currently, almost
any model fit using one of rstanarm's modelfitting functions can be
used with ShinyStan. The only exception is that ShinyStan does not
currently support rstanarm models fit using
algorithm='optimizing'
. See the
shinystan package documentation for more
information.
For some rstanarm models ShinyStan may take a very long time to launch.
If this is the case with one of your models you may be able to speed up
launch_shinystan
in one of several ways:
When used with a stanreg
object
(rstanarm model object) ShinyStan will draw from the posterior
predictive distribution and prepare graphical posterior predictive checks
before launching. That way when you go to the PPcheck page the plots are
immediately available. This can be time consuming for models fit to very
large datasets and you can prevent this behavior by creating a shinystan
object before calling launch_shinystan
. To do this use
as.shinystan
with optional argument ppd
set
to FALSE
(see the Examples section below). When you then launch
ShinyStan and go to the PPcheck page the plots will no longer be
automatically generated and you will be presented with the standard
interface requiring you to first specify the appropriate $y$
and
$yrep$
, which can be done for many but not all rstanarm models.
Even if you don't want to prevent ShinyStan from preparing graphical
posterior predictive checks, first creating a shinystan object using
as.shinystan
can reduce future launch
times. That is, launch_shinystan(sso)
will be faster than
launch_shinystan(fit)
, where sso
is a shinystan object and
fit
is a stanreg object. It still may take some time for
as.shinystan
to create sso
initially, but each time you
subsequently call launch_shinystan(sso)
it will reuse sso
instead of internally creating a shinystan object every time. See the
Examples section below.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
Muth, C., Oravecz, Z., and Gabry, J. (2018) Userfriendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology. 14(2), 99–119. https://www.tqmp.org/RegularArticles/vol142/p099/p099.pdf
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # Launch the ShinyStan app without saving the resulting shinystan object if (interactive()) launch_shinystan(example_model) # Launch the ShinyStan app (saving resulting shinystan object as sso) if (interactive()) sso < launch_shinystan(example_model) # First create shinystan object then call launch_shinystan sso < shinystan::as.shinystan(example_model) if (interactive()) launch_shinystan(sso) # Prevent ShinyStan from preparing graphical posterior predictive checks that # can be time consuming. example_model is small enough that it won't matter # much here but in general this can help speed up launch_shinystan sso < shinystan::as.shinystan(example_model, ppd = FALSE) if (interactive()) launch_shinystan(sso) ## End(Not run) }
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # Launch the ShinyStan app without saving the resulting shinystan object if (interactive()) launch_shinystan(example_model) # Launch the ShinyStan app (saving resulting shinystan object as sso) if (interactive()) sso < launch_shinystan(example_model) # First create shinystan object then call launch_shinystan sso < shinystan::as.shinystan(example_model) if (interactive()) launch_shinystan(sso) # Prevent ShinyStan from preparing graphical posterior predictive checks that # can be time consuming. example_model is small enough that it won't matter # much here but in general this can help speed up launch_shinystan sso < shinystan::as.shinystan(example_model, ppd = FALSE) if (interactive()) launch_shinystan(sso) ## End(Not run) }
Make linear predictor vector from x and point estimates for beta, or linear predictor matrix from x and full posterior sample of beta.
linear_predictor(beta, x, offset = NULL)
linear_predictor(beta, x, offset = NULL)
beta 
A vector or matrix or parameter estimates. 
x 
Predictor matrix. 
offset 
Optional offset vector. 
A vector or matrix.
Make linear predictor vector from x and point estimates for beta, or linear predictor matrix from x and full posterior sample of beta.
## Default S3 method: linear_predictor(beta, x, offset = NULL)
## Default S3 method: linear_predictor(beta, x, offset = NULL)
beta 
A vector or matrix or parameter estimates. 
x 
Predictor matrix. 
offset 
Optional offset vector. 
A vector or matrix.
Make linear predictor vector from x and point estimates for beta, or linear predictor matrix from x and full posterior sample of beta.
## S3 method for class 'matrix' linear_predictor(beta, x, offset = NULL)
## S3 method for class 'matrix' linear_predictor(beta, x, offset = NULL)
beta 
A vector or matrix or parameter estimates. 
x 
Predictor matrix. 
offset 
Optional offset vector. 
A vector or matrix.
Get inverse link function
linkinv(x, ...)
linkinv(x, ...)
x 
A stanreg object, family object, or string. 
... 
Other arguments passed to methods. For a 
The inverse link function associated with x.
Get inverse link function
## S3 method for class 'character' linkinv(x, ...)
## S3 method for class 'character' linkinv(x, ...)
x 
A character string 
... 
Other arguments passed to methods. For a 
The inverse link function associated with x.
Get inverse link function
## S3 method for class 'family' linkinv(x, ...)
## S3 method for class 'family' linkinv(x, ...)
x 
A family object 
... 
Other arguments passed to methods. For a 
The inverse link function associated with x.
Get inverse link function
## S3 method for class 'stanmvreg' linkinv(x, m = NULL, ...)
## S3 method for class 'stanmvreg' linkinv(x, m = NULL, ...)
x 
A stanmvreg object 
m 
An integer specifying the submodel. 
... 
Other arguments passed to methods. For a 
The inverse link function associated with x.
Get inverse link function
## S3 method for class 'stanreg' linkinv(x, ...)
## S3 method for class 'stanreg' linkinv(x, ...)
x 
A stanreg object 
... 
Other arguments passed to methods. For a 
The inverse link function associated with x.
get arguments needed for ll_fun
ll_args(object, ...)
ll_args(object, ...)
object 
stanreg object 
... 
For models without groupspecific terms (i.e., not stan_[g]lmer), if reloo_or_kfold is TRUE and 'newdata' is specified then ... is used to pass 'newx' and 'stanmat' from reloo or kfold (bypassing pp_data). This is a workaround in case there are issues with newdata containing factors with only a single level. Or for stanmvreg objects, then ... can be used to pass 'stanmat', which may be a matrix with a reduced number of draws (potentially just a single MCMC draw). 
a named list with elements data, draws, S (posterior sample size) and N = number of observations
Alternative ll_args method for stanjm objects that allows data and pars to be passed directly, rather than constructed using pp_data within the ll_args method. This can be much faster when used in the MH algorithm within posterior_survfit, since it doesn't require repeated calls to pp_data.
## S3 method for class 'stanjm' ll_args(object, data, pars, m = 1, reloo_or_kfold = FALSE, ...)
## S3 method for class 'stanjm' ll_args(object, data, pars, m = 1, reloo_or_kfold = FALSE, ...)
object 
A stanmvreg object 
data 
Output from .pp_data_jm 
pars 
Output from extract_pars 
m 
Integer specifying which submodel 
reloo_or_kfold 
logical. TRUE if ll_args is for reloo or kfold 
... 
Additional arguments passed to the method 
get arguments needed for ll_fun
## S3 method for class 'stanreg' ll_args( object, newdata = NULL, offset = NULL, m = NULL, reloo_or_kfold = FALSE, ... )
## S3 method for class 'stanreg' ll_args( object, newdata = NULL, offset = NULL, m = NULL, reloo_or_kfold = FALSE, ... )
object 
stanreg object 
newdata 
same as posterior predict 
offset 
vector of offsets (only required if model has offset term and 
m 
Integer specifying which submodel for stanmvreg objects 
reloo_or_kfold 
logical. TRUE if ll_args is for reloo or kfold 
... 
For models without groupspecific terms (i.e., not stan_[g]lmer), if reloo_or_kfold is TRUE and 'newdata' is specified then ... is used to pass 'newx' and 'stanmat' from reloo or kfold (bypassing pp_data). This is a workaround in case there are issues with newdata containing factors with only a single level. Or for stanmvreg objects, then ... can be used to pass 'stanmat', which may be a matrix with a reduced number of draws (potentially just a single MCMC draw). 
a named list with elements data, draws, S (posterior sample size) and N = number of observations
get arguments needed for ll_fun
## S3 method for class 'stansurv' ll_args(object, newdata = NULL, ...)
## S3 method for class 'stansurv' ll_args(object, newdata = NULL, ...)
object 
stansurv object 
newdata 
same as posterior predict 
... 
For models without groupspecific terms (i.e., not stan_[g]lmer), if reloo_or_kfold is TRUE and 'newdata' is specified then ... is used to pass 'newx' and 'stanmat' from reloo or kfold (bypassing pp_data). This is a workaround in case there are issues with newdata containing factors with only a single level. Or for stanmvreg objects, then ... can be used to pass 'stanmat', which may be a matrix with a reduced number of draws (potentially just a single MCMC draw). 
a named list with elements data, draws, S (posterior sample size) and N = number of observations
For models fit using MCMC only, the log_lik
method returns the
$S$
by $N$
pointwise loglikelihood matrix, where $S$
is the size
of the posterior sample and $N$
is the number of data points, or in the
case of the stanmvreg
method (when called on stan_jm
model objects) an $S$
by $Npat$
matrix where $Npat$
is the number
of individuals.
## S3 method for class 'stanreg' log_lik(object, newdata = NULL, offset = NULL, ...) ## S3 method for class 'stanmvreg' log_lik(object, m = 1, newdata = NULL, ...) ## S3 method for class 'stanjm' log_lik(object, newdataLong = NULL, newdataEvent = NULL, ...)
## S3 method for class 'stanreg' log_lik(object, newdata = NULL, offset = NULL, ...) ## S3 method for class 'stanmvreg' log_lik(object, m = 1, newdata = NULL, ...) ## S3 method for class 'stanjm' log_lik(object, newdataLong = NULL, newdataEvent = NULL, ...)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
newdata 
An optional data frame of new data (e.g. holdout data) to use
when evaluating the loglikelihood. See the description of 
offset 
A vector of offsets. Only required if 
... 
Currently ignored. 
m 
Integer specifying the number or name of the submodel 
newdataLong , newdataEvent

Optional data frames containing new data
(e.g. holdout data) to use when evaluating the loglikelihood for a
model estimated using 
For the stanreg
and stanmvreg
methods an $S$
by
$N$
matrix, where $S$
is the size of the posterior sample and
$N$
is the number of data points. For the stanjm
method
an $S$
by $Npat$
matrix where $Npat$
is the number of individuals.
if (.Platform$OS.type != "windows") { roaches$roach100 < roaches$roach1 / 100 fit < stan_glm( y ~ roach100 + treatment + senior, offset = log(exposure2), data = roaches, family = poisson(link = "log"), prior = normal(0, 2.5), prior_intercept = normal(0, 10), iter = 500, # just to speed up example, refresh = 0 ) ll < log_lik(fit) dim(ll) all.equal(ncol(ll), nobs(fit)) # using newdata argument nd < roaches[1:2, ] nd$treatment[1:2] < c(0, 1) ll2 < log_lik(fit, newdata = nd, offset = c(0, 0)) head(ll2) dim(ll2) all.equal(ncol(ll2), nrow(nd)) }
if (.Platform$OS.type != "windows") { roaches$roach100 < roaches$roach1 / 100 fit < stan_glm( y ~ roach100 + treatment + senior, offset = log(exposure2), data = roaches, family = poisson(link = "log"), prior = normal(0, 2.5), prior_intercept = normal(0, 10), iter = 500, # just to speed up example, refresh = 0 ) ll < log_lik(fit) dim(ll) all.equal(ncol(ll), nobs(fit)) # using newdata argument nd < roaches[1:2, ] nd$treatment[1:2] < c(0, 1) ll2 < log_lik(fit, newdata = nd, offset = c(0, 0)) head(ll2) dim(ll2) all.equal(ncol(ll2), nrow(nd)) }
Logit and inverse logit
logit(x) invlogit(x)
logit(x) invlogit(x)
x 
Numeric vector. 
A numeric vector the same length as x
.
These functions are wrappers around the E_loo
function
(loo package) that provide compatibility for rstanarm models.
## S3 method for class 'stanreg' loo_predict( object, type = c("mean", "var", "quantile"), probs = 0.5, ..., psis_object = NULL ) ## S3 method for class 'stanreg' loo_linpred( object, type = c("mean", "var", "quantile"), probs = 0.5, transform = FALSE, ..., psis_object = NULL ) ## S3 method for class 'stanreg' loo_predictive_interval(object, prob = 0.9, ..., psis_object = NULL)
## S3 method for class 'stanreg' loo_predict( object, type = c("mean", "var", "quantile"), probs = 0.5, ..., psis_object = NULL ) ## S3 method for class 'stanreg' loo_linpred( object, type = c("mean", "var", "quantile"), probs = 0.5, transform = FALSE, ..., psis_object = NULL ) ## S3 method for class 'stanreg' loo_predictive_interval(object, prob = 0.9, ..., psis_object = NULL)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
type 
The type of expectation to compute. The options are

probs 
For computing quantiles, a vector of probabilities. 
... 
Currently unused. 
psis_object 
An object returned by 
transform 
Passed to 
prob 
For 
A list with elements value
and pareto_k
.
For loo_predict
and loo_linpred
the value component is a
vector with one element per observation.
For loo_predictive_interval
the value
component is a matrix
with one row per observation and two columns (like
predictive_interval
). loo_predictive_interval(..., prob
= p)
is equivalent to loo_predict(..., type = "quantile", probs =
c(a, 1a))
with a = (1  p)/2
, except it transposes the result and
adds informative column names.
See E_loo
and paretokdiagnostic
for
details on the pareto_k
diagnostic.
Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leaveoneout crossvalidation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s1122201696964. arXiv preprint: https://arxiv.org/abs/1507.04544
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Using stacking to average Bayesian predictive distributions. Bayesian Analysis, advance publication, doi:10.1214/17BA1091.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # optionally, logweights can be precomputed and reused psis_result < loo::psis(log_ratios = log_lik(example_model)) loo_probs < loo_linpred(example_model, type = "mean", transform = TRUE, psis_object = psis_result) str(loo_probs) loo_pred_var < loo_predict(example_model, type = "var", psis_object = psis_result) str(loo_pred_var) loo_pred_ints < loo_predictive_interval(example_model, prob = 0.8, psis_object = psis_result) str(loo_pred_ints) ## End(Not run) }
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # optionally, logweights can be precomputed and reused psis_result < loo::psis(log_ratios = log_lik(example_model)) loo_probs < loo_linpred(example_model, type = "mean", transform = TRUE, psis_object = psis_result) str(loo_probs) loo_pred_var < loo_predict(example_model, type = "var", psis_object = psis_result) str(loo_pred_var) loo_pred_ints < loo_predictive_interval(example_model, prob = 0.8, psis_object = psis_result) str(loo_pred_ints) ## End(Not run) }
For models fit using MCMC, compute approximate leaveoneout
crossvalidation (LOO, LOOIC) or, less preferably, the Widely Applicable
Information Criterion (WAIC) using the loo
package. (For $K$
fold crossvalidation see kfold.stanreg
.)
Functions for model comparison, and model weighting/averaging are also
provided.
Note: these functions are not guaranteed to work
properly unless the data
argument was specified when the model was
fit. Also, as of loo version 2.0.0
the default number of cores
is now only 1, but we recommend using as many (or close to as many) cores
as possible by setting the cores
argument or using
options(mc.cores = VALUE)
to set it for an entire session.
## S3 method for class 'stanreg' loo( x, ..., cores = getOption("mc.cores", 1), save_psis = FALSE, k_threshold = NULL ) ## S3 method for class 'stanreg' waic(x, ...) ## S3 method for class 'stanreg' loo_compare(x, ..., criterion = c("loo", "kfold", "waic"), detail = FALSE) ## S3 method for class 'stanreg_list' loo_compare(x, ..., criterion = c("loo", "kfold", "waic"), detail = FALSE) ## S3 method for class 'stanreg_list' loo_model_weights(x, ..., cores = getOption("mc.cores", 1), k_threshold = NULL) compare_models(..., loos = list(), detail = FALSE)
## S3 method for class 'stanreg' loo( x, ..., cores = getOption("mc.cores", 1), save_psis = FALSE, k_threshold = NULL ) ## S3 method for class 'stanreg' waic(x, ...) ## S3 method for class 'stanreg' loo_compare(x, ..., criterion = c("loo", "kfold", "waic"), detail = FALSE) ## S3 method for class 'stanreg_list' loo_compare(x, ..., criterion = c("loo", "kfold", "waic"), detail = FALSE) ## S3 method for class 'stanreg_list' loo_model_weights(x, ..., cores = getOption("mc.cores", 1), k_threshold = NULL) compare_models(..., loos = list(), detail = FALSE)
x 
For For the 
... 
For For 
cores , save_psis

Passed to 
k_threshold 
Threshold for flagging estimates of the Pareto shape
parameters 
criterion 
For 
detail 
For 
loos 
a list of objects produced by the 
The structure of the objects returned by loo
and waic
methods are documented in detail in the Value section in
loo
and waic
(from the loo
package).
loo_compare
returns a matrix with class 'compare.loo'. See the
Comparing models section below for more details.
The loo
method for stanreg objects
provides an interface to the loo package for
approximate leaveoneout crossvalidation (LOO). The LOO Information
Criterion (LOOIC) has the same purpose as the Akaike Information Criterion
(AIC) that is used by frequentists. Both are intended to estimate the
expected log predictive density (ELPD) for a new dataset. However, the AIC
ignores priors and assumes that the posterior distribution is multivariate
normal, whereas the functions from the loo package do not make this
distributional assumption and integrate over uncertainty in the parameters.
This only assumes that any one observation can be omitted without having a
major effect on the posterior distribution, which can be judged using the
diagnostic plot provided by the plot.loo
method and the
warnings provided by the print.loo
method (see the
How to Use the rstanarm Package vignette for an example of this
process).
loo
gives warnings (k_threshold)The k_threshold
argument to the loo
method for rstanarm
models is provided as a possible remedy when the diagnostics reveal
problems stemming from the posterior's sensitivity to particular
observations. Warnings about Pareto $k$
estimates indicate observations
for which the approximation to LOO is problematic (this is described in
detail in Vehtari, Gelman, and Gabry (2017) and the
loo package documentation). The
k_threshold
argument can be used to set the $k$
value above
which an observation is flagged. If k_threshold
is not NULL
and there are $J$
observations with $k$
estimates above
k_threshold
then when loo
is called it will refit the
original model $J$
times, each time leaving out one of the $J$
problematic observations. The pointwise contributions of these observations
to the total ELPD are then computed directly and substituted for the
previous estimates from these $J$
observations that are stored in the
object created by loo
. Another option to consider is Kfold
crossvalidation, which is documented on a separate page (see
kfold
).
Note: in the warning messages issued by loo
about large
Pareto $k$
estimates we recommend setting k_threshold
to at
least $0.7$
. There is a theoretical reason, explained in Vehtari,
Gelman, and Gabry (2017), for setting the threshold to the stricter value
of $0.5$
, but in practice they find that errors in the LOO
approximation start to increase nonnegligibly when $k > 0.7$
.
"loo" (or "waic" or "kfold") objects can be passed
to the loo_compare
function in the loo package to
perform model comparison. rstanarm also provides a
loo_compare.stanreg
method that can be used if the "loo" (or "waic"
or "kfold") object has been added to the fitted model object (see the
Examples section below for how to do this). This second method
allows rstanarm to perform some extra checks that can't be done by
the loo package itself (e.g., verifying that all models to be
compared were fit using the same outcome variable).
loo_compare
will return a matrix with one row per model and columns
containing the ELPD difference and the standard error of the difference. In
the first row of the matrix will be the model with the largest ELPD
(smallest LOOIC) and will contain zeros (there is no difference between
this model and itself). For each of the remaining models the ELPD
difference and SE are reported relative to the model with the best ELPD
(the first row). See the Details section at the
loo_compare
page in the loo package for more
information.
The loo_model_weights
method can be used to
compute model weights for a "stanreg_list"
object, which is a list
of fitted model objects made with stanreg_list
. The end of
the Examples section has a demonstration. For details see the
loo_model_weights
documentation in the loo
package.
Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leaveoneout crossvalidation and WAIC. Statistics and Computing. 27(5), 1413–1432. doi:10.1007/s1122201696964. arXiv preprint: https://arxiv.org/abs/1507.04544
Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Using stacking to average Bayesian predictive distributions. Bayesian Analysis, advance publication, doi:10.1214/17BA1091.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
The new loo package vignettes
and various rstanarm vignettes
for more examples using loo
and related functions with rstanarm models.
paretokdiagnostic
in the loo package for
more on Pareto $k$
diagnostics.
log_lik.stanreg
to directly access the pointwise
loglikelihood matrix.
if (.Platform$OS.type != "windows") { fit1 < stan_glm(mpg ~ wt, data = mtcars, refresh = 0) fit2 < stan_glm(mpg ~ wt + cyl, data = mtcars, refresh = 0) # (for bigger models use as many cores as possible) loo1 < loo(fit1, cores = 1) print(loo1) loo2 < loo(fit2, cores = 1) print(loo2) # when comparing models the loo objects can be passed to loo_compare # as individual arguments or as a list of loo objects loo_compare(loo1, loo2) loo_compare(list(loo1, loo2)) # if the fitted model objects contain a loo object in the component "loo" # then the model objects can be passed directly or as a stanreg_list fit1$loo < loo1 fit2$loo < loo2 loo_compare(fit1, fit2) # if the fitted model objects contain a loo object _and_ a waic or kfold # object, then the criterion argument determines which of them the comparison # is based on fit1$waic < waic(fit1) fit2$waic < waic(fit2) loo_compare(fit1, fit2, criterion = "waic") # the models can also be combined into a stanreg_list object, and more # informative model names can be provided to use when printing model_list < stanreg_list(fit1, fit2, model_names = c("Fewer predictors", "More predictors")) loo_compare(model_list) fit3 < stan_glm(mpg ~ disp * as.factor(cyl), data = mtcars, refresh = 0) loo3 < loo(fit3, cores = 2, k_threshold = 0.7) loo_compare(loo1, loo2, loo3) # setting detail=TRUE will also print model formulas if used with # loo_compare.stanreg or loo_compare.stanreg_list fit3$loo < loo3 model_list < stanreg_list(fit1, fit2, fit3) loo_compare(model_list, detail=TRUE) # Computing model weights # # if the objects in model_list already have 'loo' components then those # will be used. otherwise loo will be computed for each model internally # (in which case the 'cores' argument may also be used and is passed to loo()) loo_model_weights(model_list) # defaults to method="stacking" loo_model_weights(model_list, method = "pseudobma") loo_model_weights(model_list, method = "pseudobma", BB = FALSE) # you can also pass precomputed loo objects directly to loo_model_weights loo_list < list(A = loo1, B = loo2, C = loo3) # names optional (affects printing) loo_model_weights(loo_list) }
if (.Platform$OS.type != "windows") { fit1 < stan_glm(mpg ~ wt, data = mtcars, refresh = 0) fit2 < stan_glm(mpg ~ wt + cyl, data = mtcars, refresh = 0) # (for bigger models use as many cores as possible) loo1 < loo(fit1, cores = 1) print(loo1) loo2 < loo(fit2, cores = 1) print(loo2) # when comparing models the loo objects can be passed to loo_compare # as individual arguments or as a list of loo objects loo_compare(loo1, loo2) loo_compare(list(loo1, loo2)) # if the fitted model objects contain a loo object in the component "loo" # then the model objects can be passed directly or as a stanreg_list fit1$loo < loo1 fit2$loo < loo2 loo_compare(fit1, fit2) # if the fitted model objects contain a loo object _and_ a waic or kfold # object, then the criterion argument determines which of them the comparison # is based on fit1$waic < waic(fit1) fit2$waic < waic(fit2) loo_compare(fit1, fit2, criterion = "waic") # the models can also be combined into a stanreg_list object, and more # informative model names can be provided to use when printing model_list < stanreg_list(fit1, fit2, model_names = c("Fewer predictors", "More predictors")) loo_compare(model_list) fit3 < stan_glm(mpg ~ disp * as.factor(cyl), data = mtcars, refresh = 0) loo3 < loo(fit3, cores = 2, k_threshold = 0.7) loo_compare(loo1, loo2, loo3) # setting detail=TRUE will also print model formulas if used with # loo_compare.stanreg or loo_compare.stanreg_list fit3$loo < loo3 model_list < stanreg_list(fit1, fit2, fit3) loo_compare(model_list, detail=TRUE) # Computing model weights # # if the objects in model_list already have 'loo' components then those # will be used. otherwise loo will be computed for each model internally # (in which case the 'cores' argument may also be used and is passed to loo()) loo_model_weights(model_list) # defaults to method="stacking" loo_model_weights(model_list, method = "pseudobma") loo_model_weights(model_list, method = "pseudobma", BB = FALSE) # you can also pass precomputed loo objects directly to loo_model_weights loo_list < list(A = loo1, B = loo2, C = loo3) # names optional (affects printing) loo_model_weights(loo_list) }
Specifies the information required to fit a Negative Binomial GLM in a
similar way to negative.binomial
. However, here the
overdispersion parameter theta
is not specified by the user and always
estimated (really the reciprocal of the dispersion parameter is
estimated). A call to this function can be passed to the family
argument of stan_glm
or stan_glmer
to estimate a
Negative Binomial model. Alternatively, the stan_glm.nb
and
stan_glmer.nb
wrapper functions may be used, which call
neg_binomial_2
internally.
neg_binomial_2(link = "log")
neg_binomial_2(link = "log")
link 
The same as for 
An object of class family
very similar to
that of poisson
but with a different family name.
if (.Platform$OS.type != "windows") stan_glm(Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine, seed = 123, family = neg_binomial_2, QR = TRUE, algorithm = "optimizing") # or, equivalently, call stan_glm.nb() without specifying the family
if (.Platform$OS.type != "windows") stan_glm(Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine, seed = 123, family = neg_binomial_2, QR = TRUE, algorithm = "optimizing") # or, equivalently, call stan_glm.nb() without specifying the family
The methods documented on this page are actually some of the least important methods defined for stanreg objects. The most important methods are documented separately, each with its own page. Links to those pages are provided in the See Also section, below.
## S3 method for class 'stanmvreg' nobs(object, ...) ## S3 method for class 'stanreg' coef(object, ...) ## S3 method for class 'stanreg' confint(object, parm, level = 0.95, ...) ## S3 method for class 'stanreg' fitted(object, ...) ## S3 method for class 'stanreg' nobs(object, ...) ## S3 method for class 'stanreg' residuals(object, ...) ## S3 method for class 'stanreg' se(object, ...) ## S3 method for class 'stanreg' update(object, formula., ..., evaluate = TRUE) ## S3 method for class 'stanreg' vcov(object, correlation = FALSE, ...) ## S3 method for class 'stanreg' fixef(object, ...) ## S3 method for class 'stanreg' ngrps(object, ...) ## S3 method for class 'stanreg' nsamples(object, ...) ## S3 method for class 'stanreg' ranef(object, ...) ## S3 method for class 'stanreg' sigma(object, ...) ## S3 method for class 'stanreg' VarCorr(x, sigma = 1, ...)
## S3 method for class 'stanmvreg' nobs(object, ...) ## S3 method for class 'stanreg' coef(object, ...) ## S3 method for class 'stanreg' confint(object, parm, level = 0.95, ...) ## S3 method for class 'stanreg' fitted(object, ...) ## S3 method for class 'stanreg' nobs(object, ...) ## S3 method for class 'stanreg' residuals(object, ...) ## S3 method for class 'stanreg' se(object, ...) ## S3 method for class 'stanreg' update(object, formula., ..., evaluate = TRUE) ## S3 method for class 'stanreg' vcov(object, correlation = FALSE, ...) ## S3 method for class 'stanreg' fixef(object, ...) ## S3 method for class 'stanreg' ngrps(object, ...) ## S3 method for class 'stanreg' nsamples(object, ...) ## S3 method for class 'stanreg' ranef(object, ...) ## S3 method for class 'stanreg' sigma(object, ...) ## S3 method for class 'stanreg' VarCorr(x, sigma = 1, ...)
object , x

A fitted model object returned by one of the
rstanarm modeling functions. See 
... 
Ignored, except by the 
parm 
For 
level 
For 
formula. , evaluate

See 
correlation 
For 
sigma 
Ignored (included for compatibility with

The methods documented on this page are similar to the methods defined for objects of class 'lm', 'glm', 'glmer', etc. However there are a few key differences:
residuals
Residuals are always of type "response"
(not "deviance"
residuals or any other type). However, in the case of stan_polr
with more than two response categories, the residuals are the difference
between the latent utility and its linear predictor.
coef
Medians are used for point estimates. See the Point estimates section
in print.stanreg
for more details.
se
The se
function returns standard errors based on
mad
. See the Uncertainty estimates section in
print.stanreg
for more details.
confint
For models fit using optimization, confidence intervals are returned via a
call to confint.default
. If algorithm
is
"sampling"
, "meanfield"
, or "fullrank"
, the
confint
will throw an error because the
posterior_interval
function should be used to compute Bayesian
uncertainty intervals.
nsamples
The number of draws from the posterior distribution obtained
The print
,
summary
, and prior_summary
methods for stanreg objects for information on the fitted model.
launch_shinystan
to use the ShinyStan GUI to explore a
fitted rstanarm model.
The plot
method to plot estimates and
diagnostics.
The pp_check
method for graphical posterior predictive
checking.
The posterior_predict
and predictive_error
methods for predictions and predictive errors.
The posterior_interval
and predictive_interval
methods for uncertainty intervals for model parameters and predictions.
The loo
, kfold
, and
log_lik
methods for leaveoneout or Kfold crossvalidation,
model comparison, and computing the loglikelihood of (possibly new) data.
The as.matrix
, as.data.frame
,
and as.array
methods to access posterior draws.
Interface to bayesplot's
mcmc_pairs
function for use with
rstanarm models. Be careful not to specify too many parameters to
include or the plot will be both hard to read and slow to render.
## S3 method for class 'stanreg' pairs( x, pars = NULL, regex_pars = NULL, condition = pairs_condition(nuts = "accept_stat__"), ... )
## S3 method for class 'stanreg' pairs( x, pars = NULL, regex_pars = NULL, condition = pairs_condition(nuts = "accept_stat__"), ... )
x 
A fitted model object returned by one of the
rstanarm modeling functions. See 
pars 
An optional character vector of parameter names. All parameters are included by default, but for models with more than just a few parameters it may be far too many to visualize on a small computer screen and also may require substantial computing time. 
regex_pars 
An optional character vector of regular
expressions to use for parameter selection. 
condition 
Same as the 
... 
Optional arguments passed to

if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) bayesplot::color_scheme_set("purple") # see 'condition' argument above for details on the plots below and # above the diagonal. default is to split by accept_stat__. pairs(example_model, pars = c("(Intercept)", "logposterior")) # for demonstration purposes, intentionally fit a model that # will (almost certainly) have some divergences fit < stan_glm( mpg ~ ., data = mtcars, iter = 1000, # this combo of prior and adapt_delta should lead to some divergences prior = hs(), adapt_delta = 0.9, refresh = 0 ) pairs(fit, pars = c("wt", "sigma", "logposterior")) # requires hexbin package # pairs( # fit, # pars = c("wt", "sigma", "logposterior"), # transformations = list(sigma = "log"), # show log(sigma) instead of sigma # off_diag_fun = "hex" # use hexagonal heatmaps instead of scatterplots # ) bayesplot::color_scheme_set("brightblue") pairs( fit, pars = c("(Intercept)", "wt", "sigma", "logposterior"), transformations = list(sigma = "log"), off_diag_args = list(size = 3/4, alpha = 1/3), # size and transparency of scatterplot points np_style = pairs_style_np(div_color = "black", div_shape = 2) # color and shape of the divergences ) # Using the condition argument to show divergences above the diagonal pairs( fit, pars = c("(Intercept)", "wt", "logposterior"), condition = pairs_condition(nuts = "divergent__") ) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) bayesplot::color_scheme_set("purple") # see 'condition' argument above for details on the plots below and # above the diagonal. default is to split by accept_stat__. pairs(example_model, pars = c("(Intercept)", "logposterior")) # for demonstration purposes, intentionally fit a model that # will (almost certainly) have some divergences fit < stan_glm( mpg ~ ., data = mtcars, iter = 1000, # this combo of prior and adapt_delta should lead to some divergences prior = hs(), adapt_delta = 0.9, refresh = 0 ) pairs(fit, pars = c("wt", "sigma", "logposterior")) # requires hexbin package # pairs( # fit, # pars = c("wt", "sigma", "logposterior"), # transformations = list(sigma = "log"), # show log(sigma) instead of sigma # off_diag_fun = "hex" # use hexagonal heatmaps instead of scatterplots # ) bayesplot::color_scheme_set("brightblue") pairs( fit, pars = c("(Intercept)", "wt", "sigma", "logposterior"), transformations = list(sigma = "log"), off_diag_args = list(size = 3/4, alpha = 1/3), # size and transparency of scatterplot points np_style = pairs_style_np(div_color = "black", div_shape = 2) # color and shape of the divergences ) # Using the condition argument to show divergences above the diagonal pairs( fit, pars = c("(Intercept)", "wt", "logposterior"), condition = pairs_condition(nuts = "divergent__") ) }
This generic plot
method for predict.stanjm
objects will
plot the estimated subjectspecific or marginal longitudinal trajectory
using the data frame returned by a call to posterior_traj
.
To ensure that enough data points are available to plot the longitudinal
trajectory, it is assumed that the call to posterior_traj
would have used the default interpolate = TRUE
, and perhaps also
extrapolate = TRUE
(the latter being optional, depending on
whether or not the user wants to see extrapolation of the longitudinal
trajectory beyond the last observation time).
## S3 method for class 'predict.stanjm' plot( x, ids = NULL, limits = c("ci", "pi", "none"), xlab = NULL, ylab = NULL, vline = FALSE, plot_observed = FALSE, facet_scales = "free_x", ci_geom_args = NULL, grp_overlay = FALSE, ... )
## S3 method for class 'predict.stanjm' plot( x, ids = NULL, limits = c("ci", "pi", "none"), xlab = NULL, ylab = NULL, vline = FALSE, plot_observed = FALSE, facet_scales = "free_x", ci_geom_args = NULL, grp_overlay = FALSE, ... )
x 
A data frame and object of class 
ids 
An optional vector providing a subset of subject IDs for whom the predicted curves should be plotted. 
limits 
A quoted character string specifying the type of limits to
include in the plot. Can be one of: 
xlab , ylab

An optional axis label passed to

vline 
A logical. If 
plot_observed 
A logical. If 
facet_scales 
A character string passed to the 
ci_geom_args 
Optional arguments passed to

grp_overlay 
Only relevant if the model had lower level units
clustered within an individual. If 
... 
Optional arguments passed to

A ggplot
object, also of class plot.predict.stanjm
.
This object can be further customised using the ggplot2 package.
It can also be passed to the function plot_stack_jm
.
posterior_traj
, plot_stack_jm
,
posterior_survfit
, plot.survfit.stanjm
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # For a subset of individuals in the estimation dataset we will # obtain subjectspecific predictions for the longitudinal submodel # at evenly spaced times between 0 and their event or censoring time. pt1 < posterior_traj(example_jm, ids = c(7,13,15), interpolate = TRUE) plot(pt1) # credible interval for mean response plot(pt1, limits = "pi") # prediction interval for raw response plot(pt1, limits = "none") # no uncertainty interval # We can also extrapolate the longitudinal trajectories. pt2 < posterior_traj(example_jm, ids = c(7,13,15), interpolate = TRUE, extrapolate = TRUE) plot(pt2) plot(pt2, vline = TRUE) # add line indicating event or censoring time plot(pt2, vline = TRUE, plot_observed = TRUE) # overlay observed longitudinal data # We can change or add attributes to the plot plot1 < plot(pt2, ids = c(7,13,15), xlab = "Follow up time", vline = TRUE, plot_observed = TRUE, facet_scales = "fixed", color = "blue", linetype = 2, ci_geom_args = list(fill = "red")) plot1 # Since the returned plot is also a ggplot object, we can # modify some of its attributes after it has been returned plot1 + ggplot2::theme(strip.background = ggplot2::element_blank()) + ggplot2::labs(title = "Some plotted longitudinal trajectories")
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # For a subset of individuals in the estimation dataset we will # obtain subjectspecific predictions for the longitudinal submodel # at evenly spaced times between 0 and their event or censoring time. pt1 < posterior_traj(example_jm, ids = c(7,13,15), interpolate = TRUE) plot(pt1) # credible interval for mean response plot(pt1, limits = "pi") # prediction interval for raw response plot(pt1, limits = "none") # no uncertainty interval # We can also extrapolate the longitudinal trajectories. pt2 < posterior_traj(example_jm, ids = c(7,13,15), interpolate = TRUE, extrapolate = TRUE) plot(pt2) plot(pt2, vline = TRUE) # add line indicating event or censoring time plot(pt2, vline = TRUE, plot_observed = TRUE) # overlay observed longitudinal data # We can change or add attributes to the plot plot1 < plot(pt2, ids = c(7,13,15), xlab = "Follow up time", vline = TRUE, plot_observed = TRUE, facet_scales = "fixed", color = "blue", linetype = 2, ci_geom_args = list(fill = "red")) plot1 # Since the returned plot is also a ggplot object, we can # modify some of its attributes after it has been returned plot1 + ggplot2::theme(strip.background = ggplot2::element_blank()) + ggplot2::labs(title = "Some plotted longitudinal trajectories")
The plot
method for stanregobjects provides a convenient
interface to the MCMC module in the
bayesplot package for plotting MCMC draws and diagnostics. It is also
straightforward to use the functions from the bayesplot package directly rather than
via the plot
method. Examples of both methods of plotting are given
below.
## S3 method for class 'stanreg' plot(x, plotfun = "intervals", pars = NULL, regex_pars = NULL, ...) ## S3 method for class 'stansurv' plot( x, plotfun = "basehaz", pars = NULL, regex_pars = NULL, ..., prob = 0.95, limits = c("ci", "none"), ci_geom_args = NULL, n = 1000 )
## S3 method for class 'stanreg' plot(x, plotfun = "intervals", pars = NULL, regex_pars = NULL, ...) ## S3 method for class 'stansurv' plot( x, plotfun = "basehaz", pars = NULL, regex_pars = NULL, ..., prob = 0.95, limits = c("ci", "none"), ci_geom_args = NULL, n = 1000 )
x 
A fitted model object returned by one of the
rstanarm modeling functions. See 
plotfun 
A character string naming the bayesplot
MCMC function to use. The default is to call

pars 
An optional character vector of parameter names. 
regex_pars 
An optional character vector of regular
expressions to use for parameter selection. 
... 
Additional arguments to pass to 
prob 
A scalar between 0 and 1 specifying the width to use for the
plotted posterior uncertainty interval when 
limits 
A quoted character string specifying the type of limits to
include in the plot. Can be 
ci_geom_args 
Optional arguments passed to

n 
Integer specifying the number of points to interpolate along when plotting the baseline hazard or timevarying hazard ratio. Each of the points are joined using a line. 
Either a ggplot object that can be further customized using the
ggplot2 package, or an object created from multiple ggplot objects
(e.g. a gtable object created by arrangeGrob
).
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
The vignettes in the bayesplot package for many examples.
MCMCoverview
(bayesplot) for links to
the documentation for all the available plotting functions.
color_scheme_set
(bayesplot) to change
the color scheme used for plotting.
pp_check
for graphical posterior predictive checks.
plot_nonlinear
for models with nonlinear smooth
functions fit using stan_gamm4
.
if (.Platform$OS.type != "windows") { # Use rstanarm example model if (!exists("example_model")) example(example_model) fit < example_model ##################################### ### Intervals and point estimates ### ##################################### plot(fit) # same as plot(fit, "intervals"), plot(fit, "mcmc_intervals") p < plot(fit, pars = "size", regex_pars = "period", prob = 0.5, prob_outer = 0.9) p + ggplot2::ggtitle("Posterior medians \n with 50% and 90% intervals") # Shaded areas under densities bayesplot::color_scheme_set("brightblue") plot(fit, "areas", regex_pars = "period", prob = 0.5, prob_outer = 0.9) # Make the same plot by extracting posterior draws and calling # bayesplot::mcmc_areas directly x < as.array(fit, regex_pars = "period") bayesplot::mcmc_areas(x, prob = 0.5, prob_outer = 0.9) # Ridgelines version of the areas plot bayesplot::mcmc_areas_ridges(x, regex_pars = "period", prob = 0.9) ################################## ### Histograms & density plots ### ################################## plot_title < ggplot2::ggtitle("Posterior Distributions") plot(fit, "hist", regex_pars = "period") + plot_title plot(fit, "dens_overlay", pars = "(Intercept)", regex_pars = "period") + plot_title #################### ### Scatterplots ### #################### bayesplot::color_scheme_set("teal") plot(fit, "scatter", pars = paste0("period", 2:3)) plot(fit, "scatter", pars = c("(Intercept)", "size"), size = 3, alpha = 0.5) + ggplot2::stat_ellipse(level = 0.9) #################################################### ### Rhat, effective sample size, autocorrelation ### #################################################### bayesplot::color_scheme_set("red") # rhat plot(fit, "rhat") plot(fit, "rhat_hist") # ratio of effective sample size to total posterior sample size plot(fit, "neff") plot(fit, "neff_hist") # autocorrelation by chain plot(fit, "acf", pars = "(Intercept)", regex_pars = "period") plot(fit, "acf_bar", pars = "(Intercept)", regex_pars = "period") ################## ### Traceplots ### ################## # NOTE: rstanarm doesn't store the warmup draws (to save space because they # are not so essential for diagnosing the particular models implemented in # rstanarm) so the iterations in the traceplot are postwarmup iterations bayesplot::color_scheme_set("pink") (trace < plot(fit, "trace", pars = "(Intercept)")) # change traceplot colors to ggplot defaults or custom values trace + ggplot2::scale_color_discrete() trace + ggplot2::scale_color_manual(values = c("maroon", "skyblue2")) # changing facet layout plot(fit, "trace", pars = c("(Intercept)", "period2"), facet_args = list(nrow = 2)) # same plot by calling bayesplot::mcmc_trace directly x < as.array(fit, pars = c("(Intercept)", "period2")) bayesplot::mcmc_trace(x, facet_args = list(nrow = 2)) ############ ### More ### ############ # regex_pars examples plot(fit, regex_pars = "herd:1\\]") plot(fit, regex_pars = "herd:[279]") plot(fit, regex_pars = "herd:[279]period2") plot(fit, regex_pars = c("herd:[279]", "period2")) # For graphical posterior predictive checks see # help("pp_check.stanreg") }
if (.Platform$OS.type != "windows") { # Use rstanarm example model if (!exists("example_model")) example(example_model) fit < example_model ##################################### ### Intervals and point estimates ### ##################################### plot(fit) # same as plot(fit, "intervals"), plot(fit, "mcmc_intervals") p < plot(fit, pars = "size", regex_pars = "period", prob = 0.5, prob_outer = 0.9) p + ggplot2::ggtitle("Posterior medians \n with 50% and 90% intervals") # Shaded areas under densities bayesplot::color_scheme_set("brightblue") plot(fit, "areas", regex_pars = "period", prob = 0.5, prob_outer = 0.9) # Make the same plot by extracting posterior draws and calling # bayesplot::mcmc_areas directly x < as.array(fit, regex_pars = "period") bayesplot::mcmc_areas(x, prob = 0.5, prob_outer = 0.9) # Ridgelines version of the areas plot bayesplot::mcmc_areas_ridges(x, regex_pars = "period", prob = 0.9) ################################## ### Histograms & density plots ### ################################## plot_title < ggplot2::ggtitle("Posterior Distributions") plot(fit, "hist", regex_pars = "period") + plot_title plot(fit, "dens_overlay", pars = "(Intercept)", regex_pars = "period") + plot_title #################### ### Scatterplots ### #################### bayesplot::color_scheme_set("teal") plot(fit, "scatter", pars = paste0("period", 2:3)) plot(fit, "scatter", pars = c("(Intercept)", "size"), size = 3, alpha = 0.5) + ggplot2::stat_ellipse(level = 0.9) #################################################### ### Rhat, effective sample size, autocorrelation ### #################################################### bayesplot::color_scheme_set("red") # rhat plot(fit, "rhat") plot(fit, "rhat_hist") # ratio of effective sample size to total posterior sample size plot(fit, "neff") plot(fit, "neff_hist") # autocorrelation by chain plot(fit, "acf", pars = "(Intercept)", regex_pars = "period") plot(fit, "acf_bar", pars = "(Intercept)", regex_pars = "period") ################## ### Traceplots ### ################## # NOTE: rstanarm doesn't store the warmup draws (to save space because they # are not so essential for diagnosing the particular models implemented in # rstanarm) so the iterations in the traceplot are postwarmup iterations bayesplot::color_scheme_set("pink") (trace < plot(fit, "trace", pars = "(Intercept)")) # change traceplot colors to ggplot defaults or custom values trace + ggplot2::scale_color_discrete() trace + ggplot2::scale_color_manual(values = c("maroon", "skyblue2")) # changing facet layout plot(fit, "trace", pars = c("(Intercept)", "period2"), facet_args = list(nrow = 2)) # same plot by calling bayesplot::mcmc_trace directly x < as.array(fit, pars = c("(Intercept)", "period2")) bayesplot::mcmc_trace(x, facet_args = list(nrow = 2)) ############ ### More ### ############ # regex_pars examples plot(fit, regex_pars = "herd:1\\]") plot(fit, regex_pars = "herd:[279]") plot(fit, regex_pars = "herd:[279]period2") plot(fit, regex_pars = c("herd:[279]", "period2")) # For graphical posterior predictive checks see # help("pp_check.stanreg") }
This generic plot
method for survfit.stanjm
objects will
plot the estimated subjectspecific or marginal survival function
using the data frame returned by a call to posterior_survfit
.
The call to posterior_survfit
should ideally have included an
"extrapolation" of the survival function, obtained by setting the
extrapolate
argument to TRUE
.
The plot_stack_jm
function takes arguments containing the plots of the estimated
subjectspecific longitudinal trajectory (or trajectories if a multivariate
joint model was estimated) and the plot of the estimated subjectspecific
survival function and combines them into a single figure. This is most
easily understood by running the Examples below.
## S3 method for class 'survfit.stanjm' plot( x, ids = NULL, limits = c("ci", "none"), xlab = NULL, ylab = NULL, facet_scales = "free", ci_geom_args = NULL, ... ) ## S3 method for class 'survfit.stansurv' plot( x, ids = NULL, limits = c("ci", "none"), xlab = NULL, ylab = NULL, facet_scales = "free", ci_geom_args = NULL, ... ) plot_stack_jm(yplot, survplot)
## S3 method for class 'survfit.stanjm' plot( x, ids = NULL, limits = c("ci", "none"), xlab = NULL, ylab = NULL, facet_scales = "free", ci_geom_args = NULL, ... ) ## S3 method for class 'survfit.stansurv' plot( x, ids = NULL, limits = c("ci", "none"), xlab = NULL, ylab = NULL, facet_scales = "free", ci_geom_args = NULL, ... ) plot_stack_jm(yplot, survplot)
x 
A data frame and object of class 
ids 
An optional vector providing a subset of subject IDs for whom the predicted curves should be plotted. 
limits 
A quoted character string specifying the type of limits to
include in the plot. Can be one of: 
xlab , ylab

An optional axis label passed to

facet_scales 
A character string passed to the 
ci_geom_args 
Optional arguments passed to

... 
Optional arguments passed to

yplot 
An object of class 
survplot 
An object of class 
The plot method returns a ggplot
object, also of class
plot.survfit.stanjm
. This object can be further customised using the
ggplot2 package. It can also be passed to the function
plot_stack_jm
.
plot_stack_jm
returns an object of class
bayesplot_grid
that includes plots of the
estimated subjectspecific longitudinal trajectories stacked on top of the
associated subjectspecific survival curve.
posterior_survfit
, plot_stack_jm
,
posterior_traj
, plot.predict.stanjm
plot.predict.stanjm
, plot.survfit.stanjm
,
posterior_predict
, posterior_survfit
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific conditional survival probabilities # for all individuals in the estimation dataset. ps1 < posterior_survfit(example_jm, extrapolate = TRUE) # We then plot the conditional survival probabilities for # a subset of individuals plot(ps1, ids = c(7,13,15)) # We can change or add attributes to the plot plot(ps1, ids = c(7,13,15), limits = "none") plot(ps1, ids = c(7,13,15), xlab = "Follow up time") plot(ps1, ids = c(7,13,15), ci_geom_args = list(fill = "red"), color = "blue", linetype = 2) plot(ps1, ids = c(7,13,15), facet_scales = "fixed") # Since the returned plot is also a ggplot object, we can # modify some of its attributes after it has been returned plot1 < plot(ps1, ids = c(7,13,15)) plot1 + ggplot2::theme(strip.background = ggplot2::element_blank()) + ggplot2::coord_cartesian(xlim = c(0, 15)) + ggplot2::labs(title = "Some plotted survival functions") # We can also combine the plot(s) of the estimated # subjectspecific survival functions, with plot(s) # of the estimated longitudinal trajectories for the # same individuals ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) pt1 < posterior_traj(example_jm, , ids = c(7,13,15)) plot_surv < plot(ps1) plot_traj < plot(pt1, vline = TRUE, plot_observed = TRUE) plot_stack_jm(plot_traj, plot_surv) # Lastly, let us plot the standardised survival function # based on all individuals in our estimation dataset ps2 < posterior_survfit(example_jm, standardise = TRUE, times = 0, control = list(epoints = 20)) plot(ps2) if (.Platform$OS.type != "windows") { if (!exists("example_jm")) example(example_jm) ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) pt1 < posterior_traj(example_jm, ids = c(7,13,15), extrapolate = TRUE) plot_surv < plot(ps1) plot_traj < plot(pt1, vline = TRUE, plot_observed = TRUE) plot_stack_jm(plot_traj, plot_surv) }
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific conditional survival probabilities # for all individuals in the estimation dataset. ps1 < posterior_survfit(example_jm, extrapolate = TRUE) # We then plot the conditional survival probabilities for # a subset of individuals plot(ps1, ids = c(7,13,15)) # We can change or add attributes to the plot plot(ps1, ids = c(7,13,15), limits = "none") plot(ps1, ids = c(7,13,15), xlab = "Follow up time") plot(ps1, ids = c(7,13,15), ci_geom_args = list(fill = "red"), color = "blue", linetype = 2) plot(ps1, ids = c(7,13,15), facet_scales = "fixed") # Since the returned plot is also a ggplot object, we can # modify some of its attributes after it has been returned plot1 < plot(ps1, ids = c(7,13,15)) plot1 + ggplot2::theme(strip.background = ggplot2::element_blank()) + ggplot2::coord_cartesian(xlim = c(0, 15)) + ggplot2::labs(title = "Some plotted survival functions") # We can also combine the plot(s) of the estimated # subjectspecific survival functions, with plot(s) # of the estimated longitudinal trajectories for the # same individuals ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) pt1 < posterior_traj(example_jm, , ids = c(7,13,15)) plot_surv < plot(ps1) plot_traj < plot(pt1, vline = TRUE, plot_observed = TRUE) plot_stack_jm(plot_traj, plot_surv) # Lastly, let us plot the standardised survival function # based on all individuals in our estimation dataset ps2 < posterior_survfit(example_jm, standardise = TRUE, times = 0, control = list(epoints = 20)) plot(ps2) if (.Platform$OS.type != "windows") { if (!exists("example_jm")) example(example_jm) ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) pt1 < posterior_traj(example_jm, ids = c(7,13,15), extrapolate = TRUE) plot_surv < plot(ps1) plot_traj < plot(pt1, vline = TRUE, plot_observed = TRUE) plot_stack_jm(plot_traj, plot_surv) }
For models fit using MCMC (algorithm="sampling"
) or one of the
variational approximations ("meanfield"
or "fullrank"
), the
posterior_interval
function computes Bayesian posterior uncertainty
intervals. These intervals are often referred to as credible
intervals, but we use the term uncertainty intervals to highlight the
fact that wider intervals correspond to greater uncertainty.
## S3 method for class 'stanreg' posterior_interval( object, prob = 0.9, type = "central", pars = NULL, regex_pars = NULL, ... )
## S3 method for class 'stanreg' posterior_interval( object, prob = 0.9, type = "central", pars = NULL, regex_pars = NULL, ... )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
prob 
A number 
type 
The type of interval to compute. Currently the only option is

pars 
An optional character vector of parameter names. 
regex_pars 
An optional character vector of regular
expressions to use for parameter selection. 
... 
Currently ignored. 
Unlike for a frenquentist confidence interval, it is valid to say that,
conditional on the data and model, we believe that with probability $p$
the value of a parameter is in its $100p$
% posterior interval. This
intuitive interpretation of Bayesian intervals is often erroneously applied
to frequentist confidence intervals. See Morey et al. (2015) for more details
on this issue and the advantages of using Bayesian posterior uncertainty
intervals (also known as credible intervals).
We default to reporting $90$
% intervals rather than $95$
% intervals
for several reasons:
Computational stability: $90$
% intervals are more stable than
$95$
% intervals (for which each end relies on only $2.5$
% of the
posterior draws).
Relation to TypeS errors (Gelman and Carlin, 2014):
$95$
% of the mass in a $90$
% central interval is above the lower
value (and $95$
% is below the upper value). For a parameter
$\theta$
, it is therefore easy to see if the posterior probability that
$\theta > 0$
(or $\theta < 0$
) is larger or smaller than $95$
%.
Of course, if $95$
% intervals are desired they can be computed by
specifying prob=0.95
.
Currently posterior_interval
only computes central intervals because
other types of intervals are rarely useful for the models that rstanarm
can estimate. Additional possibilities may be provided in future releases as
more models become available.
A matrix with two columns and as many rows as model parameters (or
the subset of parameters specified by pars
and/or
regex_pars
). For a given value of prob
, $p$
, the columns
correspond to the lower and upper $100p$
% interval limits and have the
names $100\alpha/2$
% and $100(1  \alpha/2)$
%, where $\alpha
= 1p$
. For example, if prob=0.9
is specified (a $90$
%
interval), then the column names will be "5%"
and "95%"
,
respectively.
Gelman, A. and Carlin, J. (2014). Beyond power calculations: assessing Type S (sign) and Type M (magnitude) errors. Perspectives on Psychological Science. 9(6), 641–51.
Morey, R. D., Hoekstra, R., Rouder, J., Lee, M. D., and Wagenmakers, E. (2016). The fallacy of placing confidence in confidence intervals. Psychonomic Bulletin & Review. 23(1), 103–123.
confint.stanreg
, which, for models fit using optimization, can
be used to compute traditional confidence intervals.
predictive_interval
for predictive intervals.
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) posterior_interval(example_model) posterior_interval(example_model, regex_pars = "herd") posterior_interval(example_model, pars = "period2", prob = 0.5) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) posterior_interval(example_model) posterior_interval(example_model, regex_pars = "herd") posterior_interval(example_model, pars = "period2", prob = 0.5) }
Extract the posterior draws of the linear predictor, possibly transformed by
the inverselink function. This function is occasionally useful, but it
should be used sparingly: inference and model checking should generally be
carried out using the posterior predictive distribution (i.e., using
posterior_predict
).
## S3 method for class 'stanreg' posterior_linpred( object, transform = FALSE, newdata = NULL, draws = NULL, re.form = NULL, offset = NULL, XZ = FALSE, ... ) ## S3 method for class 'stanreg' posterior_epred( object, newdata = NULL, draws = NULL, re.form = NULL, offset = NULL, XZ = FALSE, ... )
## S3 method for class 'stanreg' posterior_linpred( object, transform = FALSE, newdata = NULL, draws = NULL, re.form = NULL, offset = NULL, XZ = FALSE, ... ) ## S3 method for class 'stanreg' posterior_epred( object, newdata = NULL, draws = NULL, re.form = NULL, offset = NULL, XZ = FALSE, ... )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
transform 
Should the linear predictor be transformed using the
inverselink function? The default is 
newdata , draws , re.form , offset

Same as for 
XZ 
If 
... 
Currently ignored. 
The posterior_linpred
function returns the posterior
distribution of the linear predictor, while the posterior_epred
function returns the posterior distribution of the conditional expectation.
In the special case of a Gaussian likelihood with an identity link
function, these two concepts are the same. The posterior_epred
function is a less noisy way to obtain expectations over the output of
posterior_predict
.
The default is to return a draws
by nrow(newdata)
matrix of simulations from the posterior distribution of the (possibly
transformed) linear predictor. The exception is if the argument XZ
is set to TRUE
(see the XZ
argument description above).
For models estimated with stan_clogit
, the number of
successes per stratum is ostensibly fixed by the research design. Thus,
when calling posterior_linpred
with new data and transform =
TRUE
, the data.frame
passed to the newdata
argument must
contain an outcome variable and a stratifying factor, both with the same
name as in the original data.frame
. Then, the probabilities will
condition on this outcome in the new data.
posterior_predict
to draw from the posterior
predictive distribution of the outcome, which is typically preferable.
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) print(family(example_model)) # linear predictor on logodds scale linpred < posterior_linpred(example_model) colMeans(linpred) # probabilities # same as posterior_linpred(example_model, transform = TRUE) probs < posterior_epred(example_model) colMeans(probs) # not conditioning on any grouplevel parameters probs2 < posterior_epred(example_model, re.form = NA) apply(probs2, 2, median) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) print(family(example_model)) # linear predictor on logodds scale linpred < posterior_linpred(example_model) colMeans(linpred) # probabilities # same as posterior_linpred(example_model, transform = TRUE) probs < posterior_epred(example_model) colMeans(probs) # not conditioning on any grouplevel parameters probs2 < posterior_epred(example_model, re.form = NA) apply(probs2, 2, median) }
The posterior predictive distribution is the distribution of the outcome implied by the model after using the observed data to update our beliefs about the unknown parameters in the model. Simulating data from the posterior predictive distribution using the observed predictors is useful for checking the fit of the model. Drawing from the posterior predictive distribution at interesting values of the predictors also lets us visualize how a manipulation of a predictor affects (a function of) the outcome(s). With new observations of predictor variables we can use the posterior predictive distribution to generate predicted outcomes.
## S3 method for class 'stanreg' posterior_predict( object, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'stanmvreg' posterior_predict( object, m = 1, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... )
## S3 method for class 'stanreg' posterior_predict( object, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'stanmvreg' posterior_predict( object, m = 1, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
newdata 
Optionally, a data frame in which to look for variables with
which to predict. If omitted, the model matrix is used. If 
draws 
An integer indicating the number of draws to return. The default and maximum number of draws is the size of the posterior sample. 
re.form 
If 
fun 
An optional function to apply to the results. 
seed 
An optional 
offset 
A vector of offsets. Only required if 
... 
For 
m 
Integer specifying the number or name of the submodel 
A draws
by nrow(newdata)
matrix of simulations from the
posterior predictive distribution. Each row of the matrix is a vector of
predictions generated using a single draw of the model parameters from the
posterior distribution.
For binomial models with a number of trials greater than one (i.e., not
Bernoulli models), if newdata
is specified then it must include all
variables needed for computing the number of binomial trials to use for the
predictions. For example if the lefthand side of the model formula is
cbind(successes, failures)
then both successes
and
failures
must be in newdata
. The particular values of
successes
and failures
in newdata
do not matter so
long as their sum is the desired number of trials. If the lefthand side of
the model formula were cbind(successes, trials  successes)
then
both trials
and successes
would need to be in newdata
,
probably with successes
set to 0
and trials
specifying
the number of trials. See the Examples section below and the
How to Use the rstanarm Package for examples.
For models estimated with stan_clogit
, the number of
successes per stratum is ostensibly fixed by the research design. Thus, when
doing posterior prediction with new data, the data.frame
passed to
the newdata
argument must contain an outcome variable and a stratifying
factor, both with the same name as in the original data.frame
. Then, the
posterior predictions will condition on this outcome in the new data.
pp_check
for graphical posterior predictive checks.
Examples of posterior predictive checking can also be found in the
rstanarm vignettes and demos.
predictive_error
and predictive_interval
.
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) yrep < posterior_predict(example_model) table(yrep) # Using newdata counts < c(18,17,15,20,10,20,25,13,12) outcome < gl(3,1,9) treatment < gl(3,3) dat < data.frame(counts, treatment, outcome) fit3 < stan_glm( counts ~ outcome + treatment, data = dat, family = poisson(link="log"), prior = normal(0, 1, autoscale = FALSE), prior_intercept = normal(0, 5, autoscale = FALSE), refresh = 0 ) nd < data.frame(treatment = factor(rep(1,3)), outcome = factor(1:3)) ytilde < posterior_predict(fit3, nd, draws = 500) print(dim(ytilde)) # 500 by 3 matrix (draws by nrow(nd)) ytilde < data.frame( count = c(ytilde), outcome = rep(nd$outcome, each = 500) ) ggplot2::ggplot(ytilde, ggplot2::aes(x=outcome, y=count)) + ggplot2::geom_boxplot() + ggplot2::ylab("predicted count") # Using newdata with a binomial model. # example_model is binomial so we need to set # the number of trials to use for prediction. # This could be a different number for each # row of newdata or the same for all rows. # Here we'll use the same value for all. nd < lme4::cbpp print(formula(example_model)) # cbind(incidence, size  incidence) ~ ... nd$size < max(nd$size) + 1L # number of trials nd$incidence < 0 # set to 0 so size  incidence = number of trials ytilde < posterior_predict(example_model, newdata = nd) # Using fun argument to transform predictions mtcars2 < mtcars mtcars2$log_mpg < log(mtcars2$mpg) fit < stan_glm(log_mpg ~ wt, data = mtcars2, refresh = 0) ytilde < posterior_predict(fit, fun = exp) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) yrep < posterior_predict(example_model) table(yrep) # Using newdata counts < c(18,17,15,20,10,20,25,13,12) outcome < gl(3,1,9) treatment < gl(3,3) dat < data.frame(counts, treatment, outcome) fit3 < stan_glm( counts ~ outcome + treatment, data = dat, family = poisson(link="log"), prior = normal(0, 1, autoscale = FALSE), prior_intercept = normal(0, 5, autoscale = FALSE), refresh = 0 ) nd < data.frame(treatment = factor(rep(1,3)), outcome = factor(1:3)) ytilde < posterior_predict(fit3, nd, draws = 500) print(dim(ytilde)) # 500 by 3 matrix (draws by nrow(nd)) ytilde < data.frame( count = c(ytilde), outcome = rep(nd$outcome, each = 500) ) ggplot2::ggplot(ytilde, ggplot2::aes(x=outcome, y=count)) + ggplot2::geom_boxplot() + ggplot2::ylab("predicted count") # Using newdata with a binomial model. # example_model is binomial so we need to set # the number of trials to use for prediction. # This could be a different number for each # row of newdata or the same for all rows. # Here we'll use the same value for all. nd < lme4::cbpp print(formula(example_model)) # cbind(incidence, size  incidence) ~ ... nd$size < max(nd$size) + 1L # number of trials nd$incidence < 0 # set to 0 so size  incidence = number of trials ytilde < posterior_predict(example_model, newdata = nd) # Using fun argument to transform predictions mtcars2 < mtcars mtcars2$log_mpg < log(mtcars2$mpg) fit < stan_glm(log_mpg ~ wt, data = mtcars2, refresh = 0) ytilde < posterior_predict(fit, fun = exp) }
This function allows us to generate predicted quantities for survival
models at specified times. These quantities include the hazard rate,
cumulative hazard, survival probability, or failure probability (i.e. CDF).
Note that the cumulative hazard, survival probability, or failure
probability may be conditional on a last known survival time (see the
condition
argument discussed below). Predictions are obtained
using unique draws from the posterior distribution of each of the model
parameters and then summarised into a median and posterior uncertainty
interval. For stan_jm
models "dynamic" predictions are allowed and
are in fact the default when new data is provided (see the dynamic
argument discussed below).
posterior_survfit(object, ...) ## S3 method for class 'stansurv' posterior_survfit( object, newdata = NULL, type = "surv", extrapolate = TRUE, control = list(), condition = FALSE, last_time = NULL, prob = 0.95, times = NULL, standardise = FALSE, draws = NULL, seed = NULL, return_matrix = FALSE, ... ) ## S3 method for class 'stanjm' posterior_survfit( object, newdataLong = NULL, newdataEvent = NULL, type = "surv", extrapolate = TRUE, control = list(), condition = NULL, last_time = NULL, prob = 0.95, ids, times = NULL, standardise = FALSE, dynamic = TRUE, scale = 1.5, draws = NULL, seed = NULL, return_matrix = FALSE, ... )
posterior_survfit(object, ...) ## S3 method for class 'stansurv' posterior_survfit( object, newdata = NULL, type = "surv", extrapolate = TRUE, control = list(), condition = FALSE, last_time = NULL, prob = 0.95, times = NULL, standardise = FALSE, draws = NULL, seed = NULL, return_matrix = FALSE, ... ) ## S3 method for class 'stanjm' posterior_survfit( object, newdataLong = NULL, newdataEvent = NULL, type = "surv", extrapolate = TRUE, control = list(), condition = NULL, last_time = NULL, prob = 0.95, ids, times = NULL, standardise = FALSE, dynamic = TRUE, scale = 1.5, draws = NULL, seed = NULL, return_matrix = FALSE, ... )
object 
A fitted model object returned by the

... 
Currently unused. 
newdata 
Optionally, a data frame in which to look for variables with
which to predict. If omitted, the model matrix is used. If 
type 
The type of prediction to return. The following are currently allowed:

extrapolate 
A logical specifying whether to extrapolate the estimated
survival probabilities beyond the times specified in the 
control 
A named list with parameters controlling extrapolation
of the estimated survival function when

condition 
A logical specifying whether the estimated
subjectspecific survival probabilities at time 
last_time 
A scalar, character string, or 
prob 
A scalar between 0 and 1 specifying the width to use for the
uncertainty interval (sometimes called credible interval) for the predictions.
For example 
times 
A scalar, a character string, or 
standardise 
A logical specifying whether the estimated
subjectspecific survival probabilities should be averaged
across all individuals for whom the subjectspecific predictions are
being obtained. This can be used to average over the covariate and random effects
distributions of the individuals used in estimating the model, or the individuals
included in the 
draws 
An integer specifying the number of MCMC draws to use when
evaluating the predicted quantities. For 
seed 
An optional 
return_matrix 
A logical. If 
newdataLong , newdataEvent

An optional data frame (or in the case of

ids 
For 
dynamic 
A logical that is only relevant for 
scale 
Only relevant for 
By default, the predicted quantities are evaluated conditional on observed values of the fixed effect covariates. That is, predictions will be obtained using either:
the design matrices used in the original stan_surv
or stan_jm
model call, or
the covariate values provided in the newdata
argument
(or newdataLong
and newdataEvent
arugments for the
stanjm
method).
However, if you wish to average over the observed distribution
of the fixed effect covariates then this is possible – such predictions
are sometimes referred to as standardised survival probabilties – see the
standardise
argument.
For stansurv
objects, the predicted quantities are calculated for
each row of the prediction data, at the specified times
as
well as any times generated through extrapolation (when
extrapolate = TRUE
).
For stanjm
objects, the predicted quantities are calculated for
each individual, at the specified times
as well as any times
generated through extrapolation (when extrapolate = TRUE
).
The following also applies for stanjm
objects.
By default the survival probabilities are conditional on an individual's
groupspecific coefficients (i.e. their individuallevel random
effects). If prediction data is provided via the newdataLong
and newdataEvent
arguments, then the default behaviour is to
sample new groupspecific coefficients for the individuals in the
new data using a Monte Carlo scheme that conditions on their
longitudinal outcome data provided in newdataLong
(sometimes referred to as "dynamic predictions", see Rizopoulos
(2011)). This default behaviour can be stopped by specifying
dynamic = FALSE
, in which case the predicted survival
probabilities will be marginalised over the distribution of the
groupspecific coefficients. This has the benefit that the user does
not need to provide longitudinal outcome measurements for the new
individuals, however, it does mean that the predictions will incorporate
all the uncertainty associated with betweenindividual variation in the
biomarker (longitudinal outcome) values since the predictions aren't
conditional on any observed biomarker (longitudinal outcome) data for
the individual.
When return_matrix = FALSE
(the default), a data frame of
class survfit.stansurv
or survfit.stanjm
. The data frame
includes columns for each of the following:
(i) the median of the posterior predictions (median
);
(ii) each of the lower and upper limits of the corresponding uncertainty
interval for the posterior predictions (ci_lb
and ci_ub
);
(iii) an observation identifier (for stan_surv
models) or an
individual identifier (for stan_jm
models), unless standardised
predictions were requested;
(iv) the time that the prediction corresponds to (time
).
(v) the last known survival time on which the prediction is conditional
(cond_time
); this will be set to NA if not relevant.
The returned object also includes a number of additional attributes.
When return_matrix = TRUE
a list of matrices is returned. Each
matrix contains the predictions evaluated at one step of the
extrapolation time sequence (note that if extrapolate = FALSE
then the list will be of length one, i.e. the predictions are only
evaluated at times
which corresponds to just one time point
for each individual). Each matrix will have draws
rows and
nrow(newdata)
columns, such that each row contains a
vector of predictions generated using a single draw of the model
parameters from the posterior distribution. The returned
list also includes a number of additional attributes.
Note that if any variables were transformed (e.g. rescaled) in the data
used to fit the model, then these variables must also be transformed in
newdataLong
and newdataEvent
. This only applies if variables
were transformed before passing the data to one of the modeling functions
and not if transformations were specified inside the model formula.
Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and timetoevent data. Biometrics 67, 819.
plot.survfit.stanjm
for plotting the estimated survival
probabilities ps_check
for for graphical checks of the estimated
survival function posterior_traj
for estimating the
marginal or subjectspecific longitudinal trajectories plot_stack_jm
for combining plots of the estimated
subjectspecific longitudinal trajectory and survival function
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific survival probabilities for a few # selected individuals in the estimation dataset who were # known to survive up until their censoring time. By default # the posterior_survfit function will estimate the conditional # survival probabilities, that is, conditional on having survived # until the event or censoring time, and then by default will # extrapolate the survival predictions forward from there. ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) # We can plot the estimated survival probabilities using the # associated plot function plot(ps1) # If we wanted to estimate the survival probabilities for the # same three individuals as the previous example, but this time # we won't condition on them having survived up until their # censoring time. Instead, we will estimate their probability # of having survived between 0 and 5 years given their covariates # and their estimated random effects. # The easiest way to achieve the time scale we want (ie, 0 to 5 years) # is to specify that we want the survival time estimated at time 0 # and then extrapolated forward 5 years. We also specify that we # do not want to condition on their last known survival time. ps2 < posterior_survfit(example_jm, ids = c(7,13,15), times = 0, extrapolate = TRUE, condition = FALSE, control = list(edist = 5)) # Instead we may want to estimate subjectspecific survival probabilities # for a set of new individuals. To demonstrate this, we will simply take # the first two individuals in the estimation dataset, but pass their data # via the newdata arguments so that posterior_survfit will assume we are # predicting survival for new individuals and draw new random effects # under a Monte Carlo scheme (see Rizopoulos (2011)). ndL < pbcLong[pbcLong$id %in% c(1,2),] ndE < pbcSurv[pbcSurv$id %in% c(1,2),] ps3 < posterior_survfit(example_jm, newdataLong = ndL, newdataEvent = ndE, last_time = "futimeYears", seed = 12345) head(ps3) # We can then compare the estimated random effects for these # individuals based on the fitted model and the Monte Carlo scheme ranef(example_jm)$Long1$id[1:2,,drop=FALSE] # from fitted model colMeans(attr(ps3, "b_new")) # from Monte Carlo scheme # Lastly, if we wanted to obtain "standardised" survival probabilities, # (by averaging over the observed distribution of the fixed effect # covariates, as well as averaging over the estimated random effects # for individuals in our estimation sample or new data) then we can # specify 'standardise = TRUE'. We can then plot the resulting # standardised survival curve. ps4 < posterior_survfit(example_jm, standardise = TRUE, times = 0, extrapolate = TRUE) plot(ps4)
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific survival probabilities for a few # selected individuals in the estimation dataset who were # known to survive up until their censoring time. By default # the posterior_survfit function will estimate the conditional # survival probabilities, that is, conditional on having survived # until the event or censoring time, and then by default will # extrapolate the survival predictions forward from there. ps1 < posterior_survfit(example_jm, ids = c(7,13,15)) # We can plot the estimated survival probabilities using the # associated plot function plot(ps1) # If we wanted to estimate the survival probabilities for the # same three individuals as the previous example, but this time # we won't condition on them having survived up until their # censoring time. Instead, we will estimate their probability # of having survived between 0 and 5 years given their covariates # and their estimated random effects. # The easiest way to achieve the time scale we want (ie, 0 to 5 years) # is to specify that we want the survival time estimated at time 0 # and then extrapolated forward 5 years. We also specify that we # do not want to condition on their last known survival time. ps2 < posterior_survfit(example_jm, ids = c(7,13,15), times = 0, extrapolate = TRUE, condition = FALSE, control = list(edist = 5)) # Instead we may want to estimate subjectspecific survival probabilities # for a set of new individuals. To demonstrate this, we will simply take # the first two individuals in the estimation dataset, but pass their data # via the newdata arguments so that posterior_survfit will assume we are # predicting survival for new individuals and draw new random effects # under a Monte Carlo scheme (see Rizopoulos (2011)). ndL < pbcLong[pbcLong$id %in% c(1,2),] ndE < pbcSurv[pbcSurv$id %in% c(1,2),] ps3 < posterior_survfit(example_jm, newdataLong = ndL, newdataEvent = ndE, last_time = "futimeYears", seed = 12345) head(ps3) # We can then compare the estimated random effects for these # individuals based on the fitted model and the Monte Carlo scheme ranef(example_jm)$Long1$id[1:2,,drop=FALSE] # from fitted model colMeans(attr(ps3, "b_new")) # from Monte Carlo scheme # Lastly, if we wanted to obtain "standardised" survival probabilities, # (by averaging over the observed distribution of the fixed effect # covariates, as well as averaging over the estimated random effects # for individuals in our estimation sample or new data) then we can # specify 'standardise = TRUE'. We can then plot the resulting # standardised survival curve. ps4 < posterior_survfit(example_jm, standardise = TRUE, times = 0, extrapolate = TRUE) plot(ps4)
This function allows us to generate an estimated longitudinal trajectory (either subjectspecific, or by marginalising over the distribution of the groupspecific parameters) based on draws from the posterior predictive distribution.
posterior_traj( object, m = 1, newdata = NULL, newdataLong = NULL, newdataEvent = NULL, interpolate = TRUE, extrapolate = FALSE, control = list(), last_time = NULL, prob = 0.95, ids, dynamic = TRUE, scale = 1.5, draws = NULL, seed = NULL, return_matrix = FALSE, ... )
posterior_traj( object, m = 1, newdata = NULL, newdataLong = NULL, newdataEvent = NULL, interpolate = TRUE, extrapolate = FALSE, control = list(), last_time = NULL, prob = 0.95, ids, dynamic = TRUE, scale = 1.5, draws = NULL, seed = NULL, return_matrix = FALSE, ... )
object 
A fitted model object returned by the

m 
Integer specifying the number or name of the submodel 
newdata 
Deprecated: please use 
newdataLong , newdataEvent

Optionally, a data frame (or in the case of

interpolate 
A logical specifying whether to interpolate the estimated
longitudinal trajectory in between the observation times. This can be used
to achieve a smooth estimate of the longitudinal trajectory across the
entire follow up time. If 
extrapolate 
A logical specifying whether to extrapolate the estimated
longitudinal trajectory beyond the time of the last known observation time.
If 
control 
A named list with parameters controlling the interpolation or
extrapolation of the estimated longitudinal trajectory when either

last_time 
A scalar, character string, or 
prob 
A scalar between 0 and 1 specifying the width to use for the
uncertainty interval (sometimes called credible interval) for the predicted
mean response and the prediction interval for the predicted (raw) response.
For example 
ids 
An optional vector specifying a subset of subject IDs for whom the
predictions should be obtained. The default is to predict for all individuals
who were used in estimating the model or, if 
dynamic 
A logical that is only relevant if new data is provided
via the 
scale 
A scalar, specifying how much to multiply the asymptotic
variancecovariance matrix for the random effects by, which is then
used as the "width" (ie. variancecovariance matrix) of the multivariate
Studentt proposal distribution in the MetropolisHastings algorithm. This
is only relevant when 
draws 
An integer indicating the number of MCMC draws to return. The default is to set the number of draws equal to 200, or equal to the size of the posterior sample if that is less than 200. 
seed 
An optional 
return_matrix 
A logical. If 
... 
Other arguments passed to 
The posterior_traj
function acts as a wrapper to the
posterior_predict
function, but allows predictions to be
easily generated at time points that are interpolated and/or extrapolated
between time zero (baseline) and the last known survival time for the
individual, thereby providing predictions that correspond to a smooth estimate
of the longitudinal trajectory (useful for the plotting via the associated
plot.predict.stanjm
method). In addition it returns a data
frame by default, whereas the posterior_predict
function
returns a matrix; see the Value section below for details. Also,
posterior_traj
allows predictions to only be generated for a subset
of individuals, via the ids
argument.
When return_matrix = FALSE
, a data frame
of class predict.stanjm
. The data frame includes a column for the median
of the posterior predictions of the mean longitudinal response (yfit
),
a column for each of the lower and upper limits of the uncertainty interval
corresponding to the posterior predictions of the mean longitudinal response
(ci_lb
and ci_ub
), and a column for each of the lower and upper
limits of the prediction interval corresponding to the posterior predictions
of the (raw) longitudinal response. The data frame also includes columns for
the subject ID variable, and each of the predictor variables. The returned
object also includes a number of attributes.
When return_matrix = TRUE
, the returned object is the same as that
described for posterior_predict
.
plot.predict.stanjm
, posterior_predict
,
posterior_survfit
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific predictions for all individuals # in the estimation dataset pt1 < posterior_traj(example_jm, interpolate = FALSE, extrapolate = FALSE) head(pt1) # Obtain subjectspecific predictions only for a few selected individuals pt2 < posterior_traj(example_jm, ids = c(1,3,8)) # If we wanted to obtain subjectspecific predictions in order to plot the # longitudinal trajectories, then we might want to ensure a full trajectory # is obtained by interpolating and extrapolating time. We can then use the # generic plot function to plot the subjectspecific predicted trajectories # for the first three individuals. Interpolation and extrapolation is # carried out by default. pt3 < posterior_traj(example_jm) head(pt3) # predictions at additional time points compared with pt1 plot(pt3, ids = 1:3) # If we wanted to extrapolate further in time, but decrease the number of # discrete time points at which we obtain predictions for each individual, # then we could specify a named list in the 'control' argument pt4 < posterior_traj(example_jm, control = list(ipoints = 10, epoints = 10, eprop = 0.5)) # If we have prediction data for a new individual, and we want to # estimate the longitudinal trajectory for that individual conditional # on this new data (perhaps extrapolating forward from our last # longitudinal measurement) then we can do that. It requires drawing # new individualspecific parameters, based on the full likelihood, # so we must supply new data for both the longitudinal and event # submodels. These are sometimes known as dynamic predictions. ndL < pbcLong[pbcLong$id == 8, , drop = FALSE] ndE < pbcSurv[pbcSurv$id == 8, , drop = FALSE] ndL$id < "new_subject" # new id can't match one used in training data ndE$id < "new_subject" pt5 < posterior_traj(example_jm, newdataLong = ndL, newdataEvent = ndE) # By default it is assumed that the last known survival time for # the individual is the time of their last biomarker measurement, # but if we know they survived to some later time then we can # condition on that information using the last_time argument pt6 < posterior_traj(example_jm, newdataLong = ndL, newdataEvent = ndE, last_time = "futimeYears") # Alternatively we may want to estimate the marginal longitudinal # trajectory for a given set of covariates. To do this, we can pass # the desired covariate values in a new data frame (however the only # covariate in our fitted model was the time variable, year). To make sure # that we marginalise over the random effects, we need to specify an ID value # which does not correspond to any of the individuals who were used in the # model estimation and specify the argument dynamic=FALSE. # The marginal prediction is obtained by generating subjectspecific # predictions using a series of random draws from the random # effects distribution, and then integrating (ie, averaging) over these. # Our marginal prediction will therefore capture the betweenindividual # variation associated with the random effects. nd < data.frame(id = rep("new1", 11), year = (0:10 / 2)) pt7 < posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE) head(pt7) # note the greater width of the uncertainty interval compared # with the subjectspecific predictions in pt1, pt2, etc # Alternatively, we could have estimated the "marginal" trajectory by # ignoring the random effects (ie, assuming the random effects were set # to zero). This will generate a predicted longitudinal trajectory only # based on the fixed effect component of the model. In essence, for a # linear mixed effects model (ie, a model that uses an identity link # function), we should obtain a similar point estimate ("yfit") to the # estimates obtained in pt5 (since the mean of the estimated random effects # distribution will be approximately 0). However, it is important to note that # the uncertainty interval will be much more narrow, since it completely # ignores the betweenindividual variability captured by the random effects. # Further, if the model uses a nonidentity link function, then the point # estimate ("yfit") obtained only using the fixed effect component of the # model will actually provide a biased estimate of the marginal prediction. # Nonetheless, to demonstrate how we can obtain the predictions only using # the fixed effect component of the model, we simply specify 're.form = NA'. # (We will use the same covariate values as used in the prediction for # example for pt5). pt8 < posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE, re.form = NA) head(pt8) # note the much narrower ci, compared with pt5
# Run example model if not already loaded if (!exists("example_jm")) example(example_jm) # Obtain subjectspecific predictions for all individuals # in the estimation dataset pt1 < posterior_traj(example_jm, interpolate = FALSE, extrapolate = FALSE) head(pt1) # Obtain subjectspecific predictions only for a few selected individuals pt2 < posterior_traj(example_jm, ids = c(1,3,8)) # If we wanted to obtain subjectspecific predictions in order to plot the # longitudinal trajectories, then we might want to ensure a full trajectory # is obtained by interpolating and extrapolating time. We can then use the # generic plot function to plot the subjectspecific predicted trajectories # for the first three individuals. Interpolation and extrapolation is # carried out by default. pt3 < posterior_traj(example_jm) head(pt3) # predictions at additional time points compared with pt1 plot(pt3, ids = 1:3) # If we wanted to extrapolate further in time, but decrease the number of # discrete time points at which we obtain predictions for each individual, # then we could specify a named list in the 'control' argument pt4 < posterior_traj(example_jm, control = list(ipoints = 10, epoints = 10, eprop = 0.5)) # If we have prediction data for a new individual, and we want to # estimate the longitudinal trajectory for that individual conditional # on this new data (perhaps extrapolating forward from our last # longitudinal measurement) then we can do that. It requires drawing # new individualspecific parameters, based on the full likelihood, # so we must supply new data for both the longitudinal and event # submodels. These are sometimes known as dynamic predictions. ndL < pbcLong[pbcLong$id == 8, , drop = FALSE] ndE < pbcSurv[pbcSurv$id == 8, , drop = FALSE] ndL$id < "new_subject" # new id can't match one used in training data ndE$id < "new_subject" pt5 < posterior_traj(example_jm, newdataLong = ndL, newdataEvent = ndE) # By default it is assumed that the last known survival time for # the individual is the time of their last biomarker measurement, # but if we know they survived to some later time then we can # condition on that information using the last_time argument pt6 < posterior_traj(example_jm, newdataLong = ndL, newdataEvent = ndE, last_time = "futimeYears") # Alternatively we may want to estimate the marginal longitudinal # trajectory for a given set of covariates. To do this, we can pass # the desired covariate values in a new data frame (however the only # covariate in our fitted model was the time variable, year). To make sure # that we marginalise over the random effects, we need to specify an ID value # which does not correspond to any of the individuals who were used in the # model estimation and specify the argument dynamic=FALSE. # The marginal prediction is obtained by generating subjectspecific # predictions using a series of random draws from the random # effects distribution, and then integrating (ie, averaging) over these. # Our marginal prediction will therefore capture the betweenindividual # variation associated with the random effects. nd < data.frame(id = rep("new1", 11), year = (0:10 / 2)) pt7 < posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE) head(pt7) # note the greater width of the uncertainty interval compared # with the subjectspecific predictions in pt1, pt2, etc # Alternatively, we could have estimated the "marginal" trajectory by # ignoring the random effects (ie, assuming the random effects were set # to zero). This will generate a predicted longitudinal trajectory only # based on the fixed effect component of the model. In essence, for a # linear mixed effects model (ie, a model that uses an identity link # function), we should obtain a similar point estimate ("yfit") to the # estimates obtained in pt5 (since the mean of the estimated random effects # distribution will be approximately 0). However, it is important to note that # the uncertainty interval will be much more narrow, since it completely # ignores the betweenindividual variability captured by the random effects. # Further, if the model uses a nonidentity link function, then the point # estimate ("yfit") obtained only using the fixed effect component of the # model will actually provide a biased estimate of the marginal prediction. # Nonetheless, to demonstrate how we can obtain the predictions only using # the fixed effect component of the model, we simply specify 're.form = NA'. # (We will use the same covariate values as used in the prediction for # example for pt5). pt8 < posterior_traj(example_jm, newdataLong = nd, dynamic = FALSE, re.form = NA) head(pt8) # note the much narrower ci, compared with pt5
Plot medians and central intervals comparing parameter draws from the prior
and posterior distributions. If the plotted priors look different than the
priors you think you specified it is likely either because of internal
rescaling or the use of the QR
argument (see the documentation for the
prior_summary
method for details on
these special cases).
posterior_vs_prior(object, ...) ## S3 method for class 'stanreg' posterior_vs_prior( object, pars = NULL, regex_pars = NULL, prob = 0.9, color_by = c("parameter", "vs", "none"), group_by_parameter = FALSE, facet_args = list(), ... )
posterior_vs_prior(object, ...) ## S3 method for class 'stanreg' posterior_vs_prior( object, pars = NULL, regex_pars = NULL, prob = 0.9, color_by = c("parameter", "vs", "none"), group_by_parameter = FALSE, facet_args = list(), ... )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
... 
The S3 generic uses 
pars 
An optional character vector specifying a subset of parameters to
display. Parameters can be specified by name or several shortcuts can be
used. Using In addition, for If 
regex_pars 
An optional character vector of regular
expressions to use for parameter selection. 
prob 
A number 
color_by 
How should the estimates be colored? Use 
group_by_parameter 
Should estimates be grouped together by parameter
( 
facet_args 
A named list of arguments passed to

A ggplot object that can be further customized using the ggplot2 package.
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # display nonvarying (i.e. not grouplevel) coefficients posterior_vs_prior(example_model, pars = "beta") # show grouplevel (varying) parameters and group by parameter posterior_vs_prior(example_model, pars = "varying", group_by_parameter = TRUE, color_by = "vs") # group by parameter and allow axis scales to vary across facets posterior_vs_prior(example_model, regex_pars = "period", group_by_parameter = TRUE, color_by = "none", facet_args = list(scales = "free")) # assign to object and customize with functions from ggplot2 (gg < posterior_vs_prior(example_model, pars = c("beta", "varying"), prob = 0.8)) gg + ggplot2::geom_hline(yintercept = 0, size = 0.3, linetype = 3) + ggplot2::coord_flip() + ggplot2::ggtitle("Comparing the prior and posterior") # compare very wide and very narrow priors using roaches example # (see help(roaches, "rstanarm") for info on the dataset) roaches$roach100 < roaches$roach1 / 100 wide_prior < normal(0, 10) narrow_prior < normal(0, 0.1) fit_pois_wide_prior < stan_glm(y ~ treatment + roach100 + senior, offset = log(exposure2), family = "poisson", data = roaches, prior = wide_prior) posterior_vs_prior(fit_pois_wide_prior, pars = "beta", prob = 0.5, group_by_parameter = TRUE, color_by = "vs", facet_args = list(scales = "free")) fit_pois_narrow_prior < update(fit_pois_wide_prior, prior = narrow_prior) posterior_vs_prior(fit_pois_narrow_prior, pars = "beta", prob = 0.5, group_by_parameter = TRUE, color_by = "vs", facet_args = list(scales = "free")) # look at cutpoints for ordinal model fit_polr < stan_polr(tobgp ~ agegp, data = esoph, method = "probit", prior = R2(0.2, "mean"), init_r = 0.1) (gg_polr < posterior_vs_prior(fit_polr, regex_pars = "\\", color_by = "vs", group_by_parameter = TRUE)) # flip the x and y axes gg_polr + ggplot2::coord_flip() ## End(Not run) }
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) # display nonvarying (i.e. not grouplevel) coefficients posterior_vs_prior(example_model, pars = "beta") # show grouplevel (varying) parameters and group by parameter posterior_vs_prior(example_model, pars = "varying", group_by_parameter = TRUE, color_by = "vs") # group by parameter and allow axis scales to vary across facets posterior_vs_prior(example_model, regex_pars = "period", group_by_parameter = TRUE, color_by = "none", facet_args = list(scales = "free")) # assign to object and customize with functions from ggplot2 (gg < posterior_vs_prior(example_model, pars = c("beta", "varying"), prob = 0.8)) gg + ggplot2::geom_hline(yintercept = 0, size = 0.3, linetype = 3) + ggplot2::coord_flip() + ggplot2::ggtitle("Comparing the prior and posterior") # compare very wide and very narrow priors using roaches example # (see help(roaches, "rstanarm") for info on the dataset) roaches$roach100 < roaches$roach1 / 100 wide_prior < normal(0, 10) narrow_prior < normal(0, 0.1) fit_pois_wide_prior < stan_glm(y ~ treatment + roach100 + senior, offset = log(exposure2), family = "poisson", data = roaches, prior = wide_prior) posterior_vs_prior(fit_pois_wide_prior, pars = "beta", prob = 0.5, group_by_parameter = TRUE, color_by = "vs", facet_args = list(scales = "free")) fit_pois_narrow_prior < update(fit_pois_wide_prior, prior = narrow_prior) posterior_vs_prior(fit_pois_narrow_prior, pars = "beta", prob = 0.5, group_by_parameter = TRUE, color_by = "vs", facet_args = list(scales = "free")) # look at cutpoints for ordinal model fit_polr < stan_polr(tobgp ~ agegp, data = esoph, method = "probit", prior = R2(0.2, "mean"), init_r = 0.1) (gg_polr < posterior_vs_prior(fit_polr, regex_pars = "\\", color_by = "vs", group_by_parameter = TRUE)) # flip the x and y axes gg_polr + ggplot2::coord_flip() ## End(Not run) }
Interface to the PPC (posterior predictive checking) module
in the bayesplot package, providing various plots comparing the
observed outcome variable $y$
to simulated datasets $y^{rep}$
from the posterior predictive distribution. The pp_check
method for
stanregobjects prepares the arguments required for the specified
bayesplot PPC plotting function and then calls that function. It is
also straightforward to use the functions from the bayesplot package
directly rather than via the pp_check
method. Examples of both are
given below.
## S3 method for class 'stanreg' pp_check(object, plotfun = "dens_overlay", nreps = NULL, seed = NULL, ...)
## S3 method for class 'stanreg' pp_check(object, plotfun = "dens_overlay", nreps = NULL, seed = NULL, ...)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
plotfun 
A character string naming the bayesplot
PPC function to use. The default is to call

nreps 
The number of 
seed 
An optional 
... 
Additonal arguments passed to the bayesplot function
called. For many plotting functions 
pp_check
returns a ggplot object that can be further
customized using the ggplot2 package.
For binomial data, plots of $y$
and $y^{rep}$
show the
proportion of 'successes' rather than the raw count. Also for binomial
models see ppc_error_binned
for binned residual
plots.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. (Ch. 6)
Gabry, J. , Simpson, D. , Vehtari, A. , Betancourt, M. and Gelman, A. (2019), Visualization in Bayesian workflow. J. R. Stat. Soc. A, 182: 389402. doi:10.1111/rssa.12378, arXiv preprint, code on GitHub)
The vignettes in the bayesplot package for many examples. Examples of posterior predictive checks can also be found in the rstanarm vignettes and demos.
PPCoverview
(bayesplot) for links to
the documentation for all the available plotting functions.
posterior_predict
for drawing from the posterior
predictive distribution.
color_scheme_set
to change the color scheme
of the plots.
if (.Platform$OS.type != "windows") { fit < stan_glmer( mpg ~ wt + am + (1cyl), data = mtcars, iter = 400, # iter and chains small just to keep example quick chains = 2, refresh = 0 ) # Compare distribution of y to distributions of multiple yrep datasets pp_check(fit) pp_check(fit, plotfun = "boxplot", nreps = 10, notch = FALSE) pp_check(fit, plotfun = "hist", nreps = 3) # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_hist(y = mtcars$mpg, yrep = posterior_predict(fit, draws = 3)) # Check histograms of test statistics by level of grouping variable 'cyl' pp_check(fit, plotfun = "stat_grouped", stat = "median", group = "cyl") # Defining a custom test statistic q25 < function(y) quantile(y, probs = 0.25) pp_check(fit, plotfun = "stat_grouped", stat = "q25", group = "cyl") # Scatterplot of two test statistics pp_check(fit, plotfun = "stat_2d", stat = c("mean", "sd")) # Scatterplot of y vs. average yrep pp_check(fit, plotfun = "scatter_avg") # y vs. average yrep # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_scatter_avg(y = mtcars$mpg, yrep = posterior_predict(fit)) # Scatterplots of y vs. several individual yrep datasets pp_check(fit, plotfun = "scatter", nreps = 3) # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_scatter(y = mtcars$mpg, yrep = posterior_predict(fit, draws = 3)) # yrep intervals with y points overlaid # by default 1:length(y) used on xaxis but can also specify an x variable pp_check(fit, plotfun = "intervals") pp_check(fit, plotfun = "intervals", x = "wt") + ggplot2::xlab("wt") # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_intervals(y = mtcars$mpg, yrep = posterior_predict(fit), x = mtcars$wt) + ggplot2::xlab("wt") # predictive errors pp_check(fit, plotfun = "error_hist", nreps = 6) pp_check(fit, plotfun = "error_scatter_avg_vs_x", x = "wt") + ggplot2::xlab("wt") # Example of a PPC for ordinal models (stan_polr) fit2 < stan_polr(tobgp ~ agegp, data = esoph, method = "probit", prior = R2(0.2, "mean"), init_r = 0.1, refresh = 0) pp_check(fit2, plotfun = "bars", nreps = 500, prob = 0.5) pp_check(fit2, plotfun = "bars_grouped", group = esoph$agegp, nreps = 500, prob = 0.5) }
if (.Platform$OS.type != "windows") { fit < stan_glmer( mpg ~ wt + am + (1cyl), data = mtcars, iter = 400, # iter and chains small just to keep example quick chains = 2, refresh = 0 ) # Compare distribution of y to distributions of multiple yrep datasets pp_check(fit) pp_check(fit, plotfun = "boxplot", nreps = 10, notch = FALSE) pp_check(fit, plotfun = "hist", nreps = 3) # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_hist(y = mtcars$mpg, yrep = posterior_predict(fit, draws = 3)) # Check histograms of test statistics by level of grouping variable 'cyl' pp_check(fit, plotfun = "stat_grouped", stat = "median", group = "cyl") # Defining a custom test statistic q25 < function(y) quantile(y, probs = 0.25) pp_check(fit, plotfun = "stat_grouped", stat = "q25", group = "cyl") # Scatterplot of two test statistics pp_check(fit, plotfun = "stat_2d", stat = c("mean", "sd")) # Scatterplot of y vs. average yrep pp_check(fit, plotfun = "scatter_avg") # y vs. average yrep # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_scatter_avg(y = mtcars$mpg, yrep = posterior_predict(fit)) # Scatterplots of y vs. several individual yrep datasets pp_check(fit, plotfun = "scatter", nreps = 3) # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_scatter(y = mtcars$mpg, yrep = posterior_predict(fit, draws = 3)) # yrep intervals with y points overlaid # by default 1:length(y) used on xaxis but can also specify an x variable pp_check(fit, plotfun = "intervals") pp_check(fit, plotfun = "intervals", x = "wt") + ggplot2::xlab("wt") # Same plot (up to RNG noise) using bayesplot package directly bayesplot::ppc_intervals(y = mtcars$mpg, yrep = posterior_predict(fit), x = mtcars$wt) + ggplot2::xlab("wt") # predictive errors pp_check(fit, plotfun = "error_hist", nreps = 6) pp_check(fit, plotfun = "error_scatter_avg_vs_x", x = "wt") + ggplot2::xlab("wt") # Example of a PPC for ordinal models (stan_polr) fit2 < stan_polr(tobgp ~ agegp, data = esoph, method = "probit", prior = R2(0.2, "mean"), init_r = 0.1, refresh = 0) pp_check(fit2, plotfun = "bars", nreps = 500, prob = 0.5) pp_check(fit2, plotfun = "bars_grouped", group = esoph$agegp, nreps = 500, prob = 0.5) }
The pp_validate
function is based on the methods described in
Cook, Gelman, and Rubin (2006) for validating software developed to fit
particular Bayesian models. Here we take the perspective that models
themselves are software and thus it is useful to apply this validation
approach to individual models.
pp_validate(object, nreps = 20, seed = 12345, ...)
pp_validate(object, nreps = 20, seed = 12345, ...)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
nreps 
The number of replications to be performed. 
seed 
A seed passed to Stan to use when refitting the model. 
... 
Currently ignored. 
We repeat nreps
times the process of simulating parameters and data
from the model and refitting the model to this simulated data. For each of
the nreps
replications we do the following:
Refit the model but without conditioning on the data (setting
prior_PD=TRUE
), obtaining draws $\theta^{true}$
from the prior distribution of the model parameters.
Given $\theta^{true}$
, simulate data $y^\ast$
from the prior predictive distribution (calling
posterior_predict
on the fitted model object obtained in step
1).
Fit the model to the simulated outcome $y^\ast$
, obtaining
parameters $\theta^{post}$
.
For any individual parameter, the quantile of the "true" parameter value with respect to its posterior distribution should be uniformly distributed. The validation procedure entails looking for deviations from uniformity by computing statistics for a test that the quantiles are uniformly distributed. The absolute values of the computed test statistics are plotted for batches of parameters (e.g., nonvarying coefficients are grouped into a batch called "beta", parameters that vary by group level are in batches named for the grouping variable, etc.). See Cook, Gelman, and Rubin (2006) for more details on the validation procedure.
A ggplot object that can be further customized using the ggplot2 package.
In order to make it through nreps
replications without running
into numerical difficulties you may have to restrict the range for randomly
generating initial values for parameters when you fit the original
model. With any of rstanarm's modeling functions this can be done by
specifying the optional argument init_r
as some number less than the
default of $2$
.
Cook, S., Gelman, A., and Rubin, D. (2006). Validation of software for Bayesian models using posterior quantiles. Journal of Computational and Graphical Statistics. 15(3), 675–692.
pp_check
for graphical posterior predictive checks and
posterior_predict
to draw from the posterior predictive
distribution.
color_scheme_set
to change the color scheme of the
plot.
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) try(pp_validate(example_model)) # fails with default seed / priors ## End(Not run) }
if (.Platform$OS.type != "windows") { ## Not run: if (!exists("example_model")) example(example_model) try(pp_validate(example_model)) # fails with default seed / priors ## End(Not run) }
This method is primarily intended to be used only for models fit using
optimization. For models fit using MCMC or one of the variational
approximations, see posterior_predict
.
## S3 method for class 'stanreg' predict( object, ..., newdata = NULL, type = c("link", "response"), se.fit = FALSE )
## S3 method for class 'stanreg' predict( object, ..., newdata = NULL, type = c("link", "response"), se.fit = FALSE )
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
... 
Ignored. 
newdata 
Optionally, a data frame in which to look for variables with which to predict. If omitted, the model matrix is used. 
type 
The type of prediction. The default 
se.fit 
A logical scalar indicating if standard errors should be
returned. The default is 
A vector if se.fit
is FALSE
and a list if se.fit
is TRUE
.
This is a convenience function for computing $y  y^{rep}$
(insample, for observed $y$
) or $y  \tilde{y}$
(outofsample, for new or heldout $y$
). The method for stanreg objects
calls posterior_predict
internally, whereas the method for
matrices accepts the matrix returned by posterior_predict
as input and
can be used to avoid multiple calls to posterior_predict
.
## S3 method for class 'stanreg' predictive_error( object, newdata = NULL, draws = NULL, re.form = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'matrix' predictive_error(object, y, ...) ## S3 method for class 'ppd' predictive_error(object, y, ...) ## S3 method for class 'stanmvreg' predictive_error( object, newdataLong = NULL, newdataEvent = NULL, m = "Event", draws = NULL, re.form = NULL, seed = NULL, offset = NULL, t, u, lossfn = "square", ... )
## S3 method for class 'stanreg' predictive_error( object, newdata = NULL, draws = NULL, re.form = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'matrix' predictive_error(object, y, ...) ## S3 method for class 'ppd' predictive_error(object, y, ...) ## S3 method for class 'stanmvreg' predictive_error( object, newdataLong = NULL, newdataEvent = NULL, m = "Event", draws = NULL, re.form = NULL, seed = NULL, offset = NULL, t, u, lossfn = "square", ... )
object 
Either a fitted model object returned by one of the
rstanarm modeling functions (a stanreg
object) or, for the matrix method, a matrix of draws from the
posterior predictive distribution returned by

newdata , draws , seed , offset , re.form

Optional arguments passed to

... 
Currently ignored. 
y 
For the matrix method only, a vector of 
newdataLong 
Data frame containing the longitudinal data 
newdataEvent 
Data frame containing the event data 
m 
For 
t , u

Only relevant for 
lossfn 
A character string specifying the loss function to use. 
A draws
by nrow(newdata)
matrix. If newdata
is
not specified then it will be draws
by nobs(object)
.
The Note section in posterior_predict
about
newdata
for binomial models also applies for
predictive_error
, with one important difference. For
posterior_predict
if the lefthand side of the model formula is
cbind(successes, failures)
then the particular values of
successes
and failures
in newdata
don't matter, only
that they add to the desired number of trials. This is not the case
for predictive_error
. For predictive_error
the particular
value of successes
matters because it is used as $y$
when
computing the error.
posterior_predict
to draw
from the posterior predictive distribution without computing predictive
errors.
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) err1 < predictive_error(example_model, draws = 50) hist(err1) # Using newdata with a binomial model formula(example_model) nd < data.frame( size = c(10, 20), incidence = c(5, 10), period = factor(c(1,2)), herd = c(1, 15) ) err2 < predictive_error(example_model, newdata = nd, draws = 10, seed = 1234) # stanreg vs matrix methods fit < stan_glm(mpg ~ wt, data = mtcars, iter = 300) preds < posterior_predict(fit, seed = 123) all.equal( predictive_error(fit, seed = 123), predictive_error(preds, y = fit$y) ) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) err1 < predictive_error(example_model, draws = 50) hist(err1) # Using newdata with a binomial model formula(example_model) nd < data.frame( size = c(10, 20), incidence = c(5, 10), period = factor(c(1,2)), herd = c(1, 15) ) err2 < predictive_error(example_model, newdata = nd, draws = 10, seed = 1234) # stanreg vs matrix methods fit < stan_glm(mpg ~ wt, data = mtcars, iter = 300) preds < posterior_predict(fit, seed = 123) all.equal( predictive_error(fit, seed = 123), predictive_error(preds, y = fit$y) ) }
For models fit using MCMC (algorithm="sampling"
) or one of the
variational approximations ("meanfield"
or "fullrank"
), the
predictive_interval
function computes Bayesian predictive intervals.
The method for stanreg objects calls posterior_predict
internally, whereas the method for matrices accepts the matrix returned by
posterior_predict
as input and can be used to avoid multiple calls to
posterior_predict
.
## S3 method for class 'stanreg' predictive_interval( object, prob = 0.9, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'matrix' predictive_interval(object, prob = 0.9, ...) ## S3 method for class 'ppd' predictive_interval(object, prob = 0.9, ...)
## S3 method for class 'stanreg' predictive_interval( object, prob = 0.9, newdata = NULL, draws = NULL, re.form = NULL, fun = NULL, seed = NULL, offset = NULL, ... ) ## S3 method for class 'matrix' predictive_interval(object, prob = 0.9, ...) ## S3 method for class 'ppd' predictive_interval(object, prob = 0.9, ...)
object 
Either a fitted model object returned by one of the
rstanarm modeling functions (a stanreg
object) or, for the matrix method, a matrix of draws from the
posterior predictive distribution returned by

prob 
A number 
newdata , draws , fun , offset , re.form , seed

Passed to

... 
Currently ignored. 
A matrix with two columns and as many rows as are in newdata
.
If newdata
is not provided then the matrix will have as many rows as
the data used to fit the model. For a given value of prob
, $p$
,
the columns correspond to the lower and upper $100p$
% central interval
limits and have the names $100\alpha/2$
% and $100(1 
\alpha/2)$
%, where $\alpha = 1p$
. For example, if prob=0.9
is
specified (a $90$
% interval), then the column names will be
"5%"
and "95%"
, respectively.
predictive_error
, posterior_predict
,
posterior_interval
if (.Platform$OS.type != "windows") { fit < stan_glm(mpg ~ wt, data = mtcars, iter = 300) predictive_interval(fit) predictive_interval(fit, newdata = data.frame(wt = range(mtcars$wt)), prob = 0.5) # stanreg vs matrix methods preds < posterior_predict(fit, seed = 123) all.equal( predictive_interval(fit, seed = 123), predictive_interval(preds) ) }
if (.Platform$OS.type != "windows") { fit < stan_glm(mpg ~ wt, data = mtcars, iter = 300) predictive_interval(fit) predictive_interval(fit, newdata = data.frame(wt = range(mtcars$wt)), prob = 0.5) # stanreg vs matrix methods preds < posterior_predict(fit, seed = 123) all.equal( predictive_interval(fit, seed = 123), predictive_interval(preds) ) }
The print
method for stanreg objects displays a compact summary of the
fitted model. See the Details section below for descriptions of the
different components of the printed output. For additional summary statistics
and diagnostics use the summary
method.
## S3 method for class 'stanreg' print(x, digits = 1, detail = TRUE, ...) ## S3 method for class 'stanmvreg' print(x, digits = 3, detail = TRUE, ...)
## S3 method for class 'stanreg' print(x, digits = 1, detail = TRUE, ...) ## S3 method for class 'stanmvreg' print(x, digits = 3, detail = TRUE, ...)
x 
A fitted model object returned by one of the
rstanarm modeling functions. See 
digits 
Number of digits to use for formatting numbers. 
detail 
Logical, defaulting to 
... 
Ignored. 
Regardless of the estimation algorithm, point estimates are medians computed
from simulations. For models fit using MCMC ("sampling"
) the posterior
sample is used. For optimization ("optimizing"
), the simulations are
generated from the asymptotic Gaussian sampling distribution of the
parameters. For the "meanfield"
and "fullrank"
variational
approximations, draws from the variational approximation to the posterior are
used. In all cases, the point estimates reported are the same as the values
returned by coef
.
The standard deviations reported (labeled MAD_SD
in the print output)
are computed from the same set of draws described above and are proportional
to the median absolute deviation (mad
) from the median.
Compared to the raw posterior standard deviation, the MAD_SD will be
more robust for longtailed distributions. These are the same as the values
returned by se
.
For GLMs with groupspecific terms (see stan_glmer
) the printed
output also shows point estimates of the standard deviations of the group
effects (and correlations if there are both intercept and slopes that vary by
group).
For analysis of variance models (see stan_aov
) models, an
ANOVAlike table is also displayed.
For joint longitudinal and timetoevent (see stan_jm
) models
the estimates are presented separately for each of the distinct submodels.
Returns x
, invisibly.
summary.stanreg
, stanregmethods
The prior_summary
method provides a summary of the prior distributions
used for the parameters in a given model. In some cases the userspecified
prior does not correspond exactly to the prior used internally by
rstanarm (see the sections below). Especially in these cases, but also
in general, it can be much more useful to visualize the priors. Visualizing
the priors can be done using the posterior_vs_prior
function,
or alternatively by fitting the model with the prior_PD
argument set
to TRUE
(to draw from the prior predictive distribution instead of
conditioning on the outcome) and then plotting the parameters.
## S3 method for class 'stanreg' prior_summary(object, digits = 2, ...)
## S3 method for class 'stanreg' prior_summary(object, digits = 2, ...)
object 
A fitted model object returned by one of the
rstanarm modeling functions. See 
digits 
Number of digits to use for rounding. 
... 
Currently ignored by the method for stanreg objects. 
A list of class "prior_summary.stanreg", which has its own print method.
For rstanarm modeling functions that accept a prior_intercept
argument, the specified prior for the intercept term applies to the
intercept after rstanarm internally centers the predictors so they
each have mean zero. The estimate of the intercept returned to the user
correspond to the intercept with the predictors as specified by the user
(unmodified by rstanarm), but when specifying the prior the
intercept can be thought of as the expected outcome when the predictors are
set to their means. The only exceptions to this are for models fit using
the stan_surv
modelling function, or models fit with the
sparse
argument set to TRUE
(which is only possible with a
subset of the modeling functions and never the default).
For some models you may see "adjusted scale
"
in the printed output and adjusted scales included in the object returned
by prior_summary
. These adjusted scale values are the prior scales
actually used by rstanarm and are computed by adjusting the prior
scales specified by the user to account for the scales of the predictors
(as described in the documentation for the autoscale
argument). To disable internal prior scale adjustments set the
autoscale
argument to FALSE
when setting a prior using one of
the distributions that accepts an autoscale
argument. For example,
normal(0, 5, autoscale=FALSE)
instead of just normal(0, 5)
.
For the models fit with an rstanarm modeling function that supports
the QR
argument (see e.g, stan_glm
), if QR
is
set to TRUE
then the prior distributions for the regression
coefficients specified using the prior
argument are not relative to
the original predictor variables $X$
but rather to the variables in the
matrix $Q$
obtained from the $QR$
decomposition of $X$
.
In particular, if prior = normal(location,scale)
, then this prior on
the coefficients in $Q$
space can be easily translated into a joint
multivariate normal (MVN) prior on the coefficients on the original
predictors in $X$
. Letting $\theta$
denote the coefficients on
$Q$
and $\beta$
the coefficients on $X$
then if $\theta
\sim N(\mu, \sigma)$
the corresponding prior on
$\beta$
is $\beta \sim MVN(R\mu, R'R\sigma^2)$
, where $\mu$
and $\sigma$
are vectors of the
appropriate length. Technically, rstanarm uses a scaled $QR$
decomposition to ensure that the columns of the predictor matrix used to
fit the model all have unit scale, when the autoscale
argument
to the function passed to the prior
argument is TRUE
(the
default), in which case the matrices actually used are
$Q^\ast = Q \sqrt{n1}$
and $R^\ast =
\frac{1}{\sqrt{n1}} R$
. If autoscale = FALSE
we instead scale such that the lowerright element of $R^\ast$
is
$1$
, which is useful if you want to specify a prior on the coefficient
of the last predictor in its original units (see the documentation for the
QR
argument).
If you are interested in the prior on $\beta$
implied by the prior on
$\theta$
, we strongly recommend visualizing it as described above in
the Description section, which is simpler than working it out
analytically.
The priors help page and the Prior Distributions vignette.
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) prior_summary(example_model) priors < prior_summary(example_model) names(priors) priors$prior$scale priors$prior$adjusted_scale # for a glm with adjusted scales (see Details, above), compare # the default (rstanarm adjusting the scales) to setting # autoscale=FALSE for prior on coefficients fit < stan_glm(mpg ~ wt + am, data = mtcars, prior = normal(0, c(2.5, 4)), prior_intercept = normal(0, 5), iter = 10, chains = 1) # only for demonstration prior_summary(fit) fit2 < update(fit, prior = normal(0, c(2.5, 4), autoscale=FALSE), prior_intercept = normal(0, 5, autoscale=FALSE)) prior_summary(fit2) }
if (.Platform$OS.type != "windows") { if (!exists("example_model")) example(example_model) prior_summary(example_model) priors < prior_summary(example_model) names(priors) priors$prior$scale priors$prior$adjusted_scale # for a glm with adjusted scales (see Details, above), compare # the default (rstanarm adjusting the scales) to setting # autoscale=FALSE for prior on coefficients fit < stan_glm(mpg ~ wt + am, data = mtcars, prior = normal(0, c(2.5, 4)), prior_intercept = normal(0, 5), iter = 10, chains = 1) # only for demonstration prior_summary(fit) fit2 < update(fit, prior = normal(0, c(2.5, 4), autoscale=FALSE), prior_intercept = normal(0, 5, autoscale=FALSE)) prior_summary(fit2) }
The functions described on this page are used to specify the
priorrelated arguments of the various modeling functions in the
rstanarm package (to view the priors used for an existing model see
prior_summary
).
The default priors used in the various rstanarm modeling functions
are intended to be weakly informative in that they provide moderate
regularization and help stabilize computation. For many applications the
defaults will perform well, but prudent use of more informative priors is
encouraged. Uniform prior distributions are possible (e.g. by setting
stan_glm
's prior
argument to NULL
) but, unless
the data is very strong, they are not recommended and are not
noninformative, giving the same probability mass to implausible values as
plausible ones.
More information on priors is available in the vignette Prior Distributions for rstanarm Models as well as the vignettes for the various modeling functions. For details on the priors used for multilevel models in particular see the vignette Estimating Generalized (Non)Linear Models with GroupSpecific Terms with rstanarm and also the Covariance matrices section lower down on this page.
normal(location = 0, scale = NULL, autoscale = FALSE) student_t(df = 1, location = 0, scale = NULL, autoscale = FALSE) cauchy(location = 0, scale = NULL, autoscale = FALSE) hs(df = 1, global_df = 1, global_scale = 0.01, slab_df = 4, slab_scale = 2.5) hs_plus( df1 = 1, df2 = 1, global_df = 1, global_scale = 0.01, slab_df = 4, slab_scale = 2.5 ) laplace(location = 0, scale = NULL, autoscale = FALSE) lasso(df = 1, location = 0, scale = NULL, autoscale = FALSE) product_normal(df = 2, location = 0, scale = 1) exponential(rate = 1, autoscale = FALSE) decov(regularization = 1, concentration = 1, shape = 1, scale = 1) lkj(regularization = 1, scale = 10, df = 1, autoscale = TRUE) dirichlet(concentration = 1) R2(location = NULL, what = c("mode", "mean", "median", "log")) default_prior_intercept(family) default_prior_coef(family)
normal(location = 0, scale = NULL, autoscale = FALSE) student_t(df = 1, location = 0, scale = NULL, autoscale = FALSE) cauchy(location = 0, scale = NULL, autoscale = FALSE) hs(df = 1, global_df = 1, global_scale = 0.01, slab_df = 4, slab_scale = 2.5) hs_plus( df1 = 1, df2 = 1, global_df = 1, global_scale = 0.01, slab_df = 4, slab_scale = 2.5 ) laplace(location = 0, scale = NULL, autoscale = FALSE) lasso(df = 1, location = 0, scale = NULL, autoscale = FALSE) product_normal(df = 2, location = 0, scale = 1) exponential(rate = 1, autoscale = FALSE) decov(regularization = 1, concentration = 1, shape = 1, scale = 1) lkj(regularization = 1, scale = 10, df = 1, autoscale = TRUE) dirichlet(concentration = 1) R2(location = NULL, what = c("mode", "mean", "median", "log")) default_prior_intercept(family) default_prior_coef(family)
location 
Prior location. In most cases, this is the prior mean, but
for 
scale 
Prior scale. The default depends on the family (see Details). 
autoscale 
If 
df , df1 , df2

Prior degrees of freedom. The default is 
global_df , global_scale , slab_df , slab_scale

Optional arguments for the hierarchical shrinkage priors. See the Hierarchical shrinkage family section below. 
rate 
Prior rate for the exponential distribution. Defaults to

regularization 
Exponent for an LKJ prior on the correlation matrix in
the 
concentration 
Concentration parameter for a symmetric Dirichlet
distribution. The default is 
shape 
Shape parameter for a gamma prior on the scale parameter in the

what 
A character string among 
family 
Not currently used. 
The details depend on the family of the prior being used:
Family members:
normal(location, scale)
student_t(df, location, scale)
cauchy(location, scale)
Each of these functions also takes an argument autoscale
.
For the prior distribution for the intercept, location
,
scale
, and df
should be scalars. For the prior for the other
coefficients they can either be vectors of length equal to the number of
coefficients (not including the intercept), or they can be scalars, in
which case they will be recycled to the appropriate length. As the
degrees of freedom approaches infinity, the Student t distribution
approaches the normal distribution and if the degrees of freedom are one,
then the Student t distribution is the Cauchy distribution.
If scale
is not specified it will default to $2.5$
, unless the
probit link function is used, in which case these defaults are scaled by a
factor of dnorm(0)/dlogis(0)
, which is roughly $1.6$
.
If the autoscale
argument is TRUE
, then the
scales will be further adjusted as described above in the documentation of
the autoscale
argument in the Arguments section.
Family members:
hs(df, global_df, global_scale, slab_df, slab_scale)
hs_plus(df1, df2, global_df, global_scale, slab_df, slab_scale)
The hierarchical shrinkage priors are normal with a mean of zero and a
standard deviation that is also a random variable. The traditional
hierarchical shrinkage prior utilizes a standard deviation that is
distributed half Cauchy with a median of zero and a scale parameter that is
also half Cauchy. This is called the "horseshoe prior". The hierarchical
shrinkage (hs
) prior in the rstanarm package instead utilizes
a regularized horseshoe prior, as described by Piironen and Vehtari (2017),
which recommends setting the global_scale
argument equal to the ratio
of the expected number of nonzero coefficients to the expected number of
zero coefficients, divided by the square root of the number of observations.
The hierarhical shrinkpage plus (hs_plus
) prior is similar except
that the standard deviation that is distributed as the product of two
independent half Cauchy parameters that are each scaled in a similar way
to the hs
prior.
The hierarchical shrinkage priors have very tall modes and very fat tails.
Consequently, they tend to produce posterior distributions that are very
concentrated near zero, unless the predictor has a strong influence on the
outcome, in which case the prior has little influence. Hierarchical
shrinkage priors often require you to increase the
adapt_delta
tuning parameter in order to diminish the number
of divergent transitions. For more details on tuning parameters and
divergent transitions see the Troubleshooting section of the How to
Use the rstanarm Package vignette.
Family members:
laplace(location, scale)
lasso(df, location, scale)
Each of these functions also takes an argument autoscale
.
The Laplace distribution is also known as the doubleexponential distribution. It is a symmetric distribution with a sharp peak at its mean / median / mode and fairly long tails. This distribution can be motivated as a scale mixture of normal distributions and the remarks above about the normal distribution apply here as well.
The lasso approach to supervised learning can be expressed as finding the
posterior mode when the likelihood is Gaussian and the priors on the
coefficients have independent Laplace distributions. It is commonplace in
supervised learning to choose the tuning parameter by crossvalidation,
whereas a more Bayesian approach would be to place a prior on “it”,
or rather its reciprocal in our case (i.e. smaller values correspond
to more shrinkage toward the prior location vector). We use a chisquare
prior with degrees of freedom equal to that specified in the call to
lasso
or, by default, 1. The expectation of a chisquare random
variable is equal to this degrees of freedom and the mode is equal to the
degrees of freedom minus 2, if this difference is positive.
It is also common in supervised learning to standardize the predictors
before training the model. We do not recommend doing so. Instead, it is
better to specify autoscale = TRUE
, which
will adjust the scales of the priors according to the dispersion in the
variables. See the documentation of the autoscale
argument above
and also the prior_summary
page for more information.
Family members:
product_normal(df, location, scale)
The productnormal distribution is the product of at least two independent
normal variates each with mean zero, shifted by the location
parameter. It can be shown that the density of a productnormal variate is
symmetric and infinite at location
, so this prior resembles a
“spikeandslab” prior for sufficiently large values of the
scale
parameter. For better or for worse, this prior may be
appropriate when it is strongly believed (by someone) that a regression
coefficient “is” equal to the location
, parameter even though
no true Bayesian would specify such a prior.
Each element of df
must be an integer of at least $2$
because
these “degrees of freedom” are interpreted as the number of normal
variates being multiplied and then shifted by location
to yield the
regression coefficient. Higher degrees of freedom produce a sharper
spike at location
.
Each element of scale
must be a nonnegative real number that is
interpreted as the standard deviation of the normal variates being
multiplied and then shifted by location
to yield the regression
coefficient. In other words, the elements of scale
may differ, but
the kth standard deviation is presumed to hold for all the normal deviates
that are multiplied together and shifted by the kth element of
location
to yield the kth regression coefficient. The elements of
scale
are not the prior standard deviations of the regression
coefficients. The prior variance of the regression coefficients is equal to
the scale raised to the power of $2$
times the corresponding element of
df
. Thus, larger values of scale
put more prior volume on
values of the regression coefficient that are far from zero.
Family members:
dirichlet(concentration)
The Dirichlet distribution is a multivariate generalization of the beta distribution. It is perhaps the easiest prior distribution to specify because the concentration parameters can be interpreted as prior counts (although they need not be integers) of a multinomial random variable.
The Dirichlet distribution is used in stan_polr
for an
implicit prior on the cutpoints in an ordinal regression model. More
specifically, the Dirichlet prior pertains to the prior probability of
observing each category of the ordinal outcome when the predictors are at
their sample means. Given these prior probabilities, it is straightforward
to add them to form cumulative probabilities and then use an inverse CDF
transformation of the cumulative probabilities to define the cutpoints.
If a scalar is passed to the concentration
argument of the
dirichlet
function, then it is replicated to the appropriate length
and the Dirichlet distribution is symmetric. If concentration
is a
vector and all elements are $1$
, then the Dirichlet distribution is
jointly uniform. If all concentration parameters are equal but greater than
$1$
then the prior mode is that the categories are equiprobable, and
the larger the value of the identical concentration parameters, the more
sharply peaked the distribution is at the mode. The elements in
concentration
can also be given different values to represent that
not all outcome categories are a priori equiprobable.
Family members:
decov(regularization, concentration, shape, scale)
lkj(regularization, scale, df)
(Also see vignette for stan_glmer
,
Estimating
Generalized (Non)Linear Models with GroupSpecific Terms with rstanarm)
Covariance matrices are decomposed into correlation matrices and
variances. The variances are in turn decomposed into the product of a
simplex vector and the trace of the matrix. Finally, the trace is the
product of the order of the matrix and the square of a scale parameter.
This prior on a covariance matrix is represented by the decov
function.
The prior for a correlation matrix is called LKJ whose density is
proportional to the determinant of the correlation matrix raised to the
power of a positive regularization parameter minus one. If
regularization = 1
(the default), then this prior is jointly
uniform over all correlation matrices of that size. If
regularization > 1
, then the identity matrix is the mode and in the
unlikely case that regularization < 1
, the identity matrix is the
trough.
The trace of a covariance matrix is equal to the sum of the variances. We
set the trace equal to the product of the order of the covariance matrix
and the square of a positive scale parameter. The particular
variances are set equal to the product of a simplex vector — which is
nonnegative and sums to $1$
— and the scalar trace. In other words,
each element of the simplex vector represents the proportion of the trace
attributable to the corresponding variable.
A symmetric Dirichlet prior is used for the simplex vector, which has a
single (positive) concentration
parameter, which defaults to
$1$
and implies that the prior is jointly uniform over the space of
simplex vectors of that size. If concentration > 1
, then the prior
mode corresponds to all variables having the same (proportion of total)
variance, which can be used to ensure the the posterior variances are not
zero. As the concentration
parameter approaches infinity, this
mode becomes more pronounced. In the unlikely case that
concentration < 1
, the variances are more polarized.
If all the variables were multiplied by a number, the trace of their
covariance matrix would increase by that number squared. Thus, it is
reasonable to use a scaleinvariant prior distribution for the positive
scale parameter, and in this case we utilize a Gamma distribution, whose
shape
and scale
are both $1$
by default, implying a
unitexponential distribution. Set the shape
hyperparameter to some
value greater than $1$
to ensure that the posterior trace is not zero.
If regularization
, concentration
, shape
and / or
scale
are positive scalars, then they are recycled to the
appropriate length. Otherwise, each can be a positive vector of the
appropriate length, but the appropriate length depends on the number of
covariance matrices in the model and their sizes. A onebyone covariance
matrix is just a variance and thus does not have regularization
or
concentration
parameters, but does have shape
and
scale
parameters for the prior standard deviation of that
variable.
Note that for stan_mvmer
and stan_jm
models an
additional prior distribution is provided through the lkj
function.
This prior is in fact currently used as the default for those modelling
functions (although decov
is still available as an option if the user
wishes to specify it through the prior_covariance
argument). The
lkj
prior uses the same decomposition of the covariance matrices
into correlation matrices and variances, however, the variances are not
further decomposed into a simplex vector and the trace; instead the
standard deviations (square root of the variances) for each of the group
specific parameters are given a half Student t distribution with the
scale and df parameters specified through the scale
and df
arguments to the lkj
function. The scale parameter default is 10
which is then autoscaled, whilst the df parameter default is 1
(therefore equivalent to a half Cauchy prior distribution for the
standard deviation of each group specific parameter). This prior generally
leads to similar results as the decov
prior, but it is also likely
to be **less** diffuse compared with the decov
prior; therefore it
sometimes seems to lead to faster estimation times, hence why it has
been chosen as the default prior for stan_mvmer
and
stan_jm
where estimation times can be long.
Family members:
R2(location, what)
The stan_lm
, stan_aov
, and
stan_polr
functions allow the user to utilize a function
called R2
to convey prior information about all the parameters.
This prior hinges on prior beliefs about the location of $R^2$
, the
proportion of variance in the outcome attributable to the predictors,
which has a Beta
prior with first shape
hyperparameter equal to half the number of predictors and second shape
hyperparameter free. By specifying what
to be the prior mode (the
default), mean, median, or expected log of $R^2$
, the second shape
parameter for this Beta distribution is determined internally. If
what = 'log'
, location should be a negative scalar; otherwise it
should be a scalar on the $(0,1)$
interval.
For example, if $R^2 = 0.5$
, then the mode, mean, and median of
the Beta
distribution are all the same and thus the
second shape parameter is also equal to half the number of predictors.
The second shape parameter of the Beta
distribution
is actually the same as the shape parameter in the LKJ prior for a
correlation matrix described in the previous subsection. Thus, the smaller
is $R^2$
, the larger is the shape parameter, the smaller are the
prior correlations among the outcome and predictor variables, and the more
concentrated near zero is the prior density for the regression
coefficients. Hence, the prior on the coefficients is regularizing and
should yield a posterior distribution with good outofsample predictions
if the prior location of $R^2$
is specified in a reasonable
fashion.
A named list to be used internally by the rstanarm model fitting functions.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. https://stat.columbia.edu/~gelman/book/
Gelman, A., Jakulin, A., Pittau, M. G., and Su, Y. (2008). A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics. 2(4), 1360–1383.
Piironen, J., and Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. https://arxiv.org/abs/1707.01694
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. https://mcstan.org/users/documentation/.
The various vignettes for the rstanarm package also discuss and demonstrate the use of some of the supported prior distributions.
if (.Platform$OS.type != "windows") { fmla < mpg ~ wt + qsec + drat + am # Draw from prior predictive distribution (by setting prior_PD = TRUE) prior_pred_fit < stan_glm(fmla, data = mtcars, prior_PD = TRUE, chains = 1, seed = 12345, iter = 250, # for speed only prior = student_t(df = 4, 0, 2.5), prior_intercept = cauchy(0,10), prior_aux = exponential(1/2)) plot(prior_pred_fit, "hist") # Can assign priors to names N05 < normal(0, 5) fit < stan_glm(fmla, data = mtcars, prior = N05, prior_intercept = N05) # Visually compare normal, student_t, cauchy, laplace, and product_normal compare_priors < function(scale = 1, df_t = 2, xlim = c(10, 10)) { dt_loc_scale < function(x, df, location, scale) { 1/scale * dt((x  location)/scale, df) } dlaplace < function(x, location, scale) { 0.5 / scale * exp(abs(x  location) / scale) } dproduct_normal < function(x, scale) { besselK(abs(x) / scale ^ 2, nu = 0) / (scale ^ 2 * pi) } stat_dist < function(dist, ...) { ggplot2::stat_function(ggplot2::aes_(color = dist), ...) } ggplot2::ggplot(data.frame(x = xlim), ggplot2::aes(x)) + stat_dist("normal", size = .75, fun = dnorm, args = list(mean = 0, sd = scale)) + stat_dist("student_t", size = .75, fun = dt_loc_scale, args = list(df = df_t, location = 0, scale = scale)) + stat_dist("cauchy", size = .75, linetype = 2, fun = dcauchy, args = list(location = 0, scale = scale)) + stat_dist("laplace", size = .75, linetype = 2, fun = dlaplace, args = list(location = 0, scale = scale)) + stat_dist("product_normal", size = .75, linetype = 2, fun = dproduct_normal, args = list(scale = 1)) } # Cauchy has fattest tails, followed by student_t, laplace, and normal compare_priors() # The student_t with df = 1 is the same as the cauchy compare_priors(df_t = 1) # Even a scale of 5 is somewhat large. It gives plausibility to rather # extreme values compare_priors(scale = 5, xlim = c(20,20)) # If you use a prior like normal(0, 1000) to be "noninformative" you are # actually saying that a coefficient value of e.g. 500 is quite plausible compare_priors(scale = 1000, xlim = c(1000,1000)) }
if (.Platform$OS.type != "windows") { fmla < mpg ~ wt + qsec + drat + am # Draw from prior predictive distribution (by setting prior_PD = TRUE) prior_pred_fit < stan_glm(fmla, data = mtcars, prior_PD = TRUE, chains = 1, seed = 12345, iter = 250, # for speed only prior = student_t(df = 4, 0, 2.5), prior_intercept = cauchy(0,10), prior_aux = exponential(1/2)) plot(prior_pred_fit, "hist") # Can assign priors to names N05 < normal(0, 5) fit < stan_glm(fmla, data = mtcars, prior = N05, prior_intercept = N05) # Visually compare normal, student_t, cauchy, laplace, and product_normal compare_priors < function(scale = 1, df_t = 2, xlim = c(10, 10)) { dt_loc_scale < function(x, df, location, scale) { 1/scale * dt((x  location)/scale, df) } dlaplace < function(x, location, scale) { 0.5 / scale * exp(abs(x  location) / scale) } dproduct_normal < function(x, scale) { besselK(abs(x) / scale ^ 2, nu = 0) / (scale ^ 2 * pi) } stat_dist < function(dist, ...) { ggplot2::stat_function(ggplot2::aes_(color = dist), ...) } ggplot2::ggplot(data.frame(x = xlim), ggplot2::aes(x)) + stat_dist("normal", size = .75, fun = dnorm, args = list(mean = 0, sd = scale)) + stat_dist("student_t", size = .75, fun = dt_loc_scale, args = list(df = df_t, location = 0, scale = scale)) + stat_dist("cauchy", size = .75, linetype = 2, fun = dcauchy, args = list(location = 0, scale = scale)) + stat_dist("laplace", size = .75, linetype = 2, fun = dlaplace, args = list(location = 0, scale = scale)) + stat_dist("product_normal", size = .75, linetype = 2, fun = dproduct_normal, args = list(scale = 1)) } # Cauchy has fattest tails, followed by student_t, laplace, and normal compare_priors() # The student_t with df = 1 is the same as the cauchy compare_priors(df_t = 1) # Even a scale of 5 is somewhat large. It gives plausibility to rather # extreme values compare_priors(scale = 5, xlim = c(20,20)) # If you use a prior like normal(0, 1000) to be "noninformative" you are # actually saying that a coefficient value of e.g. 500 is quite plausible compare_priors(scale = 1000, xlim = c(1000,1000)) }
This function plots the estimated standardised survival curve for the estimation sample based on draws from the posterior distribution of the fitted model, and then overlays a KaplanMeier survival curve based on the observed data.
ps_check( object, check = "survival", limits = c("ci", "none"), draws = NULL, npoints = 101, seed = NULL, xlab = NULL, ylab = NULL, ci_geom_args = NULL, ... )
ps_check( object, check = "survival", limits = c("ci", "none"), draws = NULL, npoints = 101, seed = NULL, xlab = NULL, ylab = NULL, ci_geom_args = NULL, ... )
object 
A fitted model object returned by the

check 
The type of plot to show. Currently only "survival" is allowed, which compares the estimated standardised survival curve based on the fitted model to the estimated KaplanMeier survival curve based on the observed data. 
limits 
A quoted character string specifying the type of limits to
include in the plot. Can be one of: 
draws 
Passed to the 
npoints 
The number of time points at which to predict the survival
function. The plot of the survival curve is generated by interpolating
along these points using 
seed 
An optional 
xlab , ylab

An optional axis label passed to

ci_geom_args 
Optional arguments passed to

... 
Optional arguments passed to

A ggplot object that can be further customized using the ggplot2 package.
posterior_survfit
for the estimated marginal or
subjectspecific survival function based on draws of the model parameters
from the posterior distribution posterior_predict
for drawing from the posterior
predictive distribution for the longitudinal submodel (for
stan_jm
models only) pp_check
for graphical checks of the longitudinal submodel
(for stan_jm
models only)
if (!exists("example_jm")) example(example_jm) # Compare estimated survival function to KaplanMeier curve ps < ps_check(example_jm) ps + ggplot2::scale_color_manual(values = c("red", "black")) + # change colors ggplot2::scale_size_manual (values = c(0.5, 3)) + # change line sizes ggplot2::scale_fill_manual (values = c(NA, NA)) # remove fill
if (!exists("example_jm")) example(example_jm) # Compare estimated survival function to KaplanMeier curve ps < ps_check(example_jm) ps + ggplot2::scale_color_manual(values = c("red", "black")) + # change colors ggplot2::scale_size_manual (values = c(0.5, 3)) + # change line sizes ggplot2::scale_fill_manual (values = c(NA, NA)) # remove fill
Perform importance sampling with stanreg objects
## S3 method for class 'stanreg' psis(log_ratios, ...)
## S3 method for class 'stanreg' psis(log_ratios, ...)
log_ratios 
log_ratios for PSIS computation 
... 
Additional arguments passed to 
QR
argumentDetails about the QR
argument to rstanarm's modeling
functions.
The QR
argument is a logical scalar defaulting to
FALSE
, but if TRUE
applies a scaled qr
decomposition to the design matrix, $X = Q^\ast R^\ast$
.
If autoscale = TRUE
(the default)
in the call to the function passed to the prior
argument, then
$Q^\ast = Q \sqrt{n1}$
and
$R^\ast = \frac{1}{\sqrt{n1}} R$
. When
autoscale = FALSE
, $R$
is scaled such that the lowerright
element of $R^\ast$
is $1$
.
The coefficients relative to $Q^\ast$
are obtained and then
premultiplied by the inverse of $R^{\ast}$
to obtain coefficients
relative to the original predictors, $X$
. Thus, when
autoscale = FALSE
, the coefficient on the last column of $X$
is the same as the coefficient on the last column of $Q^\ast$
.
These transformations do not change the likelihood of the data but are
recommended for computational reasons when there are multiple predictors.
Importantly, while the columns of $X$
are almost generally correlated,
the columns of $Q^\ast$
are uncorrelated by design, which often makes
sampling from the posterior easier. However, because when QR
is
TRUE
the prior
argument applies to the coefficients relative to
$Q^\ast$
(and those are not very interpretable), setting QR=TRUE
is only recommended if you do not have an informative prior for the regression
coefficients or if the only informative prior is on the last regression
coefficient (in which case you should set autoscale = FALSE
when
specifying such priors).
For more details see the Stan case study The QR Decomposition For Regression Models at https://mcstan.org/users/documentation/casestudies/qr_regression.html.
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. https://mcstan.org/users/documentation/.
Apply quadrature weights and sum over nodes
quadrature_sum(x, qnodes, qwts)
quadrature_sum(x, qnodes, qwts)
x 
Inputs 
qnodes 
Nodes 
qwts 
Quadrature weights 
Apply quadrature weights and sum over nodes
## Default S3 method: quadrature_sum(x, qnodes, qwts)
## Default S3 method: quadrature_sum(x, qnodes, qwts)
x 
Inputs 
qnodes 
Nodes 
qwts 
Quadrature weights 
Apply quadrature weights and sum over nodes
## S3 method for class 'matrix' quadrature_sum(x, qnodes, qwts)
## S3 method for class 'matrix' quadrature_sum(x, qnodes, qwts)
x 
Inputs 
qnodes 
Nodes 
qwts 
Quadrature weights 
loo/waic/kfold objects created by rstanarm have a model_name attribute. when a stanreg_list is created those attributes should be changed to match the names of the models used for the stanreg_list in case user has specified the model_names argument
rename_loos(x, ...)
rename_loos(x, ...)
x 
loo/waic/kfold object 
... 
Additional arguments 
Renamed object
Change model_name attributes of a loo/waic/kfold object stored in a stanreg object,
## S3 method for class 'stanreg' rename_loos(x, new_model_name, ...)
## S3 method for class 'stanreg' rename_loos(x, new_model_name, ...)
x 
loo/waic/kfold object 
new_model_name 
new name for object 
... 
Additional arguments 
Renamed object
Change model_name attributes of loo/waic/kfold objects to correspond to model names used for stanreg_list
## S3 method for class 'stanreg_list' rename_loos(x, ...)
## S3 method for class 'stanreg_list' rename_loos(x, ...)
x 
loo/waic/kfold object 
... 
Additional arguments 
Renamed object
Small datasets for use in rstanarm examples and vignettes.
bball1970
Data on hits and atbats from the 1970 Major League Baseball season for 18 players.
Source: Efron and Morris (1975).
18 obs. of 5 variables
Player
Player's last name
Hits
Number of hits in the first 45 atbats of the season
AB
Number of atbats (45 for all players)
RemainingAB
Number of remaining atbats (different for most players)
RemainingHits
Number of remaining hits
bball2006
Hits and atbats for the entire 2006 American League season of Major League Baseball.
Source: Carpenter (2009)
302 obs. of 2 variables
y
Number of hits
K
Number of atbats
bcancer
The German Breast Cancer Study Group dataset, containing time to death or recurrence for 686 patients with primary node positive breast cancer recruited between 19841989.
Source: Royston and Parmar (2002)
686 obs. of 4 variables
recdays
Time to death or censoring (in days)
recyrs
Time to death or censoring (in years)
status
Event indicator (0 = right censored, 1 = event)
group
Prognostic group, based on a regression model developed
by Sauerbrei and Royston (1999) (Good
, Medium
, Poor
)
frail
A simulated dataset of event times (i.e. survival data) for 200 patients
clustered within 20 hospital sites (10 patients per hospital site).
The event times are simulated from a parametric proportional hazards model
under the following assumptions: (i) a constant (i.e. exponential) baseline
hazard rate of 0.1; (ii) a fixed treatment effect with log hazard ratio of
0.3; and (iii) a sitespecific random intercept (specified on the log
hazard scale) drawn from a $N(0,1)$
distribution.
200 obs. of 6 variables
id
ID unique to each patient
site
ID unique to each hospital site (i.e. cluster)
trt
Treatment indicator (0 = untreated, 1 = treated)
b
Clusterspecific random intercept used to simulate the
event times
eventtime
Event or censoring time
status
Event indicator (0 = right censored, 1 = event)
kidiq
Data from a survey of adult American women and their children (a subsample from the National Longitudinal Survey of Youth).
Source: Gelman and Hill (2007)
434 obs. of 4 variables
kid_score
Child's IQ score
mom_hs
Indicator for whether the mother has a high school degree
mom_iq
Mother's IQ score
mom_age
Mother's age
mice
Lung tumor development in 144 RFM mice allocated to either a conventional
environment or germfree environment. Mice were sacrificed and examined
for presence of a lung tumor. The outcome variables in the dataset
(l
and u
) denote a leftcensored or rightcensored time
interval within which the development of the first lung tumor must have
occurred.
Source: Hoel and Walburg (1972)
144 obs. of 3 variables
l
Lower limit of the interval.
u
Upper limit of the interval.
grp
Experimental group (ce
= conventional environment,
ge
= germfree environment).
mortality
Surgical mortality rates in 12 hospitals performing cardiac surgery in babies.
Source: Spiegelhalter et al. (1996).
12 obs. of 2 variables
y
Number of deaths
K
Number of surgeries
pbcLong,pbcSurv
Longitudinal biomarker and timetoevent survival data for 40 patients with primary biliary cirrhosis who participated in a randomised placebo controlled trial of Dpenicillamine conducted at the Mayo Clinic between 1974 and 1984.
Source: Therneau and Grambsch (2000)
304 obs. of 8 variables (pbcLong
) and 40 obs. of 7 variables (pbcSurv
)
age
Age (in years)
albumin
Serum albumin (g/dl)
logBili
Logarithm of serum bilirubin
death
Indicator of death at endpoint
futimeYears
Time (in years) between baseline and
the earliest of death, transplantion or censoring
id
Numeric ID unique to each individual
platelet
Platelet count
sex
Gender (m = male, f = female)
status
Status at endpoint (0 = censored, 1 = transplant,
2 = dead)
trt
Binary treatment code (0 = placebo, 1 = Dpenicillamine)
year
Time (in years) of the longitudinal measurements,
taken as time since baseline
radon
Data on radon levels in houses in the state of Minnesota.
Source: Gelman and Hill (2007)
919 obs. of 4 variables
log_radon
Radon measurement from the house (log scale)
log_uranium
Uranium level in the county (log scale)
floor
Indicator for radon measurement made on the first floor of
the house (0 = basement, 1 = first floor)
county
County name (factor
)
roaches
Data on the efficacy of a pest management system at reducing the number of roaches in urban apartments.
Source: Gelman and Hill (2007)
262 obs. of 6 variables
y
Number of roaches caught
roach1
Pretreatment number of roaches
treatment
Treatment indicator
senior
Indicator for only elderly residents in building
exposure2
Number of days for which the roach traps were used
tumors
Tarone (1982) provides a data set of tumor incidence in historical control groups of rats; specifically endometrial stromal polyps in female lab rats of type F344.
Source: Gelman and Hill (2007)
71 obs. of 2 variables
y
Number of rats with tumors
K
Number of rats
wells
A survey of 3200 residents in a small area of Bangladesh suffering from arsenic contamination of groundwater. Respondents with elevated arsenic levels in their wells had been encouraged to switch their water source to a safe public or private well in the nearby area and the survey was conducted several years later to learn which of the affected residents had switched wells.
Souce: Gelman and Hill (2007)
3020 obs. of 5 variables
switch
Indicator for wellswitching
arsenic
Arsenic level in respondent's well
dist
Distance (meters) from the respondent's house to the
nearest well with safe drinking water.
assoc
Indicator for member(s) of household participate
in community organizations
educ
Years of education (head of household)
Carpenter, B. (2009) Bayesian estimators for the betabinomial model of batting ability. https://web.archive.org/web/20220618114439/https://lingpipeblog.com/2009/09/23/
Efron, B. and Morris, C. (1975) Data analysis using Stein's estimator and its generalizations. Journal of the American Statistical Association 70(350), 311–319.
Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge, UK. https://stat.columbia.edu/~gelman/arm/
Hoel, D. and Walburg, H. (1972) Statistical analysis of survival experiments. The Annals of Statistics 18:1259–1294.
Royston, P. and Parmar, M. (2002) Flexible parametric proportionalhazards and proportionalodds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21(1):2175–2197.
Sauerbrei, W. and Royston, P. (1999) Building multivariable prognostic and diagnostic models: transformation of the predictors using fractional polynomials. Journal of the Royal Statistical Society, Series A 162:71–94.
Spiegelhalter, D., Thomas, A., Best, N., and Gilks, W. (1996) BUGS 0.5 Examples. MRC Biostatistics Unit, Institute of Public health, Cambridge, UK.
Tarone, R. E. (1982) The use of historical control information in testing for a trend in proportions. Biometrics 38(1):215–220.
Therneau, T. and Grambsch, P. (2000) Modeling Survival Data: Extending the Cox Model. SpringerVerlag, New York, US.
if (.Platform$OS.type != "windows") { # Using 'kidiq' dataset fit < stan_lm(kid_score ~ mom_hs * mom_iq, data = kidiq, prior = R2(location = 0.30, what = "mean"), # the next line is only to make the example go fast enough chains = 1, iter = 500, seed = 12345) pp_check(fit, nreps = 20) bayesplot::color_scheme_set("brightblue") pp_check(fit, plotfun = "stat_grouped", stat = "median", group = factor(kidiq$mom_hs, labels = c("No HS", "HS"))) }
if (.Platform$OS.type != "windows") { # Using 'kidiq' dataset fit < stan_lm(kid_score ~ mom_hs * mom_iq, data = kidiq, prior = R2(location = 0.30, what = "mean"), # the next line is only to make the example go fast enough chains = 1, iter = 500, seed = 12345) pp_check(fit, nreps = 20) bayesplot::color_scheme_set("brightblue") pp_check(fit, plotfun = "stat_grouped", stat = "median", group = factor(kidiq$mom_hs, labels = c("No HS", "HS"))) }
These functions are deprecated and will be removed in a future release. The Arguments section below provides details on how the functionality obtained via each of the arguments has been replaced.
prior_options( prior_scale_for_dispersion = 5, min_prior_scale = 1e12, scaled = TRUE )
prior_options( prior_scale_for_dispersion = 5, min_prior_scale = 1e12, scaled = TRUE )
prior_scale_for_dispersion , min_prior_scale , scaled

Arguments to
deprecated

Split a vector or matrix into a specified number of segments and return each segment as an element of a list. The matrix method allows splitting across the column (bycol = TRUE) or row margin (bycol = FALSE).
split2(x, n_segments = 1, ...)
split2(x, n_segments = 1, ...)
x 
A vector or matrix. 
n_segments 
Integer specifying the number of segments. 
... 
Additional arguments passed to the method. 
A list with n_segments elements.
Split a vector or matrix into a specified number of segments and return each segment as an element of a list. The matrix method allows splitting across the column (bycol = TRUE) or row margin (bycol = FALSE).
## S3 method for class 'matrix' split2(x, n_segments = 1, bycol = TRUE, ...)
## S3 method for class 'matrix' split2(x, n_segments = 1, bycol = TRUE, ...)
x 
A matrix 
n_segments 
Integer specifying the number of segments. 
bycol 
Logical, should a matrix be split along the column or row margin? 
... 
Additional arguments passed to the method. 
Split a vector or matrix into a specified number of segments and return each segment as an element of a list. The matrix method allows splitting across the column (bycol = TRUE) or row margin (bycol = FALSE).
## S3 method for class 'vector' split2(x, n_segments = 1, ...)
## S3 method for class 'vector' split2(x, n_segments = 1, ...)
x 
A vector 
n_segments 
Integer specifying the number of segments. 
... 
Additional arguments passed to the method. 
Bayesian inference for linear modeling with regularizing priors on the model
parameters that are driven by prior beliefs about $R^2$
, the proportion
of variance in the outcome attributable to the predictors. See
priors
for an explanation of this critical point.
stan_glm
with family="gaussian"
also estimates a linear
model with normallydistributed errors and allows for various other priors on
the coefficients.
stan_aov( formula, data, projections = FALSE, contrasts = NULL, ..., prior = R2(stop("'location' must be specified")), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm( formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, singular.ok = TRUE, contrasts = NULL, offset, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm.wfit( x, y, w, offset = NULL, singular.ok = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm.fit( x, y, offset = NULL, singular.ok = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL )
stan_aov( formula, data, projections = FALSE, contrasts = NULL, ..., prior = R2(stop("'location' must be specified")), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm( formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, singular.ok = TRUE, contrasts = NULL, offset, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm.wfit( x, y, w, offset = NULL, singular.ok = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_lm.fit( x, y, offset = NULL, singular.ok = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL )
formula , data , subset

Same as 
projections 
For 
... 
Further arguments passed to the function in the rstan
package ( Another useful argument that can be passed to rstan via 
prior 
Must be a call to 
prior_PD 
A logical scalar (defaulting to 
algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 
adapt_delta 
Only relevant if 
na.action , singular.ok , contrasts

Same as 
model , offset , weights

Same as 
x , y

In 
prior_intercept 
Either Note: If using a dense representation of the design matrix
—i.e., if the 
w 
Same as in 
The stan_lm
function is similar in syntax to the
lm
function but rather than choosing the parameters to
minimize the sum of squared residuals, samples from the posterior
distribution are drawn using MCMC (if algorithm
is
"sampling"
). The stan_lm
function has a formulabased
interface and would usually be called by users but the stan_lm.fit
and stan_lm.wfit
functions might be called by other functions that
parse the data themselves and are analogous to lm.fit
and lm.wfit
respectively.
In addition to estimating sigma
— the standard deviation of the
normallydistributed errors — this model estimates a positive parameter
called logfit_ratio
. If it is positive, the marginal posterior
variance of the outcome will exceed the sample variance of the outcome
by a multiplicative factor equal to the square of fit_ratio
.
Conversely if logfit_ratio
is negative, then the model underfits.
Given the regularizing nature of the priors, a slight underfit is good.
Finally, the posterior predictive distribution is generated with the predictors fixed at their sample means. This quantity is useful for checking convergence because it is reasonably normally distributed and a function of all the parameters in the model.
The stan_aov
function is similar to aov
, but
does a Bayesian analysis of variance that is basically equivalent to
stan_lm
with dummy variables. stan_aov
has a somewhat
customized print
method that prints an ANOVAlike table in
addition to the output printed for stan_lm
models.
A stanreg object is returned
for stan_lm, stan_aov
.
A stanfit object (or a slightly modified
stanfit object) is returned if stan_lm.fit or stan_lm.wfit
is called directly.
Lewandowski, D., Kurowicka D., and Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis. 100(9), 1989–2001.
The vignettes for stan_lm
and stan_aov
, which have more
thorough descriptions and examples.
https://mcstan.org/rstanarm/articles/
Also see stan_glm
, which — if family =
gaussian(link="identity")
— also estimates a linear model with
normallydistributed errors but specifies different priors.
if (.Platform$OS.type != "windows") { op < options(contrasts = c("contr.helmert", "contr.poly")) fit_aov < stan_aov(yield ~ block + N*P*K, data = npk, prior = R2(0.5), seed = 12345) options(op) print(fit_aov) } if (.Platform$OS.type != "windows"  .Platform$r_arch !="i386") { (fit < stan_lm(mpg ~ wt + qsec + am, data = mtcars, prior = R2(0.75), # the next line is only to make the example go fast enough chains = 1, iter = 300, seed = 12345, refresh = 0)) plot(fit, "hist", pars = c("wt", "am", "qsec", "sigma"), transformations = list(sigma = "log")) }
if (.Platform$OS.type != "windows") { op < options(contrasts = c("contr.helmert", "contr.poly")) fit_aov < stan_aov(yield ~ block + N*P*K, data = npk, prior = R2(0.5), seed = 12345) options(op) print(fit_aov) } if (.Platform$OS.type != "windows"  .Platform$r_arch !="i386") { (fit < stan_lm(mpg ~ wt + qsec + am, data = mtcars, prior = R2(0.75), # the next line is only to make the example go fast enough chains = 1, iter = 300, seed = 12345, refresh = 0)) plot(fit, "hist", pars = c("wt", "am", "qsec", "sigma"), transformations = list(sigma = "log")) }
Beta regression modeling with optional prior distributions for the
coefficients, intercept, and auxiliary parameter phi
(if applicable).
stan_betareg( formula, data, subset, na.action, weights, offset, link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, model = TRUE, y = TRUE, x = FALSE, ..., prior = normal(autoscale = TRUE), prior_intercept = normal(autoscale = TRUE), prior_z = normal(autoscale = TRUE), prior_intercept_z = normal(autoscale = TRUE), prior_phi = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE ) stan_betareg.fit( x, y, z = NULL, weights = rep(1, NROW(x)), offset = rep(0, NROW(x)), link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, ..., prior = normal(autoscale = TRUE), prior_intercept = normal(autoscale = TRUE), prior_z = normal(autoscale = TRUE), prior_intercept_z = normal(autoscale = TRUE), prior_phi = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE )
stan_betareg( formula, data, subset, na.action, weights, offset, link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, model = TRUE, y = TRUE, x = FALSE, ..., prior = normal(autoscale = TRUE), prior_intercept = normal(autoscale = TRUE), prior_z = normal(autoscale = TRUE), prior_intercept_z = normal(autoscale = TRUE), prior_phi = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE ) stan_betareg.fit( x, y, z = NULL, weights = rep(1, NROW(x)), offset = rep(0, NROW(x)), link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, ..., prior = normal(autoscale = TRUE), prior_intercept = normal(autoscale = TRUE), prior_z = normal(autoscale = TRUE), prior_intercept_z = normal(autoscale = TRUE), prior_phi = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE )
formula , data , subset

Same as 

na.action 
Same as 

link 
Character specification of the link function used in the model
for mu (specified through 

link.phi 
If applicable, character specification of the link function
used in the model for 

model , offset , weights

Same as 

x , y

In 

... 
Further arguments passed to the function in the rstan
package ( Another useful argument that can be passed to rstan via 

prior 
The prior distribution for the (nonhierarchical) regression coefficients. The default priors are described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default,
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
Note: Unless 

prior_intercept 
The prior distribution for the intercept (after centering all predictors, see note below). The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, Note: If using a dense representation of the design matrix
—i.e., if the 

prior_z 
Prior distribution for the coefficients in the model for


prior_intercept_z 
Prior distribution for the intercept in the model
for 

prior_phi 
The prior distribution for 

prior_PD 
A logical scalar (defaulting to 

algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 

adapt_delta 
Only relevant if 

QR 
A logical scalar defaulting to 

z 
For 
The stan_betareg
function is similar in syntax to
betareg
but rather than performing maximum
likelihood estimation, full Bayesian estimation is performed (if
algorithm
is "sampling"
) via MCMC. The Bayesian model adds
priors (independent by default) on the coefficients of the beta regression
model. The stan_betareg
function calls the workhorse
stan_betareg.fit
function, but it is also possible to call the
latter directly.
A stanreg object is returned
for stan_betareg
.
A stanfit object (or a slightly modified
stanfit object) is returned if stan_betareg.fit
is called directly.
Ferrari, SLP and CribariNeto, F (2004). Beta regression for modeling rates and proportions. Journal of Applied Statistics. 31(7), 799–815.
stanregmethods
and
betareg
.
The vignette for stan_betareg
.
https://mcstan.org/rstanarm/articles/
if (.Platform$OS.type != "windows") { ### Simulated data N < 200 x < rnorm(N, 2, 1) z < rnorm(N, 2, 1) mu < binomial(link = "logit")$linkinv(1 + 0.2*x) phi < exp(1.5 + 0.4*z) y < rbeta(N, mu * phi, (1  mu) * phi) hist(y, col = "dark grey", border = FALSE, xlim = c(0,1)) fake_dat < data.frame(y, x, z) fit < stan_betareg( y ~ x  z, data = fake_dat, link = "logit", link.phi = "log", algorithm = "optimizing" # just for speed of example ) print(fit, digits = 2) }
if (.Platform$OS.type != "windows") { ### Simulated data N < 200 x < rnorm(N, 2, 1) z < rnorm(N, 2, 1) mu < binomial(link = "logit")$linkinv(1 + 0.2*x) phi < exp(1.5 + 0.4*z) y < rbeta(N, mu * phi, (1  mu) * phi) hist(y, col = "dark grey", border = FALSE, xlim = c(0,1)) fake_dat < data.frame(y, x, z) fit < stan_betareg( y ~ x  z, data = fake_dat, link = "logit", link.phi = "log", algorithm = "optimizing" # just for speed of example ) print(fit, digits = 2) }
This is the same model as with stan_lm
but it utilizes the
output from biglm
in the biglm package in order to
proceed when the data is too large to fit in memory.
stan_biglm( biglm, xbar, ybar, s_y, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_biglm.fit( b, R, SSR, N, xbar, ybar, s_y, has_intercept = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank", "optimizing"), adapt_delta = NULL, importance_resampling = TRUE, keep_every = 1 )
stan_biglm( biglm, xbar, ybar, s_y, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL ) stan_biglm.fit( b, R, SSR, N, xbar, ybar, s_y, has_intercept = TRUE, ..., prior = R2(stop("'location' must be specified")), prior_intercept = NULL, prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank", "optimizing"), adapt_delta = NULL, importance_resampling = TRUE, keep_every = 1 )
biglm 
The list output by 
xbar 
A numeric vector of column means in the implicit design matrix excluding the intercept for the observations included in the model. 
ybar 
A numeric scalar indicating the mean of the outcome for the observations included in the model. 
s_y 
A numeric scalar indicating the unbiased sample standard deviation of the outcome for the observations included in the model. 
... 
Further arguments passed to the function in the rstan
package ( Another useful argument that can be passed to rstan via 
prior 
Must be a call to 
prior_intercept 
Either Note: If using a dense representation of the design matrix
—i.e., if the 
prior_PD 
A logical scalar (defaulting to 
algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 
adapt_delta 
Only relevant if 
b 
A numeric vector of OLS coefficients, excluding the intercept 
R 
A square uppertriangular matrix from the QR decomposition of the design matrix, excluding the intercept 
SSR 
A numeric scalar indicating the sumofsquared residuals for OLS 
N 
A integer scalar indicating the number of included observations 
has_intercept 
A logical scalar indicating whether to add an intercept to the model when estimating it. 
importance_resampling 
Logical scalar indicating whether to use
importance resampling when approximating the posterior distribution with
a multivariate normal around the posterior mode, which only applies
when 
keep_every 
Positive integer, which defaults to 1, but can be higher
in order to thin the importance sampling realizations and also only
apples when 
The stan_biglm
function is intended to be used in the same
circumstances as the biglm
function in the biglm
package but with an informative prior on the $R^2$
of the regression.
Like biglm
, the memory required to estimate the model
depends largely on the number of predictors rather than the number of
observations. However, stan_biglm
and stan_biglm.fit
have
additional required arguments that are not necessary in
biglm
, namely xbar
, ybar
, and s_y
.
If any observations have any missing values on any of the predictors or the
outcome, such observations do not contribute to these statistics.
The output of both stan_biglm
and stan_biglm.fit
is an
object of stanfitclass
rather than
stanregobjects
, which is more limited and less convenient
but necessitated by the fact that stan_biglm
does not bring the full
design matrix into memory. Without the full design matrix,some of the
elements of a stanregobjects
object cannot be calculated,
such as residuals. Thus, the functions in the rstanarm package that
input stanregobjects
, such as
posterior_predict
cannot be used.
if (.Platform$OS.type != "windows") { # create inputs ols < lm(mpg ~ wt + qsec + am, data = mtcars, # all row are complete so ... na.action = na.exclude) # not necessary in this case b < coef(ols)[1] R < qr.R(ols$qr)[1,1] SSR < crossprod(ols$residuals)[1] not_NA < !is.na(fitted(ols)) N < sum(not_NA) xbar < colMeans(mtcars[not_NA,c("wt", "qsec", "am")]) y < mtcars$mpg[not_NA] ybar < mean(y) s_y < sd(y) post < stan_biglm.fit(b, R, SSR, N, xbar, ybar, s_y, prior = R2(.75), # the next line is only to make the example go fast chains = 1, iter = 500, seed = 12345) cbind(lm = b, stan_lm = rstan::get_posterior_mean(post)[13:15,]) # shrunk }
if (.Platform$OS.type != "windows") { # create inputs ols < lm(mpg ~ wt + qsec + am, data = mtcars, # all row are complete so ... na.action = na.exclude) # not necessary in this case b < coef(ols)[1] R < qr.R(ols$qr)[1,1] SSR < crossprod(ols$residuals)[1] not_NA < !is.na(fitted(ols)) N < sum(not_NA) xbar < colMeans(mtcars[not_NA,c("wt", "qsec", "am")]) y < mtcars$mpg[not_NA] ybar < mean(y) s_y < sd(y) post < stan_biglm.fit(b, R, SSR, N, xbar, ybar, s_y, prior = R2(.75), # the next line is only to make the example go fast chains = 1, iter = 500, seed = 12345) cbind(lm = b, stan_lm = rstan::get_posterior_mean(post)[13:15,]) # shrunk }
A model for casecontrol studies with optional prior distributions for the coefficients, intercept, and auxiliary parameters.
stan_clogit( formula, data, subset, na.action = NULL, contrasts = NULL, ..., strata, prior = normal(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE )
stan_clogit( formula, data, subset, na.action = NULL, contrasts = NULL, ..., strata, prior = normal(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE )
formula , data , subset , na.action , contrasts

Same as for 

... 
Further arguments passed to the function in the rstan
package ( Another useful argument that can be passed to rstan via 

strata 
A factor indicating the groups in the data where the number of
successes (possibly one) is fixed by the research design. It may be useful
to use 

prior 
The prior distribution for the (nonhierarchical) regression coefficients. The default priors are described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default,
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
Note: Unless 

prior_covariance 
Cannot be 

prior_PD 
A logical scalar (defaulting to 

algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 

adapt_delta 
Only relevant if 

QR 
A logical scalar defaulting to 

sparse 
A logical scalar (defaulting to 
The stan_clogit
function is mostly similar in syntax to
clogit
but rather than performing maximum
likelihood estimation of generalized linear models, full Bayesian
estimation is performed (if algorithm
is "sampling"
) via
MCMC. The Bayesian model adds priors (independent by default) on the
coefficients of the GLM.
The data.frame
passed to the data
argument must be sorted by
the variable passed to the strata
argument.
The formula
may have groupspecific terms like in
stan_glmer
but should not allow the intercept to vary by the
stratifying variable, since there is no information in the data with which
to estimate such deviations in the intercept.
A stanreg object is returned
for stan_clogit
.
stanregmethods
and
clogit
.
The vignette for Bernoulli and binomial models, which has more
details on using stan_clogit
.
https://mcstan.org/rstanarm/articles/
if (.Platform$OS.type != "windows") { dat < infert[order(infert$stratum), ] # order by strata post < stan_clogit(case ~ spontaneous + induced + (1  education), strata = stratum, data = dat, subset = parity <= 2, QR = TRUE, chains = 2, iter = 500) # for speed only nd < dat[dat$parity > 2, c("case", "spontaneous", "induced", "education", "stratum")] # next line would fail without case and stratum variables pr < posterior_epred(post, newdata = nd) # get predicted probabilities # not a random variable b/c probabilities add to 1 within strata all.equal(rep(sum(nd$case), nrow(pr)), rowSums(pr)) }
if (.Platform$OS.type != "windows") { dat < infert[order(infert$stratum), ] # order by strata post < stan_clogit(case ~ spontaneous + induced + (1  education), strata = stratum, data = dat, subset = parity <= 2, QR = TRUE, chains = 2, iter = 500) # for speed only nd < dat[dat$parity > 2, c("case", "spontaneous", "induced", "education", "stratum")] # next line would fail without case and stratum variables pr < posterior_epred(post, newdata = nd) # get predicted probabilities # not a random variable b/c probabilities add to 1 within strata all.equal(rep(sum(nd$case), nrow(pr)), rowSums(pr)) }
Bayesian inference for GAMMs with flexible priors.
stan_gamm4( formula, random = NULL, family = gaussian(), data, weights = NULL, subset = NULL, na.action, knots = NULL, drop.unused.levels = TRUE, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_smooth = exponential(autoscale = FALSE), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE ) plot_nonlinear( x, smooths, ..., prob = 0.9, facet_args = list(), alpha = 1, size = 0.75 )
stan_gamm4( formula, random = NULL, family = gaussian(), data, weights = NULL, subset = NULL, na.action, knots = NULL, drop.unused.levels = TRUE, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_smooth = exponential(autoscale = FALSE), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE ) plot_nonlinear( x, smooths, ..., prob = 0.9, facet_args = list(), alpha = 1, size = 0.75 )
formula , random , family , data , knots , drop.unused.levels

Same as for


subset , weights , na.action

Same as 

... 
Further arguments passed to 

prior 
The prior distribution for the (nonhierarchical) regression coefficients. The default priors are described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default,
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
Note: Unless 

prior_intercept 
The prior distribution for the intercept (after centering all predictors, see note below). The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, Note: If using a dense representation of the design matrix
—i.e., if the 

prior_smooth 
The prior distribution for the hyperparameters in GAMs, with lower values yielding less flexible smooth functions.


prior_aux 
The prior distribution for the "auxiliary" parameter (if
applicable). The "auxiliary" parameter refers to a different parameter
depending on the The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, 

prior_covariance 
Cannot be 

prior_PD 
A logical scalar (defaulting to 

algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 

adapt_delta 
Only relevant if 

QR 
A logical scalar defaulting to 

sparse 
A logical scalar (defaulting to 

x 
An object produced by 

smooths 
An optional character vector specifying a subset of the smooth
functions specified in the call to 

prob 
For univarite smooths, a scalar between 0 and 1 governing the width of the uncertainty interval. 

facet_args 
An optional named list of arguments passed to


alpha , size

For univariate smooths, passed to

The stan_gamm4
function is similar in syntax to
gamm4
in the gamm4 package. But rather than performing
(restricted) maximum likelihood estimation with the lme4 package,
the stan_gamm4
function utilizes MCMC to perform Bayesian
estimation. The Bayesian model adds priors on the common regression
coefficients (in the same way as stan_glm
), priors on the
standard deviations of the smooth terms, and a prior on the decomposition
of the covariance matrices of any groupspecific parameters (as in
stan_glmer
). Estimating these models via MCMC avoids
the optimization issues that often crop up with GAMMs and provides better
estimates for the uncertainty in the parameter estimates.
See gamm4
for more information about the model
specicification and priors
for more information about the
priors on the main coefficients. The formula
should include at least
one smooth term, which can be specified in any way that is supported by the
jagam
function in the mgcv package. The
prior_smooth
argument should be used to specify a prior on the unknown
standard deviations that govern how smooth the smooth function is. The
prior_covariance
argument can be used to specify the prior on the
components of the covariance matrix for any (optional) groupspecific terms.
The gamm4
function in the gamm4 package uses
groupspecific terms to implement the departure from linearity in the smooth
terms, but that is not the case for stan_gamm4
where the groupspecific
terms are exactly the same as in stan_glmer
.
The plot_nonlinear
function creates a ggplot object with one facet for
each smooth function specified in the call to stan_gamm4
in the case
where all smooths are univariate. A subset of the smooth functions can be
specified using the smooths
argument, which is necessary to plot a
bivariate smooth or to exclude the bivariate smooth and plot the univariate
ones. In the bivariate case, a plot is produced using
geom_contour
. In the univariate case, the resulting
plot is conceptually similar to plot.gam
except the
outer lines here demark the edges of posterior uncertainty intervals
(credible intervals) rather than confidence intervals and the inner line
is the posterior median of the function rather than the function implied
by a point estimate. To change the colors used in the plot see
color_scheme_set
.
A stanreg object is returned
for stan_gamm4
.
plot_nonlinear
returns a ggplot object.
Crainiceanu, C., Ruppert D., and Wand, M. (2005). Bayesian analysis for penalized spline regression using WinBUGS. Journal of Statistical Software. 14(14), 1–22. doi:10.18637/jss.v014.i14
stanregmethods
and
gamm4
.
The vignette for stan_glmer
, which also discusses
stan_gamm4
. https://mcstan.org/rstanarm/articles/
if (.Platform$OS.type != "windows") { # from example(gamm4, package = "gamm4"), prefixing gamm4() call with stan_ dat < mgcv::gamSim(1, n = 400, scale = 2) ## simulate 4 term additive truth ## Now add 20 level random effect `fac'... dat$fac < fac < as.factor(sample(1:20, 400, replace = TRUE)) dat$y < dat$y + model.matrix(~ fac  1) %*% rnorm(20) * .5 br < stan_gamm4(y ~ s(x0) + x1 + s(x2), data = dat, random = ~ (1  fac), chains = 1, iter = 500) # for example speed print(br) plot_nonlinear(br) plot_nonlinear(br, smooths = "s(x0)", alpha = 2/3) }
if (.Platform$OS.type != "windows") { # from example(gamm4, package = "gamm4"), prefixing gamm4() call with stan_ dat < mgcv::gamSim(1, n = 400, scale = 2) ## simulate 4 term additive truth ## Now add 20 level random effect `fac'... dat$fac < fac < as.factor(sample(1:20, 400, replace = TRUE)) dat$y < dat$y + model.matrix(~ fac  1) %*% rnorm(20) * .5 br < stan_gamm4(y ~ s(x0) + x1 + s(x2), data = dat, random = ~ (1  fac), chains = 1, iter = 500) # for example speed print(br) plot_nonlinear(br) plot_nonlinear(br, smooths = "s(x0)", alpha = 2/3) }
Generalized linear modeling with optional prior distributions for the coefficients, intercept, and auxiliary parameters.
stan_glm( formula, family = gaussian(), data, weights, subset, na.action = NULL, offset = NULL, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing" && !prior_PD, adapt_delta = NULL, QR = FALSE, sparse = FALSE ) stan_glm.nb( formula, data, weights, subset, na.action = NULL, offset = NULL, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, link = "log", ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing", adapt_delta = NULL, QR = FALSE ) stan_glm.fit( x, y, weights = rep(1, NROW(y)), offset = rep(0, NROW(y)), family = gaussian(), ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_smooth = exponential(autoscale = FALSE), prior_ops = NULL, group = list(), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing" && !prior_PD, adapt_delta = NULL, QR = FALSE, sparse = FALSE, importance_resampling = algorithm != "sampling", keep_every = algorithm != "sampling" )
stan_glm( formula, family = gaussian(), data, weights, subset, na.action = NULL, offset = NULL, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing" && !prior_PD, adapt_delta = NULL, QR = FALSE, sparse = FALSE ) stan_glm.nb( formula, data, weights, subset, na.action = NULL, offset = NULL, model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, link = "log", ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing", adapt_delta = NULL, QR = FALSE ) stan_glm.fit( x, y, weights = rep(1, NROW(y)), offset = rep(0, NROW(y)), family = gaussian(), ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_smooth = exponential(autoscale = FALSE), prior_ops = NULL, group = list(), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), mean_PPD = algorithm != "optimizing" && !prior_PD, adapt_delta = NULL, QR = FALSE, sparse = FALSE, importance_resampling = algorithm != "sampling", keep_every = algorithm != "sampling" )
formula , data , subset

Same as 

family 
Same as 

na.action , contrasts

Same as 

model , offset , weights

Same as 

x 
In 

y 
In 

... 
Further arguments passed to the function in the rstan
package ( Another useful argument that can be passed to rstan via 

prior 
The prior distribution for the (nonhierarchical) regression coefficients. The default priors are described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default,
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
Note: Unless 

prior_intercept 
The prior distribution for the intercept (after centering all predictors, see note below). The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, Note: If using a dense representation of the design matrix
—i.e., if the 

prior_aux 
The prior distribution for the "auxiliary" parameter (if
applicable). The "auxiliary" parameter refers to a different parameter
depending on the The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, 

prior_PD 
A logical scalar (defaulting to 

algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 

mean_PPD 
A logical value indicating whether the sample mean of the
posterior predictive distribution of the outcome should be calculated in
the 

adapt_delta 
Only relevant if 

QR 
A logical scalar defaulting to 

sparse 
A logical scalar (defaulting to 

link 
For 

prior_smooth 
The prior distribution for the hyperparameters in GAMs, with lower values yielding less flexible smooth functions.


prior_ops 
Deprecated. See rstanarmdeprecated for details. 

group 
A list, possibly of length zero (the default), but otherwise
having the structure of that produced by 

importance_resampling 
Logical scalar indicating whether to use
importance resampling when approximating the posterior distribution with
a multivariate normal around the posterior mode, which only applies
when 

keep_every 
Positive integer, which defaults to 1, but can be higher
in order to "thin" the importance sampling realizations. Applies only
when 
The stan_glm
function is similar in syntax to
glm
but rather than performing maximum likelihood
estimation of generalized linear models, full Bayesian estimation is
performed (if algorithm
is "sampling"
) via MCMC. The Bayesian
model adds priors (independent by default) on the coefficients of the GLM.
The stan_glm
function calls the workhorse stan_glm.fit
function, but it is also possible to call the latter directly.
The stan_glm.nb
function, which takes the extra argument
link
, is a wrapper for stan_glm
with family =
neg_binomial_2(link)
.
A stanreg object is returned
for stan_glm, stan_glm.nb
.
A stanfit object (or a slightly modified
stanfit object) is returned if stan_glm.fit
is called directly.
Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge, UK. (Ch. 36)
Muth, C., Oravecz, Z., and Gabry, J. (2018) Userfriendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology. 14(2), 99–119. https://www.tqmp.org/RegularArticles/vol142/p099/p099.pdf
stanregmethods
and
glm
.
The various vignettes for stan_glm
at
https://mcstan.org/rstanarm/articles/.
if (.Platform$OS.type != "windows") { ### Linear regression mtcars$mpg10 < mtcars$mpg / 10 fit < stan_glm( mpg10 ~ wt + cyl + am, data = mtcars, QR = TRUE, # for speed of example only (default is "sampling") algorithm = "fullrank", refresh = 0 ) plot(fit, prob = 0.5) plot(fit, prob = 0.5, pars = "beta") plot(fit, "hist", pars = "sigma") ### Logistic regression head(wells) wells$dist100 < wells$dist / 100 fit2 < stan_glm( switch ~ dist100 + arsenic, data = wells, family = binomial(link = "logit"), prior_intercept = normal(0, 10), QR = TRUE, refresh = 0, # for speed of example only chains = 2, iter = 200 ) print(fit2) prior_summary(fit2) # ?bayesplot::mcmc_areas plot(fit2, plotfun = "areas", prob = 0.9, pars = c("(Intercept)", "arsenic")) # ?bayesplot::ppc_error_binned pp_check(fit2, plotfun = "error_binned") ### Poisson regression (example from help("glm")) count_data < data.frame( counts = c(18,17,15,20,10,20,25,13,12), outcome = gl(3,1,9), treatment = gl(3,3) ) fit3 < stan_glm( counts ~ outcome + treatment, data = count_data, family = poisson(link="log"), prior = normal(0, 2), refresh = 0, # for speed of example only chains = 2, iter = 250 ) print(fit3) bayesplot::color_scheme_set("viridis") plot(fit3) plot(fit3, regex_pars = c("outcome", "treatment")) plot(fit3, plotfun = "combo", regex_pars = "treatment") # ?bayesplot::mcmc_combo posterior_vs_prior(fit3, regex_pars = c("outcome", "treatment")) ### Gamma regression (example from help("glm")) clotting < data.frame(log_u = log(c(5,10,15,20,30,40,60,80,100)), lot1 = c(118,58,42,35,27,25,21,19,18), lot2 = c(69,35,26,21,18,16,13,12,12)) fit4 < stan_glm( lot1 ~ log_u, data = clotting, family = Gamma(link="log"), iter = 500, # for speed of example only refresh = 0 ) print(fit4, digits = 2) fit5 < update(fit4, formula = lot2 ~ log_u) # ?bayesplot::ppc_dens_overlay bayesplot::bayesplot_grid( pp_check(fit4, seed = 123), pp_check(fit5, seed = 123), titles = c("lot1", "lot2") ) ### Negative binomial regression fit6 < stan_glm.nb( Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine, link = "log", prior_aux = exponential(1.5, autoscale=TRUE), chains = 2, iter = 200, # for speed of example only refresh = 0 ) prior_summary(fit6) bayesplot::color_scheme_set("brightblue") plot(fit6) pp_check(fit6, plotfun = "hist", nreps = 5) # ?bayesplot::ppc_hist # 80% interval of estimated reciprocal_dispersion parameter posterior_interval(fit6, pars = "reciprocal_dispersion", prob = 0.8) plot(fit6, "areas", pars = "reciprocal_dispersion", prob = 0.8) }
if (.Platform$OS.type != "windows") { ### Linear regression mtcars$mpg10 < mtcars$mpg / 10 fit < stan_glm( mpg10 ~ wt + cyl + am, data = mtcars, QR = TRUE, # for speed of example only (default is "sampling") algorithm = "fullrank", refresh = 0 ) plot(fit, prob = 0.5) plot(fit, prob = 0.5, pars = "beta") plot(fit, "hist", pars = "sigma") ### Logistic regression head(wells) wells$dist100 < wells$dist / 100 fit2 < stan_glm( switch ~ dist100 + arsenic, data = wells, family = binomial(link = "logit"), prior_intercept = normal(0, 10), QR = TRUE, refresh = 0, # for speed of example only chains = 2, iter = 200 ) print(fit2) prior_summary(fit2) # ?bayesplot::mcmc_areas plot(fit2, plotfun = "areas", prob = 0.9, pars = c("(Intercept)", "arsenic")) # ?bayesplot::ppc_error_binned pp_check(fit2, plotfun = "error_binned") ### Poisson regression (example from help("glm")) count_data < data.frame( counts = c(18,17,15,20,10,20,25,13,12), outcome = gl(3,1,9), treatment = gl(3,3) ) fit3 < stan_glm( counts ~ outcome + treatment, data = count_data, family = poisson(link="log"), prior = normal(0, 2), refresh = 0, # for speed of example only chains = 2, iter = 250 ) print(fit3) bayesplot::color_scheme_set("viridis") plot(fit3) plot(fit3, regex_pars = c("outcome", "treatment")) plot(fit3, plotfun = "combo", regex_pars = "treatment") # ?bayesplot::mcmc_combo posterior_vs_prior(fit3, regex_pars = c("outcome", "treatment")) ### Gamma regression (example from help("glm")) clotting < data.frame(log_u = log(c(5,10,15,20,30,40,60,80,100)), lot1 = c(118,58,42,35,27,25,21,19,18), lot2 = c(69,35,26,21,18,16,13,12,12)) fit4 < stan_glm( lot1 ~ log_u, data = clotting, family = Gamma(link="log"), iter = 500, # for speed of example only refresh = 0 ) print(fit4, digits = 2) fit5 < update(fit4, formula = lot2 ~ log_u) # ?bayesplot::ppc_dens_overlay bayesplot::bayesplot_grid( pp_check(fit4, seed = 123), pp_check(fit5, seed = 123), titles = c("lot1", "lot2") ) ### Negative binomial regression fit6 < stan_glm.nb( Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine, link = "log", prior_aux = exponential(1.5, autoscale=TRUE), chains = 2, iter = 200, # for speed of example only refresh = 0 ) prior_summary(fit6) bayesplot::color_scheme_set("brightblue") plot(fit6) pp_check(fit6, plotfun = "hist", nreps = 5) # ?bayesplot::ppc_hist # 80% interval of estimated reciprocal_dispersion parameter posterior_interval(fit6, pars = "reciprocal_dispersion", prob = 0.8) plot(fit6, "areas", pars = "reciprocal_dispersion", prob = 0.8) }
Bayesian inference for GLMs with groupspecific coefficients that have unknown covariance matrices with flexible priors.
stan_glmer( formula, data = NULL, family = gaussian, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE ) stan_lmer( formula, data = NULL, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE ) stan_glmer.nb( formula, data = NULL, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, link = "log", ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE )
stan_glmer( formula, data = NULL, family = gaussian, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE, sparse = FALSE ) stan_lmer( formula, data = NULL, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE ) stan_glmer.nb( formula, data = NULL, subset, weights, na.action = getOption("na.action", "na.omit"), offset, contrasts = NULL, link = "log", ..., prior = default_prior_coef(family), prior_intercept = default_prior_intercept(family), prior_aux = exponential(autoscale = TRUE), prior_covariance = decov(), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE )
formula , data

Same as for 

family 
Same as for 

subset , weights , offset

Same as 

na.action , contrasts

Same as 

... 
For 

prior 
The prior distribution for the (nonhierarchical) regression coefficients. The default priors are described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default,
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
Note: Unless 

prior_intercept 
The prior distribution for the intercept (after centering all predictors, see note below). The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, Note: If using a dense representation of the design matrix
—i.e., if the 

prior_aux 
The prior distribution for the "auxiliary" parameter (if
applicable). The "auxiliary" parameter refers to a different parameter
depending on the The default prior is described in the vignette
Prior
Distributions for rstanarm Models.
If not using the default, 

prior_covariance 
Cannot be 

prior_PD 
A logical scalar (defaulting to 

algorithm 
A string (possibly abbreviated) indicating the
estimation approach to use. Can be 

adapt_delta 
Only relevant if 

QR 
A logical scalar defaulting to 

sparse 
A logical scalar (defaulting to 

link 
For 
The stan_glmer
function is similar in syntax to
glmer
but rather than performing (restricted) maximum
likelihood estimation of generalized linear models, Bayesian estimation is
performed via MCMC. The Bayesian model adds priors on the
regression coefficients (in the same way as stan_glm
) and
priors on the terms of a decomposition of the covariance matrices of the
groupspecific parameters. See priors
for more information
about the priors.
The stan_lmer
function is equivalent to stan_glmer
with
family = gaussian(link = "identity")
.
The stan_glmer.nb
function, which takes the extra argument
link
, is a wrapper for stan_glmer
with family =
neg_binomial_2(link)
.
A stanreg object is returned
for stan_glmer, stan_lmer, stan_glmer.nb
.
A list with classes stanreg
, glm
, lm
,
and lmerMod
. The conventions for the parameter names are the
same as in the lme4 package with the addition that the standard
deviation of the errors is called sigma
and the variancecovariance
matrix of the groupspecific deviations from the common parameters is
called Sigma
, even if this variancecovariance matrix only has
one row and one column (in which case it is just the grouplevel variance).
Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge, UK. (Ch. 1115)
Muth, C., Oravecz, Z., and Gabry, J. (2018) Userfriendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology. 14(2), 99–119. https://www.tqmp.org/RegularArticles/vol142/p099/p099.pdf
stanregmethods
and
glmer
.
The vignette for stan_glmer
and the Hierarchical
Partial Pooling vignette. https://mcstan.org/rstanarm/articles/
if (.Platform$OS.type != "windows") { # see help(example_model) for details on the model below if (!exists("example_model")) example(example_model) print(example_model, digits = 1) }
if (.Platform$OS.type != "windows") { # see help(example_model) for details on the model below if (!exists("example_model")) example(example_model) print(example_model, digits = 1) }
Fits a shared parameter joint model for longitudinal and timetoevent (e.g. survival) data under a Bayesian framework using Stan.
stan_jm( formulaLong, dataLong, formulaEvent, dataEvent, time_var, id_var, family = gaussian, assoc = "etavalue", lag_assoc = 0, grp_assoc, scale_assoc = NULL, epsilon = 1e05, basehaz = c("bs", "weibull", "piecewise"), basehaz_ops, qnodes = 15, init = "prefit", weights, priorLong = normal(autoscale = TRUE), priorLong_intercept = normal(autoscale = TRUE), priorLong_aux = cauchy(0, 5, autoscale = TRUE), priorEvent = normal(autoscale = TRUE), priorEvent_intercept = normal(autoscale = TRUE), priorEvent_aux = cauchy(autoscale = TRUE), priorEvent_assoc = normal(autoscale = TRUE), prior_covariance = lkj(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, max_treedepth = 10L, QR = FALSE, sparse = FALSE, ... )
stan_jm( formulaLong, dataLong, formulaEvent, dataEvent, time_var, id_var, family = gaussian, assoc = "etavalue", lag_assoc = 0, grp_assoc, scale_assoc = NULL, epsilon = 1e05, basehaz = c("bs", "weibull", "piecewise"), basehaz_ops, qnodes = 15, init = "prefit", weights, priorLong = normal(autoscale = TRUE), priorLong_intercept = normal(autoscale = TRUE), priorLong_aux = cauchy(0, 5, autoscale = TRUE), priorEvent = normal(autoscale = TRUE), priorEvent_intercept = normal(autoscale = TRUE), priorEvent_aux = cauchy(autoscale = TRUE), priorEvent_assoc = normal(autoscale = TRUE), prior_covariance = lkj(autoscale = TRUE), prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"), adapt_delta = NULL, max_treedepth = 10L, QR = FALSE, sparse = FALSE, ... )
formulaLong 
A twosided linear formula object describing both the
fixedeffects and randomeffects parts of the longitudinal submodel,
similar in vein to formula specification in the lme4 package
(see 
dataLong 
A data frame containing the variables specified in

formulaEvent 
A twosided formula object describing the event
submodel. The left hand side of the formula should be a 
dataEvent 
A data frame containing the variables specified in

time_var 
A character string specifying the name of the variable
in 
id_var 
A character string specifying the name of the variable in

family 
The family (and possibly also the link function) for the
longitudinal submodel(s). See 
assoc 
A character string or character vector specifying the joint
model association structure. Possible association structures that can
be used include: "etavalue" (the default); "etaslope"; "etaauc";
"muvalue"; "muslope"; "muauc"; "shared_b"; "shared_coef"; or "null".
These are described in the Details section below. For a multivariate
joint model, different association structures can optionally be used for
each longitudinal submodel by specifying a list of character
vectors, with each element of the list specifying the desired association
structure for one of the longitudinal submodels. Specifying 
lag_assoc 
A nonnegative scalar specifying the time lag that should be
used for the association structure. That is, the hazard of the event at
time t will be assumed to be associated with the value/slope/auc of
the longitudinal marker at time tu, where u is the time lag.
If fitting a multivariate joint model, then a different time lag can be used
for each longitudinal marker by providing a numeric vector of lags, otherwise
if a scalar is provided then the specified time lag will be used for all
longitudinal markers. Note however that only one time lag can be specified
for linking each longitudinal marker to the
event, and that that time lag will be used for all association structure
types (e.g. 