Title: | Tools for Developing R Packages Interfacing with 'Stan' |
---|---|
Description: | Provides various tools for developers of R packages interfacing with 'Stan' <https://mc-stan.org>, including functions to set up the required package structure, S3 generics and default methods to unify function naming across 'Stan'-based R packages, and vignettes with recommendations for developers. |
Authors: | Jonah Gabry [aut, cre], Ben Goodrich [aut], Martin Lysy [aut], Andrew Johnson [aut], Hamada S. Badr [ctb], Marco Colombo [ctb], Stefan Siegert [ctb], Trustees of Columbia University [cph] |
Maintainer: | Jonah Gabry <[email protected]> |
License: | GPL (>=3) |
Version: | 2.4.0.9000 |
Built: | 2024-12-16 03:29:21 UTC |
Source: | https://github.com/stan-dev/rstantools |
Stan Development Team
The rstantools package provides various tools for developers of R
packages interfacing with Stan (https://mc-stan.org), including
functions to set up the required package structure, S3 generic methods to
unify function naming across Stan-based R packages, and vignettes with
guidelines for developers. To get started building a package see
rstan_create_package()
.
Maintainer: Jonah Gabry [email protected]
Authors:
Ben Goodrich [email protected]
Martin Lysy [email protected]
Andrew Johnson
Other contributors:
Hamada S. Badr [contributor]
Marco Colombo [contributor]
Stefan Siegert [contributor]
Trustees of Columbia University [copyright holder]
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
After reading the guidelines for developers, if you have trouble setting up your package let us know on the the Stan Forums or at the rstantools GitHub issue tracker.
Useful links:
Report bugs at https://github.com/stan-dev/rstantools/issues
Generic function and default method for Bayesian version of R-squared for regression models. A generic for LOO-adjusted R-squared is also provided. See the bayes_R2.stanreg() method in the rstanarm package for an example of defining a method.
bayes_R2(object, ...) ## Default S3 method: bayes_R2(object, y, ...) loo_R2(object, ...)
bayes_R2(object, ...) ## Default S3 method: bayes_R2(object, y, ...) loo_R2(object, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
y |
For the default method, a vector of |
bayes_R2()
and loo_R2()
methods should return a vector of
length equal to the posterior sample size.
The default bayes_R2()
method just takes object
to be a matrix of y-hat
values (one column per observation, one row per posterior draw) and y
to
be a vector with length equal to ncol(object)
.
Andrew Gelman, Ben Goodrich, Jonah Gabry, and Aki Vehtari (2018). R-squared for Bayesian regression models. The American Statistician, to appear. DOI: 10.1080/00031305.2018.1549100. (Preprint, Notebook)
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
We define a new function log_lik()
rather than a
stats::logLik()
method because (in addition to the conceptual
difference) the documentation for logLik()
states that the return value
will be a single number, whereas log_lik()
returns a matrix. See
the log_lik.stanreg()
method in the rstanarm package for an example of defining a method.
log_lik(object, ...)
log_lik(object, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
log_lik()
methods should return a by
matrix,
where
is the size of the posterior sample (the number of draws from
the posterior distribution) and
is the number of data points.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# See help("log_lik", package = "rstanarm")
# See help("log_lik", package = "rstanarm")
See the methods in the rstanarm package for examples.
loo_linpred(object, ...) loo_predict(object, ...) loo_predictive_interval(object, ...) loo_pit(object, ...) ## Default S3 method: loo_pit(object, y, lw, ...)
loo_linpred(object, ...) loo_predict(object, ...) loo_predictive_interval(object, ...) loo_pit(object, ...) ## Default S3 method: loo_pit(object, y, lw, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
y |
For the default method of |
lw |
For the default method of |
loo_predict()
, loo_linpred()
, and loo_pit()
(probability integral transform) methods should return a vector with length
equal to the number of observations in the data.
For discrete observations, probability integral transform is randomised to
ensure theoretical uniformity. Fix random seed for reproducible results
with discrete data. For more details, see Czado et al. (2009).
loo_predictive_interval()
methods should return a two-column matrix
formatted in the same way as for predictive_interval()
.
Czado, C., Gneiting, T., and Held, L. (2009). Predictive Model Assessment for Count Data. Biometrics. 65(4), 1254-1261. doi:10.1111/j.1541-0420.2009.01191.x. Journal version: https://doi.org/10.1111/j.1541-0420.2009.01191.x
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
Extract the posterior draws of the conditional expectation. See the rstanarm package for an example.
posterior_epred(object, ...)
posterior_epred(object, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
posterior_epred()
methods should return a by
matrix, where
is the number of draws from the posterior
distribution distribution and
is the number of data points.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
These intervals are often referred to as credible intervals, but we use the term uncertainty intervals to highlight the fact that wider intervals correspond to greater uncertainty. See posterior_interval.stanreg() in the rstanarm package for an example.
posterior_interval(object, ...) ## Default S3 method: posterior_interval(object, prob = 0.9, ...)
posterior_interval(object, ...) ## Default S3 method: posterior_interval(object, prob = 0.9, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
prob |
A number |
posterior_interval()
methods should return a matrix with two
columns and as many rows as model parameters (or a subset of parameters
specified by the user). For a given value of prob
, , the
columns correspond to the lower and upper
\
have the names
\
. For example, if
prob=0.9
is specified (a
\
"95%"
, respectively.
The default method just takes object
to be a matrix (one column per
parameter) and computes quantiles, with prob
defaulting to 0.9
.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# Default method takes a numeric matrix (of posterior draws) draws <- matrix(rnorm(100 * 5), 100, 5) # fake draws colnames(draws) <- paste0("theta_", 1:5) posterior_interval(draws) # Also see help("posterior_interval", package = "rstanarm")
# Default method takes a numeric matrix (of posterior draws) draws <- matrix(rnorm(100 * 5), 100, 5) # fake draws colnames(draws) <- paste0("theta_", 1:5) posterior_interval(draws) # Also see help("posterior_interval", package = "rstanarm")
Extract the posterior draws of the linear predictor, possibly transformed by the inverse-link function. See posterior_linpred.stanreg() in the rstanarm package for an example.
posterior_linpred(object, transform = FALSE, ...)
posterior_linpred(object, transform = FALSE, ...)
object |
The object to use. |
transform |
Should the linear predictor be transformed using the
inverse-link function? The default is |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
posterior_linpred()
methods should return a by
matrix, where
is the number of draws from the posterior
distribution distribution and
is the number of data points.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# See help("posterior_linpred", package = "rstanarm")
# See help("posterior_linpred", package = "rstanarm")
Draw from the posterior predictive distribution of the outcome. See posterior_predict.stanreg() in the rstanarm package for an example.
posterior_predict(object, ...)
posterior_predict(object, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
posterior_predict()
methods should return a by
matrix, where
is the number of draws from the posterior predictive
distribution and
is the number of data points being predicted per
draw.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# See help("posterior_predict", package = "rstanarm")
# See help("posterior_predict", package = "rstanarm")
Generic function and default method for computing predictive errors
(in-sample, for observed
) or
(out-of-sample, for new or held-out
).
See predictive_error.stanreg()
in the rstanarm package for an example.
predictive_error(object, ...) ## Default S3 method: predictive_error(object, y, ...)
predictive_error(object, ...) ## Default S3 method: predictive_error(object, y, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
y |
For the default method, a vector of |
predictive_error()
methods should return a by
matrix, where
is the number of draws from the posterior predictive
distribution and
is the number of data points being predicted per
draw.
The default method just takes object
to be a matrix and y
to be a
vector.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# default method y <- rnorm(10) ypred <- matrix(rnorm(500), 50, 10) pred_errors <- predictive_error(ypred, y) dim(pred_errors) head(pred_errors) # Also see help("predictive_error", package = "rstanarm")
# default method y <- rnorm(10) ypred <- matrix(rnorm(500), 50, 10) pred_errors <- predictive_error(ypred, y) dim(pred_errors) head(pred_errors) # Also see help("predictive_error", package = "rstanarm")
See predictive_interval.stanreg() in the rstanarm package for an example.
predictive_interval(object, ...) ## Default S3 method: predictive_interval(object, prob = 0.9, ...)
predictive_interval(object, ...) ## Default S3 method: predictive_interval(object, prob = 0.9, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
prob |
A number |
predictive_interval()
methods should return a matrix with two
columns and as many rows as data points being predicted. For a given value
of prob
, , the columns correspond to the lower and upper
\
\
prob=0.9
is specified (a \
would be
"5%"
and "95%"
, respectively.
The default method just takes object
to be a matrix and computes
quantiles, with prob
defaulting to 0.9
.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# Default method takes a numeric matrix (of draws from posterior # predictive distribution) ytilde <- matrix(rnorm(100 * 5, sd = 2), 100, 5) # fake draws predictive_interval(ytilde, prob = 0.8) # Also see help("predictive_interval", package = "rstanarm")
# Default method takes a numeric matrix (of draws from posterior # predictive distribution) ytilde <- matrix(rnorm(100 * 5, sd = 2), 100, 5) # fake draws predictive_interval(ytilde, prob = 0.8) # Also see help("predictive_interval", package = "rstanarm")
See prior_summary.stanreg() in the rstanarm package for an example.
prior_summary(object, ...) ## Default S3 method: prior_summary(object, ...)
prior_summary(object, ...) ## Default S3 method: prior_summary(object, ...)
object |
The object to use. |
... |
Arguments passed to methods. See the methods in the rstanarm package for examples. |
prior_summary()
methods should return an object containing
information about the prior distribution(s) used for the given model.
The structure of this object will depend on the method.
The default method just returns object$prior.info
, which is
NULL
if there is no 'prior.info'
element.
The rstanarm package (mc-stan.org/rstanarm) for example methods (CRAN, GitHub).
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
# See help("prior_summary", package = "rstanarm")
# See help("prior_summary", package = "rstanarm")
Creates or update package-specific system files to compile .stan
model
files found in inst/stan
.
rstan_config(pkgdir = ".")
rstan_config(pkgdir = ".")
pkgdir |
Path to package root folder. |
The Stan source files for the package should be stored in:
inst/stan
for .stan
files containing instructions to
build a stanmodel
object.
inst/stan/any_subfolder
for files to be included via the
#include "/my_subfolder/mylib.stan"
directive.
inst/stan/any_subfolder
for a license.stan
file.
inst/include
for the stan_meta_header.hpp
file, to be
used for directly interacting with the Stan C++ libraries.
Invisibly, whether or not any files were added/removed/modified by the function.
The rstan_create_package()
function helps get you started developing a
new R package that interfaces with Stan via the rstan package. First
the basic package structure is set up via usethis::create_package()
.
Then several adjustments are made so the package can include Stan programs
that can be built into binary versions (i.e., pre-compiled Stan C++ code).
The Details section below describes the process and the See Also section provides links to recommendations for developers and a step-by-step walk-through.
As of version 2.0.0
of rstantools the
rstan_package_skeleton()
function is defunct and only
rstan_create_package()
is supported.
rstan_create_package( path, fields = NULL, rstudio = TRUE, open = TRUE, stan_files = character(), roxygen = TRUE, travis = FALSE, license = TRUE, auto_config = TRUE )
rstan_create_package( path, fields = NULL, rstudio = TRUE, open = TRUE, stan_files = character(), roxygen = TRUE, travis = FALSE, license = TRUE, auto_config = TRUE )
path |
The path to the new package to be created (terminating in the package name). |
fields , rstudio , open
|
Same as |
stan_files |
A character vector with paths to |
roxygen |
Should roxygen2 be used for documentation? Defaults to
|
travis |
Should a |
license |
Logical or character; whether or not to paste the contents of
a |
auto_config |
Whether to automatically configure Stan functionality
whenever the package gets installed (see Details). Defaults to |
This function first creates a regular R package using
usethis::create_package()
, then adds the infrastructure required to compile
and export stanmodel
objects. In the package root directory, the user's
Stan source code is located in:
inst/ |_stan/ | |_include/ | |_include/
All .stan
files containing instructions to build a stanmodel
object must be placed in inst/stan
. Other .stan
files go in
any stan/
subdirectory, to be invoked by Stan's #include
mechanism, e.g.,
#include "include/mylib.stan" #include "data/preprocess.stan"
See rstanarm for many examples.
The folder inst/include
is for all user C++ files associated with the
Stan programs. In this folder, the only file to directly interact with the
Stan C++ library is stan_meta_header.hpp
; all other #include
directives must be channeled through here.
The final step of the package creation is to invoke
rstan_config()
, which creates the following files for
interfacing with Stan objects from R:
src
contains the stan_ModelName{.cc/.hpp}
pairs
associated with all ModelName.stan
files in inst/stan
which
define stanmodel
objects.
src/Makevars[.win]
which link to the StanHeaders
and
Boost (BH
) libraries.
R/stanmodels.R
loads the C++ modules containing the
stanmodel
class definitions, and assigns an R instance of each
stanmodel
object to a stanmodels
list (with names
corresponding to the names of the Stan files).
When auto_config = TRUE
, a configure[.win]
file is added to the
package, calling rstan_config()
whenever the package is installed.
Consequently, the package must list rstantools in the DESCRIPTION
Imports field for this mechanism to work. Setting auto_config = FALSE
removes the package's dependency on rstantools, but the package then
must be manually configured by running rstan_config()
whenever
stanmodel
files in inst/stan
are added, removed, or modified.
In order to enable Stan functionality, rstantools
copies some files to your package. Since these files are licensed as
GPL >= 3, the same license applies to your package should you choose to
distribute it. Even if you don't use rstantools to create
your package, it is likely that you will be linking to Rcpp to
export the Stan C++ stanmodel
objects to R. Since
Rcpp is released under GPL >= 2, the same license would apply
to your package upon distribution.
Authors willing to license their Stan programs of general interest
under the GPL are invited to contribute their .stan
files and
supporting R code to the rstanarm package.
The
stanmodel
objects corresponding to the Stan programs included with your
package are stored in a list called stanmodels
. To run one of the Stan
programs from within an R function in your package just pass the
appropriate element of the stanmodels
list to one of the rstan
functions for model fitting (e.g., sampling()
). For example, for a Stan
program "foo.stan"
you would use rstan::sampling(stanmodels$foo, ...)
.
For devtools users, because of changes in the latest versions of
roxygen2 it may be necessary to run pkgbuild::compile_dll()
once before devtools::document()
will work.
use_rstan()
for adding Stan functionality to an existing
R package and rstan_config()
for updating an existing package
when its Stan files are changed.
The rstanarm package repository on GitHub.
Guidelines and recommendations for developers of R packages interfacing with Stan and a demonstration getting a simple package working can be found in the vignettes included with rstantools and at mc-stan.org/rstantools/articles.
After reading the guidelines for developers, if you have trouble setting up your package let us know on the the Stan Forums or at the rstantools GitHub issue tracker.
Adapted from the sourceDir
function defined
by example(source)
.
rstantools_load_code(path, trace = TRUE, ...)
rstantools_load_code(path, trace = TRUE, ...)
path |
Path to directory containing code to load |
trace |
Whether to print file names as they are loaded |
... |
Additional arguments passed to |
NULL
Add Stan infrastructure to an existing R package. To create a new package
containing Stan programs use rstan_create_package()
instead.
use_rstan(pkgdir = ".", license = TRUE, auto_config = TRUE)
use_rstan(pkgdir = ".", license = TRUE, auto_config = TRUE)
pkgdir |
Path to package root folder. |
license |
Logical or character; whether or not to paste the contents of
a |
auto_config |
Whether to automatically configure Stan functionality
whenever the package gets installed (see Details). Defaults to |
Prepares a package to compile and use Stan code by performing the following steps:
Create inst/stan
folder where all .stan
files defining
Stan models should be stored.
Create inst/stan/include
where optional license.stan
file is stored.
Create inst/include/stan_meta_header.hpp
to include optional header
files used by Stan code.
Create src
folder (if it doesn't exist) to contain the Stan C++ code.
Create R
folder (if it doesn't exist) to contain wrapper code to expose
Stan C++ classes to R.
Update DESCRIPTION
file to contain all needed dependencies to compile
Stan C++ code.
If NAMESPACE
file is generic (i.e., created by rstan_create_package()
),
append import(Rcpp, methods)
, importFrom(rstan, sampling)
,
importFrom(rstantools, rstan_config)
, importFrom(RcppParallel, RcppParallelLibs)
,
and useDynLib
directives. If NAMESPACE
is not generic, display message
telling user what to add to NAMESPACE
for themselves.
When auto_config = TRUE
, a configure[.win]
file is added to the
package, calling rstan_config()
whenever the package is installed.
Consequently, the package must list rstantools in the DESCRIPTION
Imports field for this mechanism to work. Setting auto_config = FALSE
removes the package's dependency on rstantools, but the package then
must be manually configured by running rstan_config()
whenever
stanmodel
files in inst/stan
are added, removed, or modified.
Invisibly, TRUE
or FALSE
indicating whether or not any files or
folders where created or modified.
The
stanmodel
objects corresponding to the Stan programs included with your
package are stored in a list called stanmodels
. To run one of the Stan
programs from within an R function in your package just pass the
appropriate element of the stanmodels
list to one of the rstan
functions for model fitting (e.g., sampling()
). For example, for a Stan
program "foo.stan"
you would use rstan::sampling(stanmodels$foo, ...)
.