We can easily customise the summary statistics reported by
$summary()
and $print()
.
fit <- cmdstanr::cmdstanr_example("schools", method = "sample")
fit$summary()
Warning: 302 of 4000 (8.0%) transitions ended with a divergence.
See https://mc-stan.org/misc/warnings for details.
Warning: 1 of 4 chains had an E-BFMI less than 0.2.
See https://mc-stan.org/misc/warnings for details.
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
1 lp__ -56.7 -57.1 6.1 6.4 -66.008 -46 1.1 33 19
2 mu 6.6 6.7 4.2 4.5 0.025 13 1.0 124 879
3 tau 4.7 3.9 3.6 3.4 0.790 12 1.1 33 22
4 theta[1] 9.1 8.5 6.8 6.0 -0.082 21 1.0 163 405
5 theta[2] 7.0 6.9 5.5 5.8 -1.456 16 1.0 236 1810
6 theta[3] 5.7 6.1 6.4 6.2 -4.835 16 1.0 313 1466
7 theta[4] 6.8 6.9 5.9 5.8 -2.355 16 1.0 246 1356
8 theta[5] 4.9 5.2 5.8 5.5 -5.038 13 1.0 219 932
9 theta[6] 5.7 5.9 5.8 5.8 -4.073 15 1.0 278 1208
10 theta[7] 8.9 8.6 6.0 5.7 0.094 19 1.0 176 352
11 theta[8] 7.0 7.1 6.5 5.9 -3.088 18 1.0 317 1709
By default all variables are summaries with the follow functions:
posterior::default_summary_measures()
[1] "mean" "median" "sd" "mad" "quantile2"
To change the variables summarised, we use the variables argument
fit$summary(variables = c("mu", "tau"))
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
1 mu 6.6 6.7 4.2 4.5 0.025 13 1.0 124 879
2 tau 4.7 3.9 3.6 3.4 0.790 12 1.1 33 22
We can additionally change which functions are used
fit$summary(variables = c("mu", "tau"), mean, sd)
variable mean sd
1 mu 6.6 4.2
2 tau 4.7 3.6
To summarise all variables with non-default functions, it is
necessary to set explicitly set the variables argument, either to
NULL
or the full vector of variable names.
fit$metadata()$model_params
fit$summary(variables = NULL, "mean", "median")
[1] "lp__" "mu" "tau" "theta[1]" "theta[2]" "theta[3]"
[7] "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
variable mean median
1 lp__ -56.7 -57.1
2 mu 6.6 6.7
3 tau 4.7 3.9
4 theta[1] 9.1 8.5
5 theta[2] 7.0 6.9
6 theta[3] 5.7 6.1
7 theta[4] 6.8 6.9
8 theta[5] 4.9 5.2
9 theta[6] 5.7 5.9
10 theta[7] 8.9 8.6
11 theta[8] 7.0 7.1
Summary functions can be specified by character string, function, or using a formula (or anything else supported by [rlang::as_function]). If these arguments are named, those names will be used in the tibble output. If the summary results are named they will take precedence.
my_sd <- function(x) c(My_SD = sd(x))
fit$summary(
c("mu", "tau"),
MEAN = mean,
"median",
my_sd,
~quantile(.x, probs = c(0.1, 0.9)),
Minimum = function(x) min(x)
)
variable MEAN median My_SD 10% 90% Minimum
1 mu 6.6 6.7 4.2 1.3 11.7 -11.23
2 tau 4.7 3.9 3.6 1.1 9.6 0.53
Arguments to all summary functions can also be specified with
.args
.
variable 2.5% 5% 95% 97.5%
1 mu -1.17 0.025 13 15
2 tau 0.59 0.790 12 13
The summary functions are applied to the array of sample values, with
dimension iter_sampling
xchains
.
fit$summary(variables = NULL, dim, colMeans)
variable dim.1 dim.2 1 2 3 4
1 lp__ 1000 4 -58.0 -55.7 -55.7 -57.3
2 mu 1000 4 6.9 7.5 5.4 6.8
3 tau 1000 4 5.2 4.3 4.4 4.9
4 theta[1] 1000 4 10.0 9.7 7.6 9.1
5 theta[2] 1000 4 7.1 8.0 5.8 7.2
6 theta[3] 1000 4 5.7 6.6 4.5 5.9
7 theta[4] 1000 4 7.2 7.7 5.6 6.7
8 theta[5] 1000 4 4.9 6.0 4.0 4.9
9 theta[6] 1000 4 5.7 6.7 4.8 5.7
10 theta[7] 1000 4 9.3 9.5 7.5 9.2
11 theta[8] 1000 4 7.0 8.0 5.9 7.0
For this reason users may have unexpected results if they use
stats::var()
directly, as it will return a covariance
matrix. An alternative is the distributional::variance()
function, which can also be accessed via
posterior::variance()
.
variable posterior::variance ~var(as.vector(.x))
1 mu 18 18
2 tau 13 13
Summary functions need not be numeric, but these won’t work with
$print()
.
strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
fit$summary(variables = NULL, "Strictly Positive" = strict_pos)
# fit$print(variables = NULL, "Strictly Positive" = strict_pos)
variable Strictly Positive
1 lp__ no
2 mu no
3 tau yes
4 theta[1] no
5 theta[2] no
6 theta[3] no
7 theta[4] no
8 theta[5] no
9 theta[6] no
10 theta[7] no
11 theta[8] no
For more information, see posterior::summarise_draws()
,
which is called by $summary()
.
The $draws()
method can be used to extract the posterior draws in formats provided by
the posterior
package. Here we demonstrate only the draws_array
and
draws_df
formats, but the posterior
package supports other useful formats as well.
# default is a 3-D draws_array object from the posterior package
# iterations x chains x variables
draws_arr <- fit$draws() # or format="array"
str(draws_arr)
'draws_array' num [1:1000, 1:4, 1:11] -66.1 -68.2 -67.1 -62.4 -65.6 ...
- attr(*, "dimnames")=List of 3
..$ iteration: chr [1:1000] "1" "2" "3" "4" ...
..$ chain : chr [1:4] "1" "2" "3" "4"
..$ variable : chr [1:11] "lp__" "mu" "tau" "theta[1]" ...
# draws x variables data frame
draws_df <- fit$draws(format = "df")
str(draws_df)
draws_df [4,000 × 14] (S3: draws_df/draws/tbl_df/tbl/data.frame)
$ lp__ : num [1:4000] -66.1 -68.2 -67.1 -62.4 -65.6 ...
$ mu : num [1:4000] -2.42 9.44 2.99 2.91 6.73 ...
$ tau : num [1:4000] 12.21 6.46 17.66 8.04 8.8 ...
$ theta[1] : num [1:4000] 5.57 11.03 -2.77 1.5 8.91 ...
$ theta[2] : num [1:4000] 6.97 3.31 6.77 12.84 5.79 ...
$ theta[3] : num [1:4000] 8.21 15.21 -8.08 -5.34 -19.54 ...
$ theta[4] : num [1:4000] 19.75 19.47 -7.42 -5.76 7.54 ...
$ theta[5] : num [1:4000] -4.12 -5.77 6.01 5.63 -3.23 ...
$ theta[6] : num [1:4000] -4.03 2.55 2.99 2.86 15.21 ...
$ theta[7] : num [1:4000] -0.186 -2.004 10.11 7.803 14.427 ...
$ theta[8] : num [1:4000] 0.0702 -3.005 11.0116 14.5279 14.1928 ...
$ .chain : int [1:4000] 1 1 1 1 1 1 1 1 1 1 ...
$ .iteration: int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
$ .draw : int [1:4000] 1 2 3 4 5 6 7 8 9 10 ...
print(draws_df)
# A draws_df: 1000 iterations, 4 chains, and 11 variables
lp__ mu tau theta[1] theta[2] theta[3] theta[4] theta[5]
1 -66 -2.4 12.2 5.6 6.97 8.2 19.75 -4.12
2 -68 9.4 6.5 11.0 3.31 15.2 19.47 -5.77
3 -67 3.0 17.7 -2.8 6.77 -8.1 -7.42 6.01
4 -62 2.9 8.0 1.5 12.84 -5.3 -5.76 5.63
5 -66 6.7 8.8 8.9 5.79 -19.5 7.54 -3.23
6 -64 5.3 11.4 18.5 13.37 -1.4 15.97 -0.61
7 -60 7.3 9.1 8.6 7.82 3.5 0.34 -2.00
8 -60 6.3 8.5 7.7 0.51 7.5 -0.99 1.51
9 -59 1.9 6.9 3.0 8.63 1.4 3.70 4.72
10 -63 9.1 9.3 16.0 5.77 3.9 4.14 -10.34
# ... with 3990 more draws, and 3 more variables
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
To convert an existing draws object to a different format use the
posterior::as_draws_*()
functions.
To manipulate the draws
objects use the various methods
described in the posterior package vignettes
and documentation.