Skip to contents

Conduct a cluster-based permutation test

Usage

clusterpermute(
  jlmer_spec,
  family = c("gaussian", "binomial"),
  statistic = c("t", "chisq"),
  threshold,
  nsim = 100L,
  predictors = NULL,
  binned = FALSE,
  top_n = Inf,
  add1 = TRUE,
  ...,
  progress = TRUE
)

Arguments

jlmer_spec

Data prepped for jlmer from make_jlmer_spec()

family

A GLM family. Currently supports "gaussian" and "binomial".

statistic

Test statistic for calculating cluster mass. Can be one of "t" (default) from the regression model output or "chisq" from a likelihood ratio test (takes about twice as long to calculate).

threshold

The threshold value that the statistic must pass to contribute to cluster mass. Interpretation differs on the choice of statistic (more below):

  • If statistic = "t", the threshold for t-value (beta/std.err) from the regression model.

  • If statistic = "chisq", the threshold for the p-value of chi-squared statistics from likelihood ratio tests.

nsim

Number of simulations description

predictors

(Optional) a subset of predictors to test. Defaults to NULL which tests all predictors.

binned

Whether the data has been aggregated/collapsed into time bins. Defaults to FALSE, which requires a cluster to span at least two time points. If TRUE, allows length-1 clusters to exist.

top_n

How many clusters to return, in the order of the size of the cluster-mass statistic. Defaults to Inf which return all detected clusters.

add1

Whether to add 1 to the numerator and denominator when calculating the p-value. Use TRUE to effectively count the observed statistic as part of the permuted null distribution (recommended with larger nsim prior to publishing results).

...

Optional arguments passed to Julia for model fitting. Defaults to fast = TRUE (when family = "binomial") and progress = FALSE.

progress

Defaults to TRUE, which prints progress on each step of the cluster permutation test.

Value

A list of null_cluster_dists and empirical_clusters with p-values

Examples

# \donttest{

library(dplyr, warn.conflicts = FALSE)

# Specification object
spec <- make_jlmer_spec(
  weight ~ 1 + Diet, filter(ChickWeight, Time <= 20),
  subject = "Chick", time = "Time"
)
spec
#> ── jlmer specification ───────────────────────────────────────── <jlmer_spec> ──
#> Formula: weight ~ 1 + Diet2 + Diet3 + Diet4
#> Predictors:
#>   Diet: Diet2, Diet3, Diet4
#> Groupings:
#>   Subject: Chick
#>   Trial:
#>   Time: Time
#> Data:
#> # A tibble: 533 × 6
#>   weight Diet2 Diet3 Diet4 Chick  Time
#>    <dbl> <dbl> <dbl> <dbl> <ord> <dbl>
#> 1     42     0     0     0 1         0
#> 2     51     0     0     0 1         2
#> 3     59     0     0     0 1         4
#> # ℹ 530 more rows
#> ────────────────────────────────────────────────────────────────────────────────

# Should minimally provide `threshold` and `nsim`, in addition to the spec object
reset_rng_state()
CPA <- clusterpermute(spec, threshold = 2, nsim = 100, progress = FALSE)
CPA
#> $null_cluster_dists
#> ── Null cluster-mass distribution (t > 2) ────────────── <null_cluster_dists> ──
#> Diet2 (n = 100)
#>   Mean (SD): 0.266 (2.13)
#>   Coverage intervals: 95% [0.000, 6.579]
#> Diet3 (n = 100)
#>   Mean (SD): -0.124 (2.75)
#>   Coverage intervals: 95% [-4.777, 5.707]
#> Diet4 (n = 100)
#>   Mean (SD): 0.007 (2.50)
#>   Coverage intervals: 95% [0.000, 0.000]
#> ────────────────────────────────────────────────────────────────────────────────
#> 
#> $empirical_clusters
#> ── Empirical clusters (t > 2) ────────────────────────── <empirical_clusters> ──
#> Diet2
#>   [4, 8]: 8.496 (p=0.0297)
#> Diet3
#>   [2, 20]: 34.628 (p=0.0099)
#> Diet4
#>   [2, 20]: 39.371 (p=0.0099)
#> ────────────────────────────────────────────────────────────────────────────────

# CPA is a list of `<null_cluster_dists>` and `<empirical_clusters>` objects
sapply(CPA, class)
#>   null_cluster_dists   empirical_clusters 
#> "null_cluster_dists" "empirical_clusters" 

# You can extract the individual components for further inspection
CPA$null_cluster_dists
#> ── Null cluster-mass distribution (t > 2) ────────────── <null_cluster_dists> ──
#> Diet2 (n = 100)
#>   Mean (SD): 0.266 (2.13)
#>   Coverage intervals: 95% [0.000, 6.579]
#> Diet3 (n = 100)
#>   Mean (SD): -0.124 (2.75)
#>   Coverage intervals: 95% [-4.777, 5.707]
#> Diet4 (n = 100)
#>   Mean (SD): 0.007 (2.50)
#>   Coverage intervals: 95% [0.000, 0.000]
#> ────────────────────────────────────────────────────────────────────────────────
CPA$empirical_clusters
#> ── Empirical clusters (t > 2) ────────────────────────── <empirical_clusters> ──
#> Diet2
#>   [4, 8]: 8.496 (p=0.0297)
#> Diet3
#>   [2, 20]: 34.628 (p=0.0099)
#> Diet4
#>   [2, 20]: 39.371 (p=0.0099)
#> ────────────────────────────────────────────────────────────────────────────────

# }