Configuration

This vignette documents the options that control how mutate_package() runs: which test strategy is used, how mutant timeouts are calibrated, which tests are selected, how parallel workers are isolated, how to exclude code from mutation, and how the optional coverage-guided, precise-location, and equivalent-mutant-detection features behave. See ?mutate_package for the full argument and return-value documentation.

Mutation testing modes (`strategy`)

mutator selects a package test strategy automatically:

If tests/testthat/ exists, mutator loads the mutant in-process with pkgload::load_all() and mirrors the package’s own tests/testthat.R harness by forwarding extractable arguments (notably any filter) from testthat::test_check() to testthat::test_dir(), without paying for an install per mutant.
Otherwise, if inst/tinytest/ exists, mutator loads the mutant in-process with pkgload::load_all() and runs tinytest::run_test_dir("inst/tinytest"), also without an install per mutant.
Otherwise, if tests/ exists, mutator falls back to tools::testInstalledPackage(..., types = "tests") after installing each mutant with --install-tests.

Both in-process strategies use pkgload::load_all(), which in some cases does not load a package the way an installed copy would (see pkgload issue #340). A tinytest package affected by this can be run against an installed copy with strategy = "tinytest-installed" (R CMD INSTALL plus tinytest::test_package()), which is slower but matches an installed package and still supports coverage-guided selection. Pass strategy to mutate_package() to override auto-detection:

mutate_package("path/to/pkg", strategy = "tinytest-installed")

Timeouts and contention in parallel mode

Each mutant is run with a wall-clock timeout; exceeding it is reported as HANG. The timeout is derived from how long the package’s own test suite takes, unless you pass timeout_seconds explicitly.

Because mutants run in parallel (cores at a time), the timeout must account for contention: with many workers, each test run is slower than it would be alone for packages that load many dependencies, or do heavy per-mutant install work. A timeout based on a single, uncontended baseline run would then fire on nearly every mutant (leading to numerous false HANGs).

To avoid this, mutate_package() first runs the baseline suite once on its own (to confirm it passes and fail fast otherwise), then runs it cores times concurrently and takes the slowest of those as the contended baseline. The timeout is max(contended_baseline * 1.5, 5s). This self-calibrates to the machine, the chosen parallelism, and the package’s real load/compile cost, avoiding manual tuning. Pass timeout_seconds to override it entirely.

When cores = 1, or when forking is unavailable, the timing for the baseline suite is used.

CRAN mode (test selection)

By default mutate_package() runs each mutant’s tests in CRAN mode (cran = TRUE): the NOT_CRAN environment variable is set to "false" in the test subprocess, so testthat::skip_on_cran() and skip_if_offline() guards take effect and mutator runs the same tests CRAN would. This skips the slow, flaky, or network-dependent tests that packages mark as CRAN-skippable, which keeps mutation runs fast and avoids spurious timeouts/kills from, e.g., tests that hit the network.

Set cran = FALSE to run the full suite instead (NOT_CRAN = "true", which is what happens by default when running devtools::test()):

mutate_package("path/to/pkg", cran = FALSE)

This applies to every test strategy (testthat, tinytest, tinytest-installed, and the installed-tests fallback; the tinytest strategies map it to run_test_dir(at_home = )). Note that it only affects tests the package explicitly guards with skip_on_cran() / skip_if_offline() (or at_home() for tinytest); a package whose network test has no such guard will still run that test in either mode.

Fail-fast (stop at the first failing test)

A mutant is KILLED the instant any of the tests detects it, so running the rest of its suite is wasted work. By default (fail_fast = TRUE), each mutant’s test run stops at the first failing test instead of finishing the suite, which speeds up the test-running phase (often substantially for packages with large suites) without changing any mutant’s verdict.

Set fail_fast = FALSE to run the full suite for every mutant:

mutate_package("path/to/pkg", fail_fast = FALSE)

This applies to the testthat strategy: it sets TESTTHAT_MAX_FAILS = 1 in the test subprocess and uses the progress reporter, which aborts the run at the first failing test_that() block. SURVIVED mutants are unaffected, as they have no failure to short-circuit on, so baseline timing and timeout calibration are unchanged. The tinytest strategies run the full set of selected test files (a mutant is still KILLED on the first failing file), and the installed-tests fallback already stops at the first failing test file regardless of this flag.

Parallel execution and isolation (`isolate`)

By default, each mutant package copy symlinks the unchanged directories of the original package and only the mutated R/ file is materialised, so all parallel workers point at the same physical src/, tests/, etc. This is fast, and two design choices keep it correct:

Compiled code is built once, not per mutant. For the in-process strategies (testthat and tinytest), pkgload::load_all() reuses the baseline’s compiled src/ since C code is never mutated. The installed and tinytest-installed strategies compile once into a template library and install each mutant with --no-libs, which never writes into src/. This prevents parallel workers from fighting over the same compiled objects.
Snapshot references are not shared. For testthat packages with _snaps directories, mutator gives each mutant its own snapshot copy while symlinking the rest of tests/, so filtered or parallel runs cannot rewrite shared snapshot references.

The one remaining hazard is non-hermetic tests that write files into a shared directory (most often tests/). When parallel workers’ tests fight over the same files you can still see spurious KILLED/HANG. Two ways to attenuate it:

Run without parallelism: cores = 1. No contention, no extra disk, but the slowest option.
Isolate file state: isolate = TRUE. Each mutant gets its own deep copy of src/ and tests/ instead of a symlink, so file-writing tests can’t collide:
```
mutate_package("path/to/pkg", isolate = TRUE)
```

The default (isolate = FALSE) is fast and correct for hermetic test suites; reach for isolate = TRUE (or cores = 1) only when a package’s tests are not hermetic and you see parallel-only KILLED/HANG results.

Progress bar

When the optional pbmcapply package is installed, the multi-core test run shows a live progress bar.

install.packages("pbmcapply")

Excluding code from mutation

Not all code should be mutation-tested. Vendored/standalone files, generated code, or deprecated paths a suite is not meant to cover will mostly produce SURVIVED mutants and depress the score without telling you anything useful. There are a few ways to exclude code.

1. By file, at the call site, with exclude_files. A character vector of shell-style glob patterns matched against the base names of the .R files in R/. Matching files are skipped entirely before any mutants are generated:

mutate_package("path/to/scales", exclude_files = c("import-standalone-*"))

2. In the source, with # mutator:ignore directives. Place markers in the .R file itself:

# mutator:ignore-file anywhere in a file excludes the whole file.
# mutator:ignore-start / # mutator:ignore-end exclude the line region between them (wrap a function to exclude it):
```
# mutator:ignore-start
legacy_helper <- function(x) {
  # not worth mutation-testing
  x * 2
}
# mutator:ignore-end
```
An unmatched -start excludes through the end of the file.

3. With covr’s # nocov annotations. mutator also honours covr’s coverage-exclusion comments, so code you have already marked as untested-by-design needs no separate mutator directive:

# nocov start / # nocov end exclude the line region between them.

A bare # nocov excludes its own line, and may trail code:

if (impossible_state) {
  stop("unreachable") # nocov
}

4. With a covr .covrignore file. If the package root has a .covrignore (covr’s file-level coverage-exclusion list), mutator reads it too: each line is a glob expanded relative to the package root (a matched directory expands to the files under it), and matching R/ files are skipped before generation, using the same mechanism covr uses. So files you already exclude from coverage need no extra mutator configuration.

Granularity. Excluding whole files and whole functions is reliable. More fine-grained than that is not, for operator mutations: R mostly only attaches source references to blocks {}, so the engine resolves an operator mutant’s position only to its enclosing block. A region directive inside a function therefore excludes that function’s operator mutants as a group: you cannot single out one operator mid-function. Line-deletion mutants are excluded line-precisely. In practice, wrap whole functions, not fragments. This only affects the surviving mutant output, which shows a larger context than necessary, and prevents the coverage-guided test selection from being as precise as it could be. The granularity can be improved with the optional imputesrcref package.

Coverage-guided test selection (`coverage_guided`)

Most mutants are settled by a small subset of the suite, and a mutant on a line that no test exercises can never be killed. With coverage_guided = TRUE (the default) mutator measures coverage once with covr and then, for each mutant, runs only the test files that cover its mutated line, and skips running tests altogether for mutants on uncovered lines (reported SURVIVED immediately).

Coverage guidance applies to the testthat, tinytest, and tinytest-installed strategies. covr builds the coverage-to-tests map against an instrumented, installed copy of the package; per test file, it maps each covered source line to the test files that reach it (for tinytest, by running each inst/tinytest file individually). When the resolved strategy is the generic installed-tests fallback, mutator cannot attribute coverage to test files, so it emits a warning and runs the full suite for every mutant. Pass coverage_guided = FALSE to disable the optimisation (and that warning).

# On by default; pass FALSE to run the full suite for every mutant.
mutate_package("path/to/pkg", coverage_guided = FALSE)

The single coverage run also serves as the baseline check (it runs the package’s own tests/testthat.R harness, which fails if any test fails), so the suite is not run twice. Selection is at the test-file level, as testthat filters tests by file, and assumes the suite deterministically exercises the code, so it changes which tests run without changing a mutant’s verdict.

Requires the covr package. Coverage attribution (and therefore speed-up) depends on the backend (coverage_backend):

"record_tests" (default) uses covr’s record_tests in a single run and relies only on covr’s public output. Its limitation: covr credits a covered line to the deepest test-directory frame on the call stack, so when a test reaches package code through a function defined in a helper-*.R / setup-*.R file (a common pattern), covr attributes it to the helper, not the originating test-*.R. The true triggering test is then unknown, so mutator conservatively runs the full suite for that mutant. Packages that wrap their API in shared test helpers therefore see less speed-up.
"per_file" instruments the package once and runs the suite a single time through a reporter that snapshots coverage per test file, giving exact file-level attribution with no helper fallback, at roughly the same cost as the single record_tests run (often faster overall, since more mutants get a narrowed test set). It reaches into covr internals, so it is opt-in:
```
mutate_package("path/to/pkg", coverage_guided = TRUE, coverage_backend = "per_file")
```

Either way the pay-off is largest when the suite is big, many lines are uncovered, and tests exercise the code directly.

Precise mutant locations (optional `imputesrcref`)

Each reported mutant carries a source Range: (start:col-end:col). For statement- and line-deletion mutants this is precise, but operator mutants (+, <, &&, …) are different: R attaches no srcref to nested call objects, so the engine can only report the bounds of the enclosing block ({}), effectively a loop body or even the whole function. A surviving ==-to-!= mutant in a 40-line function therefore points at all 40 lines.

If the optional imputesrcref package is installed, mutator uses it to recover precise spans for many operator mutants, typically narrowing a whole-function range down to the exact sub-expression on a single line. It is a GitHub-only optional package listed in Enhances; install it yourself to opt in:

# install.packages("remotes")
remotes::install_github("PRL-PRG/imputesrcref")

When it is not installed, mutator behaves exactly as before (coarser operator ranges); nothing else changes. The refinement is used only as a read-only source-location oracle: mutant files are deparsed from the original code and are byte-for-byte identical whether or not imputesrcref is present, and # mutator:ignore-* directives keep their function-granular semantics regardless. For best results, install the package under test from source with parse data retained, since imputesrcref reads it:

install.packages("<pkg>", INSTALL_opts = c("--with-keep.source", "--with-keep.parse.data"))

Equivalent Mutant Detection

Equivalent-mutant detection calls an OpenAI-compatible Chat Completions API. Configure it in any of these ways (listed highest precedence first; each setting is resolved independently, so they can be mixed):

Programmatically, in your R session:

set_openai_config(
  api_key  = "api-key...",
  model    = "gpt-4",
  base_url = "https://api.openai.com/v1" # any OpenAI-compatible endpoint
)

A .openai_config file in the working directory (or in a directory you pass via get_openai_config(dir = ...)). It is a plain, human-readable file of field: value lines and is parsed, never executed:
```
api_key: api-key...
model: gpt-4
base_url: https://api.openai.com/v1
```
Only the given directory is consulted. Parent directories are not searched.
Environment variables OPENAI_API_KEY, OPENAI_MODEL and OPENAI_BASE_URL.

If nothing is configured, the model defaults to gpt-4 and the base URL to the public OpenAI API. Set base_url to target a self-hosted or alternative OpenAI-compatible service (for example http://localhost:11434/v1).

Enable the analysis when running package mutation tests:

mutate_package("path/to/pkg", detectEqMutants = TRUE)

Equivalence detection runs before the test suites. Because an equivalent mutant is behaviorally identical to the original, no test can kill it, so running its test suite is wasted work. mutate_package() therefore analyzes every generated mutant up front and skips the test run for those judged equivalent: they are recorded as survived directly. Mutants judged NOT_EQUIVALENT or DONT_KNOW are tested as usual. The trade-off is that the equivalence pass now covers all mutants, not only survivors, so it makes more API calls; in exchange it avoids the far more expensive test runs on equivalent mutants.

Mutants are analyzed in bounded batches (default 25 per request), and each mutant is shown to the model as a small unified diff of its edit (plus a short change label) rather than its full mutated source. This is compact, unambiguous, and in a format LLMs read natively. When mutate_package() runs with cores > 1 the batches are sent concurrently, which (with the bounded size) keeps the equivalence pass fast and avoids the truncated responses that otherwise drop verdicts.

For mutants the model flags as EQUIVALENT (they are excluded from the adjusted mutation score), it also returns a one-sentence reason, stored as equivalence_reason on the mutant so the call can be audited. No reason is requested for NOT_EQUIVALENT/DONT_KNOW, keeping responses small.

Progress and the results summary are emitted via message() (so they can be silenced with suppressMessages()). The full prompts and model responses are not printed by default; set options(mutator.verbose = TRUE) to log them.

Mutation testing modes (strategy)