Screenshot Testing
Screenshot tests catch visual regressions by rendering your GUI to an image and comparing it against a committed reference. If something changes unexpectedly — a widget moves, a color shifts, a label disappears — the test fails and points at the freshly-rendered PNG so you can compare visually.
The test API is a builder constructed by the screenshot! macro,
which takes the plugin type and the explicit reference-PNG path:
#[test]
fn gui_screenshot() {
truce_test::screenshot!(Plugin, "screenshots/default.png").run();
}
The path is resolved relative to your crate's Cargo.toml directory
(or used as-is if absolute). The macro never picks a path on your
behalf — every test names its own reference, in whatever directory
layout suits the project.
#Quick start
Add truce-test to [dev-dependencies]:
[dev-dependencies]
truce-test = { workspace = true }
Drop the test into your lib.rs (or wherever your mod tests lives):
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn gui_screenshot() {
truce_test::screenshot!(Plugin, "screenshots/default.png").run();
}
}
If the reference doesn't exist, the test fails and points at the CLI command that creates one:
No screenshot baseline at /path/to/your-plugin/screenshots/default.png.
Create one with: cargo truce screenshot --out /path/to/.../screenshots/default.png
then inspect the rendered PNG and commit it.
The standard flow:
cargo truce screenshot --out screenshots/default.png # writes the PNG
git add screenshots/ # eyeball + commit
cargo test # the gate
cargo truce screenshot is decoupled from the test surface — it
works on any crate built with truce::plugin! whether or not
the codebase has a gui_screenshot test.
#How it works
truce_test::screenshot!(Plugin, "<path>")constructs aScreenshotTest<Plugin>with the reference path resolved against the calling crate'sCARGO_MANIFEST_DIR(when relative) or used as-is (when absolute)..run()callsPlugin::create()(plusinit()and anysetupclosure you supplied), then asks the editor for(pixels, width, height)viaEditor::screenshot().- The current render is always written to
<workspace>/target/screenshots/<basename>.png— gitignored, so it never accidentally gets committed. - The committed reference is loaded from the path you passed.
- If the reference doesn't exist, the test panics with a
cargo truce screenshot --out <path>hint. - If the reference exists and the diff exceeds tolerance, the
test fails with both PNG paths and the exact
cpcommand to accept the new render as the new baseline.
References save at 2× resolution with 144 DPI metadata so they look right on GitHub and in image viewers.
#State-dependent screenshots
The default test renders against Plugin::create() output (a
fresh plugin at default params). For shots that need a specific
configuration — a knob at a particular value, a panel toggled
open, a meter reading a particular level — the builder's
setup closure runs after init() and before the render:
#[test]
fn gui_screenshot_max_gain() {
truce_test::screenshot!(Plugin, "screenshots/max_gain.png")
.set_param(MyParamId::Gain, 1.0)
.run();
}
The builder applies, in order, state_file (if any), set_param
shortcuts, then the setup closure — same lifecycle the audio
PluginDriver uses, so the same vocabulary
works for both audio and GUI tests.
For more than one-shot param tweaks, drop down to the setup
closure (&mut Plugin):
- Drive
p.process(…)to populate meters or animations. - Mutate custom (non-param) state on the plugin struct.
For state you'd rather author interactively than spell out in
code, the standalone host's Cmd+S / Ctrl+S saves a
.pluginstate file. Load it via state_file:
#[test]
fn gui_screenshot_evening() {
truce_test::screenshot!(Plugin, "screenshots/evening.png")
.state_file("test_states/evening.pluginstate")
.run();
}
state_file paths are crate-manifest-relative or absolute. The
.pluginstate blob just gets fed to plugin.load_state(&bytes) —
the same path CLAP / VST3 / AU hosts use to restore session state.
#Promoting a render
The test's failure log includes the exact cp command. A typical
flow:
# 1. Run the test. Either it fails (regression) or panics with the
# "no baseline" message (test added but not yet baselined).
cargo test -p my-plugin gui_screenshot
# 2. Inspect target/screenshots/default.png. If it looks correct,
# promote it (the cp command from the failure message):
cp target/screenshots/default.png screenshots/default.png
# 3. Commit the updated reference.
git add screenshots/default.png
cargo truce screenshot --out <path> is the faster path when
you want to write directly to the baseline location:
cargo truce screenshot --out screenshots/default.png
git diff screenshots/ # eyeball the visual change
git add screenshots/ && git commit
For state-dependent baselines, --state lets the CLI feed a
.pluginstate blob to the renderer:
cargo truce screenshot \
--state cool.pluginstate \
--out screenshots/cool.png
(The CLI can't run setup closures — those live in the test
binary, not the cdylib cargo truce dlopens. Use cargo test gui_screenshot_<name> + cp for closure-driven baselines.)
#Tolerance
Comparison runs at pixel granularity (one count per RGBA pixel, not per byte) with two independent knobs:
.tolerance(n)— how many noticeably different pixels are allowed.0= strict..pixel_threshold(d: u8)— how big a per-channel delta has to be before a pixel counts as "noticeable" at all.0= strict (any byte difference counts).
A pixel only consumes the tolerance budget if at least one of
its R/G/B/A channels differs from the reference by more than
pixel_threshold. This lets you absorb sub-perceptual drift
(rasterizer / driver / font-hinting wobble of 1–3/255) without
inflating tolerance to numbers that would also hide real
regressions:
truce_test::screenshot!(Plugin, "screenshots/default.png")
.pixel_threshold(2) // ignore ≤2/255 channel drift
.tolerance(50) // up to 50 visibly-different pixels OK
.run();
Practical values for pixel_threshold: 1–3 rides out the
drift you get across machines; 8+ starts to mask things a human
would notice. Typical tolerance bumps are 50–500 pixels for
local AA flake on top of a sane threshold.
When a tolerance gate fails, the panic message includes the
largest single channel delta seen in the diff — useful for
deciding whether to bump pixel_threshold (drift was sub-
perceptual but slightly above your current threshold) or fix the
regression (a few pixels jumped by a lot).
#Cross-OS rendering
Per-backend wgpu rasterization differs across GPU/OS combinations (Metal / DX12 / Vulkan each have their own anti-aliasing and text rasterization quirks). A reference PNG rendered on macOS won't be pixel-identical when re-rendered on Linux or Windows even if every parameter and shader is the same.
The framework doesn't try to paper over this — strict pixel match on every host. If you intend to gate screenshots cross-platform, you have two options:
Option A — single reference platform. Pick one (typically
your CI host); only run the test there. Skip it elsewhere with a
cfg:
#[cfg(target_os = "macos")]
#[test]
fn gui_screenshot() {
truce_test::screenshot!(Plugin, "screenshots/default.png").run();
}
Option B — per-platform references with sub-perceptual slack.
One test per OS, each gated by cfg(target_os = …), each with
its own committed reference. The non-baseline platforms (the ones
where you didn't bake the reference) bump pixel_threshold to
absorb the small per-channel drift you get between driver
versions / GPUs on the same OS:
#[cfg(target_os = "macos")]
#[test]
fn gui_screenshot_macos() {
// Baseline host: strict. Drift here means a real regression.
truce_test::screenshot!(Plugin, "screenshots/default_macos.png").run();
}
#[cfg(target_os = "linux")]
#[test]
fn gui_screenshot_linux() {
truce_test::screenshot!(Plugin, "screenshots/default_linux.png")
.pixel_threshold(2)
.run();
}
#[cfg(target_os = "windows")]
#[test]
fn gui_screenshot_windows() {
truce_test::screenshot!(Plugin, "screenshots/default_windows.png")
.pixel_threshold(2)
.run();
}
cargo test on each platform compiles only its variant; cross-OS
rasterizer drift can't fail the wrong test. pixel_threshold(2)
on the non-baseline runs absorbs sub-perceptual driver / GPU
drift without masking visible regressions. The in-tree examples
(examples/truce-example-*) use this pattern.
#API reference
#screenshot! macro
truce_test::screenshot!($plugin:ty, $path:expr)
Constructs a ScreenshotTest<$plugin>. $path is the reference
PNG path; resolved against CARGO_MANIFEST_DIR when relative.
Both args are required — there's no zero-arg form, no
auto-derived path, no implicit directory.
#ScreenshotTest<P> builder
impl<P: PluginExport> ScreenshotTest<P> {
pub fn state_file<S: Into<PathBuf>>(self, path: S) -> Self;
pub fn set_param(self, id: impl Into<u32>, normalized: f32) -> Self;
pub fn setup<F: FnOnce(&mut P) + 'static>(self, f: F) -> Self;
pub fn tolerance(self, t: usize) -> Self;
pub fn pixel_threshold(self, d: u8) -> Self;
pub fn run(self);
}
| Method | Effect |
|---|---|
state_file("path") |
Load a .pluginstate blob (the standalone host's Cmd+S save format) via plugin.load_state(&bytes). Applied first. |
set_param(id, v) |
Set a parameter to a normalized [0, 1] value via params().set_normalized(id, v). Applied after state load. Multiple calls compose. |
setup(|p| …) |
Mutate the plugin between P::create() and the render. Drive process(), mutate custom state. Applied last. |
tolerance(n) |
Max allowed differing-pixel count. 0 = strict. Composes with pixel_threshold — only pixels above the threshold count toward this budget. |
pixel_threshold(d) |
Per-pixel "different enough to count" knob: a pixel only consumes tolerance if at least one R/G/B/A channel differs from the reference by more than d. 0 = strict (any byte difference counts). 1–3 ignores sub-perceptual rasterizer drift; 8+ starts hiding things a human would notice. |
run() |
Build, render, compare. |
#Editor::screenshot trait method
fn screenshot(
&mut self,
params: Arc<dyn truce_params::Params>,
) -> Option<(Vec<u8>, u32, u32)>;
Built-in backends (truce-gpu, truce-egui, truce-iced,
truce-slint) all implement this. Custom editor implementations
need to override it to be testable through screenshot!.
#cargo truce screenshot
Render a plugin's GUI from the command line, no #[test] required.
Fully self-contained — works on any crate built with
truce::plugin!.
cargo truce screenshot --out shots/hero.png # one-off render
cargo truce screenshot -p my-plugin --out shots/a.png # workspace mode
cargo truce screenshot --state s.pluginstate --out shots/cool.png
cargo truce screenshot --check --out screenshots/default.png # CI gate
| Flag | Meaning |
|---|---|
--out <path> (required) |
Output path. CWD-relative or absolute. The CLI never picks a path for you. |
-p <crate> |
Plugin crate. Required when the project has multiple plugins (each gets its own --out). |
--state <path> |
Load a .pluginstate blob (the standalone host's Cmd+S save format) before rendering. CWD-relative or absolute. |
--check |
Diff against the existing baseline at --out; exit non-zero on regression. Strict pixel match. |
--debug |
Cargo dev profile (faster compile). Default is release. |
The CLI dlopens the plugin's cdylib and calls a hidden
__truce_screenshot symbol that truce::plugin! exports. No
per-plugin scaffolding required.
#Texture format gotchas
Each backend's Editor::screenshot() impl is responsible for
returning RGBA8 bytes; the comparator just does a pixel-byte
diff. The built-in backends already convert from their native
formats — the table below is for debugging color mismatches
when you're hand-rolling a renderer.
| Backend | Live format | Screenshot bytes returned |
|---|---|---|
Built-in (truce-gpu) |
Non-sRGB surface default | RGBA8 |
egui (truce-egui) |
Rgba8UnormSrgb |
RGBA8 (sRGB) |
Iced (truce-iced) |
Bgra8UnormSrgb |
RGBA8 (sRGB, swizzled) |
Slint (truce-slint) |
CPU pixels (premultiplied) | RGBA8 (un-premultiplied) |
Mismatches usually look like a uniform tint shift (everything darker / lighter / wrong red-blue) — that's a sign the renderer returns bytes in a format the comparator can't compare against the reference.