RR: Config knobs and validation #12375

arjunr2 · 2026-01-20T17:15:12Z

Initial PR for knobs for Config for record/replay feature

github-actions · 2026-01-20T20:06:35Z

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

If you added a new Config method, you wrote extensive documentation for
it.

Details

Our documentation should be of the following form:

Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

# Example

Optional example here.

If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

Details

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in wasmtime_fuzzing::Config (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See our docs on fuzzing for more details.
If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

Details

To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

alexcrichton · 2026-01-22T19:11:14Z

crates/wasmtime/src/config.rs

+                self.validate_determinism_conflicts()?;
+                self.enforce_determinism();


What would you think of ignoring the determinism options here instead of validating/enabling them? Historically as various config options have been tied together it's caused problems and made configuration more confusing due to trying to understand how everything interacts with each other. In some sense producing a recording is entirely orthogonal to deterministic simd/nans. In another sense I also understand how such a recording runs the risk of not being too useful.

For the engine-level configuration I'd argue, however, that this is best kept as a separate concern where we'd document in the RR configuration options that users probably also want to turn on deterministic things, but it wouldn't be a requirement.

Such a change would also have the nice benefit of keeping validate as &self vs &mut self changed in this PR. That's been an intentional design so far where Config is intended to not need any sort of post-processing. If post-processing is necessary we try to defer it to "store the result of the computation in the Engine" so Config continues to reflect the source of truth of configuration specified.

I think I'd gently push back against this: determinism is a fundamental part of record/replay, not an optional add-on, and without it the replay may not only be incorrect but be incorrect in interesting ways that violate internal invariants (finding the wrong kind of event as we read the trace alongside canonical ABI steps, necessitating some sort of fallback that, I don't know, returns an error? aborts halfway through allocating something? forces a trap halfway through the marshalling code?). I'd personally see this in the same light as, e.g., the need for bounds checks when memory is configured a certain way: it's Just How We Compile Things.

I can't comment much on the intended direction of Config, but I agree that determinism is a fundamental part of RR, and feels like something that should be implicitly enforced whenever that option is specified.

To be clear I'm not saying that things should be deterministic, I'm saying that I think it would make sense to avoid unconditionally coupling them here. The two options handled here, NaN bits and relaxed-simd, are pretty low down on the list of "things that practically cause nondeterminism" in wasm with resource exhaustion (stacks or memory) being much higher on that list. RR cannot control resource exhaustion during replay in the sense that it can't necessarily predict the stack consumption (maybe memory? that seems relatively advanced)

Basically I would expect that divergence of a replay from the original recording is something that's going to need to be handled no matter what. I think it'd be reasonable, for example, for the CLI to automatically set these options but at the wasmtime::Config layer we've generally had bad experiences tying all these together.

Put another way, I would be surprised if we could actually achieve absolutely perfect determinism during a record and replay. Inevitably it seems like we'll forget events, have bugs that prevent this, etc. Assuming perfect determinism to me sounds like it's going to introduce more subtle bugs than not and be a pretty steep uphill battle

Right, I think it's an interesting challenge to think about permitting partial divergence and recovering. However I also think, having seen and thought through a lot of the challenges here, that is enormously complex and opens a huge new Pandora's box of issues. For example, with reversible debugging, which builds on top of replay, the whole algorithm depends on determinism; we'll back ourselves into fundamentally unsolvable corners if we don't have it.

See also e.g. how rr (the Mozilla project) panics with internal asserts if trace replay mismatches. I think that's the only really reasonable way to go here: we'll have asserts when we have mismatches. In other words, yep there may be bugs; let's treat them as bugs and catch and fix them.

Resource exhaustion is of course a different category: early termination because a memory.grow failed on replay is reasonable to propagate through and we already have the error paths for that. The kind of nondeterminism that is impossible to deal with is the kind that keeps running but with a poisoned machine state.

Another possible option here, which we've done elsewhere, is to change defaults depending on configuration options. For example the default set of enabled wasm features is different if you use Winch than if you use Cranelift.

One way to slice this problem, without mutating Config, would be to keep the validation that determinism isn't explicitly disabled and then update the read of these configuration values to take into account the rr configuration. That would retain the fact that validate doesn't mutate Config, but the configuration for reading "is relaxed simd determininstic" would look like "was it set or is rr enabled" or something like that.

I think that could work too.

Is there any consensus then on the approach for this? Thinking about it a bit more, updating the value on reads could be misleading and requires all future uses of this to ensure it checks this edge case. Perhaps that's ok if it's stated explicitly in the code documentation somewhere?

I'd say my personal requirement is that fn validate should stay at &self instead of changing to &mut self. How exactly that plays out for this can be workable in a few ways (e.g. decouple these options, change those reading the options, change the source-of-truth for the options, etc).

Ah, I missed that this was happening in validate and making it a &mut self method -- I definitely agree that that's incorrect / violates the intended meaning of validate.

In earlier review I had pointed to this bit of logic for guest debugging (idea courtesy of Alex) where we change the default knob settings (prior to processing user overrides) based on other knobs; all of this is happening in a way that doesn't actually mutate the Config, just changes the tunables.

It seems that the determinism settings are not on Tunables so maybe that's not literally applicable here but in principle we should either do that, or do what validate says on the tin and simply reject an invalid config, not silently mutate.

crates/wasmtime/src/config.rs

crates/wasmtime/src/engine.rs

crates/cli-flags/src/lib.rs

arjunr2 · 2026-02-02T01:21:23Z

Ok I've addressed everything from the prior review. In particular, for validate right now, I have disabled the implicit enforcement of determinism, requiring users to explicitly provide it. It will reject invalid configs that don't meet this with an error. Seems like it should be ok to place it on the user to set appropriate sister settings for the RR.

alexcrichton

Thanks! I'll flag for merge after a rebase (merge conflicts currently)

Config knobs and validation for record-replay

8e1e5a2

arjunr2 requested review from a team as code owners January 20, 2026 17:15

arjunr2 requested review from alexcrichton and removed request for a team January 20, 2026 17:15

github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Jan 20, 2026

alexcrichton reviewed Jan 22, 2026

View reviewed changes

crates/cli-flags/src/lib.rs Show resolved Hide resolved

Resolve review; remove implicit enforcement

7eb2383

alexcrichton approved these changes Feb 2, 2026

View reviewed changes

		self.validate_determinism_conflicts()?;
		self.enforce_determinism();

RR: Config knobs and validation #12375

Are you sure you want to change the base?

RR: Config knobs and validation #12375

Conversation

arjunr2 commented Jan 20, 2026

Uh oh!

github-actions bot commented Jan 20, 2026

Label Messager: wasmtime:config

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arjunr2 commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arjunr2 commented Feb 2, 2026 •

edited

Loading