Stress-testing your solutions#
Note
This page picks up the sum-of-N-integers problem from First steps,
where sols/main.cpp is correct and sols/wa-overflow.cpp accumulates the sum into
an int32_t that silently overflows.
Our goal here is simple: ask rbx to find a tiny input that breaks sols/wa-overflow.cpp.
Describing the search#
A stress test is described by two expressions:
- a generator expression, which tells rbx how to keep producing random testcases, and
- a finder expression, which describes the condition that makes a testcase a match.
See Stress testing for the complete operator reference. Here we only need a couple of operators.
Our generator expression is:
[1..5]keeps the count of integers tiny — at most five numbers per test.<A.max>pulls the upper bound straight from thevarsdefined inproblem.rbx.yml, so it tracks the problem's real constraints.@is replaced by a fresh random string on every evaluation, so each run produces a different testcase.
Why is such a small range enough? An int32_t overflows once the sum passes ~2.1×10⁹.
With A.max up at ~10⁹, just a handful of large numbers already pushes the true sum past
that line — which is exactly why the counterexample rbx finds comes out tiny.
Our finder expression is:
This matches any testcase for which sols/wa-overflow.cpp produces a verdict considered
incorrect. As a convenience, sols/wa-overflow.cpp on its own is shorthand for the same
thing.
Running the stress#
By default the stress runs for about 10 seconds and stops as soon as it finds the first
match. You can tune both the number of findings and the timeout with -n and -t — see
Stress testing for the details.
Inspecting the counterexample#
When a match is found, rbx stress prints a report and shows the exact generator call
that produced the failing testcase, along with the input itself.
It's a small input — just a few large numbers whose sum overflows int32_t, so
sols/wa-overflow.cpp prints the wrong value while sols/main.cpp gets it right. That's
the divergence rbx flagged as INCORRECT.
Making it stick#
A counterexample is only useful if it survives into your testset. Right after a match,
rbx stress asks:
Do you want to add the tests that were found to a test group?
Answer yes. rbx then lists every test group backed by a .txt generator script,
plus two extra options: (create new script) and (skip). Choose (create new script)
and name it testplan/corner.txt.
rbx appends the found generator call to that script — prefixed with a
# Obtained by running rbx stress ... comment so you know where it came from — and adds a
new corner test group to problem.rbx.yml:
# Testcases section would now look like:
testcases:
- name: 'samples'
testcaseGlob: 'tests/samples/*.in'
- name: 'random'
generatorScript:
path: 'random.txt'
- name: 'corner' # (1)!
generatorScript:
path: 'corner.txt' # (2)!
- The new group
corneris backed by the freshly createdtestplan/corner.txt. - The
pathis relative to the testplan root, so barecorner.txthere and thetestplan/corner.txtyou typed are the same file.
Now run rbx build, and the counterexample is regenerated as a permanent test in the
corner group.
Tip
Because testlib seeds its RNG from the argv passed to the generator, the saved
generator call reproduces the exact same input on every build. The randomized @
has already been resolved to a concrete string, so the test is fully deterministic from
here on.
Note
A future rbx release will let you promote a finding straight to a manual_tests/
file in a single step (issue #442). Until
that lands, the test-group route above is the way to make a counterexample permanent.
Next steps#
-
Stress testing reference
Want fuzzing,
--slowest, or savedstresses:blocks? The reference page covers the full operator set and every flag. -
Generators
Want to write smarter generators for your stress tests? Check out our guide on generators.
-
Configure further
Want to learn all you can do in
problem.rbx.yml? Check out our reference on how to configure your problem.