Checkers#
Checker is a concept introduced by testlib to verify whether the participant's solution is correct for a given testcase.
A checking algorithm can range from a very simple diff between two files, to a much more complex algorithm such as "does the participant output contains a path between two specific vertices by using edges that are part of the input graph?".
We strongly recommend using testlib checkers for your problems, as they are battle-tested and will cover most of your needs. Also, they're usually lenient with extra spaces, newlines and such.
Think of all the frustration you had in your life with presentation errors and problems that asked you to print the "minimum lexicographically path in a graph" just to force the solution to be unique. testlib checkers are here to solve that.
Non-testlib checkers
Please, use testlib checkers. rbx is seriously opinionated about this, and although it will most of the times work with non-testlib checkers, no guarantees are given.
This document will intentionally not cover non-testlib checkers.
Built-in testlib checkers#
rbx provides out-of-the-box support for the built-in testlib checkers defined here.
You can use them by specifying the checker
field in your problem.rbx.yml
file.
rbx will automatically detect it is a built-in checker and will download it from the testlib repository.
The most common built-in checkers are described in the table below, but you should feel free to explore the others. The wcmp.cpp
checker
is the default checker for rbx, since it's basically a space-tolerant diff.
Checker |
Description |
---|---|
|
Compares the sequence of words of two files, as if they were a tuple. |
|
Same as |
|
Same as |
|
Compares a single word which must be "YES" or "NO" (case-insensitive). |
|
Same as to |
|
Compares two doubles by ensuring their maximal absolute or relative error is at most |
|
Same as |
Custom testlib checkers#
You can also write your own testlib checkers. Checkers in rbx are programs that receives three arguments:
The arguments are:
<input_file>
: the input file for the testcase.<output_file>
: the output file produced by the participant.<answer_file>
: the answer file, i.e. the one that contains the output of the model solution.
Some times, the <answer_file>
is not needed. Let's see a simple case first, and then let's look at one
where the <answer_file>
is needed.
Output-only case#
Let's say we have a problem that asks you to find a path between two vertices 1 and N
in a graph.
In this case, we want to check whether:
- The participant's output contains a path between two vertices 1 and
N
in the input graph. - The path is simple, i.e. it doesn't visit any vertex (or edge) more than once.
Let's say the input is given in the following format:
And the output is printed in the following format:
Where K
is the number of vertices in the path, and p_1 p_2 ... p_K
is the sequence of vertices in the path.
We can write a checker that does exactly that.
-
Notice how we're strict with the bounds of the numbers we're reading from the user. If the user provides an invalid number, we'll mark the participant's output as wrong.
This is especially important here, not only for correctness, but also because we'll allocate a vector of size
N
in the next step, and we don't want the participant to provide an invalid number. -
We use
quitif
to immediately stop the program and mark the participant's output as wrong when some condition is not met.One can also use the
quitf
variant to simply quit. This allows one to write an equivalent code: -
We use
quitf(_ok, ...)
to mark the participant's output as correct, and notice we can use format specifiers in the message.
Notice checking only the participant's output here was more than enough. We don't need to consume the <answer_file>
at all.
You can learn more about testlib streams and all the functions available in their official documentation.
Output + answer case#
Sometimes, it's also important to consider the jury's solution.
Let's consider the following modification to the problem above: now, you have to find the shortest path between two vertices 1 and N
. We can assume the input and output format stays the same.
The only way to be sure the participant's solution is the shortest possible is to actually find the shortest path in the input graph. That is exactly what our model solution already does, right? We could proceed in two ways here:
-
We could simply read the answer from the
<answer_file>
and ensure the participant's path size is equals to the jury's path size, and otherwise mark the participant's output as wrong. -
We could compare the participant's path with the jury's path, and if they are different, mark the participant's output as wrong if it's longer than the jury's path, or mark the jury's output as wrong if it's longer than the participant's path.
(1) is very dangerous for obvious reasons: what if the jury's solution is wrong? Of course, for shortest paths we simply know the optimal solution is a BFS, and it's hard to get that wrong, right? But think of more complex problems which are totally new and that probably have never been solved before. It's very easy to get that wrong, and having a checker for that can help detecting an issue in setting time (or, in worst case, in contest time).
(2) is the way to go here, as it ensures that the participant's solution is at least as good as the jury's solution. rbx has
a special outcome called JUDGE_FAILED
, which is used to convey a checking failure, when the jury's output is wrong.
-
Notice how we extract the common logic of reading a path from a stream into a function.
testlib is smart: if quitf is called with
_wa
, it will mark the corresponding stream as wrong, resulting in a WA verdict if the participant's output is wrong, and in a JUDGE_FAILED verdict if the jury's output is wrong. -
We use
_wa
to mark the jury's output (represented by theans
stream) as wrong and cause a JUDGE_FAILED verdict.We could as well use
quitif(..., _fail, ...)
, which would have the same effect.Note
_fail
conveys a checking failure, regardless of the stream it was applied to._wa
conveys a wrong answer, and is applied to the stream that was being checked.
Notice how we read from both outputs, validating each of them individually, and compare their sizes at the end, giving an appropriate verdict for each of them, and avoiding assuming the jury's output is 100% correct.
You can learn more about testlib streams and all the functions available in their official documentation.