Sysbench fileio read tests and the --validate option

Last updated on

Late in March 2022 a provider reported to VPSBenchmarks a flaw in the sysbench fileio read metrics: when Sysbench writes the files used for testing in a prepare command with the default parameters, the test files are filled with zeros. That's problematic because some storage hardware optimizes for this situation and when it's asked to read zero filled files, it only reads the metadata and skips reading the actual files. Obviously this is much faster than it would be with normal files. This gives an unfair advantage to providers using this type of hardware.

Fortunately, Sysbench has a command-line option called "validate". The documentation for the validate option says: "Perform validation of test results where possible (default: off)".
Further down in the man page in the fileio section, they add: "When the global --validate option is used with the fileio test mode, Sysbench performs checksums validation on all data read from the disk. On each write operation the block is filled with random values, then the checksum is calculated and stored in the block along with the offset of this block within a file. On each read operation the block is validated by comparing the stored offset with the real offset, and the stored checksum with the real calculated checksum."

This is what we want: the fileio prepare command should fill the test files with random bytes and we think this should be the default behavior of the fileio prepare command.

After the bug was verified end of March 2022, VPSBenchmarks started running fileio read tests with both the "validate" option and without it and collecting the new metrics. Review of the data for the 27 trials that have run since reveals that about 25% of them returned read metrics that were substantially higher (and thus wrong) without the validate option.

Switching to the new metrics took some time because VPSBenchmarks grades represent the relative performance of a server compared to others on every single metric. Calculating the grade on a metric requires having enough trials with the new metric to have meaningful min, max, median and several percentiles values. But as of mid-May 2022, we have enough data, all Sysbench fileio read tests are run using the validate option and all grades are based on those new metrics for new trials.




Be the first to learn about new Best VPS rankings. Subscribe to our newsletter.