This document explains how to use SafetyNet to detect performance regressions in PDFium.
safetynet_compare.py is a script that compares the performance between two versions of pdfium. This can be used to verify if a given change has caused or will cause any positive or negative changes in performance for a set of test cases.
The supported profilers are exclusive to Linux, so for now this can only be run on Linux.
An illustrative example is below, comparing the local code version to an older version. Positive % changes mean an increase in time/instructions to run the test - a regression, while negative % changes mean a decrease in time/instructions, therefore an improvement.
$ testing/tools/safetynet_compare.py ~/test_pdfs --branch-before beef5e4 ================================================================================ % Change Time after Test case -------------------------------------------------------------------------------- -0.1980% 45,703,820,326 ~/test_pdfs/PDF Reference 1-7.pdf -0.5678% 42,038,814 ~/test_pdfs/Page 24 - PDF Reference 1-7.pdf +0.2666% 10,983,158,809 ~/test_pdfs/Rival.pdf +0.0447% 10,413,890,748 ~/test_pdfs/dynamic.pdf -7.7228% 26,161,171 ~/test_pdfs/encrypted1234.pdf -0.2763% 102,084,398 ~/test_pdfs/ghost.pdf -3.7005% 10,800,642,262 ~/test_pdfs/musician.pdf -0.2266% 45,691,618,789 ~/test_pdfs/no_metadata.pdf +1.4440% 38,442,606,162 ~/test_pdfs/test7.pdf +0.0335% 9,286,083 ~/test_pdfs/testbulletpoint.pdf ================================================================================ Test cases run: 10 Failed to measure: 0 Regressions: 0 Improvements: 2
Run the safetynet_compare.py script in testing/tools to perform a comparison. Pass one or more paths with test cases - each path can be either a .pdf file or a directory containing .pdf files. Other files in those directories are ignored.
The following comparison modes are supported:
$ testing/tools/safetynet_compare.py path/to/pdfs
$ testing/tools/safetynet_compare.py path/to/pdfs --branch-before another_branch $ testing/tools/safetynet_compare.py path/to/pdfs --branch-before 1a3c5e7
$ testing/tools/safetynet_compare.py path/to/pdfs --branch-after another_branch --branch-before yet_another_branch $ testing/tools/safetynet_compare.py path/to/pdfs --branch-after 1a3c5e7 --branch-before 0b2d4f6 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-after another_branch --branch-before 0b2d4f6
$ gn args out/BuildConfig1 $ gn args out/BuildConfig2 $ testing/tools/safetynet_compare.py path/to/pdfs --build-dir out/BuildConfig2 --build-dir-before out/BuildConfig1
safetynet_compare.py takes care of checking out the appropriate branch, building it, running the test cases and comparing results.
safetynet_compare.py uses callgrind as a profiler by default. Use --profiler to specify another one. The supported ones are:
Only works on Linux. Make sure you have perf by typing in the terminal:
This is a fast profiler, but uses sampling so it's slightly inaccurate. Expect variations of up to 1%, which is below the cutoff to consider a change significant.
Use this when running over large test sets to get good enough results.
Only works on Linux. Make sure valgrind is installed:
This is a slow and accurate profiler. Expect variations of around 100 instructions. However, this takes about 50 times longer to run than perf stat.
Use this when looking for small variations (< 1%).
One advantage is that callgrind can generate
callgrind.out files (by passing --output-dir to safetynet_compare.py), which contain profiling information that can be analyzed to find the cause of a regression. KCachegrind is a good visualizer for these files.
Run without any profiler, giving a performance score of 1 always. useful for running image comparisons or debugging the script.
Arguments commonly passed to safetynet_compare.py.
Most of the time these don't need to be used.
Create a separate checkout of pdfium in a new directory, for example
~/job. The safetynet_job.py script will run from this directory. This checkout needs to be
git pull'ed when there are changes to the SafetyNet scripts, but otherwise it can be left alone.
Create a directory to contain the job results, for example
~/job_results. In each run, a
.log file with the results will be written to this directory and a subdirectory will be created with the other artifacts.
Setup a cron job to run safetynet_job.py nightly. The example below runs it at 1:42 AM, over the corpus in two directories:
@ crontab -e 42 1 * * * bash -lc '~/job/pdfium/testing/tools/safetynet_job.py ~/job_results ~/pdf_samples/thousand_pdfs ~/pdf_samples/i18n --output-to-log >> ~/job_results/cron_nightly.log 2>&1'
The first time the job runs, it will just create a checkpoint as
~/job_results/last_revision_covered. From then on, since a checkpoint is available, each run will compare performance with the last checkpoint and update the checkpoint.
--png-dir option pointing at an output directory to compare the output images from rendering the “before” and the “after” branches with pdfium_test.
$ mkdir ~/output_images $ testing/tools/safetynet_compare.py ~/pdf_samples --branch-before before_visual_changes --branch-after after_visual_changes --png-dir ~/output_images
This will output and automatically open a
~/output_images/compare.html file showing the before/after and the diff. Hover the mouse cursor over the before/after image on the left for an easier visual comparison. The “before” image is displayed until the cursor hovers over the image, which is then replaced with the “after” image.
It is recommended to use
--profiler=none with this option.