Measuring performance:
Profiling & benchmarking
Before we talk about ways to improve performance, let’s see how to measure it.
When should you care?
“There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
Optimizing code takes time, can lead to mistakes, and may make code harder to read. Consequently, not all code is worth optimizing and before jumping into optimizations, you need a strategy.
You should consider optimizations when:
- you have debugged your code (optimization comes last, don’t optimize a code that doesn’t run),
- you will run a section of code (e.g. a function) many times (your optimization efforts will really pay off),
- a section of code is particularly slow.
How do you know which sections of your code are slow? Don’t rely on intuition. You need to profile your code to identify bottlenecks.
Profiling
“It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.”
R comes with a profiler: Rprof()
.
profvis is a newer tool, built by posit (formerly RStudio Inc). Under the hood, it runs Rprof()
to collect data, then produces an interactive html widget with a flame graph that allows for an easy visual identification of slow sections of code.
While this tool integrates well within the RStudio IDE, it is not very well suited for remote work on a cluster. One option is to profile your code with small data on your own machine. Another option is to use the base profiler Rprof()
directly as in this example.
Benchmarking
Once you have identified expressions that are particularly slow, you can use benchmarking tools to compare variations of the code.
In the most basic fashion, you can use system.time()
, but this is limited and imprecise.
The microbenchmark package is a much better option. It gives the minimum time, lower quartile, mean, median, upper quartile, and maximum time of R expressions.
The newer bench package is very similar, but it has less overhead, is more accurate, and—for sequential code—gives information on memory usage and garbage collections. This is the package that we will use for this course.
The main function from this package is mark()
. You can pass as argument(s) one or multiple expressions that you want to benchmark. By default, it ensures that all expressions output the same result. If you want to remove this test, add the argument check = FALSE
.
While mark()
gives memory usage and garbage collection information for sequential code, this functionality is not yet implemented for parallel code. When benchmarking parallel expressions, we will have to use the argument memory = FALSE
.
You will see many examples throughout this course.