## speed of R, C, &tc.

**M**y Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that

- matlab is a lot faster than R and python, especially when vectorization is important : this is why the difference is spectacular on filtering/smoothing, not so much on the creation of the sample;
- octave is a good matlab emulator, if no special attention is payed to execution speed…;
- scilab appears as a credible, efficient alternative to matlab;
- still, C is
**a lot**faster; the inefficiency of matlab in loops is well-known, and clearly shown in the creation of the sample.

(In this implementation, R is “only” three times slower than matlab, so this is not so damning…) All the codes are available and you are free to make suggestions to improve the speed of of your favourite language!

February 4, 2012 at 12:19 am

Why comparison of different tools designed for absolutely different purposes become so popular?

Isn’t it obvious that many scientists prefer to reserch data (or it’s subset) using their favourite scripting languages which they know best and when the research is completed they can ask to compile their scripts in Fortran (which is sometimes faster than C) or pure C (the work for technicians to process full datasets)?

I only want to mention that you need to add in your benchmarks time consumed during development of each test (or the lenghth of source code). As an example: http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=ifc&lang2=gcc

February 4, 2012 at 12:17 am

Surely the point of high level languages like R and Matlab is that they allow rapid prototyping of complex programs. Sometimes developer time is more important than runtime.

February 3, 2012 at 11:20 pm

I’ve tried to translate the scilab code into C++ with the Armadillo library (see http://arma.sourceforge.net ).

The source code is here : http://www.math.univ-montp2.fr/~pudlo/HMMarma.cpp

On my computer, smoothing/filtering runs faster than the pure C implementation, but sampling and EM run a little bit slower.

February 7, 2012 at 3:46 am

You’ll get more speed from Armadillo if you disable the run-time bounds checks. Armadillo has debugging turned on by default, to catch mistakes in user algorithms. The reasoning is “first get algorithm right, then optimise”.

To disable run-time checks, compile with:

g++ prog.cpp -o prog -O3 -DARMA_NO_DEBUG -larmadillo

or place

#define ARMA_NO_DEBUG

before #include

It also helps to have Atlas installed, as it provides a faster implementation of many Blas and Lapack routines (which Armadillo uses).

PS. The developers are brewing Armadillo 3.0, which already has a few nice speedups.

February 3, 2012 at 3:44 pm

I don’t think benchmarking by example is getting us beyond the general rule of thumb (“R is slowest, C is fastest”).

Some applications use algebra intensively, others use nested loops or make use of iterative procedures.

I propose that we develop a multidimensional benchmark scale for synthetic tasks focusing on algebra, iterative calculation and so on. Otherwise we will not understand were specific strengths and weaknesses are.

The poor performance of R in the GIBBS example is not due to slow algebra. In fact, swapping the BLAS library for the optimized version has a close-to-zero effect on most MCMC tasks.

February 4, 2012 at 8:56 am

> “R is slowest, C is fastest”

sure, but how much?

2 times, 10 times, 100 times?

I think it is important for researchers how spend most of their time prototyping new algorithms to have the right order of magnitude in mind. And even for prototypes, one sometimes needs fast computations – is it still worth learning C today, or can we rely on optimized byte-code?

Other questions I wanted to address: can/should researchers use free software instead of matlab? The interest in python (and sage) in the community is growing: should we use it to teach in the university?

Of course, this comparison is just an element.

February 3, 2012 at 2:19 pm

The runtime of your c++ code will improve when using optimized matrix operations (maybe eigen or BLAS).

Vanilla R installations seldom use optimized BLAS libraries, while matlab always does. You don’t mention any details on your R installation.

Same comment for python: casual numpy/scipy users seldom bother linking numpy with optimized blas (it was a lot of work when I did that last time).

February 3, 2012 at 11:35 am

Thanks for the pointer to a really interesting comparison.

I’ve often found Matlab to be much closer, and sometimes better, than naive C code (not calling optimized BLAS). But in those cases my bottlenecks have been much larger matrix operations than dealing with 2×2 transition tables. The sequences of dependent cheap operations in this demo would be annoying to run on a GPU too.

Because none of the codes take that long, we’d really only bother to rewrite code if we were doing many runs. If we could put these runs side by side and run them at the same time, it may be possible to put much more of the computation time inside heavily-optimized matrix libraries. We’d need to write functions that work on multiple observations sequences at once.

If a “parallel” version could be made to work well, one could also try splitting up the really long observation sequences and pretending they were shorter independent sequences, at least during the early stages of learning.

February 3, 2012 at 10:16 am

People interested in this may also be interested in my post comparing the speed of various (free) languages for a simple Gibbs sampler:

http://darrenjw.wordpress.com/2011/07/16/gibbs-sampler-in-various-languages-revisited/

February 3, 2012 at 10:28 am

True! I should have mentioned it!!!

February 3, 2012 at 1:41 am

This is very interesting!!!

in the most recent version of Matlab, R2011b, a new tool is included: the Matlab coder. http://www.mathworks.com/products/matlab-coder/

this might go in the direction of using C++ translation of Matlab code for cases where many loops are needed.

Perhaps your colleagues could include another case: simulating the samples using the C++ via “the coder” while leaving the filtering and smoothing in pure Matlab by taking advantage of the vectorization tools.