lunes, 13 de octubre de 2014

Unit tests in R made simple

There is noting new under the moon. 

During the last six months, I have been working mostly in R. R is great for research purposes, and I am not participating in these endless discussions about what is cooler: R, Python, Matlab, SAS or you name it. As being priviledged by speaking all of the above mentioned languages with a greater or lesser fluency, I can compare, and therefore I think that it all comes down to what you want to do in the end.

One of the things that I have adopted from my working-exclusively-in-Python experience is is the test-triven development (TDD) paradigm. Now, even writing my research code in R, I can't help creating these tests.

There is actually not much new to say about unit testing, because the topic is extensively covered elsewhere. In my humble opinion, this blog post offers the most awesome coverage of unit testing that I have ever seen.

TDD in general and unit tests in partucular are often neglected by R users - unless they are writing a package.

I think the added value of unit tests for research code cannot be overestimated since, despite popular beliefs of people unfamiliar with R, the language is much more than - how one of my classmates liked to put it - "a sophisticated statistical calculator".

Of course, many-many research findings have been successfully made employing script-based code, but when you have to do similar things multple times, and when you can wrap your code up and make it unfold beautifully with every call, testing is comes in very handy.

R has a certain characteristics: there exist at least one implementation (i.e., package) for almost anything. For some things, there are multiple ways to do them. I don't really know why people reinvent the wheel, but my guess is that when the current state of things is not working for them, they prefer to start from scratch rather than to dig into someone's code.

So, if you are eager to to unit test your thoroughly developed work you can opt for - at least - these three packages:



The last one does not seem to be used very often. The second has the fame because it has been developed by the very Hadley Wickham, and is allegedly used by him in his packages. To those unfamiliar with the name, let me just say that he is the reference R guy, a visualisation guru and the ggplot2 creator. He has a $60 worth book published by Springer Verlag and a stackoverflowing reputation on Stack Overflow.

I am using the first package from the list, RUnit, and not for the reason that is has been created by fellow German people working in field of epidemiology. I do so merely because RUnit is so similar to the unit testing framework of Python that I am already familiar with. It is reportedly alike to the unit testing approach implemented in Java. I don't know Java, so I can't tell. What I can tell that RUnit is great for use. It is clear, comprehensive and disambiguous. Moreover, it comes with a terrific reference manual that is a great read - apart from being informative. It provides a simple yet exhaustive explanation of what unit tests are, why they are helpful and how they differ from integration tests. Also, it provides guidelines on how to write unit tests. It is quite unlikely to encounter a line like:

"Once a bug has been found, add a corresponding test case" 

or like:

"Develop test cases parallel to implementing your functionality. Keep testing all the time (code - test - simplify cycle)."

in an R document (and I've read quite a few of them).

This blog post provides a nice comparison of RUnit and testthat.

Unfortunately, to my knowledge, there exist no implementation of test suits in any of R IDEs. But this is not a major problem, especially for those  R users who, like me, started their journey with R using console only.

So, if you want to define a test suite in R, all you need to do is link the library, defineTestSuite(), runTestSuite() and, if you wish to, printTextProtocol() for your tests.

Like that:





No hay comentarios:

Publicar un comentario