/ r

"Functional" programming in R

Today I've found a question on SO: Use lapply to modify the data of an xts contained on a list.

There was nothing special about the question itself. But I thought a lot about functional patterns, which can and should be used in R. The author wants to replace first row of multiple xts objects inside the list.

Preparation code with some random xts objects:

library(xts)
size <- 400
cols <- 4
rows <- size / cols
a <- rep(list(xts(matrix(rnorm(size), ncol=cols), order.by = seq(from = as.Date(Sys.Date()), by = "day", length.out = rows))), 2000)

With the first straightforward version of my answer I just used plain assignment and return each xts:

b <- lapply(a, function(a) { a[1,] = 1; a })

There are lots of redundant stuff: subset, assignment, return of xts back to lapply. So, it shouldn't be fast.

Then I remember that in R, you can use the operators as a functions with the ticks ``. The new version was more interesting.

lapply(a, `[<-`, 1, TRUE, 3)

It's always a good point to measure the performance of the solutions. Here it is:

microbenchmark::microbenchmark(assign = lapply(a, function(a) { a[1,] = 1; a }),
                               anon_assign = lapply(a, `[<-`, 1, TRUE, 3))

Unit: milliseconds
        expr      min       lq     mean   median       uq       max neval
      assign 38.78712 86.87922 88.41540 87.96786 88.98891 142.31135   100
 anon_assign 32.80542 83.20638 82.45022 84.02763 84.53131  91.72108   100

Wow! A noticable difference in almost the same code...

Almost... But, I've used the = for assignment in first version (was writting some Python code meanwhile). What will be the difference with arrow assignment?

microbenchmark::microbenchmark(equal_assign = lapply(a, function(a) { a[1,] = 1; a }),
                               arrow_assign = lapply(a, function(a) { a[1,] <- 1; a }),
                               operator_assign = lapply(a, `[<-`, 1, TRUE, 3))
                               
Unit: milliseconds
            expr      min       lq     mean   median       uq       max neval
    equal_assign 83.40491 87.30866 88.65197 88.35531 89.40709  98.02099   100
    arrow_assign 40.68511 87.93752 88.60578 88.64984 89.63135 105.13063   100
 operator_assign 29.61719 83.57540 82.62361 84.53515 85.14328  94.63283   100

So... There is not much difference between <- and = for assignment, but [<- still wins the match.

"Functional" programming in R
Share this

Subscribe to Notes, Thoughts & Ideas