Some idea about automatically caching in R

As some calculation costs lots of time, it would be a good idea to cache its result for reusing later. A naive method is to “save” the object and to “load” it when needed. However it is not so convenient when there are lots of objects to save, say, 1000 matrixes generated by simulation, and lm model fits for each matrix.

The result of functions is decided by its parameters, including the implicit options such as the status of seed, variables from the global environment and the functions Options. If same hash (or digest) could be generated for each call and result, the hash value could be used as the name(or keyword) to save the data. If we meet the same call later, we can just load the cached result by checking the call-result cache dictionary. I would like to utilize the idea of makefile into R, and save the dependence relationship automatically from the R expressions.

The function “digest” can generate “hash” digests of arbitrary R objects.

Some possible issues

  • Digest for environment
  • Weak reference to environment in object
  • Object including variables require name looking up in global environment or same other namespace. The package codeTools is helpful to check this problem

Resource

The package on https://github.com/rdpeng/ might be helpful.

Posted in Uncategorized | Tagged | Leave a comment

Get the expressions of arguments

Get the expression of an argument is quite useful. The expression could be saved for late use, or showed to the user.

When a function is called, the formal arguments are binded to promise objects. Promise objects are part of R’s lazy evaluation mechanism. A promise object is composed of an expression, a value and an environment. The stored expression would only be evaluated in the environment and stored in the value, when until the promise object is fetched.

The function substitute can be used to get the expression of an argument. It would return the parse tree of the expression, by substituting the variables with values in the stored environment of promise object. The document of substitute give an example about using substitute to get the expression and using deparse to get a string for the expression,

myplot <- function(x, y){
  plot(x, y, xlab=deparse(substitute(x)),
       ylab=deparse(substitute(y)))
}
Posted in Uncategorized | Tagged | Leave a comment

Get value of …(dot-dot-dot)

The ‘…’ object of R is a pairlist. To check its value, the following codes could be used,

getDots <- function(...){
  list(...)
}
getDots(X=3, Y=1:4)
Posted in Uncategorized | Tagged | Leave a comment

Get array index of max value of a matrix or array

The function “which” would return a vector index or array index, depending on the parameter “arr.ind”. However this parameter does not work for “which.max”, so we need the function arrayInd to transfer a vector index to an array index.

> (a <- matrix(c(3:1, 9:1), nrow=3, ncol=4))
       [,1] [,2] [,3] [,4]
[1,]    3    9    6    3
[2,]    2    8    5    2
[3,]    1    7    4    1
> arrayInd(which.max(a), .dim=dim(a))
      [,1] [,2]
[1,]    1    2
Posted in Uncategorized | Tagged | Leave a comment

by – Apply a Function to a Data Frame Split by Factors

Consider the data MathAchive in package MEMSS, and we want to get the mean value of MathAch for male and female

> str(MathAchieve)
> 'data.frame':	7185 obs. of  6 variables:
$ School  : Factor w/ 160 levels "1224","1288",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Minority: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ Sex     : Factor w/ 2 levels "Female","Male": 1 1 2 2 2 2 1 2 1 2 ...
$ SES     : num  -1.528 -0.588 -0.528 -0.668 -0.158 ...
$ MathAch : num  5.88 19.71 20.35 8.78 17.9 ...
$ MEANSES : num  -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428 -0.428
> (aveAchiSex <- with(MathAchieve, by(MathAch, Sex, mean)))
Sex: Female
[1] 11.94752
------------------------------------------------------------
Sex: Male
[1] 13.64380

Although the return value has a “by” class, essentially it is a list, so we can get a data.frame by cbind or rbind. Even with this trick, it is still not so convenient to handle the return value as the names of data are lost.

> with(MathAchieve, cbind(by(MathAch, Sex, mean)))
[,1]
Female 11.94752
Male   13.64380
Posted in Uncategorized | Tagged | Leave a comment

embed R into LaTeX?

I plan to use \LaTeX to make notes. As I would use R a lot as a statistical student, embedding R into \LaTeX and running R code automatically when compile \LaTeXwould be very helpful. With the powerful google, I find some document about embed python into \LaTeX document, but nothing about R. I believe it is not hard to modify the python.sty file for R code. I’ll try it after learning \LaTeX.

Posted in Uncategorized | Tagged | Leave a comment

Hello world!

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

Posted in Uncategorized | 1 Comment