Installation
Don't install R nor R studio from your system package manager. It's a waste of time. Of course it will work and you'll be able to run hello world but soon you will need some external libraries. And some of them will be outdated others will have conflicting dependencies so the installation will fail. At least that's the case with ubuntu 14.04.rJava problem
Some libraries require java. If you have a problem with 'rJava' library, it's possible that your R installation by default looks for different (older) java version than you actually have installed. in this case you may try:sudo R CMD javareconf
as described here: http://stackoverflow.com/a/31316527/1100135
Changing locale
If, for any reason, you can't change a locale from inside R, you can run whole R with different locale:LC_ALL=C rstudio
You can read more about it using
man setlocale
. Still, it won't let you use a few different locales at once.Building / transforming formulas
At some point you will want use the power of lazy evaluation and build/transform formulas instead of providing them by hand. Two functions will be usefull:substitute
and as.formula
. Let's say we want to build a function that takes all the predictiors (or, more general, some part of formula) and adds the regression variable y
(other part of formula)make.formula <- function(x) as.formula(substitute(t ~ x))and now we can call it using:
new.formula <- make.formula(x+y*z) str(new.formula)to get:
Class 'formula' length 3 t ~ x + y * z
Tuning knitr rendering
Each code chunk {r } accepts optional parameters that allow you, for example, control if code is executed, if diagnostic messages are also rendered, if computation is cached, if each command prints its output or whole output is displayed at the end etc. Sample:```{r cache=T, message=F, results='hold'} library(randomForest) system.time(fit <- randomForest(classe ~ ., data=training)) fit ```It will exclude diagnostic from loading library, cache trained model and display whole output at the end. Do ?opts_chunk to see the reference page of available options (in
library(knitr)
) and links to the online documentation.
For inline R do:
`r 2 + 3 * x`
Benchmarking
System.time(x <- expensive.function())or to compare multiple computations:
library(rbenchmark) benchmark(x <- expensive.function1(), y <-expensive.function2())Above code will do the actual measurement and also will assign new variable in the current environment.
Training prediction models with Caret
train
delegates to other prediction method based on type. Often it's way faster to call directly the underlying method. We may loose all the caret's meta-parameter tuning but still often the model we get is good enough while having the training orders of magnitude faster. Eg:
train(y ~ x, data=training, method='rf') randomForest(y ~ x, data=training)
0 komentarze :
Post a Comment