R Programming week 3-Loop functions
Looping on the Command Line
Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier
lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result
apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply
An auxiliary function split is also useful, particularly in conjunction with lapply
lapply
lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.
## function (X, FUN, ...)
## {
## FUN <- match.fun(FUN)
## if (!is.vector(X) || is.object(X))
## X <- as.list(X)
## .Internal(lapply(X, FUN))
## }
## <bytecode: 0x7ff7a1951c00>
## <environment: namespace:base>
The actual looping is done internally in C code.
lapply always returns a list, regardless of the class of the input.
x <- list(a = 1:5, b = rnorm(10))
lapply(x, mean)
x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)
> x <- 1:4 > lapply(x, runif)
lapply and friends make heavy use of anonymous function
> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))
> x
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
An anonymous function for extracting the first column of each matrix.
> lapply(x, function(elt) elt[,1])
$a
[1] 1 2
$b
[1] 1 2 3
sapply
> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
> lapply(x, mean)
apply
apply is used to a evaluate a function (often an anonymous one) over the margins of an array.
It is most often used to apply a function to the rows or columns of a matrix
It can be used with general arrays, e.g. taking the average of an array of matrices
It is not really faster than writing a loop, but it works in one line!
> str(apply)
function (X, MARGIN, FUN, ...)
X is an array
MARGIN is an integer vector indicating which margins should be “retained”.
FUN is a function to be applied
... is for other arguments to be passed to FUN
> x <- matrix(rnorm(200), 20, 10)
> apply(x, 2, mean)
[1] 0.04868268 0.35743615 -0.09104379
[4] -0.05381370 -0.16552070 -0.18192493
[7] 0.10285727 0.36519270 0.14898850
[10] 0.26767260
col/row sums and means
For sums and means of matrix dimensions, we have some shortcuts.
rowSums = apply(x, 1, sum)
rowMeans = apply(x, 1, mean)
colSums = apply(x, 2, sum)
colMeans = apply(x, 2, mean)
The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.
Other Ways to Apply
Quantiles of the rows of a matrix.
> x <- matrix(rnorm(200), 20, 10)
> apply(x, 1, quantile, probs = c(0.25, 0.75))
mapply
mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.
> str(mapply)
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)
FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.
SIMPLIFY indicates whether the result should be simplified
The following is tedious to type
list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))
Instead we can do
Vectorizing a Function
> noise <- function(n, mean, sd) {
+ rnorm(n, mean, sd)
+ }
> noise(5, 1, 2)
[1] 2.4831198 2.4790100 0.4855190 -1.2117759
[5] -0.2743532
> noise(1:5, 1:5, 2)
[1] -4.2128648 -0.3989266 4.2507057 1.1572738
[5] 3.7413584
Instant Vectorization
> mapply(noise, 1:5, 1:5, 2)
Which is the same as
list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))
tapply
tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.
> str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)
X is a vector
INDEX is a factor or a list of factors (or else they are coerced to factors)
FUN is a function to be applied
... contains other arguments to be passed FUN
simplify, should we simplify the result?
Take group means.
> x <- c(rnorm(10), runif(10), rnorm(10, 1))
> f <- gl(3, 10)
> f
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3
[24] 3 3 3 3 3 3 3
Levels: 1 2 3
> tapply(x, f, mean)
1 2 3
0.1144464 0.5163468 1.2463678
Take group means without simplification.
> tapply(x, f, mean, simplify = FALSE)
$‘1‘
[1] 0.1144464
$‘2‘
[1] 0.5163468
$‘3‘
[1] 1.246368
Find group ranges.
> tapply(x, f, range)
$‘1‘
[1] -1.097309 2.694970
$‘2‘
[1] 0.09479023 0.79107293
$‘3‘
[1] 0.4717443 2.5887025
split
split takes a vector or other objects and splits it into groups determined by a factor or list of
factors.
> str(split)
function (x, f, drop = FALSE, ...)
x is a vector (or list) or data frame
f is a factor (or coerced to one) or a list of factors
drop indicates whether empty factors levels should be dropped
A common idiom is split followed by an lapply.
> lapply(split(x, f), mean)
Splitting a Data Frame
> library(datasets)
> head(airquality)
> s <- split(airquality, airquality$Month)
> lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))
> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))
> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))
Splitting on More than One Level
> x <- rnorm(10)
> f1 <- gl(2, 5)
> f2 <- gl(5, 2)
Interactions can create empty levels.
> str(split(x, list(f1, f2)))
split
Empty levels can be dropped
> str(split(x, list(f1, f2), drop = TRUE))
List of 6
$ 1.1: num [1:2] -0.378 0.445
$ 1.2: num [1:2] 1.4066 0.0166
$ 1.3: num -0.355
$ 2.3: num 0.315
$ 2.4: num [1:2] -0.907 0.723
$ 2.5: num [1:2] 0.732 0.360
欢迎关注

R Programming week 3-Loop functions的更多相关文章
- Coursera系列-R Programming第二周
博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html --- 好久没发博客 且容我大吼一句 终于做完这周R Progra ...
- Coursera系列-R Programming第三周-词法作用域
完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心 ...
- 让reddit/r/programming炸锅的一个帖子,还是挺有意思的
这是原帖 http://www.reddit.com/r/programming/comments/358tnp/five_programming_problems_every_software_en ...
- R Programming week2 Functions and Scoping Rules
A Diversion on Binding Values to Symbol When R tries to bind a value to a symbol,it searches through ...
- [R] [Johns Hopkins] R Programming 作業 Week 2 - Air Pollution
Introduction For this first programming assignment you will write three functions that are meant to ...
- R Programming week2 Control Structures
Control Structures Control structures in R allow you to control the flow of execution of the program ...
- R Programming week 3-Debugging
Something’s Wrong! Indications that something’s not right message: A generic notification/diagnostic ...
- R Programming week1-Reading Data
Reading Data There are a few principal functions reading data into R. read.table, read.csv, for read ...
- R Programming week1-Data Type
Objects R has five basic or “atomic” classes of objects: character numeric (real numbers) integer co ...
随机推荐
- WAMP的端口修改
wamp集成了开源的利器mysql+apache+php,真的是有越来越火的趋势了,可是有些人,安装php的集成开发环境WAMP的时候,出现端口被占用了,无法连接服务器的时候, 这时,如果要修改WAM ...
- Ctags基本配置
一般linux系统都会自带ctags,也可输入"ctags"看有木有该命令.有的话速度配置吧,没有话yum install ctags安装吧. 打开vim 配置文件,要是没该文件就 ...
- mysql 5.5安装不对容易出现问题
按照正常步骤安装完了mysql 5.5之后,再运行一下bin目录中的mysqlinstanceconfig.exe,重置一下密码!!!! 重置密码的地方:modify security setting ...
- C 项目案例实践(1)数据结构之链表(0)
链表是通过一组任意的存储单元来存储线性表中的数据元素的,那么怎样表示出数据元素之间的线性关系呢?为建立数据元素之间的线性关系,对每个数据元素ai,除了存放数据元素的自身信息ai之外,还需要存放和ai一 ...
- scikit-learn(1) 第一个例子说明
第一个 scikit-learn例子 ................................................................................. ...
- GrideView(二)---删除功能
情景一. 没有外键关联, 操作:在数据源中将删除选项选中--- GrideView 中的删除 选项选中 即可情景二. 有外键关联 *RowDeleting 行删除前触发 *RowDeleted 行删除 ...
- Java String 和 new String()的区别
Java String 和 new String()的区别 本文转自:http://www.cnblogs.com/heima-jieqi/archive/2012/04/10/2440086.htm ...
- 苹果Instruments/Shark性能调试工具概述
在Mac OS X上你可以使用Gprof这样的UNIX工具用于测试程序性能.当然,Apple也有自己的Profiling Tools,用得比较多的是Shark.10.5里还引入了一个基于DTrace的 ...
- 可视化工具Grafana部署
随着业务的越发复杂,对软件系统的要求越来越高,这意味着我们需要随时掌控系统的运行情况.因此,对系统的实时监控以及可视化展示,就成了基础架构的必须能力. Grafana官方网站 https://graf ...
- Linux 常用命令三 touch mkdir
一.touch命令 创建一个文件: wang@wang:~/workpalce/python$ ls wang@wang:~/workpalce/python$ .txt wang@wang:~/wo ...