[3 Week] Loop Functions & Debugging Tools
Loop Functions
lapply: Loop over a list and evaluate a function on each element
sapply: Same as lapply but try to simplify the result
apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector
mapply: Multivariate version of lapply
rnorm function (The normal Distribution)
Density, distribution function, quantile function and random generation for the normal distribution.
Loop Functions - lapply
lapply takes three arguments:
(1) a list x;
(2) a function (or the name of a function) FUN;
(3) other arguments via its ... argument.
If x is not a list, it will be coerced to a list using as.list.
lapply always returns a list, regardless of the class of the input.
> x <- list(a=1:5, b = rnorm(10))
> x
$a
[1] 1 2 3 4 5
$b
[1] 0.34766773 1.88039654 -0.29986269 1.88896873 0.07806339 -1.63535799
[7] 1.12373391 0.66304757 0.64747795 -0.38855335
> lapply(x,mean)
$a
[1] 3
$b
[1] 0.4305582
> x <- list(a=1:4, b=rnorm(10), c=rnorm(20,1), d=rnorm(100,5))
> lapply(x,mean)
$a
[1] 2.5
$b
[1] 0.0315751
$c
[1] 1.193494
$d
[1] 4.999784
> x <- 1:4
> lappy(x,runif)
Error: could not find function "lappy"
> lapply(x, runif)
[[1]]
[1] 0.1516973
[[2]]
[1] 0.5303134 0.7188454
[[3]]
[1] 0.61570965 0.03625812 0.79371658
[[4]]
[1] 0.06210734 0.59349463 0.83711023 0.38416463
"runif" function은 random 변수를 생성하기 위한 함수 이다.
I want to generate a uniform between zero and ten.
for that, we are passing these arguments (min, max) through the dot dot dot argument.
So here, we are calling lapply using several arguments.
> x<-1:4
> lapply(x,runif,min=0,max=10)
[[1]]
[1] 1.384929
[[2]]
[1] 4.474732 3.952107
[[3]]
[1] 2.406658 5.489504 7.572002
[[4]]
[1] 5.534824 0.325385 4.289476 4.976774
An anonymous function for extracting the first column of each matrix.
> x<- list(a=matrix(1:4,2,2), b=matrix(1:6,3,2))
> x
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
> lapply(x, function(elt) elt[,1])
$a
[1] 1 2
$b
[1] 1 2 3
Loop Functions - apply
apply is used to a evaluate a function (often an anonymous one) over the margins of an array.
It is most often used to apply a function to the rows or columns of a matrix.
It can be used with general arrays, e.g. taking the average of an array of matrices.
It is not really faster than writing a loop, but it works in one line !
> str(apply)
function (X, MARGIN, FUN, ...)
- X is an array
- MARGIN is an integer vector indicating which margins should be “retained”.
- FUN is a function to be applied
- ... is for other arguments to be passed to FUN
![](https://t1.daumcdn.net/cfile/tistory/225BB63555B885EE31)
For sums and means of matrix dimensions, we have some shorcuts.
- rowSums = apply(x, 1, sum)
- rowMeans = apply(x, 1, mean)
- colSums = apply(x, 2, sum)
- colMeans = apply(x, 2, mean)
This shorcut functions are much faster, but you won't notice unless you're using a large matrix.
Quantiles of the rows of a matrix.
![](https://t1.daumcdn.net/cfile/tistory/2306743855B886E90F)
Loop Functions - mapply
Loop Functions - tapply
Loop Functions - split
Split takes a vector or other objects and splits it into groups determined by a factor or list of factors.
< Splitting a Data Frame >
![](https://t1.daumcdn.net/cfile/tistory/2316ED4855B9907721)
lapply를 이용해서 각각을 처리한 것이다.
there are five columns.
![](https://t1.daumcdn.net/cfile/tistory/2377C15055B9B16A14)
instead of using lapply, we can use sapply to simplify the result.
What we will do is put all these numbers into a matrix.
where the three rows and in this case 5 columns.
For each of the tree variables, in a much more compact format, it's a matrix, instead of a list.
Of course we still got NA's for a lot of them, because the missing values in the original data.
So on thing I knew is I was going to pass the na.rm argument to call.
here you can see the monthly means
![](https://t1.daumcdn.net/cfile/tistory/270C1F3E55B9B66834)
< Splitting on More than One Level >
![](https://t1.daumcdn.net/cfile/tistory/2609543D55B9C6C209)
Debugging
상태 정보의 종류
- message: A generic notification/diagnostic message produced by the message function; execution of the function continues
- warning: An indication that something is wrong but not necessarily fatal; execution of the function continues; generated by the warning function.
- error: An indication that a fatal problem has occurred; execution stops; produced by the stop function
- condition: A generic concept for indicating that something unexpected can occur; programmers can create their own conditions
디버깅에 활용할 수 있는 도구들
- traceback: prints out the function call stack after an error occurs; does nothing if there's no error.
- debug: flags a function for "debug" mode which allows you to step through execution of a function one line at a time.
- browser: suspends the execution of a function wherever it is called and puts the function in debug mode.
- trace: allows you to insert debugging code into a function a specific places.
- recover: allows you to modify the error behavior so that you can browse the function call stack.
Programming Assignment 2
특이한점은 peer Assessments를 이용해서 과제를 제출한 사람들끼리 서로 서로 평가하는 방식을 택한다.
나름 부정행위를 막으려는 취지인것 같다.
최소한 1명의 과제를 체점해야 한다. 그렇지 않으면 20%의 감점을 당하게 된다.
Introduction
이미 한번 연산된 결과를 재사용하는 방법을 배우는 실습이다.
Example: Caching the Mean of Vector
<<- operator는 전역을 위한 것이다.
makeVector <- function(x = numeric()) {
m <- NULL
set <- function(y) {
x <<- y # 전역변수
m <<- NULL #전역변수
}
get <- function() x
setmean <- function(mean) m <<- mean
getmean <- function() m
list(set = set, get = get,
setmean = setmean,
getmean = getmean)
}
Assignment: Caching the inverse of a Matrix
아래와 같이 코드를 최종 작성하고 제출 했다.
# Overall, makeCacheMatrix() sustains cache data for resuing it.
# cacheSolve() cacluates the inverse of a Matrix from Matrix or makeCachematrix().
# to validate my won code, you can use the following seqeunces:
# > m <- makeCacheMatrix()
# > m$set(matrix(c(4,2,2,4),2,2))
# > m$get()
# [,1] [,2]
# [1,] 4 2
# [2,] 2 4
#
# > cacheSolve(m)
# [,1] [,2]
# [1,] 0.3333333 -0.1666667
# [2,] -0.1666667 0.3333333
#
# > cacheSolve(m)
# getting cached data
# [,1] [,2]
# [1,] 0.3333333 -0.1666667
# [2,] -0.1666667 0.3333333
# makeCacheMatrix: return a list of functions to:
# 1. Set the value of the matrix
# 2. Get the value of the matrix
# 3. Set the value of the inverse
# 4. Get the value of the inverse
makeCacheMatrix <- function(x = matrix()) {
## Initialize m
m <- NULL
## Create a function which is to keep global_x and global_m as passed matrix and Null, respectively.
set <- function(y) {
# y is the initial matrix from user. so it is stored in global_x.
global_x <<- y
# initialize global_m
global_m <<- NULL
}
# Create one line function(). a matrix stored by set() is returned.
get <- function() return(global_x)
# Create one line function(). a matrix is stored as global value.
set_global_m <- function(m) global_m <<- m
# Create one line function(). a matrix stored by set_global_m() is returned.
get_global_m <- function() return(global_m)
list(set = set, get = get,
set_global_m = set_global_m,
get_global_m = get_global_m)
}
# This function computes the inverse of matrix.
# by checking previous history, this function avoids for redundancy.
cacheSolve <- function(x) {
# try to get the value from the global environment.
m<- x$get_global_m()
if(!is.null(m)) { # Check the result.
# by checking if m is NULL, we can know whether this matrix was already computed or not.
# if so, return computed value in last time, then print the message.
message("getting cached data")
return(m)
}
# if m is NULL, the inverse of matrix is computed by solve() function.
# Then, this result should be stored in global value for reusing.
data <- x$get()
inverseMatrix <- solve(data)
x$set_global_m(inverseMatrix)
return(inverseMatrix)
}