[3 Week] Loop Functions & Debugging Tools
Loop Functions
lapply: Loop over a list and evaluate a function on each element
sapply: Same as lapply but try to simplify the result
apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector
mapply: Multivariate version of lapply
rnorm function (The normal Distribution)
Density, distribution function, quantile function and random generation for the normal distribution.
Loop Functions - lapply
lapply takes three arguments:
(1) a list x;
(2) a function (or the name of a function) FUN;
(3) other arguments via its ... argument.
If x is not a list, it will be coerced to a list using as.list.
lapply always returns a list, regardless of the class of the input.
> x <- list(a=1:5, b = rnorm(10))
> x
$a
[1] 1 2 3 4 5
$b
[1] 0.34766773 1.88039654 -0.29986269 1.88896873 0.07806339 -1.63535799
[7] 1.12373391 0.66304757 0.64747795 -0.38855335
> lapply(x,mean)
$a
[1] 3
$b
[1] 0.4305582
> x <- list(a=1:4, b=rnorm(10), c=rnorm(20,1), d=rnorm(100,5))
> lapply(x,mean)
$a
[1] 2.5
$b
[1] 0.0315751
$c
[1] 1.193494
$d
[1] 4.999784
> x <- 1:4
> lappy(x,runif)
Error: could not find function "lappy"
> lapply(x, runif)
[[1]]
[1] 0.1516973
[[2]]
[1] 0.5303134 0.7188454
[[3]]
[1] 0.61570965 0.03625812 0.79371658
[[4]]
[1] 0.06210734 0.59349463 0.83711023 0.38416463
"runif" function은 random 변수를 생성하기 위한 함수 이다.
I want to generate a uniform between zero and ten.
for that, we are passing these arguments (min, max) through the dot dot dot argument.
So here, we are calling lapply using several arguments.
> x<-1:4
> lapply(x,runif,min=0,max=10)
[[1]]
[1] 1.384929
[[2]]
[1] 4.474732 3.952107
[[3]]
[1] 2.406658 5.489504 7.572002
[[4]]
[1] 5.534824 0.325385 4.289476 4.976774
An anonymous function for extracting the first column of each matrix.
> x<- list(a=matrix(1:4,2,2), b=matrix(1:6,3,2))
> x
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
> lapply(x, function(elt) elt[,1])
$a
[1] 1 2
$b
[1] 1 2 3
Loop Functions - apply
apply is used to a evaluate a function (often an anonymous one) over the margins of an array.
It is most often used to apply a function to the rows or columns of a matrix.
It can be used with general arrays, e.g. taking the average of an array of matrices.
It is not really faster than writing a loop, but it works in one line !
> str(apply)
function (X, MARGIN, FUN, ...)
- X is an array
- MARGIN is an integer vector indicating which margins should be “retained”.
- FUN is a function to be applied
- ... is for other arguments to be passed to FUN
For sums and means of matrix dimensions, we have some shorcuts.
- rowSums = apply(x, 1, sum)
- rowMeans = apply(x, 1, mean)
- colSums = apply(x, 2, sum)
- colMeans = apply(x, 2, mean)
This shorcut functions are much faster, but you won't notice unless you're using a large matrix.
Quantiles of the rows of a matrix.
Loop Functions - mapply
Loop Functions - tapply
Loop Functions - split
Split takes a vector or other objects and splits it into groups determined by a factor or list of factors.
< Splitting a Data Frame >
lapply를 이용해서 각각을 처리한 것이다.
there are five columns.
instead of using lapply, we can use sapply to simplify the result.
What we will do is put all these numbers into a matrix.
where the three rows and in this case 5 columns.
For each of the tree variables, in a much more compact format, it's a matrix, instead of a list.
Of course we still got NA's for a lot of them, because the missing values in the original data.
So on thing I knew is I was going to pass the na.rm argument to call.
here you can see the monthly means
< Splitting on More than One Level >
Debugging
상태 정보의 종류
- message: A generic notification/diagnostic message produced by the message function; execution of the function continues
- warning: An indication that something is wrong but not necessarily fatal; execution of the function continues; generated by the warning function.
- error: An indication that a fatal problem has occurred; execution stops; produced by the stop function
- condition: A generic concept for indicating that something unexpected can occur; programmers can create their own conditions
디버깅에 활용할 수 있는 도구들
- traceback: prints out the function call stack after an error occurs; does nothing if there's no error.
- debug: flags a function for "debug" mode which allows you to step through execution of a function one line at a time.
- browser: suspends the execution of a function wherever it is called and puts the function in debug mode.
- trace: allows you to insert debugging code into a function a specific places.
- recover: allows you to modify the error behavior so that you can browse the function call stack.
Programming Assignment 2
특이한점은 peer Assessments를 이용해서 과제를 제출한 사람들끼리 서로 서로 평가하는 방식을 택한다.
나름 부정행위를 막으려는 취지인것 같다.
최소한 1명의 과제를 체점해야 한다. 그렇지 않으면 20%의 감점을 당하게 된다.
Introduction
이미 한번 연산된 결과를 재사용하는 방법을 배우는 실습이다.
Example: Caching the Mean of Vector
<<- operator는 전역을 위한 것이다.
makeVector <- function(x = numeric()) {
m <- NULL
set <- function(y) {
x <<- y # 전역변수
m <<- NULL #전역변수
}
get <- function() x
setmean <- function(mean) m <<- mean
getmean <- function() m
list(set = set, get = get,
setmean = setmean,
getmean = getmean)
}
Assignment: Caching the inverse of a Matrix
아래와 같이 코드를 최종 작성하고 제출 했다.
# Overall, makeCacheMatrix() sustains cache data for resuing it.
# cacheSolve() cacluates the inverse of a Matrix from Matrix or makeCachematrix().
# to validate my won code, you can use the following seqeunces:
# > m <- makeCacheMatrix()
# > m$set(matrix(c(4,2,2,4),2,2))
# > m$get()
# [,1] [,2]
# [1,] 4 2
# [2,] 2 4
#
# > cacheSolve(m)
# [,1] [,2]
# [1,] 0.3333333 -0.1666667
# [2,] -0.1666667 0.3333333
#
# > cacheSolve(m)
# getting cached data
# [,1] [,2]
# [1,] 0.3333333 -0.1666667
# [2,] -0.1666667 0.3333333
# makeCacheMatrix: return a list of functions to:
# 1. Set the value of the matrix
# 2. Get the value of the matrix
# 3. Set the value of the inverse
# 4. Get the value of the inverse
makeCacheMatrix <- function(x = matrix()) {
## Initialize m
m <- NULL
## Create a function which is to keep global_x and global_m as passed matrix and Null, respectively.
set <- function(y) {
# y is the initial matrix from user. so it is stored in global_x.
global_x <<- y
# initialize global_m
global_m <<- NULL
}
# Create one line function(). a matrix stored by set() is returned.
get <- function() return(global_x)
# Create one line function(). a matrix is stored as global value.
set_global_m <- function(m) global_m <<- m
# Create one line function(). a matrix stored by set_global_m() is returned.
get_global_m <- function() return(global_m)
list(set = set, get = get,
set_global_m = set_global_m,
get_global_m = get_global_m)
}
# This function computes the inverse of matrix.
# by checking previous history, this function avoids for redundancy.
cacheSolve <- function(x) {
# try to get the value from the global environment.
m<- x$get_global_m()
if(!is.null(m)) { # Check the result.
# by checking if m is NULL, we can know whether this matrix was already computed or not.
# if so, return computed value in last time, then print the message.
message("getting cached data")
return(m)
}
# if m is NULL, the inverse of matrix is computed by solve() function.
# Then, this result should be stored in global value for reusing.
data <- x$get()
inverseMatrix <- solve(data)
x$set_global_m(inverseMatrix)
return(inverseMatrix)
}