[3 Week] Loop Functions & Debugging Tools


Loop Functions


lapply: Loop over a list and evaluate a function on each element

sapply: Same as lapply but try to simplify the result

apply: Apply a function over the margins of an array

tapply: Apply a function over subsets of a vector

mapply: Multivariate version of lapply



rnorm function (The normal Distribution)

Density, distribution function, quantile function and random generation for the normal distribution. 



Loop Functions - lapply 



lapply takes three arguments: 

(1) a list x; 

(2) a function (or the name of a function) FUN; 

(3) other arguments via its ... argument.

If x is not a list, it will be coerced to a list using as.list.



lapply always returns a list, regardless of the class of the input.


> x <- list(a=1:5, b = rnorm(10))
> x
$a
[1] 1 2 3 4 5

$b
 [1]  0.34766773  1.88039654 -0.29986269  1.88896873  0.07806339 -1.63535799
 [7]  1.12373391  0.66304757  0.64747795 -0.38855335

> lapply(x,mean)
$a
[1] 3

$b
[1] 0.4305582



> x <- list(a=1:4, b=rnorm(10), c=rnorm(20,1), d=rnorm(100,5))
> lapply(x,mean)
$a
[1] 2.5

$b
[1] 0.0315751

$c
[1] 1.193494

$d
[1] 4.999784



> x <- 1:4
> lappy(x,runif)
Error: could not find function "lappy"
> lapply(x, runif)
[[1]]
[1] 0.1516973

[[2]]
[1] 0.5303134 0.7188454

[[3]]
[1] 0.61570965 0.03625812 0.79371658

[[4]]
[1] 0.06210734 0.59349463 0.83711023 0.38416463

"runif" function은 random 변수를 생성하기 위한 함수 이다.



I want to generate a uniform between zero and ten.

for that, we are passing these arguments (min, max) through the dot dot dot argument.

So here, we are calling lapply using several arguments.



> x<-1:4
> lapply(x,runif,min=0,max=10)
[[1]]
[1] 1.384929

[[2]]
[1] 4.474732 3.952107

[[3]]
[1] 2.406658 5.489504 7.572002

[[4]]
[1] 5.534824 0.325385 4.289476 4.976774



An anonymous function for extracting the first column of each matrix.


> x<- list(a=matrix(1:4,2,2), b=matrix(1:6,3,2))
> x
$a
     [,1] [,2]
[1,]    1    3
[2,]    2    4

$b
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

> lapply(x, function(elt) elt[,1])
$a
[1] 1 2

$b
[1] 1 2 3



Loop Functions - apply


apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

It is most often used to apply a function to the rows or columns of a matrix.

It can be used with general arrays, e.g. taking the average of an array of matrices.

It is not really faster than writing a loop, but it works in one line !


> str(apply)

function (X, MARGIN, FUN, ...)

  • X is an array
  • MARGIN is an integer vector indicating which margins should be “retained”.
  • FUN is a function to be applied
  • ... is for other arguments to be passed to FUN





For sums and means of matrix dimensions, we have some shorcuts.


  • rowSums = apply(x, 1, sum)
  • rowMeans = apply(x, 1, mean)
  • colSums = apply(x, 2, sum)
  • colMeans = apply(x, 2, mean)

This shorcut functions are much faster, but you won't notice unless you're using a large matrix.



Quantiles of the rows of a matrix.







Loop Functions - mapply







Loop Functions - tapply







Loop Functions - split



Split takes a vector or other objects and splits it into groups determined by a factor or list of factors.



< Splitting a Data Frame >



lapply를 이용해서 각각을 처리한 것이다.

there are five columns.




instead of using lapply, we can use sapply to simplify the result.

What we will do is put all these numbers into a matrix.


where the three rows and in this case 5 columns.

For each of the tree variables, in a much more compact format, it's a matrix, instead of a list.

Of course we still got NA's for a lot of them, because the missing values in the original data.


So on thing I knew is I was going to pass the na.rm argument to call.


here you can see the monthly means




< Splitting on More than One Level >




Debugging


상태 정보의 종류

  • message: A generic notification/diagnostic message produced by the message function; execution of the function continues
  • warning: An indication that something is wrong but not necessarily fatal; execution of the function continues; generated by the warning function.
  • error: An indication that a fatal problem has occurred; execution stops; produced by the stop function
  • condition: A generic concept for indicating that something unexpected can occur; programmers can create their own conditions


디버깅에 활용할 수 있는 도구들

  • traceback: prints out the function call stack after an error occurs; does nothing if there's no error.



  • debug: flags a function for "debug" mode which allows you to step through execution of a function one line at a time.
  • browser: suspends the execution of a function wherever it is called and puts the function in debug mode.




  • trace: allows you to insert debugging code into a function a specific places.

  • recover: allows you to modify the error behavior so that you can browse the function call stack.







Programming Assignment 2


특이한점은 peer Assessments를 이용해서 과제를 제출한 사람들끼리 서로 서로 평가하는 방식을 택한다.

나름 부정행위를 막으려는 취지인것 같다.

최소한 1명의 과제를 체점해야 한다. 그렇지 않으면 20%의 감점을 당하게 된다.



Introduction

이미 한번 연산된 결과를 재사용하는 방법을 배우는 실습이다.



Example: Caching the Mean of Vector


<<- operator는 전역을 위한 것이다.



makeVector <- function(x = numeric()) {
        m <- NULL
        set <- function(y) {
                x <<- y # 전역변수
                m <<- NULL  #전역변수
        }
        get <- function() x
        setmean <- function(mean) m <<- mean
        getmean <- function() m
        list(set = set, get = get,
             setmean = setmean,
             getmean = getmean)
}

Assignment: Caching the inverse of a Matrix


아래와 같이 코드를 최종 작성하고 제출 했다.




# Overall, makeCacheMatrix() sustains cache data for resuing it.
# cacheSolve() cacluates the inverse of a Matrix from Matrix or makeCachematrix().
# to validate my won code, you can use the following seqeunces:
# > m <- makeCacheMatrix()
# > m$set(matrix(c(4,2,2,4),2,2))
# > m$get()
#        [,1] [,2]
# [1,]    4    2
# [2,]    2    4
#
# > cacheSolve(m)
#             [,1]       [,2]
# [1,]  0.3333333 -0.1666667
# [2,] -0.1666667  0.3333333
#
# > cacheSolve(m)
# getting cached data
#             [,1]       [,2]
# [1,]  0.3333333 -0.1666667
# [2,] -0.1666667  0.3333333


# makeCacheMatrix: return a list of functions to:
# 1. Set the value of the matrix
# 2. Get the value of the matrix
# 3. Set the value of the inverse
# 4. Get the value of the inverse
makeCacheMatrix <- function(x = matrix()) {
    ## Initialize m
    m <- NULL                                      
    
    ## Create a function which is to keep global_x and global_m as passed matrix and Null, respectively.
    set <- function(y) {
        # y is the initial matrix from user. so it is stored in global_x.
        global_x <<- y 
        # initialize global_m 
        global_m <<- NULL                                
    }
    
    # Create one line function(). a matrix stored by set() is returned.
    get <- function() return(global_x)
    # Create one line function(). a matrix is stored as global value.
    set_global_m <- function(m) global_m <<- m    
    # Create one line function(). a matrix stored by set_global_m() is returned.
    get_global_m <- function() return(global_m)                       
    list(set = set, get = get,
         set_global_m = set_global_m,
         get_global_m = get_global_m)
}

# This function computes the inverse of matrix.
# by checking previous history, this function avoids for redundancy.
cacheSolve <- function(x) {
    # try to get the value from the global environment.
    m<- x$get_global_m()               
    if(!is.null(m)) { # Check the result.
        # by checking if m is NULL, we can know whether this matrix was already computed or not.
        # if so, return computed value in last time, then print the message.
        message("getting cached data")
        return(m)
    }
    # if m is NULL, the inverse of matrix is computed by solve() function.
    # Then, this result should be stored in global value for reusing.
    data <- x$get()               
    inverseMatrix <- solve(data)   
    x$set_global_m(inverseMatrix)             
    return(inverseMatrix)                            
}


'MOOC > R Programming' 카테고리의 다른 글

Certificate & Comment  (0) 2015.08.26
[4 Week] Str & Simulation & R Profiler  (0) 2015.07.30
[3 Week] Loop Functions & Debugging Tools  (1) 2015.07.24
[2 Week] Programming with R  (0) 2015.07.16
[1 Week] Getting stated and R Nuts and Bolts  (0) 2015.07.08
  1. haru 2016.03.20 10:40 신고

    Coursera Quiz 3을 풀 때 이 문제가 이해가 안됩니다. 도와주실 수 있나요?

    Continuing with the 'mtcars' dataset from the previous Question, what is the absolute difference between the average horsepower of 4-cylinder cars and the average horsepower of 8-cylinder cars?
    (Please round your final answer to the nearest whole number. Only enter the numeric result and nothing else.)

    이를 실행해 보니 126.5799인 건 알겠는데 자꾸 틀리다고 나오더라고요...
    <프로그래밍 소스>
    tapply(mtcars$hp, mtcars$cyl, mean)
    209.21429 - 82.63636

    좋은 답변 부탁드립니다! 감사합니다~^^

+ Recent posts