Model based prediction
jemin lee
2015년 11월 23일
Basic Idea
Assume the data follow a probabilistic model Use Bayes’ theorem to identify optimal classifiers
Proc:
- Can take advantage of structure of the data
- May be computationally convenient
- Are reasonably accurate on real problems
Cons:
- Make additional assumptions about the data
- When the model is incorrect you may get reduced accruacy
Example: Iris Data
data(iris); library(ggplot2); library(caret)## Loading required package: latticenames(iris)## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
## [5] "Species"table(iris$Species)##
## setosa versicolor virginica
## 50 50 50Create training and test sets
inTrain <- createDataPartition(y=iris$Species,
p=0.7, list=FALSE)
training <- iris[inTrain,]
testing <- iris[-inTrain,]
dim(training); dim(testing)## [1] 105 5## [1] 45 5Build predictions linear discriminate analysis (lda)
modlda = train(Species ~ .,data=training,method="lda")## Loading required package: MASSmodnb = train(Species ~ ., data=training,method="nb")plda = predict(modlda,testing); pnb = predict(modnb,testing)
table(plda,pnb)## pnb
## plda setosa versicolor virginica
## setosa 15 0 0
## versicolor 0 17 0
## virginica 0 1 12Comparsion of results
see that just one value appears between the two classes appears to be not classified in the same way by the same way two algorithms but overall they perform very similarly.
equalPredictions = (plda==pnb)
qplot(Petal.Width,Sepal.Width,colour=equalPredictions,data=testing)'MOOC > Practical Machine Learning (r programing)' 카테고리의 다른 글
| Certification and Comments (0) | 2015.12.04 |
|---|---|
| Week 04: Regularized Regression (0) | 2015.11.24 |
| Week03: Boosting (0) | 2015.11.23 |
| Week 03: Random Forests (0) | 2015.11.23 |
| Week 03: Bagging (0) | 2015.11.19 |