Naive Baeys vs Neural network
Naive Bayes와 Deep Neural Networks에서 어느 알고리즘이 더 우수한지 평가한다.
한 사용자의 응답성을 나타내는 데이터이다.
1276개의 응답성 지표를 가지고 분석 했다.1276
응답 비율은 아래와 같다.
(FALSE): 922
(TRUE): 354
해당 데이터를 70%
는 트레이닝 30%
는 테스팅으로 분할 한다.
임의로 분할 하기 때문에 모든 요일과 시간이 적절히 섞인다.
요청이 있어서 데이터의 일부분을 Google Drive로 공유합니다.
단순히 형태만 참고 하시면 됩니다.
앞의 8개는 입력 Feature 이고 마지막 class가 출력값 입니다.
응답성은 True
와 False
값을 가지므로 Binary classification
문제가 됩니다.
Naive Bayes Classifier in R##
데이터 구조
> head(training)
1 1 74 18 4 2 2 1 1
2 2 10 18 4 2 2 3 2
4 1 56 19 4 2 2 1 1
6 1 84 19 4 1 1 1 4
8 1 39 19 4 1 2 1 4
9 1 56 19 4 1 2 1 4
library(caret)
set.seed(12358)
inTrain <- createDataPartition(y=factorClassList[['ikhee']], p=0.70, list =FALSE)
training <- data.frame(dfList[['ikhee']][inTrain,])
testing <- data.frame(dfList[['ikhee']][-inTrain,])
classTraining <- factorClassList[['ikhee']][inTrain]
classtesting <- factorClassList[['ikhee']][-inTrain]
sms_model1 <- train(training,classTraining, method="nb", trControl = ctrl, tuneGrid =
data.frame(.fL=c(0,0,1,1,10,10), .usekernel=c(FALSE,TRUE,FALSE,TRUE,FALSE,TRUE)))
sms_model1
sms_predict1 <- predict(sms_model1, testing)
cm1 <- confusionMatrix(sms_predict1, classtesting, positive="TRUE")
cm1
결과
> cm1
Confusion Matrix and Statistics
Reference
Prediction FALSE TRUE
FALSE 273 102
TRUE 3 4
Accuracy : 0.7251
95% CI : (0.6774, 0.7693)
No Information Rate : 0.7225
P-Value [Acc > NIR] : 0.4806
Kappa : 0.0377
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.03774
Specificity : 0.98913
Pos Pred Value : 0.57143
Neg Pred Value : 0.72800
Prevalence : 0.27749
Detection Rate : 0.01047
Detection Prevalence : 0.01832
Balanced Accuracy : 0.51343
'Positive' Class : TRUE
실제값
> table(classtesting)
classtesting
FALSE TRUE
276 106
Deep Neural Networks with TensorFlow in Python##
three layers neural networks. Activation function is Sigmoid (logistic)
데이터구조
Neural network의 경우 학습을 위해서
모든 Feature와 Class를 0~1 사이로 normalization 해야 cost function이 convergence 된다.
그렇지 않으면 발산 한다.
> head(ikheeTrainingDf_norm)
1 0.0000 0.56153846 0.7391304 0.5 1 1 0 0.0 1
2 0.0625 0.06923077 0.7391304 0.5 1 1 1 0.2 1
3 0.0000 0.42307692 0.7826087 0.5 1 1 0 0.0 1
4 0.0000 0.63846154 0.7826087 0.5 0 0 0 0.6 1
5 0.0000 0.29230769 0.7826087 0.5 0 1 0 0.6 0
6 0.0000 0.42307692 0.7826087 0.5 0 1 0 0.6 1
이러한 데이터를 txt
로 export해서
python으로 다시 처리한다.
import tensorflow as tf
import numpy as np
from sklearn.metrics import precision_score, confusion_matrix
from sklearn.metrics import classification_report
import pandas as pd
# three layers neural networks. Activation function is Sigmoid (logistic)
xyTraining = np.loadtxt('ikheeTrainingNorm.txt', unpack=True)
xyTesting = np.loadtxt('ikheeTestingNorm.txt', unpack=True)
x_data_training = np.transpose(xyTraining[0:-1])
y_data_training = np.reshape(xyTraining[-1], (len(x_data_training), 1))
x_data_testing = np.transpose(xyTesting[0:-1])
y_data_testing = np.reshape(xyTesting[-1], (len(x_data_testing), 1))
X = tf.placeholder(tf.float32, name='X-input')
Y = tf.placeholder(tf.float32, name='Y-input')
W1 = tf.Variable(tf.random_uniform([8, 16], -1.0, 1.0), name='Weight1')
W2 = tf.Variable(tf.random_uniform([16, 8], -1.0, 1.0), name='Weight2')
W3 = tf.Variable(tf.random_uniform([8, 1], -1.0, 1.0), name='Weight3')
b1 = tf.Variable(tf.zeros([16]), name="Bias1")
b2 = tf.Variable(tf.zeros([8]), name="Bias2")
b3 = tf.Variable(tf.zeros([1]), name="Bias3")
# Our hypothesis
with tf.name_scope("layer2") as scope:
L2 = tf.sigmoid(tf.matmul(X, W1) + b1)
with tf.name_scope("layer3") as scope:
L3 = tf.sigmoid(tf.matmul(L2, W2) + b2)
with tf.name_scope("layer4") as scope:
hypothesis = tf.sigmoid(tf.matmul(L3, W3) + b3)
# Cost function
with tf.name_scope("cost") as scope:
cost = -tf.reduce_mean(Y*tf.log(hypothesis) + (1-Y)*tf.log(1-hypothesis))
cost_summ = tf.scalar_summary("cost", cost)
# Minimize
with tf.name_scope("train") as scope:
a = tf.Variable(0.01) # Learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)
# Add histogram
w1_hist = tf.histogram_summary("weights1", W1)
w2_hist = tf.histogram_summary("weights2", W2)
b1_hist = tf.histogram_summary("biases1", b1)
b2_hist = tf.histogram_summary("biases2", b2)
y_hist = tf.histogram_summary("y", Y)
# Before starting, initialize the variables.
# We will `run` this first.
init = tf.initialize_all_variables()
# Launch the graph,
with tf.Session() as sess:
# tensorboard --logdir=./logs/xor_logs
merged = tf.merge_all_summaries()
writer = tf.train.SummaryWriter("./logs/xor_logs", sess.graph_def)
sess.run(init)
# Fit the line.
for step in xrange(2000):
sess.run(train, feed_dict={X:x_data_training, Y:y_data_training})
if step % 200 == 0:
summary = sess.run(merged, feed_dict={X:x_data_training, Y:y_data_training})
writer.add_summary(summary, step)
#print step, sess.run(cost, feed_dict={X:x_data, Y:y_data}), sess.run(W1), sess.run(W2)
print step, sess.run(cost, feed_dict={X:x_data_training, Y:y_data_training})
# Test model
correct_prediction = tf.equal(tf.floor(hypothesis+0.5), Y)
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
y_data_pred = sess.run(tf.floor(hypothesis + 0.5),
feed_dict={X: x_data_testing, Y: y_data_testing})
print sess.run([hypothesis, tf.floor(hypothesis+0.5), correct_prediction, accuracy], feed_dict={X:x_data_testing, Y:y_data_testing})
print "Accuracy:", accuracy.eval({X:x_data_testing, Y:y_data_testing})
# print confusion_matrix(y_data_testing[:,0], y_data_pred[:,0], labels=[0, 1])
pd_y_true = pd.Series(y_data_testing[:, 0])
pd_x_pred = pd.Series(y_data_pred[:, 0])
print pd.crosstab(pd_y_true, pd_x_pred, rownames=['True'], colnames=['Predicted'], margins=True)
target_names = ['false', 'true']
print(classification_report(y_data_testing[:, 0], y_data_pred[:, 0], target_names=target_names))
print 'Precision', precision_score(y_data_testing[:, 0], y_data_pred[:, 0], average='binary',pos_label=1)
결과
# Iteration & Cost
0 0.594439
200 0.59129
400 0.591194
600 0.591135
800 0.591078
1000 0.59102
1200 0.590963
1400 0.590907
1600 0.590852
1800 0.590796
Accuracy: 0.722513
# Confusion Matrix
Predicted 0.0 All
0.0 276 276
1.0 106 106
All 382 382
# Precision & Recall
precision recall f1-score support
false 0.72 1.00 0.84 276
true 0.00 0.00 0.00 106
avg / total 0.52 0.72 0.61 382
11 layers neural networks.
Activation functions are LeRu and Sigmoid (logistic)
코드
import tensorflow as tf
import numpy as np
from sklearn.metrics import precision_score, confusion_matrix
from sklearn.metrics import classification_report
import pandas as pd
# three layers neural networks. Activation function is Sigmoid (logistic)
xyTraining = np.loadtxt('ikheeTrainingNorm.txt', unpack=True)
xyTesting = np.loadtxt('ikheeTestingNorm.txt', unpack=True)
x_data_training = np.transpose(xyTraining[0:-1])
y_data_training = np.reshape(xyTraining[-1], (len(x_data_training), 1))
x_data_testing = np.transpose(xyTesting[0:-1])
y_data_testing = np.reshape(xyTesting[-1], (len(x_data_testing), 1))
X = tf.placeholder(tf.float32, name='X-input')
Y = tf.placeholder(tf.float32, name='Y-input')
W1 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight1')
# 9 hidden layers
W2 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight2')
W3 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight3')
W4 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight4')
W5 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight5')
W6 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight6')
W7 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight7')
W8 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight8')
W9 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight9')
W10 = tf.Variable(tf.random_uniform([8, 8], -1.0, 1.0), name='Weight10')
W11 = tf.Variable(tf.random_uniform([8, 1], -1.0, 1.0), name='Weight11')
b1 = tf.Variable(tf.zeros([8]), name="Bias1")
b2 = tf.Variable(tf.zeros([8]), name="Bias2")
b3 = tf.Variable(tf.zeros([8]), name="Bias3")
b4 = tf.Variable(tf.zeros([8]), name="Bias4")
b5 = tf.Variable(tf.zeros([8]), name="Bias5")
b6 = tf.Variable(tf.zeros([8]), name="Bias6")
b7 = tf.Variable(tf.zeros([8]), name="Bias7")
b8 = tf.Variable(tf.zeros([8]), name="Bias8")
b9 = tf.Variable(tf.zeros([8]), name="Bias9")
b10 = tf.Variable(tf.zeros([8]), name="Bias10")
b11 = tf.Variable(tf.zeros([1]), name="Bias11")
# Our hypothesis
with tf.name_scope("layer1") as scope:
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
with tf.name_scope("layer2") as scope:
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
with tf.name_scope("layer3") as scope:
L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)
with tf.name_scope("layer4") as scope:
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
with tf.name_scope("layer5") as scope:
L5 = tf.nn.relu(tf.matmul(L4, W5) + b5)
with tf.name_scope("layer6") as scope:
L6 = tf.nn.relu(tf.matmul(L5, W6) + b6)
with tf.name_scope("layer7") as scope:
L7 = tf.nn.relu(tf.matmul(L6, W7) + b7)
with tf.name_scope("layer8") as scope:
L8 = tf.nn.relu(tf.matmul(L7, W8) + b8)
with tf.name_scope("layer9") as scope:
L9 = tf.nn.relu(tf.matmul(L8, W9) + b9)
with tf.name_scope("layer10") as scope:
L10 = tf.nn.relu(tf.matmul(L9, W10) + b10)
with tf.name_scope("last") as scope:
hypothesis = tf.sigmoid(tf.matmul(L10, W11) + b11)
# Cost function
with tf.name_scope("cost") as scope:
cost = -tf.reduce_mean(Y*tf.log(hypothesis) + (1-Y)*tf.log(1-hypothesis))
cost_summ = tf.scalar_summary("cost", cost)
# Minimize
with tf.name_scope("train") as scope:
a = tf.Variable(0.001) # Learning rate, alpha
optimizer = tf.train.GradientDescentOptimizer(a)
train = optimizer.minimize(cost)
# Before starting, initialize the variables.
# We will `run` this first.
init = tf.initialize_all_variables()
# Launch the graph,
with tf.Session() as sess:
# tensorboard --logdir=./logs/xor_logs
merged = tf.merge_all_summaries()
writer = tf.train.SummaryWriter("./logs/xor_logs", sess.graph_def)
sess.run(init)
# Fit the line.
for step in xrange(50000):
sess.run(train, feed_dict={X:x_data_training, Y:y_data_training})
if step % 2000 == 0:
summary = sess.run(merged, feed_dict={X:x_data_training, Y:y_data_training})
writer.add_summary(summary, step)
#print step, sess.run(cost, feed_dict={X:x_data, Y:y_data}), sess.run(W1), sess.run(W2)
print step, sess.run(cost, feed_dict={X:x_data_training, Y:y_data_training})
# Test model
correct_prediction = tf.equal(tf.floor(hypothesis+0.5), Y)
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
y_data_pred = sess.run(tf.floor(hypothesis + 0.5),
feed_dict={X: x_data_testing, Y: y_data_testing})
print sess.run([hypothesis, tf.floor(hypothesis+0.5), correct_prediction, accuracy], feed_dict={X:x_data_testing, Y:y_data_testing})
print "Accuracy:", accuracy.eval({X:x_data_testing, Y:y_data_testing})
# print confusion_matrix(y_data_testing[:,0], y_data_pred[:,0], labels=[0, 1])
pd_y_true = pd.Series(y_data_testing[:, 0])
pd_x_pred = pd.Series(y_data_pred[:, 0])
print pd.crosstab(pd_y_true, pd_x_pred, rownames=['True'], colnames=['Predicted'], margins=True)
target_names = ['false', 'true']
print(classification_report(y_data_testing[:, 0], y_data_pred[:, 0], target_names=target_names))
print 'Precision', precision_score(y_data_testing[:, 0], y_data_pred[:, 0], average='binary',pos_label=1)
결과
/root/tensorflow/bin/python /root/PycharmProjects/TensorFlowTest/PASSwithDNNLeRu9Hidden.py
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 8
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 8
0 0.770699
2000 0.589376
4000 0.580235
6000 0.578699
8000 0.577574
10000 0.576372
12000 0.575388
14000 0.574309
16000 0.572363
18000 0.570983
20000 0.569931
22000 0.568943
24000 0.567569
26000 0.565458
28000 0.564114
30000 0.562682
32000 0.561554
34000 0.56046
36000 0.559264
38000 0.558028
40000 0.556391
42000 0.555027
44000 0.553637
46000 0.55207
48000 0.550296
Accuracy: 0.727749
Predicted 0.0 1.0 All
True
0.0 276 0 276
1.0 104 2 106
All 380 2 382
precision recall f1-score support
false 0.73 1.00 0.84 276
true 1.00 0.02 0.04 106
avg / total 0.80 0.73 0.62 382
Precision 1.0
Conclusion
데이터 자체의 Label즉 true
, false
가 부정확하기 때문에
Garbage in Garbage out
의 명제대로 그다지 차이가 없다.
하지만 precision 정확도가 Deep Neural Network의 경우 매우 우수하기 때문에
False Prediction이 치명적인 시스템에서는 유효하다고 볼 수있다.
좀더 Deep Neural Network을Droupout
, Ada Optimizer
, 초기 Weight
설정 등을 통해서 향상 시킬 수 있을것 같다.
'AI > TensorFlow, PyTorch, Keras, Scikit' 카테고리의 다른 글
TensorFlow 기본 개념 (1) (0) | 2016.07.01 |
---|---|
TensorFlow 버전 업데이트 (Version Update) (4) | 2016.06.15 |
Softmax Function (0) | 2016.04.19 |
Neural Networks in XOR problem (0) | 2016.04.18 |
Logistic Regression (0) | 2016.04.17 |