小言_互联网的博客

TensorFlow|基于深度学习的人脸表情识别系统

614人阅读  评论(0)

因科研压力暂时没时间回复大家了,非常不好意思


更新(2019-4-12)

上传了模型权重和模型结构,因GItHub不支持25MB以上的文件,因此上传在此处,如果急用可以在此下载,也是作为对我工作的一些支持

地址:https://download.csdn.net/download/shillyshally/11110754

如果不急用可以在下方留下邮箱,我在看博客的时候会回复,但会有一段时间的延迟


更新(2019-1-1)

增加了resnet模型,可在cnn.py中切换


正好在学习tensorflow,使用tensorflow重构了一下之前自己做的那个表情识别系统,直接使用fer2013.csv转tfrecord训练,不需再逐张转为图片,训练更快,代码更精简,支持中断训练之后载入模型继续训练等等

已在github上开源

地址:https://github.com/shillyshallysxy/emotion_classifier/tree/master/emotion_classifier_tensorflow_version

提供给需要这个表情识别系统的tensorflow版本的人


原Keras版本地址:https://blog.csdn.net/shillyshally/article/details/80912854

Keras版本Github地址:https://github.com/shillyshallysxy/emotion_classifier

提供给需要原Keras版本的人


使用TensorFlow搭建并训练了卷积神经网络模型,用于人脸表情识别,训练集和测试集均采用kaggle的fer2013数据集。

达到如下效果:

                    

     打上了小小马赛克的博主。

整个表情识别系统分为两个过程:卷积神经网络模型的训练 与 面部表情的识别。

1.卷积神经网络模型的训练

1.1获取数据集

使用公开的数据集一方面可以节约收集数据的时间,另一方面可以更公平地评价模型以及人脸表情分类器的性能,因此,使用了kaggle面部表情识别竞赛所使用的fer2013人脸表情数据库。图片统一以csv的格式存储。首先用python将csv文件转为单通道灰度图片并根据标签将其分类在不同的文件夹中。

fer2013数据集链接:  https://pan.baidu.com/s/1M6XS8ovXbn8-UfQwcUnvVQ 密码: jueq

1.2预处理数据集

将数据集转化为tfrecord格式

图片直接全部载入内存,每次训练全部载入的过程缓慢,耗时长,而且必然会造成内存巨大的开销,16G的内存全部被占用之后还不够,因此考虑构建一个队列,每次从外部磁盘读取部分数据,shuffle打乱后存放到内存中的队列中,此时内存只需要维护队列大小的空间,并且一次只需要载入部分数据,载入速度快了数十倍,同时训练过程中从内存中读取数据,训练过程的速度未收到影响。


  
  1. with open(csv_path, 'r') as f:
  2. csvr = csv.reader(f)
  3. header = next(csvr)
  4. rows = [row for row in csvr]
  5. trn = [row[: -1] for row in rows if row[ -1] == 'Training']
  6. val = [row[: -1] for row in rows if row[ -1] == 'PublicTest']
  7. tst = [row[: -1] for row in rows if row[ -1] == 'PrivateTest']
  8. def write_binary(record_name_, labels_images_, height_=default_height, width_=default_width):
  9. writer_ = tf.python_io.TFRecordWriter(record_name_)
  10. for label_image_ in tqdm(labels_images_):
  11. label_ = int(label_image_[ 0])
  12. image_ = np.asarray([int(p) for p in label_image_[ -1].split()])
  13. example = tf.train.Example(
  14. features=tf.train.Features(
  15. feature={
  16. "image/label": tf.train.Feature(int64_list=tf.train.Int64List(value=[label_])),
  17. "image/height": tf.train.Feature(int64_list=tf.train.Int64List(value=[height_])),
  18. "image/width": tf.train.Feature(int64_list=tf.train.Int64List(value=[width_])),
  19. "image/raw": tf.train.Feature(int64_list=tf.train.Int64List(value=image_))
  20. }
  21. )
  22. )
  23. writer_.write(example.SerializeToString())
  24. writer_.close()
  25. write_binary(record_path_train, trn)
  26. write_binary(record_path_test, tst)
  27. write_binary(record_path_eval, val)

1.3搭建卷积神经网络模型

接下来就是建立卷积神经网络模型

博主在google的论文Going deeper with convolutions中获得灵感,在输入层之后加入了1*1的卷积层使输入增加了非线性的表示、加深了网络、提升了模型的表达能力,同时基本不增加计算量。之后根据VGG网络的想法,尝试将5*5网络拆分为两层3*3但最后效果并不理想,在多次尝试了多种不同的模型并不断调整之后

最终网络模型结构如下:

种类

步长

填充

输出

丢弃

输入

 

 

 

48*48*1

 

卷积层1

1*1

1

 

48*48*32

 

卷积层2

5*5

1

2

48*48*32

 

池化层1

3*3

2

 

23*23*32

 

卷积层3

3*3

1

1

23*23*32

 

池化层2

3*3

2

 

11*11*32

 

卷积层4

5*5

1

2

11*11*64

 

池化层3

3*3

2

 

5*5*64

 

全连接层1

 

 

 

1*1*2048

50%

全连接层2

 

 

 

1*1*1024

50%

输出

 

 

 

1*1*7

 

模型的代码:

结构清晰,就不多解释了


  
  1. import tensorflow as tf
  2. from tensorflow.contrib.layers.python.layers import initializers
  3. class CNN_Model():
  4. def __init__(self, num_tags_=7, lr_=0.001, channel_=1, hidden_dim_=1024, full_shape_=2304, optimizer_='Adam'):
  5. self.num_tags = num_tags_
  6. self.lr = lr_
  7. self.full_shape = full_shape_
  8. self.channel = channel_
  9. self.hidden_dim = hidden_dim_
  10. self.conv_feature = [ 32, 32, 32, 64]
  11. self.conv_size = [ 1, 5, 3, 5]
  12. self.maxpool_size = [ 0, 3, 3, 3]
  13. self.maxpool_stride = [ 0, 2, 2, 2]
  14. # self.initializer = tf.truncated_normal_initializer(stddev=0.05)
  15. self.initializer = initializers.xavier_initializer()
  16. self.dropout = tf.placeholder(dtype=tf.float32, name= 'dropout')
  17. self.x_input = tf.placeholder(dtype=tf.float32, shape=[ None, None, None, self.channel], name= 'x_input')
  18. self.y_target = tf.placeholder(dtype=tf.int32, shape=[ None], name= 'y_target')
  19. self.batch_size = tf.shape(self.x_input)[ 0]
  20. self.logits = self.project_layer(self.cnn_layer())
  21. with tf.variable_scope( "loss"):
  22. self.loss = self.loss_layer(self.logits)
  23. self.train_step = self.optimizer(self.loss, optimizer_)
  24. def cnn_layer(self):
  25. with tf.variable_scope( "conv1"):
  26. conv1_weight = tf.get_variable( 'conv1_weight', [self.conv_size[ 0], self.conv_size[ 0],
  27. self.channel, self.conv_feature[ 0]],
  28. dtype=tf.float32, initializer=self.initializer)
  29. conv1_bias = tf.get_variable( 'conv1_bias', [self.conv_feature[ 0]], dtype=tf.float32,
  30. initializer=tf.constant_initializer( 0.0))
  31. conv1 = tf.nn.conv2d(self.x_input, conv1_weight, [ 1, 1, 1, 1], padding= 'SAME')
  32. conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
  33. conv1_relu = tf.nn.relu(conv1_add_bias)
  34. norm1 = tf.nn.lrn(conv1_relu, depth_radius= 5, bias= 2.0, alpha= 1e-3, beta= 0.75, name= 'norm1')
  35. with tf.variable_scope( "conv2"):
  36. conv2_weight = tf.get_variable( 'conv2_weight', [self.conv_size[ 1], self.conv_size[ 1],
  37. self.conv_feature[ 0], self.conv_feature[ 1]],
  38. dtype=tf.float32, initializer=self.initializer)
  39. conv2_bias = tf.get_variable( 'conv2_bias', [self.conv_feature[ 1]], dtype=tf.float32,
  40. initializer=tf.constant_initializer( 0.0))
  41. conv2 = tf.nn.conv2d(norm1, conv2_weight, [ 1, 1, 1, 1], padding= 'SAME')
  42. conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
  43. conv2_relu = tf.nn.relu(conv2_add_bias)
  44. pool2 = tf.nn.max_pool(conv2_relu, ksize=[ 1, self.maxpool_size[ 1], self.maxpool_size[ 1], 1],
  45. strides=[ 1, self.maxpool_stride[ 1], self.maxpool_stride[ 1], 1],
  46. padding= 'SAME', name= 'pool_layer2')
  47. norm2 = tf.nn.lrn(pool2, depth_radius= 5, bias= 2.0, alpha= 1e-3, beta= 0.75, name= 'norm2')
  48. with tf.variable_scope( "conv3"):
  49. conv3_weight = tf.get_variable( 'conv3_weight', [self.conv_size[ 2], self.conv_size[ 2],
  50. self.conv_feature[ 1], self.conv_feature[ 2]],
  51. dtype=tf.float32, initializer=self.initializer)
  52. conv3_bias = tf.get_variable( 'conv3_bias', [self.conv_feature[ 2]], dtype=tf.float32,
  53. initializer=tf.constant_initializer( 0.0))
  54. conv3 = tf.nn.conv2d(norm2, conv3_weight, [ 1, 1, 1, 1], padding= 'SAME')
  55. conv3_add_bias = tf.nn.bias_add(conv3, conv3_bias)
  56. conv3_relu = tf.nn.relu(conv3_add_bias)
  57. pool3 = tf.nn.max_pool(conv3_relu, ksize=[ 1, self.maxpool_size[ 2], self.maxpool_size[ 2], 1],
  58. strides=[ 1, self.maxpool_stride[ 2], self.maxpool_stride[ 2], 1],
  59. padding= 'SAME', name= 'pool_layer3')
  60. norm3 = tf.nn.lrn(pool3, depth_radius= 5, bias= 2.0, alpha= 1e-3, beta= 0.75, name= 'norm3')
  61. with tf.variable_scope( "conv4"):
  62. conv4_weight = tf.get_variable( 'conv4_weight', [self.conv_size[ 3], self.conv_size[ 3],
  63. self.conv_feature[ 2], self.conv_feature[ 3]],
  64. dtype=tf.float32, initializer=self.initializer)
  65. conv4_bias = tf.get_variable( 'conv4_bias', [self.conv_feature[ 3]], dtype=tf.float32,
  66. initializer=tf.constant_initializer( 0.0))
  67. conv4 = tf.nn.conv2d(norm3, conv4_weight, [ 1, 1, 1, 1], padding= 'SAME')
  68. conv4_add_bias = tf.nn.bias_add(conv4, conv4_bias)
  69. conv4_relu = tf.nn.relu(conv4_add_bias)
  70. pool4 = tf.nn.max_pool(conv4_relu, ksize=[ 1, self.maxpool_size[ 3], self.maxpool_size[ 3], 1],
  71. strides=[ 1, self.maxpool_stride[ 3], self.maxpool_stride[ 3], 1],
  72. padding= 'SAME', name= 'pool_layer4')
  73. norm4 = tf.nn.lrn(pool4, depth_radius= 5, bias= 2.0, alpha= 1e-3, beta= 0.75, name= 'norm4')
  74. return norm4
  75. def cnn_layer_single(self):
  76. with tf.variable_scope( "conv1"):
  77. conv1_weight = tf.get_variable( 'conv1_weight', [self.conv_size[ 0], self.conv_size[ 0],
  78. self.channel, self.conv_feature[ 0]],
  79. dtype=tf.float32, initializer=self.initializer)
  80. conv1_bias = tf.get_variable( 'conv1_bias', [self.conv_feature[ 0]], dtype=tf.float32,
  81. initializer=tf.constant_initializer( 0.0))
  82. conv1 = tf.nn.conv2d(self.x_input, conv1_weight, [ 1, 1, 1, 1], padding= 'SAME')
  83. conv1_add_bias = tf.nn.bias_add(conv1, conv1_bias)
  84. conv1_relu = tf.nn.relu(conv1_add_bias)
  85. with tf.variable_scope( "conv2"):
  86. conv2_weight = tf.get_variable( 'conv2_weight', [self.conv_size[ 1], self.conv_size[ 1],
  87. self.conv_feature[ 0], self.conv_feature[ 1]],
  88. dtype=tf.float32, initializer=self.initializer)
  89. conv2_bias = tf.get_variable( 'conv2_bias', [self.conv_feature[ 1]], dtype=tf.float32,
  90. initializer=tf.constant_initializer( 0.0))
  91. conv2 = tf.nn.conv2d(conv1_relu, conv2_weight, [ 1, 1, 1, 1], padding= 'SAME')
  92. conv2_add_bias = tf.nn.bias_add(conv2, conv2_bias)
  93. conv2_relu = tf.nn.relu(conv2_add_bias)
  94. pool2 = tf.nn.max_pool(conv2_relu, ksize=[ 1, self.maxpool_size[ 1], self.maxpool_size[ 1], 1],
  95. strides=[ 1, self.maxpool_stride[ 1], self.maxpool_stride[ 1], 1],
  96. padding= 'SAME', name= 'pool_layer2')
  97. with tf.variable_scope( "conv3"):
  98. conv3_weight = tf.get_variable( 'conv3_weight', [self.conv_size[ 2], self.conv_size[ 2],
  99. self.conv_feature[ 1], self.conv_feature[ 2]],
  100. dtype=tf.float32, initializer=self.initializer)
  101. conv3_bias = tf.get_variable( 'conv3_bias', [self.conv_feature[ 2]], dtype=tf.float32,
  102. initializer=tf.constant_initializer( 0.0))
  103. conv3 = tf.nn.conv2d(pool2, conv3_weight, [ 1, 1, 1, 1], padding= 'SAME')
  104. conv3_add_bias = tf.nn.bias_add(conv3, conv3_bias)
  105. conv3_relu = tf.nn.relu(conv3_add_bias)
  106. pool3 = tf.nn.max_pool(conv3_relu, ksize=[ 1, self.maxpool_size[ 2], self.maxpool_size[ 2], 1],
  107. strides=[ 1, self.maxpool_stride[ 2], self.maxpool_stride[ 2], 1],
  108. padding= 'SAME', name= 'pool_layer3')
  109. with tf.variable_scope( "conv4"):
  110. conv4_weight = tf.get_variable( 'conv4_weight', [self.conv_size[ 3], self.conv_size[ 3],
  111. self.conv_feature[ 2], self.conv_feature[ 3]],
  112. dtype=tf.float32, initializer=self.initializer)
  113. conv4_bias = tf.get_variable( 'conv4_bias', [self.conv_feature[ 3]], dtype=tf.float32,
  114. initializer=tf.constant_initializer( 0.0))
  115. conv4 = tf.nn.conv2d(pool3, conv4_weight, [ 1, 1, 1, 1], padding= 'SAME')
  116. conv4_add_bias = tf.nn.bias_add(conv4, conv4_bias)
  117. conv4_relu = tf.nn.relu(conv4_add_bias)
  118. pool4 = tf.nn.max_pool(conv4_relu, ksize=[ 1, self.maxpool_size[ 3], self.maxpool_size[ 3], 1],
  119. strides=[ 1, self.maxpool_stride[ 3], self.maxpool_stride[ 3], 1],
  120. padding= 'SAME', name= 'pool_layer4')
  121. return pool4
  122. def project_layer(self, x_in_):
  123. with tf.variable_scope( "project"):
  124. with tf.variable_scope( "hidden"):
  125. x_in_ = tf.reshape(x_in_, [self.batch_size, -1])
  126. w_tanh1 = tf.get_variable( "w_tanh1", [self.full_shape, self.hidden_dim* 2], initializer=self.initializer,
  127. regularizer=tf.contrib.layers.l2_regularizer( 0.001))
  128. b_tanh1 = tf.get_variable( "b_tanh1", [self.hidden_dim* 2], initializer=tf.zeros_initializer())
  129. w_tanh2 = tf.get_variable( "w_tanh2", [self.hidden_dim* 2, self.hidden_dim], initializer=self.initializer,
  130. regularizer=tf.contrib.layers.l2_regularizer( 0.001))
  131. b_tanh2 = tf.get_variable( "b_tanh2", [self.hidden_dim], initializer=tf.zeros_initializer())
  132. output1 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(x_in_, w_tanh1),
  133. b_tanh1)), keep_prob=self.dropout)
  134. output2 = tf.nn.dropout(tf.nn.relu(tf.add(tf.matmul(output1, w_tanh2),
  135. b_tanh2)), keep_prob=self.dropout)
  136. with tf.variable_scope( "output"):
  137. w_out = tf.get_variable( "w_out", [self.hidden_dim, self.num_tags], initializer=self.initializer,
  138. regularizer=tf.contrib.layers.l2_regularizer( 0.001))
  139. b_out = tf.get_variable( "b_out", [self.num_tags], initializer=tf.zeros_initializer())
  140. pred_ = tf.add(tf.matmul(output2, w_out), b_out, name= 'logits')
  141. return pred_
  142. def loss_layer(self, project_logits):
  143. with tf.variable_scope( "loss"):
  144. loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
  145. logits=project_logits, labels=self.y_target), name= 'softmax_loss')
  146. return loss
  147. def optimizer(self, loss_, method=''):
  148. if method == 'Momentum':
  149. step = tf.Variable( 0, trainable= False)
  150. model_learning_rate = tf.train.exponential_decay( 0.01, step,
  151. 100, 0.99, staircase= True)
  152. my_optimizer = tf.train.MomentumOptimizer(model_learning_rate, momentum= 0.9)
  153. train_step_ = my_optimizer.minimize(loss_, global_step=step, name= 'train_step')
  154. print( 'Using ', method)
  155. elif method == 'SGD':
  156. step = tf.Variable( 0, trainable= False)
  157. model_learning_rate = tf.train.exponential_decay( 0.1, step,
  158. 200., 0.96, staircase= True)
  159. my_optimizer = tf.train.GradientDescentOptimizer(model_learning_rate)
  160. train_step_ = my_optimizer.minimize(loss_, name= 'train_step')
  161. print( 'Using ', method)
  162. elif method == 'Adam':
  163. train_step_ = tf.train.AdamOptimizer(self.lr).minimize(loss_, name= 'train_step')
  164. print( 'Using ', method)
  165. else:
  166. train_step_ = tf.train.MomentumOptimizer( 0.005, momentum= 0.9).minimize(loss_, name= 'train_step')
  167. print( 'Using Default')
  168. return train_step_

1.4训练模型

通过水平翻转,调节亮度饱和度,随机裁切来进行数据增强,结构清晰代码简单,就不多解释了


  
  1. # 数据增强
  2. def pre_process_img(image):
  3. image = tf.image.random_flip_left_right(image)
  4. image = tf.image.random_brightness(image, max_delta= 32./ 255)
  5. image = tf.image.random_contrast(image, lower= 0.8, upper= 1.2)
  6. image = tf.random_crop(image, [default_height-np.random.randint( 0, 4), default_width-np.random.randint( 0, 4), 1])
  7. image = tf.image.resize_images(image, [default_height, default_width])
  8. return image
  9. def __parse_function_image(serial_exmp_):
  10. features_ = tf.parse_single_example(serial_exmp_, features={ "image/label": tf.FixedLenFeature([], tf.int64),
  11. "image/height": tf.FixedLenFeature([], tf.int64),
  12. "image/width": tf.FixedLenFeature([], tf.int64),
  13. "image/raw": tf.FixedLenFeature([], tf.string)})
  14. label_ = tf.cast(features_[ "image/label"], tf.int32)
  15. height_ = tf.cast(features_[ "image/height"], tf.int32)
  16. width_ = tf.cast(features_[ "image/width"], tf.int32)
  17. image_ = tf.image.decode_jpeg(features_[ "image/raw"])
  18. image_ = tf.reshape(image_, [height_, width_, channel])
  19. image_ = tf.image.convert_image_dtype(image_, dtype=tf.float32)
  20. image_ = tf.image.resize_images(image_, [default_height, default_width])
  21. # image_ = pre_process_img(image_)
  22. return image_, label_
  23. def __parse_function_csv(serial_exmp_):
  24. features_ = tf.parse_single_example(serial_exmp_,
  25. features={ "image/label": tf.FixedLenFeature([], tf.int64),
  26. "image/height": tf.FixedLenFeature([], tf.int64),
  27. "image/width": tf.FixedLenFeature([], tf.int64),
  28. "image/raw": tf.FixedLenFeature([default_width*default_height*channel]
  29. , tf.int64)})
  30. label_ = tf.cast(features_[ "image/label"], tf.int32)
  31. height_ = tf.cast(features_[ "image/height"], tf.int32)
  32. width_ = tf.cast(features_[ "image/width"], tf.int32)
  33. image_ = tf.cast(features_[ "image/raw"], tf.int32)
  34. image_ = tf.reshape(image_, [height_, width_, channel])
  35. image_ = tf.multiply(tf.cast(image_, tf.float32), 1. / 255)
  36. image_ = pre_process_img(image_)
  37. return image_, label_
  38. def get_dataset(record_name_):
  39. record_path_ = os.path.join(data_folder_name, data_path_name, record_name_)
  40. data_set_ = tf.data.TFRecordDataset(record_path_)
  41. return data_set_.map(__parse_function_csv)
  42. def evaluate(logits_, y_):
  43. return np.mean(np.equal(np.argmax(logits_, axis= 1), y_))
  44. def main(argv):
  45. with tf.Session() as sess:
  46. summary_writer = tf.summary.FileWriter(tensorboard_path, sess.graph)
  47. data_set_train = get_dataset(record_name_train)
  48. data_set_train = data_set_train.shuffle(shuffle_pool_size).batch(batch_size).repeat()
  49. data_set_train_iter = data_set_train.make_one_shot_iterator()
  50. train_handle = sess.run(data_set_train_iter.string_handle())
  51. data_set_test = get_dataset(record_name_test)
  52. data_set_test = data_set_test.shuffle(shuffle_pool_size).batch(test_batch_size).repeat()
  53. data_set_test_iter = data_set_test.make_one_shot_iterator()
  54. test_handle = sess.run(data_set_test_iter.string_handle())
  55. handle = tf.placeholder(tf.string, shape=[], name= 'handle')
  56. iterator = tf.data.Iterator.from_string_handle(handle, data_set_train.output_types, data_set_train.output_shapes)
  57. x_input_bacth, y_target_batch = iterator.get_next()
  58. cnn_model = cnn.CNN_Model()
  59. x_input = cnn_model.x_input
  60. y_target = cnn_model.y_target
  61. logits = tf.nn.softmax(cnn_model.logits)
  62. loss = cnn_model.loss
  63. train_step = cnn_model.train_step
  64. dropout = cnn_model.dropout
  65. sess.run(tf.global_variables_initializer())
  66. if retrain:
  67. print( 'retraining')
  68. ckpt_name = 'cnn_emotion_classifier.ckpt'
  69. ckpt_path = os.path.join(data_folder_name, data_path_name, ckpt_name)
  70. saver = tf.train.Saver()
  71. saver.restore(sess, ckpt_path)
  72. with tf.name_scope( 'Loss_and_Accuracy'):
  73. tf.summary.scalar( 'Loss', loss)
  74. summary_op = tf.summary.merge_all()
  75. print( 'start training')
  76. saver = tf.train.Saver(max_to_keep= 1)
  77. max_accuracy = 0
  78. temp_train_loss = []
  79. temp_test_loss = []
  80. temp_train_acc = []
  81. temp_test_acc = []
  82. for i in range(generations):
  83. x_batch, y_batch = sess.run([x_input_bacth, y_target_batch], feed_dict={handle: train_handle})
  84. train_feed_dict = {x_input: x_batch, y_target: y_batch,
  85. dropout: 0.5}
  86. sess.run(train_step, train_feed_dict)
  87. if (i + 1) % 100 == 0:
  88. train_loss, train_logits = sess.run([loss, logits], train_feed_dict)
  89. train_accuracy = evaluate(train_logits, y_batch)
  90. print( 'Generation # {}. Train Loss : {:.3f} . '
  91. 'Train Acc : {:.3f}'.format(i, train_loss, train_accuracy))
  92. temp_train_loss.append(train_loss)
  93. temp_train_acc.append(train_accuracy)
  94. summary_writer.add_summary(sess.run(summary_op, train_feed_dict), i)
  95. if (i + 1) % 400 == 0:
  96. test_x_batch, test_y_batch = sess.run([x_input_bacth, y_target_batch], feed_dict={handle: test_handle})
  97. test_feed_dict = {x_input: test_x_batch, y_target: test_y_batch,
  98. dropout: 1.0}
  99. test_loss, test_logits = sess.run([loss, logits], test_feed_dict)
  100. test_accuracy = evaluate(test_logits, test_y_batch)
  101. print( 'Generation # {}. Test Loss : {:.3f} . '
  102. 'Test Acc : {:.3f}'.format(i, test_loss, test_accuracy))
  103. temp_test_loss.append(test_loss)
  104. temp_test_acc.append(test_accuracy)
  105. if test_accuracy >= max_accuracy and save_flag and i > generations // 2:
  106. max_accuracy = test_accuracy
  107. saver.save(sess, os.path.join(data_folder_name, data_path_name, save_ckpt_name))
  108. print( 'Generation # {}. --model saved--'.format(i))
  109. print( 'Last accuracy : ', max_accuracy)
  110. with open(model_log_path, 'w') as f:
  111. f.write( 'train_loss: ' + str(temp_train_loss))
  112. f.write( '\n\ntest_loss: ' + str(temp_test_loss))
  113. f.write( '\n\ntrain_acc: ' + str(temp_train_acc))
  114. f.write( '\n\ntest_acc: ' + str(temp_test_acc))
  115. print( ' --log saved--')
  116. if __name__ == '__main__':
  117. tf.app.run()

2.人脸表情识别模块

2.1加载模型


  
  1. # config=tf.ConfigProto(log_device_placement=True)
  2. sess = tf.Session()
  3. saver = tf.train.import_meta_graph(ckpt_path+ '.meta')
  4. saver.restore(sess, ckpt_path)
  5. graph = tf.get_default_graph()
  6. name = [n.name for n in graph.as_graph_def().node]
  7. print(name)
  8. x_input = graph.get_tensor_by_name( 'x_input:0')
  9. dropout = graph.get_tensor_by_name( 'dropout:0')
  10. logits = graph.get_tensor_by_name( 'project/output/logits:0')

2.2表情识别

兼带生成测试集以及验证集混淆矩阵的代码,将confusion_matrix值设置为True即生成混淆矩阵False为识别文件夹中的所有图片


  
  1. img_size = 48
  2. confusion_matrix = False
  3. emotion_labels = [ 'angry', 'disgust:', 'fear', 'happy', 'sad', 'surprise', 'neutral']
  4. num_class = len(emotion_labels)
  5. def prodece_confusion_matrix(images_, total_num_):
  6. results = np.array([ 0]*num_class)
  7. total = []
  8. for imgs_ in images_:
  9. for img_ in imgs_:
  10. results[np.argmax(predict_emotion(img_))] += 1
  11. print(results, np.around(results/len(imgs_), decimals= 3))
  12. total.append(results)
  13. results = np.array([ 0]*num_class)
  14. sum = 0
  15. for i_ in range(num_class):
  16. sum += total[i_][i_]
  17. print( 'acc: {:.3f} %'.format(sum* 100./total_num_))
  18. print( 'Using ', ckpt_name)
  19. def predict_emotion(face_img_, img_size_=48):
  20. face_img_ = face_img_ * ( 1. / 255)
  21. resized_img_ = cv2.resize(face_img_, (img_size_, img_size_)) # ,interpolation=cv2.INTER_LINEAR
  22. rsz_img = []
  23. rsz_img.append(resized_img_[:, :])
  24. rsz_img.append(resized_img_[ 2: 45, :])
  25. rsz_img.append(cv2.flip(rsz_img[ 0], 1))
  26. for i_, rsz_image in enumerate(rsz_img):
  27. rsz_img[i_] = cv2.resize(rsz_image, (img_size_, img_size_)).reshape(img_size_, img_size_, 1)
  28. rsz_img = np.array(rsz_img)
  29. feed_dict_ = {x_input: rsz_img, dropout: 1.0}
  30. pred_logits_ = sess.run([tf.reduce_sum(tf.nn.softmax(logits), axis= 0)], feed_dict_)
  31. return np.squeeze(pred_logits_)
  32. def face_detect(image_path, casc_path_=casc_path):
  33. if os.path.isfile(casc_path_):
  34. face_casccade_ = cv2.CascadeClassifier(casc_path_)
  35. img_ = cv2.imread(image_path)
  36. img_gray_ = cv2.cvtColor(img_, cv2.COLOR_BGR2GRAY)
  37. # face detection
  38. faces = face_casccade_.detectMultiScale(
  39. img_gray_,
  40. scaleFactor= 1.1,
  41. minNeighbors= 1,
  42. minSize=( 30, 30),
  43. )
  44. return faces, img_gray_, img_
  45. else:
  46. print( "There is no {} in {}".format(casc_name, casc_path_))
  47. if __name__ == '__main__':
  48. if not confusion_matrix:
  49. images_path = []
  50. files = os.listdir(pic_path)
  51. for file in files:
  52. if file.lower().endswith( 'jpg') or file.endswith( 'png'):
  53. images_path.append(os.path.join(pic_path, file))
  54. for image in images_path:
  55. faces, img_gray, img = face_detect(image)
  56. spb = img.shape
  57. sp = img_gray.shape
  58. height = sp[ 0]
  59. width = sp[ 1]
  60. size = 600
  61. emotion_pre_dict = {}
  62. face_exists = 0
  63. for (x, y, w, h) in faces:
  64. face_exists = 1
  65. face_img_gray = img_gray[y:y + h, x:x + w]
  66. results_sum = predict_emotion(face_img_gray) # face_img_gray
  67. for i, emotion_pre in enumerate(results_sum):
  68. emotion_pre_dict[emotion_labels[i]] = emotion_pre
  69. # 输出所有情绪的概率
  70. print(emotion_pre_dict)
  71. label = np.argmax(results_sum)
  72. emo = emotion_labels[int(label)]
  73. print( 'Emotion : ', emo)
  74. # 输出最大概率的情绪
  75. # 使框的大小适应各种像素的照片
  76. t_size = 2
  77. ww = int(spb[ 0] * t_size / 300)
  78. www = int((w + 10) * t_size / 100)
  79. www_s = int((w + 20) * t_size / 100) * 2 / 5
  80. cv2.rectangle(img, (x, y), (x + w, y + h), ( 0, 0, 255), ww)
  81. cv2.putText(img, emo, (x + 2, y + h - 2), cv2.FONT_HERSHEY_SIMPLEX,
  82. www_s, ( 255, 0, 255), thickness=www, lineType= 1)
  83. # img_gray full face face_img_gray part of face
  84. if face_exists:
  85. cv2.namedWindow( 'Emotion_classifier', 0)
  86. cent = int((height * 1.0 / width) * size)
  87. cv2.resizeWindow( 'Emotion_classifier', size, cent)
  88. cv2.imshow( 'Emotion_classifier', img)
  89. k = cv2.waitKey( 0)
  90. cv2.destroyAllWindows()
  91. # if k & 0xFF == ord('q'):
  92. # break
  93. if confusion_matrix:
  94. with open(csv_path, 'r') as f:
  95. csvr = csv.reader(f)
  96. header = next(csvr)
  97. rows = [row for row in csvr]
  98. val = [row[: -1] for row in rows if row[ -1] == 'PublicTest']
  99. # tst = [row[:-1] for row in rows if row[-1] == 'PrivateTest']
  100. confusion_images_total = []
  101. confusion_images = { 0: [], 1: [], 2: [], 3: [], 4: [], 5: [], 6: []}
  102. test_set = val
  103. total_num = len(test_set)
  104. for label_image_ in test_set:
  105. label_ = int(label_image_[ 0])
  106. image_ = np.reshape(np.asarray([int(p) for p in label_image_[ -1].split()]), [img_size, img_size, 1])
  107. confusion_images[label_].append(image_)
  108. prodece_confusion_matrix(confusion_images.values(), total_num)

3.效果展示

 

最后附上实验环境

    系统:win10

    语言:python3.6

    显卡:GTX1080ti

参考文献

  1. Jeon J, Park J C, Jo Y J, et al. A Real-time Facial Expression Recognizer using Deep Neural Network[J].ACM 2016:1-4.
  2. He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016:770-778.
  3. Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2012:1097-1105.
  4. Zeiler M D, Fergus R. Visualizing and Understanding Convolutional Networks[C]// European Conference on Computer Vision. Springer, Cham, 2014:818-833.
  5. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions[C]. 2014
  6. Samaa M, Shohieb. SignsWorld Facial Expression Recognition System (FERS)[J]. Intelligent Automation & Soft Computing,2015
  7. Srivastava, Nitish. Dropout: A simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15.1: 1929-1958.
  8. Jia, Yangqing, Shelhamer, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[J]. 2014:675-678.

转载:https://blog.csdn.net/shillyshally/article/details/84934174
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场