如何用 CNN 玩转 AlphaGo 版的五子棋？_飞道的博客

如何用 CNN 玩转 AlphaGo 版的五子棋？

2020-04-15 20:21 616人阅读评论(0)

作者 | 李秋键

责编 | 郭芮

出品 | CSDN（ID：CSDNnews）

近几年来，AI在游戏方面的发展如火如荼，尤其是自从阿法狗AI围棋战胜围棋之后，更是引起了AI发展的狂潮，同时也引起了很多AI游戏的应用与深化发展。其实游戏中的AI有着非常悠久的历史，相当多的游戏都是围绕着对抗“敌人”展开，而这个“敌人”,就是AI，其中包含一些行为方式固定没有一丁点变化的低级AI，也有一些引入随机因素稍微高级一点的AI，不过这里的AI本质上是一段固定的程序脚本，如果玩家掌握到其中的规律，游戏性就会瞬间降低。

而深度学习的AI版本却是不同，他有着多层位的参数与多方向的选择，拓展了其中AI的智能性，让玩家找到其中的规律性变得基本不可能，这也是深度学习的重要意义之一。今天，我们就将利用CNN实现智能五子棋。

实验前的准备

首先我们使用的python版本是3.6.5。所测试的系统有windows10，windows7，Linux系统以及苹果系统。从这点也可以看出python多平台和多拓展性、易于迁移的优点。

所使用的的python库有tkinter，其目的是用来规划棋盘布局，实现下棋功能；SGFfile用来读取棋谱和加载训练模型；os库用来读取和存储本地文件；TensorFlow库用来建立CNN网络模型以及训练等事项。

棋盘的建立

1、初始化棋盘：

其中各参数设定意义如下：初始化：someoneWin:标识是否有人赢了；humanChessed:人类玩家是否下了；IsStart:是否开始游戏了；player:玩家是哪一方；playmethod:模式，和robot下棋，还是和ai下棋；bla_start_pos:黑棋开局时下在正中间的位置；bla_chessed:保存黑棋已经下过的棋子；whi_chessed:保存白棋已经下过的棋子；board:棋盘；window:窗口；var:用于标记选择玩家颜色的一个变量；var1:用于标记选择robot或者ai的一个变量；can:画布，用于绘出棋盘；net_board:棋盘的点信息；robot:机器人；sgf:处理棋谱；cnn:cnnc神经网络。

其中代码如下：


   
    
     
      
     
     
      
       def __init__(self):
      
     
    
     
      
     
     
              
       self.someoneWin = False
      
     
    
     
      
     
     
              
       self.humanChessed = False
      
     
    
     
      
     
     
              
       self.IsStart = False
      
     
    
     
      
     
     
              
       self.player = 
       0
      
     
    
     
      
     
     
              
       self.playmethod = 
       0
      
     
    
     
      
     
     
              
       self.bla_start_pos = [
       235, 
       235]
      
     
    
     
      
     
     
              
       self.whi_chessed = []
      
     
    
     
      
     
     
              
       self.bla_chessed = []
      
     
    
     
      
     
     
              
       self.board = 
       self.init_board()
      
     
    
     
      
     
     
              
       self.window = Tk()
      
     
    
     
      
     
     
              
       self.var = IntVar()
      
     
    
     
      
     
     
              
       self.var.set(
       0)
      
     
    
     
      
     
     
              
       self.var1 = IntVar()
      
     
    
     
      
     
     
              
       self.var1.set(
       0)
      
     
    
     
      
     
     
              
       self.window.title(
       "myGoBang")
      
     
    
     
      
     
     
              
       self.window.geometry(
       "600x470+80+80")
      
     
    
     
      
     
     
              
       self.window.resizable(
       0, 
       0)
      
     
    
     
      
     
     
              
       self.can = Canvas(
       self.window, bg=
       "#EEE8AC", width=
       470, height=
       470)
      
     
    
     
      
     
     
              
       self.draw_board()
      
     
    
     
      
     
     
              
       self.can.grid(row=
       0, column=
       0)
      
     
    
     
      
     
     
              
       self.net_board = 
       self.get_net_board()
      
     
    
     
      
     
     
              
       self.robot = Robot(
       self.board)
      
     
    
     
      
     
     
              
       self.sgf = SGFflie()
      
     
    
     
      
     
     
              
       self.cnn = myCNN()
      
     
    
     
      
     
     
              
       self.cnn.restore_save()
      
     
    
     
      
     
     
          
       def init_board(self):
      
     
    
     
      
     
     
              
       ""
       "初始化棋盘"
       ""
      
     
    
     
      
     
     
      
               list1 = [[-
       1]*
       15 
       for i 
       in range(
       15)]
      
     
    
     
      
     
     
              
       return list1

2、棋盘布局：

其主要功能就是画出棋盘和棋子。具体代码如下：


   
    
     
      
     
     
      
       def draw_board(self):
      
     
    
     
      
     
     
              
       """画出棋盘"""
      
     
    
     
      
     
     
              
       for row 
       in range(
       15):
      
     
    
     
      
     
     
                  
       if row == 
       0 
       or row == 
       14:
      
     
    
     
      
     
     
      
                       self.can.create_line((
       25, 
       25 + row * 
       30), (
       445, 
       25 + row * 
       30), width=
       2)
      
     
    
     
      
     
     
                  
       else:
      
     
    
     
      
     
     
      
                       self.can.create_line((
       25, 
       25 + row * 
       30), (
       445, 
       25 + row * 
       30), width=
       1)
      
     
    
     
      
     
     
              
       for col 
       in range(
       15):
      
     
    
     
      
     
     
                  
       if col == 
       0 
       or col == 
       14:
      
     
    
     
      
     
     
      
                       self.can.create_line((
       25 + col * 
       30, 
       25), (
       25 + col * 
       30, 
       445), width=
       2)
      
     
    
     
      
     
     
                  
       else:
      
     
    
     
      
     
     
      
                       self.can.create_line((
       25 + col * 
       30, 
       25), (
       25 + col * 
       30, 
       445), width=
       1)
      
     
    
     
      
     
     
      
               self.can.create_oval(
       112, 
       112, 
       118, 
       118, fill=
       "black")
      
     
    
     
      
     
     
      
               self.can.create_oval(
       352, 
       112, 
       358, 
       118, fill=
       "black")
      
     
    
     
      
     
     
      
               self.can.create_oval(
       112, 
       352, 
       118, 
       358, fill=
       "black")
      
     
    
     
      
     
     
      
               self.can.create_oval(
       232, 
       232, 
       238, 
       238, fill=
       "black")
      
     
    
     
      
     
     
      
               self.can.create_oval(
       352, 
       352, 
       358, 
       358, fill=
       "black")
      
     
    
     
      
     
     
      
       def draw_chessed(self):
      
     
    
     
      
     
     
              
       """在棋盘中画出已经下过的棋子"""
      
     
    
     
      
     
     
              
       if len(self.whi_chessed) != 
       0:
      
     
    
     
      
     
     
                  
       for tmp 
       in self.whi_chessed:
      
     
    
     
      
     
     
      
                       oval = pos_to_draw(*tmp[
       0:
       2])
      
     
    
     
      
     
     
      
                       self.can.create_oval(oval, fill=
       "white")
      
     
    
     
      
     
     
              
       if len(self.bla_chessed) != 
       0:
      
     
    
     
      
     
     
                  
       for tmp 
       in self.bla_chessed:
      
     
    
     
      
     
     
      
                       oval = pos_to_draw(*tmp[
       0:
       2])
      
     
    
     
      
     
     
      
                       self.can.create_oval(oval, fill=
       "black")
      
     
    
     
      
     
     
          
       def draw_a_chess(self, x, y, player=None):
      
     
    
     
      
     
     
              
       """在棋盘中画一个棋子"""
      
     
    
     
      
     
     
      
               _x, _y = pos_in_qiju(x, y)
      
     
    
     
      
     
     
      
               oval = pos_to_draw(x, y)
      
     
    
     
      
     
     
              
       if player == 
       0:
      
     
    
     
      
     
     
      
                   self.can.create_oval(oval, fill=
       "black")
      
     
    
     
      
     
     
      
                   self.bla_chessed.append([x, y, 
       0])
      
     
    
     
      
     
     
      
                   self.board[_x][_y] = 
       1
      
     
    
     
      
     
     
              
       elif player == 
       1:
      
     
    
     
      
     
     
      
                   self.can.create_oval(oval, fill=
       "white")
      
     
    
     
      
     
     
      
                   self.whi_chessed.append([x, y, 
       1])
      
     
    
     
      
     
     
      
                   self.board[_x][_y] = 
       0
      
     
    
     
      
     
     
              
       else:
      
     
    
     
      
     
     
      
                   print(AttributeError(
       "请选择棋手"))
      
     
    
     
      
     
     
              
       return

3、判断胜负条件：

根据是否是五子连在一线判断输赢。


   
    
     
      
     
     
      
       def have_five(
       self, chessed):
      
     
    
     
      
     
     
              
       ""
       "检测是否存在连五了"
       ""
      
     
    
     
      
     
     
              
       if len(chessed) == 
       0:
      
     
    
     
      
     
     
                  
       return 
       False
      
     
    
     
      
     
     
              
       for row in range(
       15):
      
     
    
     
      
     
     
                  
       for col in range(
       15):
      
     
    
     
      
     
     
      
                       x = 
       25 + row * 
       30
      
     
    
     
      
     
     
      
                       y = 
       25 + col * 
       30
      
     
    
     
      
     
     
                      
       if 
       self.check_chessed((x, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x, y + 
       30), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x, y + 
       60), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x, y + 
       90), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x, y + 
       120), chessed) == 
       True:
      
     
    
     
      
     
     
                          
       return 
       True
      
     
    
     
      
     
     
      
                       elif 
       self.check_chessed((x, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       30, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       60, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       90, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       120, y), chessed) == 
       True:
      
     
    
     
      
     
     
                          
       return 
       True
      
     
    
     
      
     
     
      
                       elif 
       self.check_chessed((x, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       30, y + 
       30), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       60, y + 
       60), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       90, y + 
       90), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       120, y + 
       120), chessed) == 
       True:
      
     
    
     
      
     
     
                          
       return 
       True
      
     
    
     
      
     
     
      
                       elif 
       self.check_chessed((x, y), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       30, y - 
       30), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       60, y - 
       60), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       90, y - 
       90), chessed) == 
       True 
       and \
      
     
    
     
      
     
     
                                      
       self.check_chessed((x + 
       120, y - 
       120), chessed) == 
       True:
      
     
    
     
      
     
     
                          
       return 
       True
      
     
    
     
      
     
     
                      
       else:
      
     
    
     
      
     
     
      
                           pass
      
     
    
     
      
     
     
              
       return 
       False
      
     
    
     
      
     
     
      
           def check_win(
       self):
      
     
    
     
      
     
     
              
       ""
       "检测是否有人赢了"
       ""
      
     
    
     
      
     
     
              
       if 
       self.have_five(
       self.whi_chessed) == 
       True:
      
     
    
     
      
     
     
      
                   label = Label(
       self.window, text=
       "White Win!", background=
       '#FFF8DC', font=(
       "宋体", 
       15, 
       "bold"))
      
     
    
     
      
     
     
      
                   label.place(relx=
       0, rely=
       0, x=
       480, y=
       40)
      
     
    
     
      
     
     
                  
       return 
       True
      
     
    
     
      
     
     
      
               elif 
       self.have_five(
       self.bla_chessed) == 
       True:
      
     
    
     
      
     
     
      
                   label = Label(
       self.window, text=
       "Black Win!", background=
       '#FFF8DC', font=(
       "宋体", 
       15, 
       "bold"))
      
     
    
     
      
     
     
      
                   label.place(relx=
       0, rely=
       0, x=
       480, y=
       40)
      
     
    
     
      
     
     
                  
       return 
       True
      
     
    
     
      
     
     
              
       else:
      
     
    
     
      
     
     
                  
       return 
       False

得到的UI界面如下：

深度学习建模

1、初始化神经网络：

其中第一层和第二层为卷积层，第四层为全连接层，接着紧接着连接池化和softmax。和一般的CNN网络基本无异。基本参数见代码，如下：


   
    
     
      
     
     
      
       def __init__(self):
      
     
    
     
      
     
     
              
       ''
       '初始化神经网络'
       ''
      
     
    
     
      
     
     
              
       self.sess = tf.InteractiveSession()
      
     
    
     
      
     
     
              
       # paras
      
     
    
     
      
     
     
              
       self.W_conv1 = 
       self.weight_varible([
       5, 
       5, 
       1, 
       32])
      
     
    
     
      
     
     
              
       self.b_conv1 = 
       self.bias_variable([
       32])
      
     
    
     
      
     
     
              
       # conv layer-1
      
     
    
     
      
     
     
              
       self.x = tf.placeholder(tf.float32, [None, 
       225])
      
     
    
     
      
     
     
              
       self.y = tf.placeholder(tf.float32, [None, 
       225])
      
     
    
     
      
     
     
              
       self.x_image = tf.reshape(
       self.x, [-
       1, 
       15, 
       15, 
       1])
      
     
    
     
      
     
     
              
       self.h_conv1 = tf.nn.relu(
       self.conv2d(
       self.x_image, 
       self.W_conv1) + 
       self.b_conv1)
      
     
    
     
      
     
     
              
       self.h_pool1 = 
       self.max_pool_2x2(
       self.h_conv1)
      
     
    
     
      
     
     
              
       # conv layer-2
      
     
    
     
      
     
     
              
       self.W_conv2 = 
       self.weight_varible([
       5, 
       5, 
       32, 
       64])
      
     
    
     
      
     
     
              
       self.b_conv2 = 
       self.bias_variable([
       64])
      
     
    
     
      
     
     
              
       self.h_conv2 = tf.nn.relu(
       self.conv2d(
       self.h_pool1, 
       self.W_conv2) + 
       self.b_conv2)
      
     
    
     
      
     
     
              
       self.h_pool2 = 
       self.max_pool_2x2(
       self.h_conv2)
      
     
    
     
      
     
     
              
       # full connection
      
     
    
     
      
     
     
              
       self.W_fc1 = 
       self.weight_varible([
       4 * 
       4 * 
       64, 
       1024])
      
     
    
     
      
     
     
              
       self.b_fc1 = 
       self.bias_variable([
       1024])
      
     
    
     
      
     
     
              
       self.h_pool2_flat = tf.reshape(
       self.h_pool2, [-
       1, 
       4 * 
       4 * 
       64])
      
     
    
     
      
     
     
              
       self.h_fc1 = tf.nn.relu(tf.matmul(
       self.h_pool2_flat, 
       self.W_fc1) + 
       self.b_fc1)
      
     
    
     
      
     
     
              
       # dropout
      
     
    
     
      
     
     
              
       self.keep_prob = tf.placeholder(tf.float32)
      
     
    
     
      
     
     
              
       self.h_fc1_drop = tf.nn.dropout(
       self.h_fc1, 
       self.keep_prob)
      
     
    
     
      
     
     
              
       # output layer: softmax
      
     
    
     
      
     
     
              
       self.W_fc2 = 
       self.weight_varible([
       1024, 
       225])
      
     
    
     
      
     
     
              
       self.b_fc2 = 
       self.bias_variable([
       225])
      
     
    
     
      
     
     
              
       self.y_conv = tf.nn.softmax(tf.matmul(
       self.h_fc1_drop, 
       self.W_fc2) + 
       self.b_fc2)
      
     
    
     
      
     
     
              
       # model training
      
     
    
     
      
     
     
              
       self.cross_entropy = -tf.reduce_sum(
       self.y * tf.log(
       self.y_conv))
      
     
    
     
      
     
     
              
       self.train_step = tf.train.AdamOptimizer(
       1e-
       3).minimize(
       self.cross_entropy)
      
     
    
     
      
     
     
              
       self.correct_prediction = tf.equal(tf.argmax(
       self.y_conv, 
       1), tf.argmax(
       self.y, 
       1))
      
     
    
     
      
     
     
              
       self.accuracy = tf.reduce_mean(tf.cast(
       self.correct_prediction, tf.float32))
      
     
    
     
      
     
     
              
       self.saver = tf.train.Saver()
      
     
    
     
      
     
     
      
               init = tf.global_variables_initializer()  
       # 不存在就初始化变量
      
     
    
     
      
     
     
              
       self.sess.run(init)
      
     
    
     
      
     
     
          
       def weight_varible(self, shape):
      
     
    
     
      
     
     
              
       ''
       '权重变量'
       ''
      
     
    
     
      
     
     
      
               initial = tf.truncated_normal(shape, stddev=
       0.
       1)
      
     
    
     
      
     
     
              
       return tf.Variable(initial)
      
     
    
     
      
     
     
          
       def bias_variable(self, shape):
      
     
    
     
      
     
     
              
       ''
       '偏置变量'
       ''
      
     
    
     
      
     
     
      
               initial = tf.constant(
       0.
       1, shape=shape)
      
     
    
     
      
     
     
              
       return tf.Variable(initial)
      
     
    
     
      
     
     
          
       def conv2d(self, x, W):
      
     
    
     
      
     
     
              
       ''
       '卷积核'
       ''
      
     
    
     
      
     
     
              
       return tf.nn.conv2d(x, W, strides=[
       1, 
       1, 
       1, 
       1], padding=
       'SAME')
      
     
    
     
      
     
     
          
       def max_pool_2x2(self, x):
      
     
    
     
      
     
     
              
       ''
       '池化核'
       ''
      
     
    
     
      
     
     
              
       return tf.nn.max_pool(x, ksize=[
       1, 
       2, 
       2, 
       1], strides=[
       1, 
       2, 
       2, 
       1], padding=
       'SAME')

2、保存和读取模型：


   
    
     
      
     
     
      
       def restore_save(self, method=1):
      
     
    
     
      
     
     
              
       '''保存和读取模型'''
      
     
    
     
      
     
     
              
       if method == 
       1:
      
     
    
     
      
     
     
      
                   self.saver.restore(self.sess, 
       'save\model.ckpt')
      
     
    
     
      
     
     
                  
       #print("已读取数据")
      
     
    
     
      
     
     
              
       elif method == 
       0:
      
     
    
     
      
     
     
      
                   saver = tf.train.Saver(write_version=tf.train.SaverDef.V2)
      
     
    
     
      
     
     
      
                   saver.save(self.sess, 
       'save\model.ckpt')
      
     
    
     
      
     
     
                  
       #print('已保存')

3、建立预测函数和训练函数：


   
    
     
      
     
     
      
       def predition(self, qiju):
      
     
    
     
      
     
     
              
       '''预测函数'''
      
     
    
     
      
     
     
      
               _qiju = self.createdataformqiju(qiju)
      
     
    
     
      
     
     
      
               pre = self.sess.run(tf.argmax(self.y_conv, 
       1), feed_dict={self.x: _qiju, self.keep_prob: 
       1.0})
      
     
    
     
      
     
     
      
               point = [
       0, 
       0]
      
     
    
     
      
     
     
      
               l = pre[
       0]
      
     
    
     
      
     
     
              
       for i 
       in range(
       15):
      
     
    
     
      
     
     
                  
       if ((i + 
       1) * 
       15) > l:
      
     
    
     
      
     
     
      
                       point[
       0] = int(i*
       30 + 
       25)
      
     
    
     
      
     
     
      
                       point[
       1] = int((l - i * 
       15) * 
       30 + 
       25)
      
     
    
     
      
     
     
                      
       break
      
     
    
     
      
     
     
              
       return point
      
     
    
     
      
     
     
          
       def train(self, qiju):
      
     
    
     
      
     
     
              
       '''训练函数'''
      
     
    
     
      
     
     
      
               sgf = SGFflie()
      
     
    
     
      
     
     
      
               _x, _y = sgf.createTraindataFromqipu(qiju)
      
     
    
     
      
     
     
              
       for i 
       in range(
       10):
      
     
    
     
      
     
     
      
                   self.sess.run(self.train_step, feed_dict={
      
     
    
     
      
     
     
      
                       self.x: _x,
      
     
    
     
      
     
     
      
                       self.y: _y
      
     
    
     
      
     
     
      
                   })
      
     
    
     
      
     
     
      
               self.restore_save(method=
       0)
      
     
    
     
      
     
     
          
       def train1(self, x, y):
      
     
    
     
      
     
     
              
       '''另一个训练函数'''
      
     
    
     
      
     
     
              
       for i 
       in range(
       100):
      
     
    
     
      
     
     
      
                   self.sess.run(self.train_step, feed_dict={
      
     
    
     
      
     
     
      
                       self.x: x,
      
     
    
     
      
     
     
      
                       self.y: y,
      
     
    
     
      
     
     
      
                       self.keep_prob: 
       0.5
      
     
    
     
      
     
     
      
                   })
      
     
    
     
      
     
     
      
               print(
       '训练好了一次')
      
     
    
     
      
     
     
              
       #self.restore_save(method=0)

4、生成数据：


   
    
     
      
     
     
      
       def createdataformqiju(self, qiju):
      
     
    
     
      
     
     
              
       '''生成数据'''
      
     
    
     
      
     
     
      
               data = []
      
     
    
     
      
     
     
      
               tmp = []
      
     
    
     
      
     
     
              
       for row 
       in qiju:
      
     
    
     
      
     
     
                  
       for point 
       in row:
      
     
    
     
      
     
     
                      
       if point == 
       -1:
      
     
    
     
      
     
     
      
                           tmp.append(
       0.0)
      
     
    
     
      
     
     
                      
       elif point == 
       0:
      
     
    
     
      
     
     
      
                           tmp.append(
       2.0)
      
     
    
     
      
     
     
                      
       elif point == 
       1:
      
     
    
     
      
     
     
      
                           tmp.append(
       1.0)
      
     
    
     
      
     
     
      
               data.append(tmp)
      
     
    
     
      
     
     
              
       return data

其中此处CNN在棋盘应用和图像识别的不同之处在于，图像识别加载的参数来自于图像本身的像素值作为训练的参数，而此处训练的参数则是自定义的棋盘棋谱参数，比如说棋盘左上角的位置参数等等各个位置参数都是预先设定好的，通过加载棋谱即可以让电脑知道此时黑白棋子在哪个位置。然后通过加载各个位置以及胜负情况进行判断，最终电脑加载模型即可预测可能胜利的下棋位置，达到智能下棋效果。

最终效果：