[译]TicTacToe-没有递归的MiniMax

By robot-v1.0

本文链接 https://www.kyfws.com/ai/tictactoe-minimax-without-recursion-zh/

01月01日, 0001 - 11 分钟阅读 - 5260 个词 阅读量 0

TicTacToe-没有递归的MiniMax（译文）

原文地址：https://www.codeproject.com/Articles/1271611/TicTacToe-MiniMax-Without-Recursion

原文作者：Christ Kennedy

译文由本站 robot-v1.0 翻译

前言

An exhaustive search of all game-states used to trace Mini-Max decisions from End game up through to Start game

详尽搜索所有游戏状态,以追踪从"最终"游戏到"开始"游戏的Mini-Max决策

下载TicTacToe_2.0-102.9 KB(Download TicTacToe_2.0 - 102.9 KB) (递归代码)((recursive code))
下载TicTacToe-2.1 MB(Download TicTacToe - 2.1 MB) (详尽的搜索代码)((exhaustive search code))

介绍(Introduction)

我想到了对所有可行的井字游戏状态进行详尽搜索以构建类似AI的游戏的想法,认为"这不是真正的AI,但这可能是一个有趣的项目".(I came up with the idea of making an exhaustive search of all viable tic-tac-toe game states to build an AI-like game, thinking “that’s not really A.I. but it might be a fun project”.)

背景(Background)

经过大约十天的时间来解决所有问题,包括解决哪些游戏状态是当前游戏状态的可行祖先并发送诸如(After about ten days of messing around debugging all the issues with resolving which game states are viable ancestors to the current game state and sending up information like) TurnsToExWin 和(and) TurnsToOhWin 和(and) Win_Lose 比率和其他多个垃圾箱跳入死胡同数据的尝试,我可以做我想做的事情-生成所有状态并发送信息,但是结果,总的来说,没有任何意义.然后我读了这篇文章,如何(ratio and a multiple of other dumpster diving dead-end data attempts, I got the thing to do what I wanted - generate all states and send up the information but the results were, to put it kindly, lacklustre. Then I read this article, How) 使用minimax算法使您的井字游戏无与伦比(to make your Tic Tac Toe game unbeatable by using the minimax algorithm) . Mini-Max的东西还活着.我睡着了,一口气得到了答案,然后松了一口气,然后在几个小时内实施了递归Mini-Max井字游戏,并决定知道自己在做什么.(. And the Mini-Max thing came alive. I slept on it relieved to have been given the answer on a platter, then implemented the Recursive Mini-Max tic-tac-toe game in a couple of hours and decided I knew what I was doing.)

将Mini-Max与我所写的内容结合使用并不是一件坏事,因为所有祖先/后代问题都已得到解决.后来出现了一些错误,实际上是从不同的角度重新思考了递归的结果.一种算法,该算法会生成所有可能的游戏状态,然后系统地逐步回退所有这些游戏状态,以便再现递归Mini-Max算法的结果,以便现在它可以直接转到文件上的正确记录,查找"移动"并选择一个合适的其智商设置.(Using Mini-Max with the core of what I had written already wasn’t so bad since all the ancestor/descendant problems were already resolved. A few mistakes later actually rethinking the results of recursion from a different perspective and here it is. An algorithm that generates all possible game states and steps back through them all systematically in order to reproduce the results of a recursive Mini-Max algorithm so that now it simply goes straight to the right record on file, looks up the Moves and picks one appropriate for its IQ setting.)

使用代码(Using the Code)

该项目相当友好,可供您姐妹的孩子们玩.开发人员可能对以下组件中可用的某些组件感兴趣(The project is fairly friendly enough for your sisters' kids to play it. Developers might be interested in some of the components available in the) CK_objects 如那个(such as the) HSlider 它沿路径描绘了一个图形滑块,该滑块允许用户设置整数值,此处为Computer IQ.(which depicts a graphical slider along a path that lets the user set an integer value, here the Computer IQ.)

递归Mini-Max如下所示:(Where the recursion Mini-Max looks like this:)

int evaluateMove(classGameState cGameState, enuSquareStates eStatePlayer)
{
  intRecursionCounter++;
  int intMyID = intRecursionCounter ;
  enuBoard_State eBoardState = cGameState.getState(ePlayerTurn);
  enuSquareStates eOpponent = getOpponent(eStatePlayer);
  switch (eBoardState)
  {
    case enuBoard_State.Draw:
      return 0;
    case enuBoard_State.Lose:
      return -10;
    case enuBoard_State.Win:
      return 10;
    default:
      int[,] intMoves = new int[3, 3];
      List<classPointListElement> lstMoves = new List<classPointListElement>();
      for (int intY = 0; intY < 3; intY++)
        for (int intX = 0; intX < 3; intX++)
        {
          if (cGameState.board[intX, intY] == enuSquareStates.blank)
          {
            Point pt = new Point(intX, intY);
            classGameState cNextGameState = new classGameState(cGameState.board, pt, eOpponent);
            enuBoard_State eNextBoardState = cNextGameState.getState(ePlayerTurn);

            intMoves[intX, intY] = evaluateMove(cNextGameState, eOpponent);
            classPointListElement cPLE = new classPointListElement(pt, intMoves[intX, intY]);
            lstMoves.Add(cPLE);
          }
        }
      IEnumerable<classPointListElement> query;
      if (ePlayerTurn == eStatePlayer)
        query= lstMoves.OrderBy(pointListElement => pointListElement.value);
      else
        query = lstMoves.OrderByDescending(pointListElement => pointListElement.value);
      lstMoves = query.ToList<classPointListElement>();
      return lstMoves[0].value;
  }
}

并在计算机每次转动时都调用自己(递归).详尽搜索(and calls itself (recurses) for every turn the computer makes. The Exhaustive search) PlayAi() 函数仅需根据其IQ设置来决定做出什么选择.(function needs only decide what choice to make based on its IQ setting.)

void PlayAI()
{
  if (cGame == null) cGame = classBinTree.instance.BinRec_Load(0).Record;

  List<classPointListElement> lstMoves = new List<classPointListElement>();
  classEnumeratedTypes.enuPlayer eOpponent = classTicTacToe_Record.NextPlayer(cGame.ePlayer);
  for (int intY = 0; intY < 3; intY++)
    for (int intX = 0; intX < 3; intX++)
      if (cGame.Descendants[intX, intY] > 0)
        lstMoves.Add(new classPointListElement
        (new Point(intX, intY), cGame.Moves[(int)cGame.ePlayer, intX, intY]));
  IEnumerable<classPointListElement> query = 
              lstMoves.OrderByDescending(move => move.value);
  lstMoves = query.ToList<classPointListElement>(); // lstMoves[0] 
                                   // has the best choice to play next

 // choose a path appropriate for this IQ
      int intRnd = 0;
  if (ghs_IQ[(int)cGame.ePlayer].Value > 90)
  {
    List<classPointListElement> lstTopMoves = new List<classPointListElement>();
    for (int iCounter = 0; iCounter < lstMoves.Count; iCounter++)
      if (lstMoves[iCounter].value == lstMoves[0].value)
        lstTopMoves.Add(lstMoves[iCounter]); intRnd = cRnd.getInt(0, lstTopMoves.Count - 1);
    cGame = classBinTree.instance.BinRec_Load
    (cGame.Descendants[lstTopMoves[intRnd].pt.X, lstTopMoves[intRnd].pt.Y]).Record;
  }
  else
  {
    int intLogBase = 6; // the higher the logbase,
                        // the less biased the result will be towards Minimum
    intRnd = cRnd.getBiasedInt(0, lstMoves.Count - 1, 
             ghs_IQ[(int)cGame.ePlayer].Value, intLogBase);
    cGame = classBinTree.instance.BinRec_Load
                 (cGame.Descendants[lstMoves[intRnd].pt.X, lstMoves[intRnd].pt.Y]).Record;
  }
}

它读取当前玩家的[3,3]动作数组,将它们全部复制到保留其笛卡尔指数及其播放值的列表中,然后以降序对列表进行重新排序,以便下一步的最佳选择是列表的顶部.完成此操作后,它使用有偏的随机数生成器随机选择允许的移动之一,该生成器倾向于使用较低的范围值来获得较高的Grade输入参数(此处为IQ),从而最终结果是计算机播放器的IQ越高,则可能会选择最佳路径.在这里,如果计算机的智商高于90,它将仅从同等有效的最佳路径中进行选择.(It reads the [3,3] array of moves for the current player, copies them all into a list retaining their cartesian indices along with their play-value, then reorder the list in Descending order so that the best choice for a next move is top of the list. After this is done, it randomly chooses one of the permissible moves using a biased random number generator which favours low range values for higher Grade input parameter (here IQ) so that the end result is the higher the IQ of the computer player, the more likely it will choose the best path. Here, if the computer has an IQ higher than 90, it will select from the equally valid best paths only.)

的(The) getBiasedInt() 是事后思考,并取得了不错的结果.它选择一个随机数(was an after-thought and gets decent results. It chooses a random number between) 0 和(and) MaxRange ,然后重置(, then resets the) MaxRange 到先前的随机值并循环n次,其中n是(to the previous random value and loops n times where n is a direct function of the) ComputerPlayer 的智商. IQ越大,它循环的次数就越多,数字越低的可能性就越大,IQ越高的计算机将选择更好的播放路径.它不是完美的,但是非常好,这不是本文的目的,因此我还是保留了它.(’s IQ. The greater the IQ, the more times it loops and the more likely a low number will result and the higher-IQ computer will choose the better paths to play. It’s not perfect but it’s pretty good and that’s not what this article is about, so I kept it anyway.)

兴趣点(Points of Interest)

上面列出的Mini-Max文章非常清楚且写得很好.我不会浪费一个星期来尝试获取正确的数据(The Mini-Max article listed above is very clear and well written. I wouldn’t have wasted a week trying to get the right data to my) PlayAI() 如果我之前已经阅读过此功能,则对Mini-Max的任何问题请参考.当算法像在盘子上一样交给您时,事情就容易做得多了!(function if I had read it earlier and I refer you to it for any questions about Mini-Max. Things are a lot easier to do when the algorithm is handed to you on a plate like that one is!)

这种非递归算法的作用是,一旦构建数据,就无需AI进行思考.(What this non-recursive algorithm does is it removes the need for the AI to think once the data is built.)

FindEndGames(FindEndGames)

要构建数据文件,算法要做的是首先对所有可能的游戏状态进行广度优先搜索.本质上,它列出了从开始位置开始的所有可能移动,并将其放置在队列中.然后,它从队列中挑选出第一个游戏状态,并仅通过重复刚刚做的事情来对其进行处理->它将所有可能的移动都从该游戏状态转移到同一队列中,然后从队列中选择下一个游戏状态并继续进行,直到没有更多的游戏状态可以找到.(To build the data-file, what the algorithm does is it first makes a breadth-first-search of all possible game states. Essentially, it lists all possible moves from a start position and places them in a Queue. Then, it picks the first one off the Queue and processes it by merely repeating what it just did -> it too places all possible moves from that game state into the same queue, then picks the next game-state off the queue and proceeds until there are no more game states to be found.)

为了将游戏状态保存在文件中,每个游戏状态的文件记录都是一个二叉树叶子,其中叶子的搜索键是其唯一的游戏状态ID.唯一的游戏状态ID是一个整数值,该整数使用以3为底的9’小数位’编号系统来描述游戏板,其中游戏板上的左上角(0,0)方块的最低位(3(In order to retrieve the game states once they’re in the file, each game state’s file record is a binary tree leaf where the leaf’s search key is its unique game-state ID. The unique Game-State ID is an integer value that describes the game board using a base-3 9 ‘decimal-place’ numbering system where the top-left (0,0) square on the board is the least significant (3)0(0)),则下一个大于(1,0)的是下一个最低有效(3)(), the next one over (1,0) is the next least significant (3)1个(1)),直到到达最右下角(2,2)(3() and so on until you get to the bottom-right (2,2) which is the most significant (3)8(8)).假设空白=0,Ex =1和Oh =2,我们将正方形的内容乘以3(). Letting a blank space = 0, Ex =1 and Oh =2 we multiply the contents of the squares by 3)ñ(n)并获得描述游戏状态的唯一ID,并且可以用作存储游戏状态的二叉树的搜索关键字.(and get a unique ID that describes the game state and can be used as a search key for the binary tree in which the game-states are stored.)

或者,如果以下情况完全绕开二叉树,则会加速游戏状态记录的检索:(Alternately, bypassing the binary-tree altogether would accelerate the game-state record retrieval if the) uniqueID 用作文件中的索引.但是,这样做会大大增加文件的大小,但是由于硬盘存储设备是如此丰富且便宜,因此如果您的游戏速度很快,这可能是一个更合适的解决方案.(is used as an index in the file. Doing so, however, would significantly increase the size of the file but since Harddrive storage devices are so abundant and cheap, that may be a more suitable solution if speed is your game.)

所有游戏状态都分为9个单独的文件,每个文件都包含在游戏期间编译的二叉树中记录的索引列表.(All game states are sorted into 9 separate files that each contain a list of indices to the records in the binary-tree compiled during the) FindEndGames 数据重建阶段.这些索引直接指向二叉树文件中的记录,从而使我们无需搜索(phase of the data rebuild. These indices point directly to the record in the binary-tree file and spares us the need to perform a search for the) UniqueID 正在寻找的记录.这些索引在整个重建阶段以及(of the record we’re looking for. These indices are used throughout the rebuild stage as well as the) playAI() 阶段,如上图所示,使用(stage such as this line here seen above which uses the) classBinTree 的(’s) load 函数使用以下内容的随机索引检索记录(function to retrieve a record using a random index of the) lstTopMoves[] .(.)

cGame = classBinTree.instance.BinRec_Load
(cGame.Descendants[lstTopMoves[intRnd].pt.X, lstTopMoves[intRnd].pt.Y]).Record;   }

TraceEndGames(TraceEndGames)

至此,所有游戏状态都已定义,并存储在Harddive上的二叉树文件中.每个人都指向自己的后代(可以从当前游戏状态实现的游戏状态).的(At this point, all the game states have been defined and are stored in the binary-tree file on the harddive. Each one points to its own descendants (game states that can be achieved from current game state). The) Descendants[3,3] 每个记录的数组都保存了每个可行后代的索引,但是这些记录还没有一个知道如何指向自己的祖先.(array of each record holds the indices of each of its viable descendants but none of these records know how to point to their own Ancestors yet.)

该算法会扫描在(The algorithm scans each list of game states that were sorted during the) FindEndGames 数据重建阶段,从最拥挤的游戏状态开始,要么(phase of the data rebuild, starting with the most crowded game-states that either result in) Draw 要么(or) ExWin 结束游戏((end games () Ex 最后一轮放置5(has the last and first turn placing 5) Ex 在董事会上(es on the board while) Oh 只有(only has) 4 ).对于每个记录,它确定它是哪种游戏状态.对于最终游戏,将为其中一个分配一个值(). For each record, it determines what kind of game state it is. For end-games, a value is assigned of either) -10 ,(,) 0 要么(or) +10 取决于游戏的结局和所涉及的玩家.都(depending on the game’s end and the player in question. Both) Ex 和(and) Oh 在每条记录中保留Mini-Max信息,从而允许AI以与通过递归搜索板上的每个可用方块得到的结果完全相同的方式选择路径.如果游戏状态不是最终游戏,则它需要以模仿递归搜索结果的方式确定要"返回"其祖先的值.它检查所有内容(hold Mini-Max information in each record that allows the AI to then choose a path in exactly the same way as it would if it had gotten the results from a recursive search of each available square on the board. If the game-state is not an end-game, then it needs to determine what value to ‘return’ up to its ancestors in a way that mimics the recursive search results. It checks the contents of all its own) Moves[,] 在上一个扫描文件级别(包含与当前扫描文件的游戏状态相距一个游戏正方形的所有游戏状态的文件的扫描的文件的扫描)中设置的数组元素,然后将该信息发送至其自己的祖先并存储在他们(array elements which were set during the previous scan-file level (the scan of the file holding all the game-states that were one game-square apart from the current scan file’s game states) then sends that information up to its own ancestors and stores it in their) Moves[] 数组.(array.)

将信息传递给祖先的棘手部分是,在井字游戏中,每个玩家都要转弯,因此对于非最终游戏状态,您可以删除任何对立玩家的棋子并生成可行的祖先.但是,最终游戏会导致对方玩家获胜(当前玩家在对手玩过游戏后继承最终游戏),唯一可行的祖先是持有三个获胜棋子的三个方格.该规则的例外情况是,当游戏结束时有两组对手,每局三个对手组成一行或一列.(The tricky part about passing the information up to the ancestors is that in Tic-Tac-Toe each player takes a turn so for non-end-game game-states you can remove any one of the opposing player’s pieces and generate a viable ancestor. However, end-games that result in a win for the opposing player (current player inherits the end game after their opponent has played) the only viable ancestors are the three squares that hold the three winning pieces. The exception to this rule is when the game ends with two sets of three of the opponents pieces making a row or column.)

public enum enuTypeWin { H1=0, H2, H3, V1, V2, V3, RS, BS, Multiple, _numTypeWin };

在这种情况下,此处称为"(In those cases, here referred to as ‘) multiple “,只有一个可行的祖先,而这是对手三个棋子的两条获胜线共有的一个正方形.(’, there is only one viable ancestor and that is the one square that is common to both winning lines of three of the opponent’s pieces.)

确定祖先后,将为每个祖先提供Mini-Max信息.(Once the ancestors have been determined, each one is provided with the Mini-Max information for the) Ex 和(and) Oh 分别.此信息存储在祖先记录的(respectively. This information is stored in the Ancestor record’s) Moves[,] 对应于导致当前正在处理游戏状态的正方形的数组.由于文件是从最拥挤的游戏状态到最不拥挤的游戏状态进行扫描的,因此Mini-Max信息会沿向上传递,直到到达二进制记录记录0(和根节点)的开始(空)游戏状态为止. -tree文件.(array corresponding to the square that leads to the game-state currently being processed. Since the files are scanned from the most crowded game-states to the least crowded in sequence, the Mini-Max information is passed along upwards until it reaches the starting (empty) game state at record index 0 (and root node) of the binary-tree file.)

历史(History)

历史？我不知道.我在Wikipedia上查找有关Mini-Max的一些历史记录,并认为我会找到一些巧妙的方法来打断这篇文章,但是并没有发生.(History? I dunno. I looked up wikipedia for some history on Mini-Max and thought I’d find something clever to punctuate this article, with but it ain’t happening.)

未来？现在好了.井字游戏每转会增加棋子,并且可以预测,因此从这个意义上说,这是一个相对容易完成的项目,所以我在想跳棋吗？不确定.最终,国际象棋会很好.(Future? Well, now. Tic-tac-toe adds pieces per turn and is fairly predictable so it was a relatively easy project to complete in that sense so I’m thinking maybe checkers? Not sure. Ultimately, Chess would be nice.)

许可

本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。

XML C# .NET Windows Dev VS2017 artificial-intelligence text 新闻翻译