[译]AI:面向初学者的神经网络(第2部分,共3部分)
By robot-v1.0
本文链接 https://www.kyfws.com/ai/ai-neural-network-for-beginners-part-of-3-zh/
版权声明 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
- 12 分钟阅读 - 5567 个词 阅读量 0AI:面向初学者的神经网络(第2部分,共3部分)(译文)
原文地址:https://www.codeproject.com/Articles/16508/AI-Neural-Network-for-beginners-Part-of-3
原文作者:Sacha Barber
译文由本站 robot-v1.0 翻译
前言
AI : An Introduction into Neural Networks (Multi-layer networks / Back Propagation)
AI:神经网络简介(多层网络/反向传播)
介绍(Introduction)
本文是我将要发布的三篇文章系列的第二部分.拟议的文章内容如下:(This article is part 2 of a series of 3 articles that I am going to post. The proposed article content will be as follows:)
- 第1部分(Part 1) :是Perceptron网络(单层神经网络)的简介.(: Is an introduction into Perceptron networks (single layer neural networks).)
- 第2部分:这一部分是关于多层神经网络的,以及用于解决非线性分类问题(例如XOR逻辑门的逻辑)的反向传播训练方法.这是Perceptron无法做到的.本文将对此进行进一步说明.(Part 2 : This one, is about multi layer neural networks, and the back propagation training method to solve a non linear classification problem such as the logic of an XOR logic gate. This is something that a Perceptron can’t do. This is explained further within this article.)
- 第三部分(Part 3) :将介绍如何使用遗传算法(GA)训练多层神经网络来解决一些逻辑问题.(: Will be about how to use a genetic algorithm (GA) to train a multi layer neural network to solve some logic problem.)
概要(Summary)
本文将展示如何使用多层神经网络解决XOR逻辑问题.(This article will show how to use a multi-layer neural network to solve the XOR logic problem.)
简要回顾(摘自3之1)(A Brief Recap (From part 1 of 3))
在我们开始处理涉及多层神经网络的新文章之前,让我们先回顾一些关键概念.如果你还没读(Before we commence with the nitty gritty of this new article which deals with multi layer Neural Networks, let just revisit a few key concepts. If you haven’t read) 第1部分(Part 1) ,也许您应该从这里开始.(, perhaps you should start there.)
Perceptron配置(单层网络)(Perceptron Configuration ( Single layer network))
输入(The inputs) (x1,x2,x3..xm)
和连接权重(and connection weights) (w1,w2,w3..wm)
下面显示的通常是实数,正数(+)和负数(-).(shown below are typically real values, both positive (+) and negative (-).)
感知器本身由权重,求和处理器,激活函数和可调阈值处理器(以下称为"偏置")组成.(The perceptron itself, consists of weights, the summation processor, an activation function, and an adjustable threshold processor (called bias here after).)
为了方便起见,通常的做法是将偏差视为另一种输入.下图说明了修改后的配置.(For convenience, the normal practice is to treat the bias as just another input. The following diagram illustrates the revised configuration.)
偏差可以认为是感知器着火的倾向(一种特定行为方式的倾向),无论其输入如何.如果加权总和> 0,或者您对数学类型有所了解,则会触发上面显示的感知器配置网络(The bias can be thought of as the propensity (a tendency towards a particular way of behaving) of the perceptron to fire irrespective of it’s inputs. The perceptron configuration network shown above fires if the weighted sum > 0, or if you have into maths type explanations)
这就是感知器的基本操作.但是我们现在想要构建更多的层次,因此让我们继续进行新的学习.(So that’s the basic operation of a perceptron. But we now want to build more layers of these, so let’s carry on to the new stuff.)
所以现在新东西(更多层)(So Now The New Stuff (More layers))
从这一点开始,正在讨论的任何内容都直接与本文的代码有关.(From this point on, anything that is being discussed relates directly to this article’s code.)
在顶部的总结中,我们要解决的问题是如何使用多层神经网络解决XOR逻辑问题.那么这是怎么做的.好吧,这实际上是基于什么的增量构建(In the summary at the top, the problem we are trying to solve was how to use a multi-layer neural network to solve the XOR logic problem. So how is this done. Well it’s really an incremental build on what) 第1部分(Part 1) 已经讨论过了.因此,让我们继续前进.(already discussed. So let’s march on.)
XOR逻辑问题是什么样的?好吧,它看起来像下面的真值表:(What does the XOR logic problem look like? Well, it looks like the following truth table:)
请记住,对于单层(感知器),我们实际上无法实现XOR功能,因为它不是线性可分离的.但是对于多层网络,这是可以实现的.(Remember with a single layer (perceptron) we can’t actually achieve the XOR functionality, as it is not linearly separable. But with a multi-layer network, this is achievable.)
新网络看起来像什么(What Does The New Network Look Like)
解决异或问题的新网络看起来类似于单层网络.我们仍在处理输入/权重/输出.新功能是增加了隐藏层.(The new network that will solve the XOR problem will look similar to a single layer network. We are still dealing with inputs / weights / outputs. What is new is the addition of the hidden layer.)
如上所述,存在一层输入层,一层隐藏层和一层输出层.(As already explained above, there is one input layer, one hidden layer and one output layer.)
通过使用输入和权重,我们能够计算出给定节点的激活.对于隐藏层,这很容易实现,因为它直接链接到实际输入层.(It is by using the inputs and weights that we are able to work out the activation for a given node. This is easily achieved for the hidden layer as it has direct links to the actual input layer.)
但是,输出层对输入层一无所知,因为它没有直接连接到输入层.因此,要对输出节点进行激活,我们需要利用隐藏层节点的输出,这些隐藏层节点用作输出层节点的输入.(The output layer, however, knows nothing about the input layer as it is not directly connected to it. So to work out the activation for an output node we need to make use of the output from the hidden layer nodes, which are used as inputs to the output layer nodes.)
可以将上述整个过程视为从一层到下一层的传递.(This entire process described above can be thought of as a pass forward from one layer to the next.)
这仍然像单层网络一样工作.对于任何给定节点的激活仍如下进行计算:(This still works like it did with a single layer network; the activation for any given node is still worked out as follows:)
其中(wi是权重(i),Ii是输入(i)值)(Where (wi is the weight(i), and Ii is the input(i) value))
您在这里看到的是相同的旧东西,没有恶魔,烟雾或魔法.这是我们已经介绍的内容.(You see it the same old stuff, no demons, smoke or magic here. It’s stuff we’ve already covered.)
这就是网络的外观/工作方式.所以现在我想您想知道如何进行培训.(So that’s how the network looks/works. So now I guess you want to know how to go about training it.)
学习类型(Types Of Learning)
对于神经网络,本质上可以应用两种学习类型,即"强化"和"监督".(There are essentially 2 types of learning that may be applied, to a Neural Network, which is “Reinforcement” and “Supervised”)
加强(Reinforcement)
在强化学习中,在训练过程中,向目标神经网络提供了一组输入,目标是1.0时,输出为0.75.(In Reinforcement learning, during training, a set of inputs is presented to the Neural Network, the Output is 0.75, when the target was expecting 1.0.)
错误(1.0-0.75)用于训练(“错误0.25”).(The error (1.0 - 0.75) is used for training (‘wrong by 0.25’).)
如果有2个输出怎么办,则将总误差相加得到一个数字(通常是平方误差的总和).例如"您在所有输出上的总误差是1.76"(What if there are 2 outputs, then the total error is summed to give a single number (typically sum of squared errors). Eg “your total error on all outputs is 1.76”)
请注意,这只是告诉您您做错了什么,而不是您讲错了什么方向.(Note that this just tells you how wrong you were, not in which direction you were wrong.)
使用这种方法,我们可能永远无法获得结果,或者可能是"缠针"的情况.(Using this method we may never get a result, or it could be a case of ‘Hunt the needle’.)
注意:本系列的第3部分将使用GA训练神经网络,即强化学习. GA会简单地执行GA,然后在所有正常的GA阶段中为神经网络选择权重.没有值的反向传播.神经网络是好是坏.可以想象,此过程需要更多的步骤才能获得相同的结果.(NOTE : Part 3 of this series will be using a GA to train a Neural Network, which is Reinforcement learning. The GA simply does what a GA does, and all the normal GA phases to select weights for the Neural Network. There is no back propagation of values. The Neural Network is just good or just bad. As one can imagine, this process takes a lot more steps to get to the same result.)
监督下(Supervised)
在监督学习中,神经网络会提供更多信息.(In Supervised Learning the Neural Network is given more information.) 不仅是"错",还是"错了什么方向",例如"打针",但在哪里被告知"北偏北",“西偏西”.(Not just ‘how wrong’ it was, but ‘in what direction it was wrong’ like ‘Hunt the needle’ but where you are told ‘North a bit’, ‘West a bit’.)
因此,您可以在"监督学习"中获得并使用更多的信息,这是神经网络学习算法的正常形式.反向传播(本文使用的是监督学习)(So you get, and use, far more information in Supervised Learning, and this is the normal form of Neural Network learning algorithm. Back Propagation (what this article uses, is Supervised Learning))
学习算法(Learning Algorithm)
简而言之,要训练多层神经网络,需要执行以下步骤:(In brief, to train a multi-layer Neural Network, the following steps are carried out:)
- 从神经网络中的随机权重(和偏差)开始(Start off with random weights (and biases) in the Neural Network)
- 尝试训练集中的一个或多个成员,查看输出与应有的输出相比(与目标输出相比)有多糟糕(Try one or more members of the training set, see how badly the output(s) are compared to what they should be (compared to the target output(s)))
- 略微调整权重,旨在提高输出(Jiggle weights a bit, aimed at getting improvement on outputs)
- 现在尝试新的训练集,或者再重复一次,(Now try with a new lot of the training set, or repeat again,) 每次晃动重量(jiggling weights each time)
- 不断重复直到获得非常准确的输出(Keep repeating until you get quite accurate outputs) 这就是本文提交用于解决XOR问题的方法.这也称为"反向传播"(通常称为BP或BackProp)(This is what this article submission uses to solve the XOR problem. This is also called “Back Propagation” (normally called BP or BackProp))
Backprop允许您在输出时使用此错误,以调整到达输出层的权重,但同时也可以让您计算返回到第1层的有效错误,并使用它来调整到达那里的权重,依此类推,通过任意数量的层传播错误.(Backprop allows you to use this error at output, to adjust the weights arriving at the output layer, but then also allows you to calculate the effective error 1 layer back, and use this to adjust the weights arriving there, and so on, back-propagating errors through any number of layers.)
诀窍是使用S型信号作为非线性传递函数(在(The trick is the use of a sigmoid as the non-linear transfer function (which was covered in) 第1部分(Part 1) .使用了乙状结肠,因为它提供了应用微分技术的能力.(. The sigmoid is used as it offers the ability to apply differentiation techniques.)
因为这很容易区分-碰巧(Because this is nicely differentiable – it so happens that)
在本文中可以写成(Which in context of the article can be written as)
delta_outputs [i] =输出[i] (1.0-输出[i])(目标[i]-输出[i])(delta_outputs[i] = outputs[i] * (1.0 - outputs[i]) * (targets[i] - outputs[i]))
通过使用此计算,可以通过网络将权重更改应用回去.(It is by using this calculation that the weight changes can be applied back through the network.)
注意事项(Things To Watch Out For)
山谷:使用滚动球的比喻,可能会有这样的山谷,其陡峭的侧面和平缓的倾斜地面.梯度下降往往会浪费时间在山谷的两侧上下摆动(想想球!)(Valleys: Using the rolled ball metaphor, there may well be valleys like this, with steep sides and a gently sloping floor. Gradient descent tends to waste time swooshing up and down each side of the valley (think ball!))
那么我们该怎么办.好吧,我们添加一个动量项,它趋向于抵消来回运动并强调任何一致的方向,然后这将成功地(更快)沿着带有缓和底部坡度的山谷下降(So what can we do about this. Well we add a momentum term, that tends to cancel out the back and forth movements and emphasizes any consistent direction, then this will go down such valleys with gentle bottom-slopes much more successfully (faster))
开始训练(Starting The Training)
最好用本文实际代码中的代码片段来证明这一点:(This is probably best demonstrated with a code snippet from the article’s actual code:)
/// <summary>
/// The main training. The expected target values are passed in to this
/// method as parameters, and the <see cref="NeuralNetwork">NeuralNetwork</see>
/// is then updated with small weight changes, for this training iteration
/// This method also applied momentum, to ensure that the NeuralNetwork is
/// nurtured into proceeding in the correct direction. We are trying to avoid valleys.
/// If you don't know what valleys means, read the articles associated text
/// </summary>
/// <param name="target">A double[] array containing the target value(s)</param>
private void train_network(double[] target)
{
//get momentum values (delta values from last pass)
double[] delta_hidden = new double[nn.NumberOfHidden + 1];
double[] delta_outputs = new double[nn.NumberOfOutputs];
// Get the delta value for the output layer
for (int i = 0; i < nn.NumberOfOutputs; i++)
{
delta_outputs[i] =
nn.Outputs[i] * (1.0 - nn.Outputs[i]) * (target[i] - nn.Outputs[i]);
}
// Get the delta value for the hidden layer
for (int i = 0; i < nn.NumberOfHidden + 1; i++)
{
double error = 0.0;
for (int j = 0; j < nn.NumberOfOutputs; j++)
{
error += nn.HiddenToOutputWeights[i, j] * delta_outputs[j];
}
delta_hidden[i] = nn.Hidden[i] * (1.0 - nn.Hidden[i]) * error;
}
// Now update the weights between hidden & output layer
for (int i = 0; i < nn.NumberOfOutputs; i++)
{
for (int j = 0; j < nn.NumberOfHidden + 1; j++)
{
//use momentum (delta values from last pass),
//to ensure moved in correct direction
nn.HiddenToOutputWeights[j, i] += nn.LearningRate * delta_outputs[i] * nn.Hidden[j];
}
}
// Now update the weights between input & hidden layer
for (int i = 0; i < nn.NumberOfHidden; i++)
{
for (int j = 0; j < nn.NumberOfInputs + 1; j++)
{
//use momentum (delta values from last pass),
//to ensure moved in correct direction
nn.InputToHiddenWeights[j, i] += nn.LearningRate * delta_hidden[i] * nn.Inputs[j];
}
}
}
所以最后的代码(So Finally The Code)
好吧,本文的代码看起来像下面的类图(它是Visual Studio 2005 C#. NET v2.0)(Well, the code for this article looks like the following class diagram (It’s Visual Studio 2005 C#, .NET v2.0))
人们应该花时间看的主要课程是:(The main classes that people should take the time to look at would be :)
NN_Trainer_XOR
:训练神经网络解决XOR问题(: Trains a Neural Network to solve the XOR problem)TrainerEventArgs
:培训事件参数,与GUI一起使用(: Training event args, for use with a GUI)NeuralNetwork
:可配置的神经网络(: A configurable Neural Network)NeuralNetworkEventArgs
:培训事件参数,与GUI一起使用(: Training event args, for use with a GUI)SigmoidActivationFunction
:提供S型激活功能的静态方法(: A static method to provide the sigmoid activation function) 其余的是我构建的GUI,只是为了展示它们如何组合在一起.(The rest are a GUI I constructed simply to show how it all fits together.)
注意:演示项目包含所有代码,因此在此不再列出.(NOTE : the demo project contains all code, so I won’t list it here.)
代码演示(Code Demos)
所附的DEMO应用程序具有3个主要区域,如下所述:(The DEMO application attached has 3 main areas which are described below:)
实时结果选项卡(LIVE RESULTS Tab)
可以看出,这几乎解决了XOR问题(您可能永远不会100%准确地获得它)(It can be seen that this has very nearly solved the XOR problem (You will probably never get it 100% accurate))
培训结果标签(TRAINING RESULTS Tab)
一起查看培训阶段的目标/输出(Viewing the training phase target/outputs together)
查看训练阶段错误(Viewing the training phase errors)
培训结果选项卡(TRAINED RESULTS Tab)
一起查看受过训练的目标/输出(Viewing the trained target/outputs together)
查看训练有素的错误(Viewing the trained errors)
也可以使用"查看神经网络配置"按钮查看神经网络的最终配置.如果人们对神经网络的最终权重感兴趣,那么这是一个值得关注的地方.(It is also possible to view the Neural Networks final configuration using the “View Neural Network Config” button. If people are interested in what weights the Neural Network ended up with, this is the place to look.)
你怎么看 ?(What Do You Think ?)
而已.我想问一下,如果您喜欢这篇文章,请投赞成票.(That’s it. I would just like to ask, if you liked the article, please vote for it.)
兴趣点(Points of Interest)
我认为AI相当有趣,这就是为什么我要花时间发布这些文章的原因.因此,我希望其他人会发现它有趣,并且因为它具有我自己的知识,因此可能会有助于进一步增进人们的知识.(I think AI is fairly interesting, that’s why I am taking the time to publish these articles. So I hope someone else finds it interesting, and that it might help further someone’s knowledge, as it has my own.)
任何想进一步了解AI类型的东西,对本文内容有所了解的人都可以在以下网址查看Andrew Krillovs的文章:(Anyone that wants to look further into AI type stuff, that finds the content of this article a bit basic should check out Andrew Krillovs articles, at) Andrew Krillov CP文章(Andrew Krillov CP articles) 因为他比较先进,也很好.实际上,安德鲁似乎所做的任何事情都是非常好的.(as his are more advanced, and very good. In fact anything Andrew seems to do, is very good.)
历史(History)
- v1.0 06年11月24日(v1.0 24/11/06)
参考书目(Bibliography)
- 人工智能第二版,伊莱恩
里奇/凯文
奈特.麦格劳`希尔公司(Artificial Intelligence 2nd edition, Elaine Rich / Kevin Knight. McGraw Hill Inc.) - 人工智能,一种现代方法,Stuart Russell/Peter Norvig.学徒大厅.(Artificial Intelligence, A Modern Approach, Stuart Russell / Peter Norvig. Prentice Hall.)
许可
本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。
C# Windows .NET .NET2.0 Visual-Studio VS2005 Dev AI 新闻 翻译