[译]使用多神经网络的大模式识别系统
By robot-v1.0
本文链接 https://www.kyfws.com/ai/large-pattern-recognition-system-using-multi-neura-zh/
版权声明 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
- 11 分钟阅读 - 5443 个词 阅读量 0使用多神经网络的大模式识别系统(译文)
原文地址:https://www.codeproject.com/Articles/376798/Large-pattern-recognition-system-using-multi-neura
原文作者:Vietdungiitb
译文由本站 robot-v1.0 翻译
前言
Tutorials of using multi neural networks for large pattern recognition system, handwriting recognition system
多神经网络用于大型模式识别系统,手写识别系统的教程
- 下载工程图样本-52 KB(Download drawing samples - 52 KB)
- 下载handwriting_recognition_using_multi_neural_networks.flv-8.9 MB(Download handwriting_recognition_using_multi_neural_networks.flv - 8.9 MB)
- 下载源1.3 MB(Download source - 1.3 MB)
- 下载演示-146.6 KB(Download demo - 146.6 KB)
- 下载lower_case_letter_v2.zip-5.6 MB(Download lower_case_letter_v2.zip - 5.6 MB)
- 下载digit_v2.zip-3.3 MB(Download digit_v2.zip - 3.3 MB)
- 下载capital_letter_v2.zip-5.6 MB(Download capital_letter_v2.zip - 5.6 MB) <object type =" application/x-shockwave-flash"代码库=" http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=3,0,0,0" width =" 480" height =" 360" data =" http://www.youtube.com/v/5OxLUI8CSoE?version=3&hl=vi_VN"> <参数名称=“质量” value =“高”/> <参数名称=" wmode" value =“透明”/>()
介绍(Introduction)
如今,人工神经网络已广泛应用于人类生活的许多领域.但是,为大型企业创建有效的网络(Now a day, artificial neural network has been appliedpopularly in many fields of human life. However, creating an efficient networkfor a large) 分类器(classifier) 像手写识别系统一样,对科学家来说仍然是一个巨大的挑战.在我上一篇名为"(like handwriting recognition systems is stilla big challenge to scientists. In my last article named “) 使用UNIPEN数据库的在线手写识别系统库(Library for online handwriting recognition system using UNIPEN database) “,我介绍了一个用于手写识别系统的高效库,该系统可以简单地创建,更改神经网络.该演示程序在数字集(97%)和字母集(93%)上显示出良好的识别结果.本文我将继续介绍一种针对大模式分类的解决方案,特别是手写识别.(”, I presented anefficient library for a handwriting recognition system which can create, changea neural network simply. The demo program showed good recognition results todigit set (97%) and alphabet sets (93%).This article I will continue to presenta solution for a large patterns classification in general and handwritingrecognition in particular.)
<o:p>(<o:p>)
<o:p>(<o:p>)
使用其他拼写检查器模块时,识别率会大大降低(Recognition rate significantly increate when using additional spell checker module)
识别系统的神经网络(Neural network for a recognition system)
在传统的模式识别模型中,手动设计的特征提取器从输入中收集相关信息,并消除不相关的变化.训练器分类器(通常可以使用标准的,全连接的多层神经网络作为分类器)将产生的特征向量分类为类,但是可能存在一些问题,这些问题应影响识别结果.卷积神经网络(CNN)解决了传统方法的这一缺点,从而在模式识别任务上实现了最佳性能. <o:p>(In the traditional model of pattern recognition, ahand-designed feature extractor gathers relevant information from input andeliminates irrelevant variabilities. A trainer classifier (normally, astandard, fully-connected multi-layer neural network can be used as aclassifier) then categorizes the resulting feature vectors into classes.However, it could have some problems which should influent to the recognitionresults. The convolution neural network (CNN) solves this shortcoming oftraditional one to achieve the best performance on pattern recognition task. <o:p>)
CNN是多层神经网络的一种特殊形式,与其他网络一样,CNN通过反向传播算法进行训练.差异在于他们的架构内.卷积网络结合了三种架构思想,以确保一定程度的移位,缩放和失真不变性:局部接收场,共享权重(或权重复制)空间或时间二次采样.它们经过专门设计,可以以最少的预处理操作直接从数字图像中识别图案. CNN的体系结构细节已在Yahn LeCun博士和Patrice Simard博士的文章中进行了全面描述(请参阅我以前的文章).(The CNNs is a special form of multi-layer neural network.Like other networks, CNNs are trained by back propagation algorithms. Thedifference is inside their architecture. The convolutional network combinesthree architectural ideas to ensure some degree of shift, scale, and distortioninvariance: local receptive field, shared weights (or weight replication)spatial or temporal sub-sampling. They have been designed especially torecognize patterns directly from digital images with the minimum ofpre-processing operations. The architecture details of CNN have been describedcomprehensively in articles of Dr. Yahn LeCun and Dr. Patrice Simard (see myprevious articles).)
图1:LeNET 5的架构(Figure 1:The Architecture of LeNET 5)
<o:p>(<o:p>)
<o:p>(<o:p>)
数字(Figure)2(2):输入图像,后跟执行5×5卷积的特征图和2x 2子采样图(:An input image followed by a feature map performing a 5 × 5 convolution and a 2x 2 sub-sampling map)
上述网络的识别结果确实很高,甚至可以收集到小的模式集合,例如数字,大写字母或小写字母等.但是,当我们想要创建一个较大的神经网络时,它可以识别数字和英文字母(62个字符)的较大集合例如,问题开始出现.找到一个优化的,足够大的网络变得更加困难,通过大输入模式训练网络需要更长的时间.网络的收敛语音较慢,特别是由于较大的不良书面字符,许多相似和易混淆的字符等,导致准确率显着下降.此外,假设我们可以创建一个足够好的网络,可以准确地识别英语字符,但当然不能正确识别特殊字符.字符超出其输出集(俄语或汉字),因为它没有扩展能力.因此,为非常大的模式分类器创建唯一的网络非常困难,而且可能是不可能的.(The recognition results of the above networks are reallyhigh to small patterns collection such as digit, capital letters or lower caseletters etc. However, when we want to create a larger neural network which canrecognize a bigger collection like digit and English letters (62 characters)for example, the problems begin appear. Finding an optimized and large enoughnetwork becomes more difficult, training network by large input patterns takesmuch longer time. Convergent speech of the network is slower and especially,the accuracy rate is significant decrease because bigger bad writtencharacters, many similar and confusable characters etc. Furthermore, assumingwe can create a good enough network which can recognize accurately Englishcharacters but it certainly cannot recognize properly a special character outsize itsoutputs set (a Russian or Chinese character) because it does not have expansioncapacity. Therefore, creating a unique network for very large patternsclassifier is very difficult and may be impossible.)
针对上述问题的建议解决方案是代替使用唯一的大型网络,而可以使用多个较小的网络,这些网络对这些自己的输出集具有很高的识别率.除了正式的输出集(数字,字母…)之外,这些网络还有一个附加的未知输出(未知字符).这意味着,如果输入模式未被识别为正式输出字符,它将被理解为未知字符.然后,输入模式将被传输到下一个网络,直到系统正确识别它为止.(Theproposed solution to the above problems is instead of using a unique bignetwork we can use multi smaller networks which have very high recognition rateto these own output sets. Beside the official output sets (digit, letters…)these networks have an additional unknown output (unknown character). It meansthat if the input pattern is not recognized as a character of official outputsit will be understand as an unknown character. Then the input pattern will betransferred to the next network until the system can recognize it correctly.)
<o:p>(<o:p>)
图3:输出未知的卷积神经网络(Figure 3: Convolution neural networkwith unknown output)
<o:p>(<o:p>)
图4:使用多神经网络的识别系统(Figure 4: Recognition System usingmulti neural networks)
该解决方案几乎克服了传统模型的局限性.新系统包括几个小型网络,这些网络很容易进行优化以获得最佳识别结果.训练这些小型网络所需的时间少于大型网络.尤其是,新模型确实非常灵活且可扩展.根据需求,我们可以加载一个或多个网络.我们还可以在系统中添加新的网络以识别新的模式,而无需更改或重建模型.所有这些小型网络都具有对其他多神经网络系统的可重用能力.(This solution overcomes almost limits of the traditional model. The new systemincludes a several small networks which are simple for optimizing to get thebest recognition results. Training these small networks takes less time than ahuge network. Especially, the new model is really flexible and expandable.Depending on the requirement we can load one or more networks; we can also addnew networks to the system to recognize new patterns without change or rebuiltthe model. All these small networks have reusable capacity to an other multi neural networks system.)
实验(Experiment)
该演示程序旨在显示识别系统的所有阶段,包括:创建组件网络,训练网络,在UNIPEN数据集上测试网络以及在鼠标绘图控件上测试网络.该教程可帮助每个人理解识别系统.所有功能都可以在程序GUI上实现.因此,您可以在运行时创建,训练和测试网络,而无需更改任何代码或重新启动程序.<o:p>(The demo program is built to the purpose showing all stagesof a recognition system including: create a component network, train a network,test networks on UNIPEN dataset and test networks on a mouse drawing control.It is tutorials which can help everybody can understand to a recognitionsystem. All functions can be implemented on the program GUI. So you can create,train, and test your network on runtime without change any code or restart theprogram.<o:p>)
<o:p>(<o:p>)
图5:手写识别系统界面(Figure 5: Handwriting recognition system interface)
创建新的神经网络(Creating new neuralnetwork)
数字(Figure)6(6):创建新的神经网络接口(: Creating new neural network Interface)
完全基于GUI创建新的神经网络.创建网络取决于输入模式的大小,层数,数据集^.在输出层上,我们可以选择``未知输出'‘复选框来创建网络的其他未知输出,也可以忽略它以创建普通网络.(Creating new neural network completely bases on GUI. Creatinga network depends on the input pattern size, number of layers, data set…. Onthe output layer we can choose unknown output checkbox to create an additionalunknown output to the network or ignore it to create a normal network.<o:p>)
<o:p>(<o:p>)
当然,我们仍然可以通过代码创建网络:(Of course, we can still to create a network by code:)
void CreateNetwork()
<pre> {
network = new ConvolutionNetwork();
//layer 0: inputlayer
network.Layers = new Layer[6];
network.LayerCount = 6;
InputLayer inputlayer = new InputLayer("00-Layer Input", new Size(29, 29));
network.InputDesignedPatternSize = new Size(29, 29);
inputlayer.Initialize();
network.Layers[0] = inputlayer;
ConvolutionLayer convlayer = new ConvolutionLayer("01-Layer ConvolutionalSubsampling", inputlayer, new Size(13, 13), 10, 5);
convlayer.Initialize();
network.Layers[1] = convlayer;
convlayer = new ConvolutionLayer("02-Layer ConvolutionalSubsampling", convlayer, new Size(5, 5), 60, 5);
convlayer.Initialize();
network.Layers[2] = convlayer;
FullConnectedLayer fulllayer = new FullConnectedLayer("03-Layer FullConnected", convlayer, 200);
fulllayer.Initialize();
network.Layers[3] = fulllayer;
fulllayer = new FullConnectedLayer("04-Layer FullConnected", fulllayer, 100);
fulllayer.Initialize();
network.Layers[4] = fulllayer;
OutputLayer outputlayer = new OutputLayer("05-Layer Output", fulllayer, Letters3.Count, true);
outputlayer.Initialize();
network.Layers[5] = outputlayer;
network.TagetOutputs = Letters3;
network.UnknownOuput = '?';
}
训练网络(Training a network)
使用"创建网络"功能创建神经网络后,将使用UNIPEN数据库对网络进行训练.(After creating a neural network using “Create network” function, the network will be trained using UNIPEN database.)
图7:培训网络界面(Figure 7: Training network interface)
<o:p>(<o:p>)
<o:p>(<o:p>)
根据网络大小,我们可以在UNIPENdata文件夹中选择训练集is1a,1b或1c.训练过程的统计数据可以显示许多有用的信息,例如:时期数,MSE,每个时期的训练时间,成功率…<o:p>(Depending on the network size we can choose training set is1a, 1b or 1c in the UNIPENdata folder. Statistic of training process can showmany useful information such as: No. of epoch, MSE, training time per epoch,success rate…<o:p>)
UNIPEN数据浏览器和识别测试<o:p>(UNIPEN data browserand recognition testing<o:p>)
演示程序中的UNIPEN数据浏览器控件可以显示所有UNIPEN数据文件.我们还可以通过加载经过训练的网络参数文件来在这些文件上测试经过训练的神经网络.(The UNIPEN data browser control in the demoprogram can show all the UNIPEN data files. We can also test the trained neuralnetwork on these files by loading trained network parameters files.)
图8:UNIPEN数据浏览器和识别界面(Figure 8: UNIPEN data browser and recognition interface)
<o:p>(<o:p>)
鼠标绘图测试(Mouse Drawing test)
图9:鼠标图形识别界面(Figure 9: Mouse drawing recognition interface)
<o:p>(<o:p>)
鼠标绘图控件基于出色的文章”(The mouse drawing control is based on the excellent article ”) 绘图工具(DrawTools) “通过(”by) 亚历克斯`弗雷(Alex Fr) .我只是更改了一些代码以适合我的要求.图像中的草书文本通过以下相同的算法分为行,字和孤立字符:(. I justchanged some codes to fit to my requirement. The cursive text in the image is dividedto line, word and isolated character by same algorithm as follows:)
private void btRecognition_Click(object sender, EventArgs e)
<pre> {
//recognition all characters in the drawArea
if (bitmap != null)
{
bitmap.Dispose();
bitmap = null;
}
bitmap = new Bitmap(drawArea.Width, drawArea.Height);
drawArea.DrawToBitmap(bitmap, new Rectangle(0, 0, bitmap.Width, bitmap.Height));
drawBitmap =(Bitmap) bitmap.Clone();
if (bitmap != null)
{
lbRecognizedText.Items.Clear();
List<InputPattern> lineList=null;
List<InputPattern> wordList=null;
InputPattern parentPt=new InputPattern(bitmap,255,new Rectangle(0,0,bitmap.Width,bitmap.Height));
lineList = GetPatternsFromBitmap(parentPt,500,1,true,10,10);
if (lineList.Count > 0)
{
if (characterList != null)
{
characterList.Clear();
characterList = null;
}
characterList = new List<InputPattern>();
foreach (var line in lineList)
{
String text = "";
wordList = GetPatternsFromBitmap(line, 50, 10,false, 10, 10);
if (wordList != null)
{
if (wordList.Count > 0)
{
foreach (var word in wordList)
{
List<InputPattern> charList = GetPatternsFromBitmap(word, 5, 5, false, 10, 10);
//check if have part bitmaps
if (charList != null)
{
if (charList.Count > 0)
{
panelNavigation.Visible = true;
foreach (var c in charList)
{
characterList.Add(c);
c.GetPatternBoundaries(5,5,false,10,10);
Char accChar = new Char();
PatternRecognition(c.OriginalBmp,out accChar);
if (accChar != '\0')
{
text = String.Format("{0}{1}", text, accChar.ToString());
drawBitmap = c.DrawChildPatternBoundaries(drawBitmap);
}
}
}
}
text = String.Format("{0} ", text);
}
}
}
lbRecognizedText.Items.Add(text);
}
}
pbPreview.Image = drawBitmap;
lblNavigation.Text = characterList.Count.ToString();
index = 0;
}
}
图10:加载经过训练的网络参数文件(Figure 10: Loading trained network parameters files)
为了激活识别功能,我只是加载训练有素的networkparameters文件.根据我的识别要求,我可以加载一个,两个或所有文件.如果仅加载一个网络以识别其输出字符,则识别结果的确很好(高达90%).但是,当我加载多网络时,系统的准确率会降低.主要原因是草书中有许多容易混淆的字符;训练集不够大等.(Inorder to active the recognition function I simply load trained networkparameters files. Depending to my recognition requirement I can load one, twoor all files. The recognition results are really good (higher 90%) if I load onlyone network to recognize its output characters. However, when I load multinetwork the system’s accuracy rate becomes lower.The main reasons are many confusable characters in cursive text; the trainingsets are not large enough etc.)
对于像手写字符这样的大型模式集合,有太多相似的字符,在某些情况下,它们不仅会使机器产生混乱,还会使人感到困惑,例如:O,0和o; 9\4,g,q等.这些字符会使网络无法识别.因此,解决方案已经升级,通过在系统输出端使用附加的拼写检查器/投票模块,可以显着提高识别率.输入模式将被所有组件网络识别.然后将这些输出(未知输出除外)设置为拼写检查器/投票模块的输入.该模块将基于以前的公认字符,内部词典和其他因素来决定哪一个将是最准确的公认字符.(For a large pattern collection like handwritten characters, there are so manysimilar characters which can make not only machine but also human confusein some cases such as: O, 0 and o; 9, 4,g,q etc. These characters can makenetworks misrecognize. Hence the solution has been being upgraded which significantincreate recognition rate by using an additional spellchecker/voting module atthe output of system. The input pattern will be recognized by all componentnetworks. These outputs (except unknown outputs) then will be set as the inputsof the spellchecker/voting module. The module will bases on previous recognizedcharacters, internal dictionary and other factors to decide which one will bethe most accurated recognized character.)
图11:使用拼写检查器/投票模块的新识别系统(Figure 11: The new recognition system using Spell checker /voting module)
使用拼写检查器/投票模块的新识别系统(内部字典)(The new recognition system using Spell checker /voting module (internal dictionary))
拼写检查器模块使系统能够更好地识别(The spellchecker module makes the system recognizes much better)****
结论(Conclusion)
所提出的识别模型最大程度地解决了大型识别系统的难题:识别大型伙伴集合的能力,灵活的设计和部署,可扩展和可恢复的容量等.通过提高组件网络的识别率,使用拼写检查器/投票模块等,提高系统的准确率也可以轻松实现.演示程序还证明了该库的容量,该库应在许多其他应用程序(例如预测应用程序,人脸识别…(The proposed recognition model has solved amost prolems to a large recognition system: the capacity of recognizing large partern collection, flexible design and deployment, expanable and resuable capacity…etc. Increasing accuracy rate to the system also can do easier by increasing recognition rate of component networks, using the spell checker /voting module etc. The demo program also proved the capacity of the library which should be used in many other applications such as prediction application, face recognition…)
未来的工作和升级(Fututre work and upgrade)
一些功能将对库有所帮助:(Some features would be udate to the library:)
-LeNET模型的卷积和采样层.(- Convolution and sampling layer of LeNET model.)
-拼写检查/投票模块(- Spell checker / voting module)
-字符细分.(-character segmentation.)
目前,该项目花了我很多时间.在我可以重新安排一切和/或找到新的良好赞助之前,应该慢下来或暂时停止.无论如何,对文章的投票/评论将决定该项目是否继续.收到本文的评论和建议,尤其是模型,拼写检查模块和字符分割算法,我将非常感激…(At the moment, the project took to much my free time. It should be slowdown or temporary stop until I can re-arrange everything and/or find a new good sponsorship. Howerver the vote/comment to the article would decice the project will continue or not. I will really appreciate to receive comments and suggessions to the article especially to the model, spell checker module and character segmentation algorithm…)
历史(History)
版本1.0:初始代码(version 1.0: initial code)
版本1.1中,系统已添加了拼写检查器/投票模块,该模块显着提高了识别率.这让我真的感到惊讶和高兴.完成代码重新排列后,我将发布它.(version 1.1 the spell checker /voting module has been added to the system which increates significantly recognition rate. It made me really supprised and happied. I will publish it when I complete code rearrangement.)
许可
本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。
C# Windows GDI+ Dev 新闻 翻译