[译]手写数字阅读器用户界面
By robot-v1.0
本文链接 https://www.kyfws.com/ai/handwritten-digits-reader-ui-zh/
版权声明 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
- 10 分钟阅读 - 4603 个词 阅读量 0手写数字阅读器用户界面(译文)
原文地址:https://www.codeproject.com/Articles/1273125/Handwritten-Digits-Reader-UI
原文作者:KristianEkman
译文由本站 robot-v1.0 翻译
前言
A C# object oriented Neural Network, trainer, and Windows Forms user interface for recognitions of hand-written digits.
一个面向C#对象的神经网络,培训器和Windows Forms用户界面,用于识别手写数字.
介绍(Introduction)
本文是关于机器学习领域基础知识的另一篇文章的续篇.第一部分,可能最好先阅读,以数学方式解释了神经网络如何在Excel中使用简单的六个神经元网络来学习.(This article is a continuation of another article about the basics in the machine learning area. The first part, which might be good to read first, explains mathematically how a Neural Network learns using a simple six neuron net in Excel.)
看到(See) Excel中的机器学习(Machine Learning in Excel) 1个(1)
该资源展示了如何训练和使用神经网络来解释手写数字.(This source demonstrates how to train and use a Neural Network to interpret hand written digits.) 本文不会涉及任何数学运算.这都是关于基本机器学习的C#实现.(There won’t be any math in this article. It is all about a C#-implementation of basic machine learning.)
背景(Background)
人工神经网络(ANN)使用(The Artificial Neural Network, ANN, is trained using the) Mnist手写数字数据集(Mnist handwritten digits dataset) 2(2).这是数据科学领域的一个经典问题.也被称为(. This is a classic problem in the field of data science. It is also known as the)Hello World应用程序(Hello World application)机器学习.关于此主题,已经在Code Project上发布了一些演示应用程序,但我认为我的消息来源可以对某人有所帮助.复杂的问题可能需要一个以上的解释,我试图使其尽可能简单.(of Machine Learning. There are already a few demo applications posted on Code Project on this subject, but I thought my source could help someone. Complex problems might need more than one explanation and I tried to make it as simple as possible.)
使用代码(Using the Code)
下载,解压缩并在Visual Studio 2017中打开解决方案.(Download, unzip and open the solution in Visual Studio 2017.)
解决方案(The Solution)
该解决方案包含五个项目:(The solution contains five projects:)
项目(Project) | 描述(Description) | 构架(Framework) |
---|---|---|
DeepLearningConsole |
培训控制台的入口点(Entry point for the training console) | .NET Core 2.2(.NET Core 2.2) |
DeepLearning |
主图书馆(Main library) | .NET标准2.0(.NET Standard 2.0) |
Data |
解析器数据.目前只有Mnist.(Parser for data. Currently only Mnist.) | .NET标准2.0(.NET Standard 2.0) |
MnistTestUi |
用于手动测试的用户界面(User interface for manually testing) | .NET Framework 4.7.1(.NET Framework 4.7.1) |
Tests |
一些单元测试(A few unit tests) | .NET Core 2.2(.NET Core 2.2) |
为什么有那么多框架?(Why So Many Frameworks?)
我意识到.NET Core的性能比.NET Framework快30%,并且我想将.NET Framework用于Windows Forms应用程序.(I realized that .NET Core performs ~30% faster than .NET Framework and I wanted to use .NET Framework for the Windows Forms application.) 不幸的是,您无法从.NET Framework中引用.NET Core库.它们不兼容.(Unfortunately, you can’t reference .NET Core libraries from .NET Framework. They are not compatible.) 为了解决该问题,我对通用组件使用了.NET Standard.(To solve that issue, I used .NET Standard for the common components.)
.NET Standard不是框架.它是.NET API的正式规范.(.NET Standard isn’t a framework. It is a formal specification of .NET APIs.) 所有.NET实现都应该与其兼容,不仅是.NET Core和.NET Framework,还应该是Xamarine,Mono,Unity和Windows Mobile.(All .NET implementations should be compatible with it, not just .NET Core and .NET Framework but also Xamarine, Mono, Unity and Windows Mobile.) 这就是为什么它是用于可重用组件的Target Framework的不错选择.(That is why it’s a good selection of Target Framework for reusable components.)
解析Mnist数据(Parsing Mnist Data)
这些是(These are the four files in the) Mnist
数据库:(database:)
- t10k-images-idx3-ubyte(t10k-images-idx3-ubyte)-测试图像(- Test images)
- t10k-labels-idx1-ubyte(t10k-labels-idx1-ubyte)-测试图像标签(- Labels for test images)
- train-images-idx3-ubyte(train-images-idx3-ubyte)-训练图像(60000)(- Training images (60000))
- train-labels-idx1-ubyte(train-labels-idx1-ubyte)-60000火车图像的标签(- Labels for 60000 train images)
来自Mnist的图像必须从字节数组转换为双精度数组,范围从(The images from Mnist had to be converted from byte arrays to arrays of doubles ranging from)
0
至(to)1
.(.) 这些文件还包含一些头字段.(The files also contain some header-fields.)
private static List<Sample> LoadMnistImages(string imgFileName, string idxFileName, int imgCount)
{
var imageReader = File.OpenRead(imgFileName);
var byte4 = new byte[4];
imageReader.Read(byte4, 0, 4); //magic number
imageReader.Read(byte4, 0, 4); //magic number
Array.Reverse(byte4);
//var imgCount = BitConverter.ToInt32(byte4, 0);
imageReader.Read(byte4, 0, 4); //width (28)
imageReader.Read(byte4, 0, 4); //height (28)
var samples = new Sample[imgCount];
var labelReader = File.OpenRead(idxFileName);
labelReader.Read(byte4, 0, 4);//magic number
labelReader.Read(byte4, 0, 4);//count
var targets = GetTargets();
for (int i = 0; i < imgCount; i++)
{
samples[i].Data = new double[784];
var buffer = new byte[784];
imageReader.Read(buffer, 0, 784);
for (int b = 0; b < buffer.Length; b++)
samples[i].Data[b] = buffer[b] / 256d;
samples[i].Label = labelReader.ReadByte();
samples[i].Targets = targets[samples[i].Label];
}
return samples.ToList();
}
解析过程会生成两个训练和测试样本列表.(The parsing process produces two lists of training and testing samples.) 样本由图像像素阵列和长度为10的目标阵列组成,该目标阵列是有关图像的位数的信息.(A sample consists of the image pixel array and an target array of length 10 which is the information on which digit the image is.) 数字零是数组:1,0,0,0,0,0,0,0,0,0.(The digit zero is the array: 1,0,0,0,0,0,0,0,0,0.) 数字五是:0,0,0,0,1,0,0,0,0,0(第五位置一个),依此类推.(Digit five is: 0,0,0,0,1,0,0,0,0,0 (a one in fifth position) and so on.)
实例化和训练神经网络(Instantiating and Train the Neural Network)
要实例化新的ANN,您需要提供其拓扑,层数以及每一层中的神经元数.(To instantiate a new ANN, you need to provide its topology, number of layers and number of neurons in each layer.)
输入必须具有784个神经元才能显示图像(28x28像素).输出层必须有(Input must have 784 neurons for mnist images (28x28 pixels). Output layer must have) 10
.(.)
培训班使用以下方法训练神经网络(The trainer class trains a Neural Network using) TrainData
具有指定的学习率.(with the specified learn rate.)
var neuralNetwork = new NeuralNetwork(rndSeed: 0, sizes: new[] { 784, 200, 10 });
neuralNetwork.LearnRate = 0.3;
var trainer = new Trainer(neuralNetwork, Mnist.Data);
trainer.Train();
接下来,每个训练样本都被馈送到网络,以便可以学习.(Next, each training sample is fed to the network so it can learn.) 我发现隐藏层中的200个神经元可以将ANN训练到98.5%的准确性,这似乎已经足够好了.(I found out that 200 neurons in the hidden layer made it possible to train the ANN to 98.5% accuracy which seemed good enough.) 拥有400个神经元,准确率最高达到98.8%,但训练时间却是原来的两倍.(With 400 neurons, the accuracy maxed out at 98.8% but it took twice the time to train.)
Mnist训练师(Mnist Trainer)
培训师反复训练人工神经网络(The trainer repeatedly trains the ANN letting it)**看到(see)**当时只有一个训练样本.(one training sample at the time.) 所有60000个训练图像中的一个循环称为一个时期.(One loop of all 60000 training images is called an epoch.)
在每个时期之后,将ANN序列化并保存到文件中.(After each epoch, the ANN is serialized and saved to a file.) 然后针对测试样本对ANN进行测试,并将结果记录到csv文件中.(The ANN is then tested against the test samples and the result is logged to a csv-file.) 训练图像在每个时期之间也被打乱.(The training images are also shuffled between each epoch.)
public void Train(int epochs = 100)
{
var rnd = new Random(0);
var name = $"Sigmoid LR{NeuralNetwork.LearnRate} HL{NeuralNetwork.Layers[1].Count}";
var csvFile = $"{name}.csv";
var bestResult = 0d;
for (int epoch = 1; epoch < epochs; epoch++)
{
Shuffle(TrainData.TrainSamples, rnd);
TrainEpoch();
var result = Test();
Log($"Epoch {epoch} {result.ToString("P")}");
File.AppendAllText(csvFile, $"{epoch};{result};{NeuralNetwork.TotalError}\r\n");
if (result > bestResult)
{
NeuralNetwork.Save($"{name}.bin");
Log($"Saved {name}.bin");
bestResult = result;
}
}
}
先前发布的文章中介绍了以下两章的理论.(The theory of the two following chapters are described in a previously posted article.)
看到(See) Excel中的机器学习(Machine Learning in Excel) .(.)
前进通行证(Forward Pass)
通过汇总所有先前的神经元乘以其权重,可以计算出每个神经元的值.(This calculates each neurons value by summarizing all previous neurons multiplied with their weights.) 该值然后通过激活函数传递.然后可以从最后一层(也就是输出层)获得结果或输出.(The value is then passed through an activation function. The result or the output can then be obtained from the last Layer, a.k.a. the output layer.)
private void Compute(Sample sample, bool train)
{
for (int i = 0; i < sample.Data.Length; i++)
Layers[0][i].Value = sample.Data[i];
for (int l = 0; l < Layers.Length - 1; l++)
{
for (int n = 0; n < Layers[l].Count; n++)
{
var neuron = Layers[l][n];
foreach (var weight in neuron.Weights)
weight.ConnectedNeuron.Value += weight.Value * neuron.Value;
}
var neuronCount = Layers[l + 1].Count;
if (l + 1 < Layers.Count() - 1)
neuronCount--; //skipping bias
for (int n = 0; n < neuronCount; n++)
{
var neuron = Layers[l + 1][n];
neuron.Value = LeakyReLU(neuron.Value / Layers[l].Count);
}
}
}
反向传播(Back Propagation)
该算法调整神经元之间的所有权重.它使网络学习并逐渐提高其性能.(This algorithm adjusts all the weights between neurons. It makes the network learn and gradually improve its performance.)
private void ComputeNextWeights(double[] targets)
{
var output = OutputLayer;
for (int t = 0; t < output.Count; t++)
output[t].Target = targets[t];
//Output Layer
foreach (var neuron in output)
{
neuron.Error = Math.Pow(neuron.Target - neuron.Value, 2) / 2;
neuron.Delta = (neuron.Value - neuron.Target) * (neuron.Value > 0 ? 1 : 1 / 20d));
}
this.TotalError = output.Sum(n => n.Error);
foreach (var neuron in Layers[1])
{
foreach (var weight in neuron.Weights)
weight.Delta = neuron.Value * weight.ConnectedNeuron.Delta;
}
//Hidden Layer
Parallel.ForEach(Layers[0], GetParallelOptions(), (neuron) => {
foreach (var weight in neuron.Weights)
{
foreach (var connectedWeight in weight.ConnectedNeuron.Weights)
weight.Delta += connectedWeight.Value * connectedWeight.ConnectedNeuron.Delta;
var cv = weight.ConnectedNeuron.Value;
weight.Delta *= (cv > 0 ? 1 : 1 / 20d);
weight.Delta *= neuron.Value;
}
});
//All deltas are done. Now calculate new weights.
for (int l = 0; l < Layers.Length - 1; l++)
{
var layer = Layers[l];
foreach (var neuron in layer)
foreach (var weight in neuron.Weights)
weight.Value -= (weight.Delta * this.LearnRate);
}
}
Mnist测试界面(Mnist Test UI)
测试用户界面用于测试您自己的笔迹.它有两个面板.小面板解释单个绘制的数字,底部的大面板解释一个数字.(The Test UI is used for testing your own handwriting. It has two panels. The small panel interprets a single drawn digit and in the larger at the bottom, you can draw a number.)
图像预处理(Image Preprocessing)
Mnist数据库主页指出:(The Mnist database homepage states that:)
“来自NIST的原始黑白(双级别)图像的尺寸进行了归一化处理,以适应20x20像素的大小,同时保留其长宽比.归一化算法使用抗锯齿技术,生成的图像包含灰度级.通过计算像素的质心并平移图像以将该点定位在28x28场的中心,将图像定位在28x28图像的中心."(“The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.")
以下是如何使用位图和Windows Forms图形进行操作的说明.(Below is the instruction how you can do it using bitmaps and Windows Forms graphics.)
首先,在绘制的数字周围找到最小的正方形.(First, find the smallest square around the drawn digit.)
public Rectangle DrawnSquare()
{
var fromX = int.MaxValue;
var toX = int.MinValue;
var fromY = int.MaxValue;
var toY = int.MinValue;
var empty = true;
for (int y = 0; y < Bitmap.Height; y++)
{
for (int x = 0; x < Bitmap.Width; x++)
{
var pixel = Bitmap.GetPixel(x, y);
if (pixel.A > 0)
{
empty = false;
if (x < fromX)
fromX = x;
if (x > toX)
toX = x;
if (y < fromY)
fromY = y;
if (y > toY)
toY = y;
}
}
}
if (empty)
return Rectangle.Empty;
var dx = toX - fromX;
var dy = toY - fromY;
var side = Math.Max(dx, dy);
if (dy > dx)
fromX -= (side - dx) / 2;
else
fromY -= (side - dy)/ 2;
return new Rectangle(fromX, fromY, side, side);
}
裁剪正方形并调整大小为20x20的新位图.(Crop out the square and resize to a new bitmap of size 20x20.)
public DirectBitmap CropToSize(Rectangle drawnRect, int width, int height)
{
var bmp = new DirectBitmap(width, height);
bmp.Bitmap.SetResolution(Bitmap.HorizontalResolution, Bitmap.VerticalResolution);
var gfx = Graphics.FromImage(bmp.Bitmap);
gfx.CompositingQuality = CompositingQuality.HighQuality;
gfx.InterpolationMode = InterpolationMode.HighQualityBicubic;
gfx.PixelOffsetMode = PixelOffsetMode.HighQuality;
gfx.SmoothingMode = SmoothingMode.AntiAlias;
var rect = new Rectangle(0, 0, width, height);
gfx.DrawImage(Bitmap, rect, drawnRect, GraphicsUnit.Pixel);
return bmp;
}
最后,画出20 x 20图像,其质心位于28x28位图中.(And finally, draw the 20 by 20 image with its center of mass centered inside a 28x28 bitmap.)
public Point GetMassCenterOffset()
{
var path = new List<Vector2>();
for (int y = 0; y < Height; y++)
{
for (int x = 0; x < Width; x++)
{
var c = GetPixel(x, y);
if (c.A > 0)
path.Add(new Vector2(x, y));
}
}
var centroid = path.Aggregate(Vector2.Zero, (current, point) => current + point) / path.Count();
return new Point((int)centroid.X - Width / 2, (int)centroid.Y - Height / 2);
}
protected DirectBitmap PadAndCenterImage(DirectBitmap bitmap)
{
var drawnRect = bitmap.DrawnRectangle();
if (drawnRect == Rectangle.Empty)
return null;
var bmp2020 = bitmap.CropToSize(drawnRect, 20, 20);
//Make image larger and center on center of mass
var off = bmp2020.GetMassCenterOffset();
var bmp2828 = new DirectBitmap(28, 28);
var gfx2828 = Graphics.FromImage(bmp2828.Bitmap);
gfx2828.DrawImage(bmp2020.Bitmap, 4 - off.X, 4 - off.Y);
bmp2020.Dispose();
return bmp2828;
}
然后,只需从图像中提取字节并使用它们查询ANN.(And then, just extract the bytes from the image and query the ANN with them.)
public byte[] ToByteArray()
{
var bytes = new List<byte>();
for (int y = 0; y < Bitmap.Height; y++)
{
for (int x = 0; x < Bitmap.Width; x++)
{
var color = Bitmap.GetPixel(x, y);
var i = color.A;
bytes.Add(i);
}
}
return bytes.ToArray();
}
如果您对Mnist图像的外观感到好奇,则该UI还具有显示Mnist图像的功能.但是我不会在UI的每个细节上投入太多,因为我觉得我们已经脱离话题了.(The UI has also a function to show Mnist images if you are curious about what they look like. But I won’t go too much into every detail of the UI, because I feel we are getting off topic.)
最后(Finally)
希望您喜欢我的文字,也许还中学到了您不知道的知识.如果您有任何问题,意见或想法,请将其放在下面.(I hope you liked my text, and perhaps learned something you didn’t already know. If you have any questions, comments or ideas, just drop them here below.)
目前仅此而已,别忘了投票.干杯!(That’s all for now, don’t forget to vote. Cheers!)
链接(Links)
- Excel中的机器学习(Machine Learning in Excel) -克里斯蒂安`埃克曼(Kristian Ekman)(- Kristian Ekman)
- Mnist手写数字数据集(Mnist handwritten digits dataset) -Yann LeCun,Corinna Cortes,Christopher J.C. Burges(- Yann LeCun, Corinna Cortes, Christopher J.C. Burges)
历史(History)
- 7(7)日(th)2019年1月-1.0版(January, 2019 - Version 1.0)
许可
本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。
XML C# .NET VS2017 ANN neural-nets text 新闻 翻译