[译]ML.NET简介-面向.NET开发人员的机器学习库
By robot-v1.0
本文链接 https://www.kyfws.com/ai/introducing-the-ml-net-a-machine-learning-library-zh/
版权声明 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
- 7 分钟阅读 - 3380 个词 阅读量 0ML.NET简介-面向.NET开发人员的机器学习库(译文)
原文地址:https://www.codeproject.com/Articles/1268051/Introducing-the-ML-NET-A-Machine-Learning-Library
原文作者:Coding Notes
译文由本站 robot-v1.0 翻译
前言
Introducing the ML.NET
ML.NET简介
介绍(Introduction)
大多数常见的机器学习(ML)库都是用Python编写的,.NET开发人员并不容易. ML.NET库是ML库和.NET应用程序之间的桥梁.(Most of the common Machine Learning (ML) libraries are written in Python and it is not so easy for .NET developers. The ML.NET library occurs as a bridge between ML libraries and .NET applications.)
ML.NET是一个开放源代码库,可以直接在.NET应用程序中使用.在本文中,我将介绍如何在Visual Studio 2017中使用ML.NET库(我正在使用VS 2017社区).(ML.NET is an open source library that can be used directly in .NET applications. In this article, I am going to introduce how to use the ML.NET library in Visual Studio 2017 (I am using VS 2017 Community).)
背景(Background)
二进制分类问题(A Binary Classification Problem)
假设我们有两个点(在二维空间中),分别是红色和蓝色,并且我们将预测一个点是否属于该点.(Assume that we have two points (in a two-dimensional space) groups that are Red and Blue and we are going to predict whether a point will belong to the) Red
组或(group or the) Blue
根据坐标分组((group based on coordinates () x
和(and) y
).我们的训练数据如下所示:() of this point. Our training data can look like this:)
3 -2 Red
-2 3 Red
-1 -4 Red
2 3 Red
3 4 Red
-1 9 Blue
2 14 Blue
1 17 Blue
3 12 Blue
0 8 Blue
我们有十分.每行的两个第一值是坐标((We have ten points. Two first values of each row are coordinates () x
和(and) y
),第三个值是该点所属的组.() of each point and the third value is the group which that point belongs to.)
因为我们只有两个输出(Because we have only two outputs that are) Blue
要么(or) Red
,我们的问题是二进制分类问题.解决二进制分类问题有很多不同的ML技术,在本文中,我将使用Logistic回归,因为它是最简单的ML算法.(, our problem is binary classification problem. There are a lot of different ML techniques for solving a binary classification problem and in this article, I will use Logistic Regression because it is the simplest ML algorithm.)
创建一个.NET应用程序并安装ML.NET库(Creating a .NET Application and Installing the ML.NET Library)
为简单起见,我们将创建一个控制台应用程序C#(.NET Framework)并将其命名为(For simplicity, we will create a Console Application C# (.NET Framework) and name it) MyFirstMLDOTNET
.在解决方案资源管理器窗口中,我们还将重命名(. In the Solution Explorer window, we also rename the)**Program.cs(Program.cs)**至(to)MyFirstMLDOTNET.cs(MyFirstMLDOTNET.cs):(:)
我们可以通过右键单击来安装ML.NET(We can install the ML.NET by right-clicking on the) MyFirstMLDOTNET
项目,然后选择管理NuGet软件包:(project and choosing Manage NuGet Packages:)
在NuGet窗口中,选择"浏览"标签,然后输入"(In the NuGet window, we select the Browse tab and enter ‘)ML.NET(ML.NET)“在搜索字段中.最后,我们选择(’ in the Search field. Finally, we select)微软ML(Microsoft.ML)然后点击(and click the)安装(Install)按钮:(button:)
在预览更改中单击确定,然后单击(Clicking OK in the Preview Changes and then clicking)我接受(I Accept)在许可接受中.几秒钟后,Visual Studio将在"输出"窗口中响应一条消息:(in the License Acceptance. After a few seconds, Visual Studio will respond with a message in the Output window:)
在这一点上,如果我们尝试运行我们的应用程序,则会收到如下错误消息:(At this point, if we try to run our application, we can get an error message as follows:)
右键点击(Solve this error by right-clicking on the) MyFirstMLDOTNET
项目,然后选择属性.在里面(project and selecting the Properties. In the)物产(Properties)窗口中,我们选择左侧的” Builded"项,然后在Plaform目标项中将" Any CPU"更改为x64:(window, we select the Built item on the left side and change Any CPU to x64 in the Plaform target item:)
我们还需要选择.NET Framework的4.7版本(或更高版本),因为我们将在早期版本中遇到一些错误.我们可以通过选择左侧的"应用程序"项并在"目标框架"项中选择版本来选择.NET Framework的版本.如果没有4.7版(或更高版本),则可以选择"安装其他框架",然后将转到Microsoft页面以下载并安装.NET Framework软件包:(We also need to select the 4.7 version (or later versions) of the .NET Framework because we will meet some errors with earlier versions. We can select the version of the .NET Framework by selecting the Application item on the left side and selecting the version in Target framework item. If we don’t have the 4.7 version (or later versions), we can select the Install other frameworks and we will be directed to the Microsoft page to download and install the .NET Framework packages:)
到目前为止,我们可以尝试再次运行您的应用程序,并且成功.(So far, we can try to run our aplication again and it is sucessful.)
使用代码(Using the Code)
培训资料(The Training Data)
在创建ML模型之前,我们必须通过右键点击创建训练数据文件(Before creating the ML model, we must create the training data file by right-clicking on the) MyFirstMLDOTNET
项目,然后选择添加>新建项目,选择文本文件类型,然后输入(project and select Add > New Item, select the Text File type and enter)**myMLData.txt(myMLData.txt)**在里面(in the) Name
领域:(field:)
点击(Click the)加(Add)按钮.在里面(button. In the)**myMLData.txt(myMLData.txt)**窗口中,我们输入(或在上面复制)训练数据:(window, we enter (or copy above) the training data:)
3 -2 Red
-2 3 Red
-1 -4 Red
2 3 Red
3 4 Red
-1 9 Blue
2 14 Blue
1 17 Blue
3 12 Blue
0 8 Blue
点击(Click the)救(Save)并关闭(and close the)**myMLData.txt(myMLData.txt)**窗口.(window.)
数据类别(The Data Classes)
创建训练数据文件后,我们还需要创建数据类.一个类(名为(After creating the training data file, we also need to create data classes. A class (named) myData
)定义训练数据的结构(两个坐标(() defines the structure of the training data (two coordinates () x
和(and) y
)和一个标签(() and one label () Red
要么(or) Blue
))()))
public class myData
{
[Column(ordinal: "0", name: "XCoord")]
public float x;
[Column(ordinal: "1", name: "YCoord")]
public float y;
[Column(ordinal: "2", name: "Label")]
public string Label;
}
和一个班级(命名(And a class (named) myPrediction
)保存了预测信息:() holds predicted information:)
public class myPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabels;
}
创建和训练ML模型(Creating and Training the ML Model)
我们可以创建ML模型并对其进行训练:(We can create the ML model and train it:)
//creating a ML model
var pipeline = new LearningPipeline();
// loading the training data
string dataPath = "..\\..\\myMLData.txt";
pipeline.Add(new TextLoader(dataPath).CreateFrom<myData>(separator: ' '));
//convert string (Red or Blue) to number (0 or 1)
pipeline.Add(new Dictionarizer("Label"));
//combining the two predictor variables (XCoord and YCoord)
//into an aggregate (Features)
pipeline.Add(new ColumnConcatenator("Features", "XCoord", "YCoord"));
//using the Logistic Regression technique for a binary classification problem
pipeline.Add(new LogisticRegressionBinaryClassifier());
pipeline.Add(new PredictedLabelColumnOriginalValueConverter()
{ PredictedLabelColumn = "PredictedLabel" });
//training the ML model
Console.WriteLine("\nStarting training \n");
var model = pipeline.Train<myData, myPrediction>();
评估模型(Evaluting the Model)
我们可以按以下方式评估您的ML模型:(We can evalute our ML model as follows:)
var testData = new TextLoader(dataPath).CreateFrom<myData>(separator: ' ');
var evaluator = new BinaryClassificationEvaluator();
var metrics = evaluator.Evaluate(model, testData);
double acc = metrics.Accuracy * 100;
Console.WriteLine("Model accuracy = " + acc.ToString("F2") + "%");
测试模型(Testing the Model)
最后,我们可以用新的观点测试模型:(Finally, we can test our model with a new point:)
myData newPoint = new myData(){ x = 5f, y = -7f};
myPrediction prediction = model.Predict(newPoint);
string result = prediction.PredictedLabels;
Console.WriteLine("Prediction = " + result);
我们所有的代码都在(All of our code in the)**MyFirstMLDOTNET.cs(MyFirstMLDOTNET.cs)**文件:(file:)
using System;
using Microsoft.ML.Runtime.Api;
using System.Threading.Tasks;
using Microsoft.ML.Legacy;
using Microsoft.ML.Legacy.Data;
using Microsoft.ML.Legacy.Transforms;
using Microsoft.ML.Legacy.Trainers;
using Microsoft.ML.Legacy.Models;
namespace MyFirstMLDOTNET
{
class MyFirstMLDOTNET
{
public class myData
{
[Column(ordinal: "0", name: "XCoord")]
public float x;
[Column(ordinal: "1", name: "YCoord")]
public float y;
[Column(ordinal: "2", name: "Label")]
public string Label;
}
public class myPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabels;
}
static void Main(string[] args)
{
//creating a ML model
var pipeline = new LearningPipeline();
// loading the training data
string dataPath = "..\\..\\myMLData.txt";
pipeline.Add(new TextLoader(dataPath).CreateFrom<myData>(separator: ' '));
//convert string (Red or Blue) to number (0 or 1)
pipeline.Add(new Dictionarizer("Label"));
//combining the two predictor variables (XCoord and YCoord)
//into an aggregate (Features)
pipeline.Add(new ColumnConcatenator("Features", "XCoord", "YCoord"));
//using Logistic Regression technique for a binary classification problem
pipeline.Add(new LogisticRegressionBinaryClassifier());
pipeline.Add(new PredictedLabelColumnOriginalValueConverter()
{ PredictedLabelColumn = "PredictedLabel" });
//training and saving the ML model
Console.WriteLine("\nStarting training \n");
var model = pipeline.Train<myData, myPrediction>();
//Evaluating the Model
var testData = new TextLoader(dataPath).CreateFrom<myData>(separator: ' ');
var evaluator = new BinaryClassificationEvaluator();
var metrics = evaluator.Evaluate(model, testData);
double acc = metrics.Accuracy * 100;
Console.WriteLine("Model accuracy = " + acc.ToString("F2") + "%");
//Predicting a new point (5,-7)
myData newPoint = new myData()
{ x = 5f, y = -7f};
myPrediction prediction = model.Predict(newPoint);
string result = prediction.PredictedLabels;
Console.WriteLine("Prediction = " + result);
Console.WriteLine("\nEnd ML.NET demo");
Console.ReadLine();
}
}
}
运行我们的应用程序并获得如下所示的结果:(Run our application and get the result which can look like this:)
兴趣点(Points of Interest)
在本文中,我仅介绍了ML.NET(基本上是.NET开发人员的机器学习库). ML.NET仍在开发中,您可以通过教程了解有关此库的更多信息(In this article, I only introduced the ML.NET – the Machine Learning library for .NET developers – basically. The ML.NET has still been developing and you can learn more about this library through tutorials) 这里(here) .(.)
历史(History)
- 24(24)日(th)2018年11月:初始版本(November, 2018: Initial version)
许可
本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。
C# .NET Visual-Studio ML.Net machine-learning 新闻 翻译