[译]机器学习和ML.NET简介-第1部分
By robot-v1.0
本文链接 https://www.kyfws.com/ai/introduction-to-machine-learning-and-ml-net-part-1-zh/
版权声明 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
- 17 分钟阅读 - 8023 个词 阅读量 0机器学习和ML.NET简介-第1部分(译文)
原文地址:https://www.codeproject.com/Articles/5245488/Introduction-to-Machine-Learning-and-ML-NET-Part-1
原文作者:syed shanu
译文由本站 robot-v1.0 翻译
前言
Introduction to Machine Learning and ML.NET (Machine Learning.NET)
机器学习和ML.NET简介(Machine Learning.NET)
介绍(Introduction)
如今,机器学习正变得越来越流行,并且已在广泛的行业以及我们的日常生活中使用.在本文中,我们将学习如何使用Microsoft ML.NET(机器学习.NET)开发机器学习应用程序.如果我们具有机器学习,机器学习类型和算法的基础知识,那么选择适合的机器学习任务和模型来开发机器学习应用程序将变得更加容易.在本章中,我们将从以下内容开始:(Nowadays, Machine Learning is getting more popular and is been using in wide industry as well as in our day to day life. In this article, we will be learning how to develop Machine Learning Applications using Microsoft ML.NET (Machine Learning .NET). If we have a basic knowledge of Machine learning, Machine Learning Types and Algorithm, it will be easier for us to select the appropriate Machine learning task and Model to develop our Machine learning application. In this chapter, we will start with:)
- 机器学习导论(Introduction to Machine Learning)
- 机器学习类型和算法简介(Introduction to Machine Learning Types and Algorithm)
- 为什么机器学习越来越受欢迎(Why Machine Learning Is Getting More Popular)
- Microsoft ML.NET简介(Introduction to Microsoft ML.NET)
- ML.NET的功能(Features of ML.NET)
机器学习导论(Introduction to Machine Learning)
机器学习是应用程序,它是人工智能(AI)的一部分,机器学习使用算法和统计技术自行训练系统,而无需使用任何明确的程序.机器学习用于自动训练系统并向我们提供系统预测结果.在用于训练和预测结果的机器学习中,我们需要提供大量数据.在机器学习2中,最常用的是魔术词(Machine Learning is an application, which is a part of Artificial Intelligence (AI), Machine Learning uses algorithm and statistical techniques to train the systems by themselves without using any explicit programs. Machine Learning is used to train the systems automatically by themselves and provide us the system predicted results. In Machine Learning for training and predicting results, we need to provide lots of data. In Machine Learning 2, magical words are mostly used they are) 参考(ref) :(:)
- 训练(Training)
- 数据(Data) 为了了解培训和数据,让我们看一下现实生活中的例子,当一个新生婴儿出生时,父母,老师和邻居将通过显示对象开始教孩子,我们可以说父母是第一次教孩子通过显示一个苹果,他们会反复告诉婴儿这是一个苹果,并且苹果将变成红色,苹果的形状会像这样,这里的苹果是为孩子准备的数据,孩子的大脑被训练为苹果将变成红色,苹果将看起来像这样,苹果将提供不同种类的形状和颜色.一旦用该物体训练了婴儿的大脑,每当婴儿看到苹果物体时,他/他立即就会告诉它是苹果.(To understand about training and data, let”s see our real life example, when a new baby is born, parents, teachers and neighbors will start teaching kids by showing the object, we can say a parent is teaching the infant for the first time by showing an apple and they will repeatedly tell the infant that this is an apple and an apple will be red in color, the shape of the apple will be like this, here the apple is the data for the kid and kid’s brain is trained as the Apple will be red in color and Apple will look like this and Apple will be available in different kind of shapes and color. Once the infant brain is trained with the object, whenever an infant sees the apple object, immediately she/he will tell that it is an Apple.)
就像第一次通过显示对象来训练婴儿一样,我们也会使用大量数据来训练机器以为我们预测并返回结果.为了训练机器,我们需要大量数据.通过向机器提供大量相关数据,机器将得到良好的培训,并且可以很好地为我们预测准确的结果.我们可以看到下面的图像作为数据示例,例如,在这里,我们考虑训练机器来预测数量并显示结果.在这里,我们将数据用作图像,我们可以看到使用不同的字体创建了不同种类的数字2,并且这些数字也被手绘使用.所有这些数字2将通过数据提供给机器,并训练机器以预测结果.(Same like training the infant for the first time by showing the object, we do train the machine with lot of data to predict and return the result for us. For training the machine, we need lots of data. By providing lot of related data to the machine, the machines will be well trained and good to predict the accurate results for us. We can see the below image as an example for the data, here for example, let’s consider we train the machines to predict the number and display the result. Here, we have used the data as image and we can see different kinds of number 2s have been created with different fonts and also used by hand drawing. All these number 2s will be given to the machines by data and trained the machines to predict the result.)
再次,你们都会想知道培训以及我们如何训练机器,为此,在机器学习中,我们拥有机器学习任务和算法,正如我们对机器学习所了解的那样,我们无需显式编写任何程序,因为我们将使用机器学习算法来预测结果.现在,我们将了解几种机器学习类型和算法.(Again, you all will be wondering about training and how we can train the machines, for this in Machine learning, we have Machine learning Task and Algorithms, as we already know as for Machine learning, we have no need to write any program explicitly, as we will be using the machine learning algorithm to predict the results. Now we will see about few Machine Learning types and Algorithms.)
机器学习过程(Machine Learning Process)
在下图中,我们可以看到机器学习过程已得到解释,因为首先将数据提供给系统,然后选择适当的机器学习模型来训练系统.训练完成后,机器即可准备预测结果并将输出显示给外界.(In the below image, we can see the Machine learning process has been explained as first we give the data to the system and then we select the appropriate Machine Learning Model to train the system. After the training is completed, the machine is ready to predict the results and show the output to the outside world.)
机器学习类型和算法简介(Introduction to Machine Learning Types and Algorithm)
在机器学习中,类型和算法非常重要,如果我们要开发机器学习应用程序,那么我们应该了解什么是机器学习类型,以及应该为我们的应用程序选择哪种类型和算法来训练和预测结果.本文主要针对将机器学习用于监督学习类型和非监督学习,我们将详细介绍两种主要的机器学习类型:(In Machine learning, Types and Algorithms are very important, if we want to develop a Machine learning application, then we should understand what are Machine learning Types and which type and algorithm should be selected for our applications to train and predict the results. This aricle is focused to use the Machine learning for the Supervised Learning Type and Unsupervised Learning, we will be seeing in detail about 2 major types of Machine Learning as:)
- 监督学习(Supervised Learning)
- 无监督学习(Unsupervised Learning) 我们将通过示例看到有监督和无监督的机器学习类型和算法.(We will be seeing Supervised and Unsupervised Machine learning types and Algorithm with examples.)
从上面的图中,我们可以看到带有示例的少数机器学习类型和算法,例如可以使用每种机器学习类型和算法的哪种应用程序.在本文中,我们将使用带回归和分类模型的监督学习以及带聚类模型的无监督类型.现在,让我们详细了解每种机器学习类型和算法.(From the above diagram, we can see few of the Machine learning Types and algorithms with examples as in which kind of application each Machine learning types and algorithm can be used. In this article, we will be using Supervised Learning with Regression and Classification model and Unsupervised type with Clustering model. Now let’s see in detail about each Machine learning type and algorithm.)
监督学习(Supervised Learning)
在监督学习中,计算机将获得标记的输入和所需的输出.首先,我们将看到一个使用回归模型预测每个城市的房价的示例,为此,我们将提供特定城市的所有房屋详细信息,包括城市名称,区域名称,房屋类型,楼层详细信息,房间和房屋租金.(In Supervised learning, the computer will get the labelled input and the desired output. First, we will see an example for using the Regression model for the Housing price prediction per city, for this, we will be giving all the house details for the particular City with City Name, Area Name, House Type, Floor details, No of Rooms and House Rent.)
在上图中,我们可以了解三种不同类型的房屋的住房信息,包括单人房,别墅类型和公寓类型以及房间数量,这并不是特定城市房屋的确切价格,而是全部示例住房类型和价格,以方便理解这些概念.从上图可以很容易地了解该城市特定区域的当前房价.该城市中所有房屋的城市名称,地区名称,房屋类型,楼层详细信息,房间数量和房屋租金信息的所有这些信息都将作为机器的输入,以预测房屋租金以供用户搜索.当我们搜索房屋时,我们将输入以下信息:城市名称,区域名称,所需的房间数量,偏好的房屋类型以及所需的房屋预算,这是预算的关键用于搜索的关键字以及将在搜索中查找的输出将作为搜索结果的房屋租金.在这里,对于机器学习监督类型和回归模型,我们将把房屋租金作为标签输入.我们使用所有输入和带标签的输入来训练机器.训练后,机器将使用回归算法预测结果,并为我们生成预测结果作为房租.(In the above image, we can understand the housing information for three different types of house as Single house, Villa type and Apartment type with number of room information, this is all not the exact price of the house in the particular city, it’s all sample housing type and prices for easy understanding of the concepts. As from the above image, we can easily understand the current housing price for the particular area in that city. All this information of City Name, Area Name, House Type, Floor details, Number of Rooms and House Rent information for all the houses in that city will be given as the input to the machine to predict the housing rent for user search. When we search for the house, we will be giving the input as the City Name, Area Name, Number of Rooms we need, Which type of house we preferred and what budget we are looking for the house, Here, the budget is the key keyword for our search and the output we will be looking in our search will be as the house rent of the searched result. Here, for the Machine learning Supervised Type and regression model, we will be giving the house rent as the labelled input. We train the machine with all the inputs and labelled input. After training, the Machine will predict the result using the regression algorithm and produce the predicted result for us as the house rent.)
如果用户搜索Maura市和Annanagar地区3套公寓房的房屋租金,并将所有数据提供给机器,机器将预测结果并显示大约15000的输出.在Machine Learning中,我们需要提供大量数据.(If a user searches for a house rent with 3 rooms, Apartment type house in Maura city and in Annanagar area, with all the data given to the machine, the machine will predict the result and display the approximate output as 15000. In Machine Learning, we need to give lot of data.)
在监督学习中,将使用另一个模型作为(In Supervised learning, one more model will be used as the)分类模型(Classification model).分类模型将用于邮件垃圾邮件检测和情绪预测.(. Classification model will be used for Mail spam detection and for sentiment predictions.)
无监督学习(Unsupervised Learning)
在无监督学习中,计算机将获得输入而没有所需的输出.该模型的主要目的是在输入中找到结构.(In Unsupervised Learning, the computer will get the input without the desired output. The main aim of this model is to find the structure in the inputs.)
在无监督学习中,我们有(In Unsupervised learning, we have the)聚类模型(Clustering model).聚类模型可用于找到我们产品的客户细分的集群,我们可以说一个例子作为我们产品销售的客户细分.假设我们将" ABC"," XYZ"和" 123"作为三种不同的产品,并且我们在德里,孟买,加尔各答和钦奈这四个主要城市销售产品.我们将四个城市的三种产品的所有销售历史分组,并希望在这种情况下找到我们的产品集群,我们可以使用基于集群模型的无监督学习.(. Clustering model can be used to find the Cluster of the Customer segmentation of our products, we can say an example as Customer Segmentation for our product sales. Let’s consider we have “ABC”, XYZ” and “123” as three different products and the products we do sales in the four major city in Delhi, Mumbai, Kolkata and in Chennai. We group all the sales history of our three products for the four city and want to find the cluster of our product in this case, we can use the Unsupervised Learning using clustering model.)
为什么机器学习越来越受欢迎?(Why Machine Learning Is Getting More Popular?)
如今,机器学习已广泛应用于我们的日常生活,许多行业,研究领域,科学等领域.机器学习还用于使系统示例自动化,例如可以说邮件垃圾邮件检测和欺诈检测.在当今的机器学习中,我们可以说以Facebook新闻提要为例,我们可以在Facebook墙上看到,因为我们将看到与经常访问或最近访问的朋友帖子相关的所有新闻提要. Facebook正在将机器学习概念用于新闻提要.如今,机器学习还被广泛用于制造业,医疗保健,金融服务,旅游,零售等行业.机器学习还被用于制造无人驾驶汽车(即无人驾驶汽车).在自动驾驶汽车中,传感器用于识别所有四个侧面上更接近的物体,具体取决于将控制车速的物体,并且还使用导航将自动驾驶汽车到达目的地.信息将被存储为交通地点和当前交通信号.对于自动驾驶汽车,将使用机器学习概念强化学习类型.机器学习现在也广泛用于研究和医学领域,例如,预测艾滋病病毒的衰竭,帕金森氏病的进展预测,智能农业,药物开发生物技术,药物治疗,用于宇宙图等.(Nowadays, Machine learning is widely used in our day to day life, in lots of industries, in research fields, in science, etc. Machine Learning is also used to automate the systems example like we can say the Mail spam detection and fraud detection. Machine Learning in our day today life, we can say the Facebook news feed as an example, we can see in our Facebook wall as we will be seeing all the news feed related to our frequently or recently visiting friends post. Facebook is using machine learning concept for the news feed. Machine learning is also used in wide industries today like Manufacturing, Healthcare, Financial services, Travel, Retail, etc. Machine learning is also used to make driverless cars (i.e., self-driving cars). In self-driving cars, Sensors are used to identify the objects coming closer in all the four sides depending on the objects the car speed will be controlled and also using the navigation the self-driving cars will be reach the destination, In the navigation all the information will be stored as traffic place and present traffic signal. For the Self-driving car, Machine learning concepts Reinforcement learning type will be used. Machine learning is also now widely used in research and medical field example like to predict the viral failure in AIDS, Parkinson disease progression prediction, Smart Farming, Bio Technology for Drug development, medical therapy, used in cosmological maps, etc.)
将来,机器学习将在所有领域中得到广泛应用,并且它将比今天更加流行.(In the future, Machine learning will be used widely in all the fields and it will be getting more popular than it is today.)
我们已经了解了如今机器学习越来越流行的方式和原因,Microsoft还在3月的Build 2018期间于3月推出了一个名为ML.NET的新框架.ML.NET代表Machine Learning.Net,该机器学习使用.NET开发机器学习应用程序-我们将在接下来的章节中看到有关ML.NET的更多详细信息.(We have seen how and why the Machine learning is getting more popular nowadays and Microsoft also has introduced a new Framework called as ML.NET in the month of March during Build 2018. ML.NET stands for the Machine Learning.Net which is used to develop the Machine Learning applications using .NET - we will be seeing more details about ML.NET in our upcoming chapters.)
Microsoft ML.NET简介(Introduction to Microsoft ML.NET)
Microsoft在Build 2018(3月)期间引入了ML.NET(Machine Learning.NET). ML.NET的当前版本是ML.NET预览版1.4,该版本发布于(Microsoft introduced ML.NET (Machine Learning.NET) during Build 2018 (March). The current version of ML.NET is ML.NET preview 1.4 which was released in) 2019年九月(September 2019) . Machine Learning.Net是一个跨平台的开放源代码框架.是的,现在很容易开发自己的机器学习应用程序或使用机器学习框架开发自定义模块.对于所有.NET爱好者来说,这是个好消息,因为我们可以使用C#或F#代码通过ML.NET开发机器学习. ML.NET是开源的,可以在Windows,Linux和macOS上开发和运行.我们可以使用ML.NET为控制台,台式机,Web,移动设备,游戏以及IOT开发自定义机器学习模型.ML.NET还支持扩展和与TensorFlow,Accord.NET和CNTK一起使用.ML的最新版本.NET还支持从关系数据库(如SQL Server,Oracle,MySQL等)加载和训练数据.还建立了最新版本的ML.NET,以使用AutoML开发简单的自定义ML.(. Machine Learning.Net is a framework which is a cross-platform and open source. Yes, now it’s easy to develop our own Machine Learning application or develop custom modules using Machine Learning framework. For all the .NET lovers, it’s great news as we can use C# or F# code to develop Machine Learning using the ML.NET. ML.NET is open source and can be developed and run on Windows, Linux and macOS. We can develop custom machine learning models using ML.NET for Console, desktop, web, mobile, gaming and for the IOT.M L.NET also supports to extend and work with TensorFlow, Accord.NET and CNTK.The latest release of ML.NET also supports to load and train data from Relational databases like SQL Server, Oracle, MySQL, etc. The latest version of ML.NET was also established to develop easy custom ML using AutoML.)
当前,Microsoft已发布ML.NET的预览版,并且Microsoft继续在ML.NET框架中添加更多功能,而ML.NET的当前版本是ML.NET 1.4.(Presently, Microsoft has released the preview version of the ML.NET and Microsoft keeps on adding more features to the ML.NET framework, the present version of ML.NET is ML.NET 1.4 .)
在开始使用ML.NET之前,让我们了解ML.NET的基本概念,该基本概念需要用于开发我们的机器学习应用程序.(Before getting started with the ML.NET, let’s understand the basic concept of the ML.NET which needs to be used to develop our Machine learning applications.)
- 载入资料(Load Data):为了完美预测结果,我们需要提供大量数据来训练模型.在ML.Net中,我们可以通过文本(CSV/TSV,关系数据库(现在支持SQL Server,Oracle,MySQL等))为训练和测试提供数据,(: For the perfect prediction of results, we need to give lot of data to train the model. In ML.Net, we can give the data for both train and test by Text (CSV/TSV, Relational Database (Now support SQL Server, Oracle, MySQL, etc.)),)
Binary
,(,)IEnumerable
等(, etc.) - 培养(Train):我们需要根据需要选择正确的算法来训练模型,我们需要选择正确的算法来训练和预测结果.(: We need to select the right algorithm to train the model depending on our needs, we need to pick the correct algorithm to train and predict the results.)
- 评估(Evaluate):为我们的模型训练和预测选择机器学习类型.如果需要使用细分,则可以选择聚类模型,如果需要找到库存预测的价格,则可以选择回归,如果需要找到情感分析,则可以选择分类模型.(: Select the Machine learning type for our model training and prediction. If you need to work with segment, then you can select the Clustering model, if you need to find the price of stock prediction, you can select the Regression and if you need to find the sentiment analysis, then you can select the Classification model.)
- 预测结果(Predicted Results):基于经过训练的模型的训练和测试数据,将使用ML.NET应用程序显示最终预测.经过训练的模型将被保存为二进制格式,也可以与我们的其他.NET应用程序集成.(: Based on the train and test data with trained model, the final prediction will be displayed using the ML.NET application. Trained model will be saved as the binary format which can also be integrated with our other .NET applications.)
上图说明了流程的流程,该流程将用于使用ML.NET开发我们的机器学习应用程序.接下来,我们将详细了解ML.NET组件.(The above picture explains the flow of process which will be used to develop our machine learning applications using the ML.NET. Next, we will see more in detail about ML.NET components.)
ML.NET的功能(Features of ML.NET)
现在,让我们看看Microsoft ML.NET的一些用途和功能.(Now let’s see some of the uses and features of Microsoft ML.NET.)
- 所有的DotNet爱好者都可以使用ML.NET编写用于机器学习的代码.(All the DotNet lovers can write their code for Machine Learning using ML.NET.)
- 您可以使用C#或F#与ML.NET进行编码.(You can use C# or F# to code with ML.NET.)
- ML.NET是跨平台和开放源代码框架.(ML.NET is cross-platform and an open source framework.)
- ML.NET可以在Windows,Linux和macOS上开发和运行.(ML.NET can be developed and run on Windows, Linux and macOS.)
- 在Microsoft Windows,Bing,Azure中广泛使用,还可以扩展到TensorFlow,CNTK和Accord.NET等其他框架.(Extensively used across Microsoft Windows, Bing, Azure and also Extensible to other frameworks like TensorFlow, CNTK and Accord.NET.)
- ML.NET支持开发用于Web,移动,桌面,游戏和IOT的机器学习应用程序.(ML.NET supports to develop Machine Learning apps for web, mobile, desktop, gaming and IOT.)
- ML.NET将训练后的模型保存为二进制文件,并且可以将其集成到任何其他DotNet应用程序中.(ML.NET saves the trained model as a binary file and it can be integrated into any other DotNet applications.)
- ML.NET现在处于预览版本,Microsoft经常添加许多新功能,并且还计划使用TensorFlow和CNTK添加深度学习.(ML.NET is now in preview version and Microsoft is frequently adding many new features and also planned to add the Deep Learning with TensorFlow and CNTK.)
- ML.NET预览版0.2引入了新的机器学习群集任务.(ML.NET preview version 0.2 introduced the new Machine learning Clustering Tasks.)
- ML.NET预览版0.5添加了TensorFlow模型评分转换.(ML.NET preview version 0.5 added a TensorFlow model scoring transform.)
- ML.NET预览版0.6添加了对预先训练的ONNX模型进行评分的功能.(ML.NET preview version 0.6 added the ability to score pre-trained ONNX models.)
- 现在从ML.NET 0.7版本开始,它同时支持x86和x64.ML.NET现在处于预览版本,并且Microsoft经常通过向ML.NET添加更多功能来更新该版本.以前的ML.NET 0.7版本仅支持为x64开发,而从新的ML.NET 0.7版本开始支持为x86和x64开发.(Now from the ML.NET 0.7 version, it supports both x86 and x64.ML.NET is in preview version now and Microsoft is frequently updating the version by adding more features to ML.NET. The previous versions of ML.NET 0.7 only support to develop for x64 but from the new ML.NET 0.7 version supports to develop for both x86 and x64.)
- ML.NET预览版0.7支持称为NimbusML的ML.NET实验Python绑定.(ML.NET preview version 0.7 supports in experimental Python bindings for ML.NET called NimbusML.)
- ML.NET预览版0.7启用了异常检测方案.(ML.NET preview version 0.7 enabled anomaly detection scenarios.)
- 添加了ML.NET预览版0.9,但对ML.NET API进行了一些改进.(ML.NET preview version 0.9 was added with few of ML.NET API improvements.)
- ML.NET 1.0已添加到自动机器学习(AutoML)中,并引入了更多新工具,例如ML.NET CLI和ML.NET Model Builder.(ML.NET 1.0 has been added with Automated machine learning (AutoML) and introduced some more new tools like ML.NET CLI and ML.NET Model Builder)
- ML.NET 1.1已发布,改进了对In-Memory Image类型的支持(ML.NET 1.1 has been released with improved support for In-Memory Image type in)
IDataView
还添加了新的异常检测算法.(also added a new algorithm Anomaly Detection algorithm.) - ML.NET 1.2已发布,支持将ML.NET模型集成到具有以下功能的Web或无服务器应用程序中:(ML.NET 1.2 has released with support to integrate ML.NET models in web or serverless apps with)
Microsoft.Extensions.ML
整合套件(integration package) - ML.NET预览版1.4数据库加载程序使使用关系数据库的培训变得容易.(ML.NET preview version 1.4 Database loader which made it easy to train using the relational database.)
参考链接(Reference Links)
- https://docs.microsoft.com/zh-CN/dotnet/machine-learning/(https://docs.microsoft.com/en-gb/dotnet/machine-learning/)
- https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet(https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet)
兴趣点(Points of Interest)
ML.NET预览版1.4是今天(2019年9月)当前发布的愿景. Microsoft会通过添加更多功能来不断更新ML.Net,因此始终检查最新更新,并等待完整的ML.NET版本发布.在下一部分中,我们将学习如何使用最新版本和功能针对每种模型和算法使用ML.NET.希望大家从第一部分中了解什么是机器学习和ML.NET,在下一部分中,我们将深入研究ML.NET入门.(ML.NET preview 1.4 is the current released vision of today (Sep 2019). Microsoft keeps on updating ML.Net by adding more features so always keep checking for the latest updates and wait till the complete ML.NET version is published. In our next part, we will learn about working with ML.NET for each model and Algorithm with the latest release version and features. Hope you all understand what is Machine Learning and ML.NET from this part 1 and in our next part, we will be looking in depth into Getting started with ML.NET.)
历史(History)
- 2019/09/14:初始版本(2019/09/14: Initial release)
许可
本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。
C# ML.Net machine-learning 新闻 翻译