[译]从语音到文本的数据抓取

By robot-v1.0

本文链接 https://www.kyfws.com/ai/data-scraping-from-speech-to-text-zh/

01月01日, 0001 - 4 分钟阅读 - 1672 个词 阅读量 0

从语音到文本的数据抓取（译文）

原文地址：https://www.codeproject.com/Articles/1236865/Data-Scraping-from-Speech-to-Text

原文作者：Eric M. H. Goh

译文由本站 robot-v1.0 翻译

前言

Speech to Text Recognition for Data Scraping and Collection in Data Mining

语音识别中用于数据挖掘和数据挖掘的语音识别

介绍(Introduction)

数据科学是一个不断发展的领域.根据CRISP DM模型和其他数据挖掘模型,我们需要在挖掘知识并进行预测分析之前收集数据.数据收集可能涉及数据抓取,其中包括Web抓取(HTML到文本),图像到文本以及视频到文本转换.当数据为文本格式时,我们通常使用文本挖掘技术来挖掘知识.(Data Science is a growing field. According to CRISP DM model and other Data Mining models, we need to collect data before mining out knowledge and conduct predictive analysis. Data Collection can involve data scraping, which includes web scraping (HTML to Text), image to text and video to text conversion. When data is in text format, we usually use text mining techniques to mine out knowledge.)

在本文中,我将向您介绍语音识别.我开发了Just Voice Voice Transformer(JAVT),可将视频转换为文本文件,并将其合并为一组文本数据,以进行文本挖掘和自然语言处理.(In this article, I am going to introduce you to speech to text recognition. I developed Just Another Voice Transformer (JAVT) to convert videos into text files, and consolidate them into a set of text data for text mining and natural language processing.)

JAVT具有使用ffmpeg将视频转换为音频文件,然后使用Microsoft SAPI或CMU Sphinx将音频转换为文本文件的功能.我已包含所有视频到音频转换和音频到文本转换的源代码.在本文中,我将仅说明使用Microsoft SAPI并与ffmpeg接口的语音识别和语音合成器.(JAVT has features to convert video into audio file using ffmpeg, and then convert audio into text file, using Microsoft SAPI or CMU Sphinx. I have included the source code for all the video to audio conversion and audio to text conversion. In this article, I am going to explain only the Speech Recognition and Speech Synthesizer using Microsoft SAPI, and interfacing with ffmpeg.)

使用Microsoft SAPI在C#中进行语音识别(Speech Recognition in C# using Microsoft SAPI)

要在C#中使用语音识别,您需要在代码顶部添加以下库:(To use speech recognition in C#, you will need to add the following libraries at the top of the code:)

using System.Speech.Recognition;
using System.Speech.AudioFormat;

然后创建听写语法和语音识别引擎:(Then create the dictation grammar and Speech Recognition Engine:)

DictationGrammar dictation;
dictation = new DictationGrammar();
private SpeechRecognitionEngine sr;
sr = new SpeechRecognitionEngine();

然后,我们需要将听写语法加载到语音识别引擎中:(We will then need to load the dictation grammar into speech recognition engine:)

sr.LoadGrammar(dictation);

如果您正在使用(If you are using)**.wav(.wav)**文件作为输入,将语音识别引擎设置为:(file as input, set the speech recognition engine to:)

sr.SetInputToWaveFile(textBox3.Text);

如果您使用麦克风等音频设备作为输入,请将语音识别引擎设置为:(If you are using the audio device such as microphone as input, set the speech recognition engine to:)

sr.SetInputToDefaultAudioDevice();

要执行异步语音识别:(To perform asynchronous speech recognition:)

sr.RecognizeAsync(RecognizeMode.Multiple);

然后添加以下事件处理程序:(Then add these event handlers:)

sr.SpeechRecognized -= new EventHandler<SpeechRecognizedEventArgs>(SpeechRecognized);
sr.EmulateRecognizeCompleted -= 
new EventHandler<EmulateRecognizeCompletedEventArgs>(EmulateRecognizeCompletedHandler);

sr.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(SpeechRecognized);
sr.EmulateRecognizeCompleted += 
new EventHandler<EmulateRecognizeCompletedEventArgs>(EmulateRecognizeCompletedHandler);

如果语音被识别,(If the speech is recognized,) SpeechRecognized() 方法将被调用.以下是(method will be called. The following is the) SpeechRecognized() JAVT中使用的方法.要获取可识别的文本,请从(method used in JAVT. To get the recognized text, we get it from) e.Result.Text .(.)

string finalResult;
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) {
            try{
            finalResult = e.Result.Text;
            richTextBox3.Text += " " + finalResult;
            }
            
            catch(Exception ex) {
                MessageBox.Show(ex.Message);
            }
        }

如果语音识别完成,则(If the speech recognition is completed, the) EmulateRecognizeCompletedHandler() 方法将被调用.以下是(method will be called. The following is the) EmulateRecognizeCompletedHandler() 程序中的方法:(method in the program:)

bool isCompleted = false;
private void EmulateRecognizeCompletedHandler(object sender, EmulateRecognizeCompletedEventArgs e) {
            try{
            isCompleted = true;
            
            sr.UnloadGrammar(dictation);
            sr.RecognizeAsyncStop();
            
            richTextBox3.Text += "\n\nCompleted. \n";
            MessageBox.Show("Completed. ");
            }
            
            catch(Exception ex) {
                MessageBox.Show(ex.Message);
            }            
        }

文字转语音(Text to Speech)

由于我们已经创建了语音识别,因此以下是文本到语音识别.(Since we have created speech recognition, the following is the text to speech recognition.)

首先,我们需要添加(First, we need to add in) System.Speech.Synthesis 库并创建语音合成器:(library and create Speech Synthesizer:)

using System.Speech.Synthesis;

SpeechSynthesizer speaker;
speaker = new SpeechSynthesizer();

然后我们设置(Then we set the) Rate 和(and) Volume :(:)

speaker.Rate = int.Parse(rateTextBox.Text);
speaker.Volume = int.Parse(volTextBox.Text);

要使用女发言人:(To use a female speaker:)

speaker.SelectVoiceByHints(VoiceGender.Female);

然后运行语音合成器:(Then run the Speech Synthesizer:)

speaker.SpeakAsync(richTextBox2.Text);

视频到音频转换(Video to Audio Conversion)

我使用ffmpeg将视频转换为音频.要与ffmpeg交互,请首先添加(I use ffmpeg to convert video into audio. To interface with ffmpeg, first, include the) System.Diagnostics 图书馆:(library:)

using System.Diagnostics;

然后创建一个新过程:(Then create a new process:)

Process process = new Process();

创建ffmpeg输入:(Create the ffmpeg inputs:)

string arg = "-i " + f + " -ab 160k -ac 2 -ar 44100 -vn " + f + ".wav";

设置过程设置:(Set the process settings:)

process.StartInfo.FileName = Directory.GetCurrentDirectory() + "\\ffmpeg\\bin\\ffmpeg.exe";
process.StartInfo.Arguments = arg;
process.StartInfo.ErrorDialog = true;
process.StartInfo.WindowStyle = ProcessWindowStyle.Normal;

开始过程:(Start the process:)

process.Start();
process.WaitForExit();

许可

本文以及所有相关的源代码和文件均已获得The Code Project Open License (CPOL)的许可。

C# .NET 新闻翻译