Using MS Agent in C# - Part-II(Speech Recognition)

This article explains how to use MS Agent to write speech recognition applications using C# and. NET.


In this piece of writing let us see some fascinating characters which speak to us and support for speech recognition. Let us see in detail.

Introduction to Microsoft Agent:

The Microsoft Agent API provides services that support the display and animation of animated characters. Microsoft Agent includes optional support for speech recognition so applications can respond to voice commands. Characters can respond using synthesized speech, recorded audio, or text in a cartoon word balloon.

Requirements:

To be able to use this technology, you must have:

  • The Microsoft Agent Core components.
  • The Microsoft Agent Characters Genie, Merlin, Robby, and Peedy.
  • The Microsoft Speech API 4.0a runtime.
  • The Microsoft Speech Recognition Engine.
  • The Lernout and Hauspie Text-to-Speech engines for at least US English.

All these components are available from

Overview of Speech Recognition:

Speech recognition and text-to-speech use engines, which are the programs that do the actual work of recognizing speech or playing text. Most speech-recognition engines convert incoming audio data to engine-specific phonemes, which are then translated into text that an application can use. (A phoneme is the smallest structural unit of sound that can be used to distinguish one utterance from another in a spoken language.)

Speech recognition is a bit more complex to categorize than text-to-speech.

Every speech recognition engine has three characteristics:

  • Continuous vs. discrete:

    In continuous speech recognition, clients can speak to the system naturally. In discrete, clients require to gap between each word. Clearly, continuous recognition is desired over discrete recognition, but continuous recognition needs more processing power.

  • Vocabulary size:

    Speech recognition can support a small or large vocabulary. Small-vocabulary recognition permits users to give simple commands to their computers. To dictate a text, the system must have large-vocabulary recognition.

  • Speaker Dependence:

    Speaker-independent speech recognition works properly with out any training, while speaker-dependent systems require that each user spend about 30 minutes training the system to his or her voice.

Msagent uses "Command and Control" speech recognition which is continuous, small vocabulary, and speaker independent. So the users can create several hundred different commands or phrases. If a user says a command that is not in the list, the speech-recognition system will return either "not recognized," or will think it heard a similar-sounding command. For the reason that users of Command and Control can say only specific phrases, the phrases must be either visible on the screen--so intuitive that all users will know what to say--or the users must learn what phrases they can say.

Commands Window:

If an attuned speech engine is installed, Microsoft Agent supplies a special window called the Commands Window that shows the commands that have been voice-enabled for speech recognition. The Commands Window serves as a visual prompt for what can be spoken as input.

Listening Tip:

If speech is enabled, a special tool tip window appears when the user presses the push-to-talk key to begin voice input. The Listening Tip displays contextual information associated to the current input state.

MSAGENT IN C# [ SPEECH RECOGNITION(SR)]

To use Msagent in C# we have to add two DLL files AxAgentObjects.dll and AgentObjects.dll in our program.

To add commands the code is simple as below:

Character.Commands.Add("Who is your Master?",(object)"Who is your Master?",(object)"(Your(Master| Administrator))",(object)true,(object)true);

Similarlly from the below code we can see that according to the user voice input we can make commands. For example if we ask "Who is your Master?" to the character it responds to us in saying the answer. Likewise we can create any number of commands.

IAgentCtlUserInput ui;
ui = (IAgentCtlUserInput)e.p_userInput;
if(ui.Name == "Who is your Master?")
{
Character.Play ("Pleased");
Character.Speak((
object)"My Master name is G.GNANA ARUN GANESH." +
" You can contact him through his mail ggarung@rediffmail.com.",
null);
}

Example:

using System;
using System.Drawing;
using System.WinForms;
using AgentObjects;
public class Speech : Form
{
private System.ComponentModel.Container components;
private System.WinForms.Button button2;
private System.WinForms.Button button1;
private System.WinForms.TextBox textBox1;
private AxAgentObjects.AxAgent AxAgent;
private IAgentCtlCharacterEx Character;
public Speech()
{
InitializeComponent();
}
public static void Main(string[] args)
{
Application.Run(
new Speech());
}
private void InitializeComponent()
{
this.components = new System.ComponentModel.Container();
this.button1 = new System.WinForms.Button();
this.button2 = new System.WinForms.Button();
this.textBox1 = new System.WinForms.TextBox();
this.AxAgent = new AxAgentObjects.AxAgent();
AxAgent.BeginInit();
button2.Click +=
new System.EventHandler(button2_Click);
button1.Location =
new System.Drawing.Point(88, 208);
button1.BackColor = (System.Drawing.Color)System.Drawing.Color.FromARGB
(
byte)255, (byte)128, (byte)128);
button1.Size =
new System.Drawing.Size(152, 32);
button1.TabIndex = 1;
button1.Text = "Load character";
button2.Location =
new System.Drawing.Point(120, 240);
button2.BackColor = (System.Drawing.Color)System.Drawing.Color.FromARGB
(
byte)255, (byte)128, (byte)128);
button2.Size =
new System.Drawing.Size(96, 24);
button2.TabIndex = 2;
button2.Text = "SPEAK";
textBox1.Location =
new System.Drawing.Point(48, 8);
textBox1.Text = " ";
textBox1.Multiline =
true;
textBox1.TabIndex = 0;
textBox1.Size =
new System.Drawing.Size(248, 200);
textBox1.BackColor = (System.Drawing.Color)System.Drawing.Color.FromARGB
(
byte)255, (byte)128, (byte)128);
this.Text = "MSAGENT DEMO";
this.AutoScaleBaseSize = new System.Drawing.Size(5, 13);
this.WindowState = System.WinForms.FormWindowState.Maximized;
this.BackColor = (System.Drawing.Color)System.Drawing.Color.FromARGB((byte)
55, (
byte)192, (byte)192);
this.ClientSize = new System.Drawing.Size(344, 301);
AxAgent.Command +=
new
AxAgentObjects._AgentEvents_CommandEventHandler(AxAgent_Command);
this.Controls.Add(button2);
this.Controls.Add(button1);
this.Controls.Add(textBox1);
this.Controls.Add(AxAgent);
button1.Click +=
new System.EventHandler(button1_Click);
AxAgent.EndInit();
}
protected void button2_Click(object sender, System.EventArgs e)
{
if(textBox1.Text.Length == 0)
return;
Character.Speak(textBox1.Text,
null);
}
protected void button1_Click(object sender, System.EventArgs e)
{
OpenFileDialog openFileDialog =
new OpenFileDialog();
openFileDialog.AddExtension =
true;
openFileDialog.Filter = "Microsoft Agent Characters (*.acs)|*.acs";
openFileDialog.FilterIndex = 1 ;
openFileDialog.RestoreDirectory =
true ;
if(openFileDialog.ShowDialog() != DialogResult.OK)
return;
try { AxAgent.Characters.Unload("CharacterID"); }
catch { }
AxAgent.Characters.Load("CharacterID", (
object)openFileDialog.FileName);
Character = AxAgent.Characters["CharacterID"];
haracter.LanguageID = 0x409;
Character.Show(
null);
Character.Commands.Caption = "Sample Commands";
Character.Commands.Add("Who is your Master?",
(
object)"Who is your Master?",
(
object)"(Your(Master| Administrator))",
(
object)true,
(
object)true);
Character.Commands.Add("Exit",
(
object)"Exit",
(
object)"(exit | close | quit)",
(
object)true,
(
object)true);
Character.Play ("announce");
Character.Speak ("welcome you sir",
null);
}
protected void AxAgent_Command(object sender,
AxAgentObjects._AgentEvents_CommandEvent e)
{
IAgentCtlUserInput ui;
ui = (IAgentCtlUserInput)e.p_userInput;
if(ui.Name == "Who is your Master?")
{
Character.Play ("Pleased");
Character.Speak((
object)"My Master name is G.GNANA ARUN GANESH." +
" You can contact him through his mail ggarung@rediffmail.com.",
null);
}
if(ui.Name == "Exit")
{
Character.Speak((
object)"Good bye", null);
Character.Play("Wave");
Character.Play("Hide");
}
}
}

Output: