Natural Language Processing (NLP) is an exciting field that allows computers to understand and process human language. NLP powers applications such as chatbots, sentiment analyzers, and text summarizers. While Python gets most of the attention in NLP (thanks to libraries like NLTK and spaCy), Microsoft’s ML.NET provides powerful tools for .NET developers to build NLP models in C#.
In this article, we’ll start from scratch and build a simple Sentiment Analysis model using C# and ML.NET. Along the way, we’ll discuss how this qualifies as an NLP model and explore its practical applications.
What You’ll Build
We’ll build a sentiment analysis application, a common NLP project, that can classify text (e.g., a customer review) as "positive" or "negative" based on its sentiment. For example:
- Input: "I love this product!"
- Predicted Sentiment: "Positive"
This article will show how to create, train, and use this NLP model in C# while explaining its place in Natural Language Processing.
Prerequisites
Before we begin, ensure the following tools are installed:
- Visual Studio 2019 or later.
- .NET 5 or later.
- Familiarity with basic C# syntax.
Step 1. Install ML.NET
ML.NET is Microsoft’s machine learning framework for .NET developers. To add ML.NET to your project:
- Create a new .NET Console Application in Visual Studio.
- Add the following NuGet packages. You can do this via the Package Manager Console:
Microsoft.ML
Microsoft.ML.DataView
These packages provide the necessary classes and methods for machine learning workflows, including support for text processing and classification.
Step 2. Prepare Your Data
For any machine learning model, you need data. Create a tab-separated text file called sentiments.tsv in your project directory and populate it with sample reviews like this:
Sentiment Text
Positive The movie was fantastic overall.
Negative The service feels terrible at best.
Positive Her attitude is outstanding every time!
Negative This product seems bad in the future.
Positive My day was amazing and full of joy.
Negative His performance is poor at best.
Positive The movie appears great to me.
Negative The event was horrible overall.
Each row contains two columns:
- Sentiment: The label (Positive or Negative sentiment).
- Text: The corresponding text review.
This data will be used to train and evaluate the model.
Step 3. Create the Sentiment Analysis Application
Now, let’s write the code to create, train, and use the machine learning model.
-
Define Data Structures
Define two classes to represent the input data (SentimentData) and the prediction output (SentimentPrediction):
public class SentimentData
{
[LoadColumn(0)]
public string Sentiment { get; set; }
[LoadColumn(1)]
public string Text { get; set; }
}
public class SentimentPrediction
{
public string Sentiment { get; set; } // Predicted sentiment class
public float[] Score { get; set; } // Probabilities for each class
}
-
Write the Main Program
Here’s the full program to train and test the sentiment analysis model:
using Microsoft.ML;
using Microsoft.ML.Data;
class Program
{
static void Main(string[] args)
{
// Initialize the ML.NET environment
MLContext mlContext = new MLContext();
// Load training data
string dataPath = "sentiments.tsv";
IDataView dataView = mlContext.Data.LoadFromTextFile<SentimentData>(
path: dataPath,
hasHeader: true,
separatorChar: '\t'
);
// Define the data pipeline
var dataPipeline = mlContext.Transforms.Text.FeaturizeText(
outputColumnName: "Features",
inputColumnName: nameof(SentimentData.Text))
.Append(mlContext.Transforms.Conversion.MapValueToKey(
outputColumnName: "Label",
inputColumnName: nameof(SentimentData.Sentiment)))
.Append(mlContext.MulticlassClassification.Trainers.SdcaMaximumEntropy())
.Append(mlContext.Transforms.Conversion.MapKeyToValue(
outputColumnName: "PredictedLabel"));
// Train the model
var trainingModel = dataPipeline.Fit(dataView);
// Save the model
string modelPath = "sentimentModel.zip";
mlContext.Model.Save(trainingModel, dataView.Schema, modelPath);
// Load the model and make a prediction
var loadedModel = mlContext.Model.Load(modelPath, out _);
var predictionEngine = mlContext.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(loadedModel);
Console.WriteLine("Enter the text");
var input = Console.ReadLine();
var newSample = new SentimentData { Text = input };
var prediction = predictionEngine.Predict(newSample);
// Display prediction result
Console.WriteLine($"Text: {newSample.Text}");
float positiveProbability = prediction.Score[0]; // Probability of Positive class
Console.WriteLine(string.Join(",", prediction.Score));
string sentiment;
if (positiveProbability > 0.66)
{
sentiment = "Positive";
}
else if (positiveProbability < 0.33)
{
sentiment = "Negative";
}
else
{
sentiment = "Neutral";
}
Console.WriteLine(sentiment);
}
}
How Is this an NLP Model?
What Makes It NLP?
This program qualifies as an NLP model because it processes and analyzes natural language text—unstructured data generated by humans—and uses machine learning to derive insights. Here’s how:
- Text as Input: The raw input to this model is human language text (e.g., "This product is great!").
- Feature Extraction: The text is converted into a numeric representation suitable for machine learning using the
FeaturizeText
transformer. This step is an NLP technique because it uses methods like tokenization, text embedding, and bag-of-words representation.
- Text Classification: The model classifies the input text into one of two categories:
Positive
or Negative
. Text classification is a core NLP task.
- Applications in Real-World NLP: Sentiment analysis is commonly used in product review analysis, customer feedback analysis, and social media monitoring.
Output Samples
![Output sample]()
When you run the program, it will train the sentiment analysis model, evaluate its accuracy on the training data, and predict the sentiment of new text. The output might look like this:
Enhancing the Model
While this is a beginner-friendly example, you can enhance the NLP capabilities by:
- Preprocessing the Text: Remove stop words, apply stemming/lemmatization, and normalize text before featurizing.
- Using Pretrained Embeddings: Integrate pre-trained NLP models like Word2Vec or BERT for better feature extraction.
- Build on Other NLP Tasks: Expand this framework to perform other NLP tasks like named entity recognition (NER), text summarization, or question answering.
Conclusion
You’ve created your first NLP model using ML.NET in C#. This sentiment analysis application processed human language, extracted meaningful features, and classified text into predefined categories. While this example focuses on sentiment classification, ML.NET can be extended to tackle various NLP challenges.
By exploring this example, you’ve taken a big step into the fascinating world of NLP. Continue experimenting with the model, refine it, and try applying it to real-world datasets!
Happy coding!