Everyone is talking about AI agents in 2026. To be honest, building a demo AI agent is easy. You call an LLM, get a clean response, and it looks impressive at first. When I first built a demo, I was confident going into a real project.
But when my project manager at Capital Numbers assigned me to test the agent for an ecommerce workflow, the real problems showed up. The agent worked… but not really.
The issue appeared whenever a customer wanted to return an item. Instead of continuing the return flow, the agent would go back to:
“Here are your service options…”
Again. And again. In a single day of testing, we saw several return-related interactions in which the agent failed to move the user forward. I had built the flow around clean inputs, not real user behavior.
So, in this blog, I have tried to explain what worked in my demo, what failed in the real use case, the mistakes I made, and why I started looking at MCP as a practical way to better structure AI agents in .NET.
Let’s first start with:
What are AI Agents?
AI agents are systems powered by artificial intelligence, not just to answer questions, but to take actions and complete goals on a user’s behalf. Unlike traditional chatbots, which primarily generate responses, AI agents can plan tasks, make decisions, retrieve information, and execute multi-step workflows with minimal human involvement.
The Real Problem Was Not the Model
At first, I thought the model was the issue. Maybe the prompt was weak, or the response format was wrong. Maybe I needed better examples.
So I kept adjusting the prompt. But the behavior did not really improve.
The agent couldn’t continue the return flow. It didn’t always ask for the missing order ID or check what should happen next. So, instead of moving toward resolution, it kept reverting to the main menu or service options.
The thing is, you can easily find the broken agent, but a failing agent is harder to find because it still responds, even though it’s not resolving the user's problem.
The Demo-to-Production Gap
This was a classic case of the "Happy Path" trap. When we build demos, we usually code for the ideal scenario. We expect the user to ask a clear question, and the agent gives a clear answer.
But you can't expect every user to say: "I want to return the order 12345678."
They may say:
"This item does not fit."
"I got the wrong product."
"Can you exchange this?"
"I already selected a return. Why are you showing me the same options again?"
As humans, we can easily identify these as return or exchange signals. The LLM may also understand the intent, but the agent still needs the right tool, missing-input handling, workflow state, and fallback path. That’s exactly why my first version was failing.
Mistakes I Made & You Should Try to Avoid
It wasn't just a minor bug. It was a mix of small architectural mistakes that you can't realize in a demo.
1. Don't Assume General Intelligence is Contextual Awareness
What feels obvious to us does not always translate into an application context for an agent. I assumed that because the LLM was smart, it would automatically understand the application context. Well, it didn’t.
An LLM may understand the sentence: "I want to return this item."
But that does not mean it knows:
Whether the user is logged in
Whether the user has an active order
Whether the order is eligible for return
Whether the return window is still open
Whether the product can be exchanged
Whether human support is required
If it doesn't have reliable access to systems, the agent is basically guessing. So when the agent was unsure, it fell back to what it knew: the home menu.
2. Don't Make the Return Tool Too Restrictive
This was one of the biggest causes of the loop. The return tool expected an order ID too early in the conversation. While this worked from a backend logic perspective, real users raise requests differently. That’s why the agent would get stuck repeatedly asking for the same information instead of guiding the conversation naturally.
I’m sharing a simple version of the mistake:
public class ReturnTools
{
[KernelFunction]
[Description("Handles product returns")]
public async Task<string> HandleReturn(string orderId)
{
var order = await _orderService.GetOrderAsync(orderId);
return $"Return initiated for order {orderId}.";
}
}
At first glance, this looks reasonable. The tool handles returns, accepts an order ID, calls the order service, and returns a response.
But the tool needs orderId. So, depending on how the agent is configured, it may decide it can’t call the tool.
3. Give Return Intent Enough Priority
In demos, AI agents usually handle simple tasks. The user asks one clear thing, and the agent follows one clean path.
But a real user may suddenly switch topics during a conversation. For example:
“Where’s my order?”
“Actually, I want a refund.”
“Also, the item that arrived is damaged.”
Now the agent has to understand what matters most quickly. But sometimes the agent keeps following the old conversation path. For example, if it was showing delivery options or menu choices, it may continue talking about those instead of realizing the user now wants to start a return process. That’s why intent anchoring is so important.
It helps the system recognize important user requests and shift the conversation immediately. So if the user says, “I want to return this,” the AI should pause the current flow and proceed directly to the return process.
In my model, the agent would technically handle the return, but the system's instructions were too weak, which caused the loop.
4. Don't Try to Fix Everything with Prompts
Naturally, I tried to fix the issue with instructions.
For example:
If the user wants a return, ask for the order ID.
If the user wants an exchange, check whether the product is eligible.
Don’t keep showing the menu again.
These instructions helped a little, but it wasn’t enough. Here’s an example of what was going on:
User: “Can I exchange this shirt?”
AI Agent: “Please share your order ID.”
User: “Never mind, I just want to return it.”
AI Agent: “Please select from the menu options below.”
This usually happens because prompts can suggest behavior, but they can’t control conversation state or workflow transitions. So, when a user has multiple intents, the AI agent may take the wrong conversational path unless the system has stronger workflow control beneath it.
For my AI Agent, it needed MCP.
What Is MCP?
MCP, or Model Context Protocol, is a structured way for an AI agent to discover and use backend tools via a defined interface. It’s basically a safer and more organized way for an AI agent to use backend tools and systems. Instead of giving the AI full access to your application or database, you grant it only a limited set of approved actions.
In a .NET application, you can expose certain C# methods as MCP tools. Even though the AI agent decides what action is needed, the backend system still controls how that action is executed safely.
This worked for my agent. Since the agent did not need unlimited system access or hundreds of complicated instructions. It simply needed a clear, structured set of tools to perform specific tasks properly.
So, here’s an idea of how you should get started:
Step 1: Understand Your Needs Before Building an MCP
You don’t need a very complex setup to build an MCP server in .NET. But you should have the basics ready so you can focus on tool design instead of fighting the environment.
I have made this into a table to keep it easy to understand:
| What You Need | Why You Need It |
|---|
| .NET SDK | To create, build, and run the MCP server project. |
| Visual Studio or VS Code | To write, debug, and test the C# code. |
| Basic C# knowledge | Helps you create tools and connect them to your backend services. |
| An MCP-compatible client | To test whether the AI agent can discover and use your tools correctly. |
| Access to your backend systems | Your tools may need to connect with databases, APIs, order systems, or support platforms. |
For me, setting up MCP was actually quite easy. The harder part was deciding what the AI should be allowed to do, what information it should access, and how it should behave when important details were missing.
Step 2: Build an MCP-Style Return Tool in .NET
After restructuring the workflow, my return tool in .NET started looking more like this:
using ModelContextProtocol.Server;
using System.ComponentModel;
[McpServerToolType]
public static class CustomerSupportTools
{
[McpServerTool(Name = "process_return_request")]
[Description(
"Use this tool when the user mentions return, exchange, refund, wrong item, damaged item, broken item, or item does not fit. " +
"Do not send the user back to the main menu for these requests. " +
"Start the return or exchange flow and ask for missing details if needed.")]
public static async Task<string> ProcessReturnRequest(
[Description("Order ID. If the user has not provided it, ask for it instead of failing.")]
string? orderId = null)
{
if (string.IsNullOrWhiteSpace(orderId))
{
return "I can help with that. Could you please share your order ID so I can check the return or exchange options?";
}
// Replace this with your actual order lookup logic.
return $"I found your order. Would you like to return the item, exchange it, or speak with support?";
}
}
If you look at this version, the code itself is still pretty simple, but the behavior feels much smarter. Instead of failing immediately, the AI stays within the return conversation and naturally asks the user for the missing details.
When you build a typical backend, you need certain parameters, such as an orderId, to keep the system predictable. Since conversational flows differ, the agent should be designed to collect missing information step by step.
*Note: These examples are only meant to explain the concept simply. In a real implementation, you’d still need proper order validation, authentication, permissions, and error handling. The code examples are based on the ModelContextProtocol.Server package. The API may evolve, so check the latest official documentation before implementing.
Step 3: Add a Human Handover Tool
Another mistake I made was expecting the AI agent to solve everything on its own.
In real customer support, the AI agent shouldn’t keep trying forever. It should have a proper way to transfer the conversation to a human support person when necessary.
So, you should add a simple handover tool, like:
using ModelContextProtocol.Server;
using System.ComponentModel;
[McpServerToolType]
public static class SupportEscalationTools
{
[McpServerTool(Name = "escalate_to_support")]
[Description(
"Use this tool when a return, exchange, or refund request cannot be completed automatically, " +
"or when the user asks for human support.")]
public static async Task<string> EscalateToSupport(
[Description("Brief reason why the request needs human support.")]
string reason)
{
// Replace this with your actual support ticket or handover logic.
return "I'm connecting you with a support specialist who can help complete this request.";
}
}
The AI agent can still try to help first, but if it lacks enough information or the conversation is going nowhere, it now has a proper fallback instead of getting stuck.
Step 4: Register and Start Your MCP Server
Once your tools are ready, here's the minimal host setup that registers and starts your MCP server:
var builder = Host.CreateApplicationBuilder(args);
builder.Services
.AddMcpServer()
.WithStdioServerTransport()
.WithToolsFromAssembly();
await builder.Build().RunAsync();
This tells the host to run an MCP server over standard I/O and automatically register tools discovered from the current assembly.
Step 5: Test the AI Agent Before Releasing It to Real Customers
Okay, enough with the stories. Here’s the practical checklist you should consider using before calling any AI agent workflow production-ready.
| What to test | How to test |
|---|
| Clean happy-path prompt | Run prompts like “I want to return order 12345678” and verify whether the agent completes the flow without unnecessary questions. |
| Missing order ID | Send prompts without an order ID and check whether the agent asks for the missing detail instead of restarting the workflow. |
| Alternate return signals | Test prompts such as “This item is damaged” or “I ordered the wrong size,” and verify whether the correct return tool is triggered. |
| Repeated user intent | Repeat the same request across multiple messages and check whether the agent remembers the active workflow state. |
| Frustrated user/escalation | Send frustrated prompts like “I already told you this” or “I want a human,” and verify whether the escalation tool gets called. |
| Tool selection | Inspect MCP logs or traces to confirm whether process_return_request was selected correctly. |
| Missing-input handling | Check whether the tool continues the conversation by collecting missing information step by step. |
| Workflow state | Verify that the agent stays within the return or exchange flow rather than jumping back to the main menu. |
| Backend or tool failure | Simulate failed API calls, database timeouts, or invalid order lookups and observe how the agent responds. |
| Tool observability | Review logs, traces, tool inputs, and outputs to debug where the workflow failed. |
| MCP setup issues | Test server startup, environment variables, project paths, and tool discovery separately before testing prompts. |
| Tool output quality | Inspect raw MCP tool responses and verify whether the output is structured clearly enough for the AI to use properly. |
The Biggest Lesson
AI agents need structured workflows, not just strong prompts. Looking back, I should have spent more time understanding the actual support workflow before coding. Talking to customer support teams, identifying repeated customer issues, and understanding common failure points would have helped you design better tools from the start.
Because the real problem is usually not intelligence. Modern LLMs are already smart enough to understand conversations. The harder part is helping the agent understand which action should occur next in a real workflow. That is why structured tooling like MCP exists.
So, before building an AI agent, understand the real operational problems first. It will save far more time than changing prompts later.