Modern web applications rely heavily on images, whether it’s blogs, product pages, or dynamic content portals. But writing good image captions and alt text for every uploaded image is often a manual, time-consuming task.
Using the AI, we can automatically analyze an uploaded image and generate:
This article shows you how to integrate Gemini AI into ASP.NET Core MVC (.NET 10) to process images and generate high-quality descriptive text.
Why Use Gemini AI for Image Captioning?
Gemini AI is trained on a massive multimodal dataset, which means:
It understands visual context (objects, scenes, emotions, actions)
It can generate human-like captions
It supports multiple languages
It can return multiple options so users can choose the best one
Features We Will Implement
Upload an image in ASP.NET Core MVC
Send the image bytes to Gemini AI
Ask Gemini to generate:
Render results on the page
SEO-friendly and accessible output
Before diving into the code, first you have to create a project in Visual Studio (2026 prefer) and have a Gemini API key that you can get it from Google AI Studio .
Project Structure
Below image shows the project structure that I am following to demonstrate the implementation.
![project structure]()
Add Gemini API Key to appsettings.json
{
"Gemini": {
"ApiKey": "your-api-key-here"
}
}
You can store the API key in User Secrets for security.
Create the Image Upload View
<div class="container mt-5">
<div class="card shadow p-4">
<h2 class="mb-4 text-center">AI Image Caption Generator</h2>
<form asp-action="Analyze" enctype="multipart/form-data" method="post">
<div class="mb-3">
<label class="form-label">Select Image</label>
<input type="file" name="file" class="form-control" required />
</div>
<div class="mb-3">
<label class="form-label">Select Language</label>
<select name="language" class="form-select">
<option value="en">English</option>
<option value="ne">Nepali</option>
<option value="hi">Hindi</option>
<option value="es">Spanish</option>
<option value="fr">French</option>
<option value="ja">Japanese</option>
</select>
</div>
<button class="btn btn-primary w-100">Analyze Image</button>
</form>
</div>
</div>
Create the Service to Handle Gemini Connection
This GeminiService is responsible to handle connection with the Gemini AI API with the prompt. Uploaded image converts into the base64Image as Geimin AI requires base64Image.
public class GeminiService
{
private readonly IConfiguration _config;
private readonly IHttpClientFactory _httpClientFactory;
public GeminiService(IConfiguration config, IHttpClientFactory httpClientFactory)
{
_config = config;
_httpClientFactory = httpClientFactory;
}
public async Task<(List<string> captions, List<string> alts)>
AnalyzeImageAsync(byte[] imageBytes, string mimeType, string language = "English")
{
string apiKey = _config["Gemini:ApiKey"];
if (string.IsNullOrEmpty(apiKey))
throw new Exception("Gemini API Key missing");
var http = _httpClientFactory.CreateClient();
string base64Image = Convert.ToBase64String(imageBytes);
var requestBody = new
{
contents = new[]
{
new {
parts = new object[]
{
new { text =
$"Analyze this image and return:" +
$"\n - 5 caption options" +
$"\n - 5 alt text options" +
$"\n - Language: {language}" +
$"\nRespond in JSON only: {{ \"captions\": [...], \"alts\": [...] }}"
},
new {
inline_data = new {
mime_type = mimeType,
data = base64Image
}
}
}
}
}
};
string url =
$"https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent?key={apiKey}";
var response = await http.PostAsync(
url,
new StringContent(JsonSerializer.Serialize(requestBody), Encoding.UTF8, "application/json")
);
if (!response.IsSuccessStatusCode)
{
string error = await response.Content.ReadAsStringAsync();
throw new Exception($"Gemini API Error: {error}");
}
var json = await response.Content.ReadFromJsonAsync<JsonElement>();
var textResponse = json
.GetProperty("candidates")[0]
.GetProperty("content")
.GetProperty("parts")[0]
.GetProperty("text")
.GetString();
textResponse = textResponse.Replace("```json", "").Replace("```", "");
var resultJson = JsonDocument.Parse(textResponse).RootElement;
var captions = resultJson.GetProperty("captions")
.EnumerateArray().Select(x => x.GetString()).ToList();
var alts = resultJson.GetProperty("alts")
.EnumerateArray().Select(x => x.GetString()).ToList();
return (captions, alts);
}
}
Register GeminiService
In the program.cs file, add below lines of code to register HttpClient and GeminiService.
// Register HttpClient and GeminiService
builder.Services.AddHttpClient();
builder.Services.AddSingleton<GeminiService>();
Create a Controller
When the user uploads an image, the ImageController processes the form submission, converts the file into bytes, detects its MIME type, and sends the prepared image data to the GeminiService for further processing by Gemini AI.
public class ImageController: Controller
{
private readonly GeminiService _gemini;
public ImageController(GeminiService gemini)
{
_gemini = gemini;
}
public IActionResult Index() => View();
[HttpPost]
[RequestSizeLimit(10 * 1024 * 1024)] // 10 MB
public async Task<IActionResult> Analyze(IFormFile file, string language = null)
{
if (file == null || file.Length == 0)
{
ModelState.AddModelError("file", "Please select an image file.");
return View("Upload");
}
using var ms = new MemoryStream();
await file.CopyToAsync(ms);
var bytes = ms.ToArray();
var result = new ResponseModel();
try
{
var mimeType = MimeTypeHelper.GetMimeType(file.FileName);
var content = await _gemini.AnalyzeImageAsync(bytes, mimeType, language);
result.Alts = content.alts;
result.Captions = content.captions;
}
catch (Exception ex)
{
//error handling
TempData["Error"] = "Failed to analyze the uploaded image. Error: " + ex.Message;
return RedirectToAction("Upload");
}
// Pass model to view
return View("Result", result);
}
}
Create the Result View
@using ImageAnalyzer.Models
@model ResponseModel
@{
ViewData["Title"] = "AI Result";
}
<div class="container mt-5">
<div class="card shadow p-4">
<h2 class="mb-4 text-center">AI Generated Caption & Alt Text</h2>
@if(Model.Alts.Any())
{
<h3>Suggested Alt text</h3>
<ul>
@foreach(var alt in Model.Alts)
{
<li>@alt</li>
}
</ul>
}
@if(Model.Captions.Any())
{
<h3>Suggested Captions</h3>
<ul>
@foreach(var caption in Model.Captions)
{
<li>@caption</li>
}
</ul>
}
<a href="/image" class="btn btn-secondary mt-3">Analyze Another Image</a>
</div>
</div>
SEO Benefits
Using AI-generated caption & alt text improves:
Your editors no longer need to write text manually—AI does it instantly.
Conclusion
Automating image captions and alt text in ASP.NET Core MVC using Gemini AI is:
With just a few lines of code, your application can automatically describe any uploaded image — making content richer, smarter, and more user-friendly.
Full source code used in this article is available in GitHub. Get it from here https://github.com/pasangtamang/ImageAnalyzer