ASP.NET Core  

Automating Image Captions & Alt Text in ASP.NET Core MVC with Gemini AI

Modern web applications rely heavily on images, whether it’s blogs, product pages, or dynamic content portals. But writing good image captions and alt text for every uploaded image is often a manual, time-consuming task.

Using the AI, we can automatically analyze an uploaded image and generate:

  • A meaningful, context-aware caption

  • SEO-friendly alt text

  • Multiple variations for each

  • Output in any language

This article shows you how to integrate Gemini AI into ASP.NET Core MVC (.NET 10) to process images and generate high-quality descriptive text.

Why Use Gemini AI for Image Captioning?

Gemini AI is trained on a massive multimodal dataset, which means:

  • It understands visual context (objects, scenes, emotions, actions)

  • It can generate human-like captions

  • It supports multiple languages

  • It can return multiple options so users can choose the best one

Features We Will Implement

  • Upload an image in ASP.NET Core MVC

  • Send the image bytes to Gemini AI

  • Ask Gemini to generate:

    • Multiple caption options

    • Multiple alt text variations

    • Output in multiple languages

  • Render results on the page

  • SEO-friendly and accessible output

Before diving into the code, first you have to create a project in Visual Studio (2026 prefer) and have a Gemini API key that you can get it from Google AI Studio .

Project Structure

Below image shows the project structure that I am following to demonstrate the implementation.

project structure

Add Gemini API Key to appsettings.json

  
    {
  "Gemini": {
    "ApiKey": "your-api-key-here"
  }
}
  

You can store the API key in User Secrets for security.

Create the Image Upload View

  
    <div class="container mt-5">
    <div class="card shadow p-4">
        <h2 class="mb-4 text-center">AI Image Caption Generator</h2>
        <form asp-action="Analyze" enctype="multipart/form-data" method="post">
            <div class="mb-3">
                <label class="form-label">Select Image</label>
                <input type="file" name="file" class="form-control" required />
            </div>
            <div class="mb-3">
                <label class="form-label">Select Language</label>
                <select name="language" class="form-select">
                    <option value="en">English</option>
                    <option value="ne">Nepali</option>
                    <option value="hi">Hindi</option>
                    <option value="es">Spanish</option>
                    <option value="fr">French</option>
                    <option value="ja">Japanese</option>
                </select>
            </div>
            <button class="btn btn-primary w-100">Analyze Image</button>
        </form>
    </div>
</div>

Create the Service to Handle Gemini Connection

This GeminiService is responsible to handle connection with the Gemini AI API with the prompt. Uploaded image converts into the base64Image as Geimin AI requires base64Image.

  
    public class GeminiService
    {
        private readonly IConfiguration _config;
        private readonly IHttpClientFactory _httpClientFactory;

        public GeminiService(IConfiguration config, IHttpClientFactory httpClientFactory)
        {
            _config = config;
            _httpClientFactory = httpClientFactory;
        }

        public async Task<(List<string> captions, List<string> alts)>
            AnalyzeImageAsync(byte[] imageBytes, string mimeType, string language = "English")
        {
            string apiKey = _config["Gemini:ApiKey"];
            if (string.IsNullOrEmpty(apiKey))
                throw new Exception("Gemini API Key missing");
            var http = _httpClientFactory.CreateClient();
            string base64Image = Convert.ToBase64String(imageBytes);
            var requestBody = new
            {
                contents = new[]
                {
                new {
                    parts = new object[]
                    {
                        new { text =
                            $"Analyze this image and return:" +
                            $"\n - 5 caption options" +
                            $"\n - 5 alt text options" +
                            $"\n - Language: {language}" +
                            $"\nRespond in JSON only: {{ \"captions\": [...], \"alts\": [...] }}"
                        },
                        new {
                            inline_data = new {
                                mime_type = mimeType,
                                data = base64Image
                            }
                        }
                    }
                }
            }
            };
            string url =
                $"https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent?key={apiKey}";
            var response = await http.PostAsync(
                url,
                new StringContent(JsonSerializer.Serialize(requestBody), Encoding.UTF8, "application/json")
            );

            if (!response.IsSuccessStatusCode)
            {
                string error = await response.Content.ReadAsStringAsync();
                throw new Exception($"Gemini API Error: {error}");
            }

            var json = await response.Content.ReadFromJsonAsync<JsonElement>();
            var textResponse = json
                .GetProperty("candidates")[0]
                .GetProperty("content")
                .GetProperty("parts")[0]
                .GetProperty("text")
                .GetString();

            textResponse = textResponse.Replace("```json", "").Replace("```", "");
            var resultJson = JsonDocument.Parse(textResponse).RootElement;
            var captions = resultJson.GetProperty("captions")
                .EnumerateArray().Select(x => x.GetString()).ToList();

            var alts = resultJson.GetProperty("alts")
                .EnumerateArray().Select(x => x.GetString()).ToList();

            return (captions, alts);
        }
    }
  

Register GeminiService

In the program.cs file, add below lines of code to register HttpClient and GeminiService.

  
    // Register HttpClient and GeminiService
builder.Services.AddHttpClient();
builder.Services.AddSingleton<GeminiService>();

Create a Controller

When the user uploads an image, the ImageController processes the form submission, converts the file into bytes, detects its MIME type, and sends the prepared image data to the GeminiService for further processing by Gemini AI.

  
    public class ImageController: Controller
    {
        private readonly GeminiService _gemini;

        public ImageController(GeminiService gemini)
        {
            _gemini = gemini;
        }

        public IActionResult Index() => View();

        [HttpPost]
        [RequestSizeLimit(10 * 1024 * 1024)] // 10 MB
        public async Task<IActionResult> Analyze(IFormFile file, string language = null)
        {
            if (file == null || file.Length == 0)
            {
                ModelState.AddModelError("file", "Please select an image file.");
                return View("Upload");
            }

            using var ms = new MemoryStream();
            await file.CopyToAsync(ms);
            var bytes = ms.ToArray();

            var result = new ResponseModel();
            try
            {
                var mimeType = MimeTypeHelper.GetMimeType(file.FileName);
                var content = await _gemini.AnalyzeImageAsync(bytes, mimeType, language);
                result.Alts = content.alts;
                result.Captions = content.captions;
            }
            catch (Exception ex)
            {
                //error handling
                TempData["Error"] = "Failed to analyze the uploaded image. Error: " + ex.Message;
                return RedirectToAction("Upload");
            }

            // Pass model to view
            return View("Result", result);
        }
    }

Create the Result View

  
    @using ImageAnalyzer.Models
@model ResponseModel
@{
    ViewData["Title"] = "AI Result";
}
<div class="container mt-5">
    <div class="card shadow p-4">
        <h2 class="mb-4 text-center">AI Generated Caption & Alt Text</h2>
        @if(Model.Alts.Any())
        {
            <h3>Suggested Alt text</h3>
            <ul>
                @foreach(var alt in Model.Alts)
                {
                    <li>@alt</li>
                } 
            </ul>           
        }

        @if(Model.Captions.Any())
        {
            <h3>Suggested Captions</h3>
            <ul>
                @foreach(var caption in Model.Captions)
                {
                    <li>@caption</li>
                }
            </ul>         
        }

        <a href="/image" class="btn btn-secondary mt-3">Analyze Another Image</a>
    </div>
</div>

SEO Benefits

Using AI-generated caption & alt text improves:

  • Google Image Search ranking

  • Accessibility score

  • User engagement

  • Localized content reach

  • Content creation time

Your editors no longer need to write text manually—AI does it instantly.

Conclusion

Automating image captions and alt text in ASP.NET Core MVC using Gemini AI is:

  • Simple to implement

  • Great for SEO

  • Helpful for accessibility

  • Extremely time-efficient

  • Supports any language you need

With just a few lines of code, your application can automatically describe any uploaded image — making content richer, smarter, and more user-friendly.

Full source code used in this article is available in GitHub. Get it from here https://github.com/pasangtamang/ImageAnalyzer