![Monitoring-logging]()
Table of Contents
The Critical Role of Monitoring in Modern Applications
Setting Up Comprehensive Logging Infrastructure
Application Insights Deep Dive
Advanced Health Check Systems
Real-Time Monitoring Dashboards
Distributed Tracing & Correlation
Performance Counters & Metrics
Alerting & Notification Systems
Security Monitoring & Compliance
Production Deployment & Best Practices
1. The Critical Role of Monitoring in Modern Applications
Why Monitoring is Non-Negotiable in Production
In today's digital landscape, application downtime or performance degradation can result in significant revenue loss, damaged reputation, and customer churn. Comprehensive monitoring provides:
Business Impact Visibility
Real-time performance metrics
User experience monitoring
Business transaction tracking
Revenue impact analysis
Technical Operations
Proactive issue detection
Rapid root cause analysis
Capacity planning insights
Security threat detection
Real-World Monitoring Disaster Stories
Case Study: E-Commerce Outage
A major retailer lost $100,000 per hour during a 6-hour outage that could have been prevented with proper monitoring. Missing database connection pool monitoring led to cascading failures.
Case Study: Financial Service Incident
A trading platform experienced 15 minutes of latency spikes, resulting in $2M in lost trades. Inadequate distributed tracing made root cause analysis take days instead of minutes.
The Monitoring Maturity Model
// Monitoring maturity assessment
public class MonitoringMaturity
{
public Level CurrentLevel { get; set; }
public enum Level
{
Reactive, // Basic error logging
Proactive, // Performance monitoring
Predictive, // AI-driven insights
Autonomous // Self-healing systems
}
public bool AssessMaturity(ApplicationMetrics metrics)
{
return metrics switch
{
{ HasStructuredLogging: true, HasHealthChecks: true } => Level.Proactive,
{ HasDistributedTracing: true, HasRealTimeAlerting: true } => Level.Predictive,
{ HasAIOps: true, HasAutoRemediation: true } => Level.Autonomous,
_ => Level.Reactive
};
}
}
2. Setting Up Comprehensive Logging Infrastructure
Structured Logging with Serilog
// Program.cs - Advanced Logging Configuration
using Serilog;
using Serilog.Events;
using Serilog.Sinks.ApplicationInsights;
var builder = WebApplication.CreateBuilder(args);
// Configure Serilog
Log.Logger = new LoggerConfiguration()
.ReadFrom.Configuration(builder.Configuration)
.Enrich.FromLogContext()
.Enrich.WithProperty("Application", "ECommerceApp")
.Enrich.WithProperty("Environment", builder.Environment.EnvironmentName)
.Enrich.WithMachineName()
.Enrich.WithThreadId()
.WriteTo.Console(
outputTemplate: "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}")
.WriteTo.File(
path: "logs/app-.log",
rollingInterval: RollingInterval.Day,
retainedFileCountLimit: 30,
outputTemplate: "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}")
.WriteTo.ApplicationInsights(
builder.Configuration["ApplicationInsights:ConnectionString"],
TelemetryConverter.Traces)
.WriteTo.Seq("http://localhost:5341")
.CreateLogger();
builder.Host.UseSerilog();
// Add services
builder.Services.AddControllers();
builder.Services.AddHealthChecks();
var app = builder.Build();
// Log application startup
Log.Information("Application starting up at {StartupTime} with environment {Environment}",
DateTime.UtcNow, app.Environment.EnvironmentName);
app.UseRouting();
app.UseAuthorization();
app.MapControllers();
app.MapHealthChecks("/health");
try
{
Log.Information("Starting web host");
app.Run();
}
catch (Exception ex)
{
Log.Fatal(ex, "Host terminated unexpectedly");
}
finally
{
Log.CloseAndFlush();
}
Advanced Logging Service with Context Enrichment
// Services/AdvancedLoggingService.cs
public interface IAdvancedLogger
{
IDisposable BeginScope<TState>(TState state);
void LogCritical(string message, params object[] properties);
void LogError(string message, params object[] properties);
void LogWarning(string message, params object[] properties);
void LogInformation(string message, params object[] properties);
void LogDebug(string message, params object[] properties);
void LogPerformance(string operation, TimeSpan duration, Dictionary<string, object> context = null);
void LogBusinessEvent(string eventType, Dictionary<string, object> metrics);
}
public class AdvancedLogger : IAdvancedLogger
{
private readonly ILogger<AdvancedLogger> _logger;
private readonly IHttpContextAccessor _httpContextAccessor;
public AdvancedLogger(ILogger<AdvancedLogger> logger, IHttpContextAccessor httpContextAccessor)
{
_logger = logger;
_httpContextAccessor = httpContextAccessor;
}
public IDisposable BeginScope<TState>(TState state)
{
return _logger.BeginScope(state);
}
public void LogCritical(string message, params object[] properties)
{
var enrichedProperties = EnrichWithContext(properties);
_logger.LogCritical(message, enrichedProperties);
}
public void LogError(string message, params object[] properties)
{
var enrichedProperties = EnrichWithContext(properties);
_logger.LogError(message, enrichedProperties);
}
public void LogWarning(string message, params object[] properties)
{
var enrichedProperties = EnrichWithContext(properties);
_logger.LogWarning(message, enrichedProperties);
}
public void LogInformation(string message, params object[] properties)
{
var enrichedProperties = EnrichWithContext(properties);
_logger.LogInformation(message, enrichedProperties);
}
public void LogDebug(string message, params object[] properties)
{
var enrichedProperties = EnrichWithContext(properties);
_logger.LogDebug(message, enrichedProperties);
}
public void LogPerformance(string operation, TimeSpan duration, Dictionary<string, object> context = null)
{
var properties = new Dictionary<string, object>
{
["Operation"] = operation,
["DurationMs"] = duration.TotalMilliseconds,
["IsSlow"] = duration.TotalMilliseconds > 1000 // 1 second threshold
};
if (context != null)
{
foreach (var item in context)
{
properties[item.Key] = item.Value;
}
}
var enrichedProperties = EnrichWithContext(properties.Values.ToArray());
if (duration.TotalMilliseconds > 1000)
{
_logger.LogWarning("Performance alert: {Operation} took {DurationMs}ms", operation, duration.TotalMilliseconds);
}
else
{
_logger.LogInformation("Performance: {Operation} completed in {DurationMs}ms", operation, duration.TotalMilliseconds);
}
}
public void LogBusinessEvent(string eventType, Dictionary<string, object> metrics)
{
var properties = new Dictionary<string, object>
{
["EventType"] = eventType,
["Timestamp"] = DateTime.UtcNow
};
foreach (var metric in metrics)
{
properties[metric.Key] = metric.Value;
}
var enrichedProperties = EnrichWithContext(properties.Values.ToArray());
_logger.LogInformation("Business Event: {EventType} with metrics {@Metrics}", eventType, metrics);
}
private object[] EnrichWithContext(object[] properties)
{
var context = _httpContextAccessor.HttpContext;
var enrichedList = properties.ToList();
if (context != null)
{
// Add correlation ID
var correlationId = context.Request.Headers["X-Correlation-ID"].FirstOrDefault()
?? Guid.NewGuid().ToString();
enrichedList.Add(correlationId);
// Add user context
var userId = context.User.FindFirst("sub")?.Value ?? "anonymous";
enrichedList.Add(userId);
// Add request path
enrichedList.Add(context.Request.Path);
}
return enrichedList.ToArray();
}
}
// Example usage in controller
[ApiController]
[Route("api/[controller]")]
public class OrdersController : ControllerBase
{
private readonly IAdvancedLogger _logger;
private readonly IOrderService _orderService;
public OrdersController(IAdvancedLogger logger, IOrderService orderService)
{
_logger = logger;
_orderService = orderService;
}
[HttpPost]
public async Task<ActionResult<Order>> CreateOrder([FromBody] CreateOrderRequest request)
{
using (_logger.BeginScope(new { OrderRequest = request, UserId = User.Identity.Name }))
{
var stopwatch = Stopwatch.StartNew();
try
{
_logger.LogInformation("Creating order for user {UserId} with {ItemCount} items",
User.Identity.Name, request.Items.Count);
var order = await _orderService.CreateOrderAsync(request);
stopwatch.Stop();
_logger.LogPerformance("CreateOrder", stopwatch.Elapsed, new Dictionary<string, object>
{
["OrderId"] = order.Id,
["TotalAmount"] = order.TotalAmount,
["ItemCount"] = request.Items.Count
});
// Log business event
_logger.LogBusinessEvent("OrderCreated", new Dictionary<string, object>
{
["OrderId"] = order.Id,
["CustomerId"] = User.Identity.Name,
["TotalAmount"] = order.TotalAmount,
["ItemsCount"] = order.Items.Count,
["PaymentMethod"] = request.PaymentMethod
});
return Ok(order);
}
catch (Exception ex)
{
stopwatch.Stop();
_logger.LogError(ex, "Failed to create order for user {UserId}", User.Identity.Name);
_logger.LogBusinessEvent("OrderCreationFailed", new Dictionary<string, object>
{
["Error"] = ex.Message,
["UserId"] = User.Identity.Name,
["DurationMs"] = stopwatch.ElapsedMilliseconds
});
return BadRequest(new { error = "Order creation failed" });
}
}
}
}
Configuration for Production Logging
// appsettings.Production.json
{
"Serilog": {
"Using": [
"Serilog.Sinks.Console",
"Serilog.Sinks.File",
"Serilog.Sinks.ApplicationInsights"
],
"MinimumLevel": {
"Default": "Information",
"Override": {
"Microsoft": "Warning",
"System": "Warning",
"Microsoft.AspNetCore": "Warning",
"Microsoft.EntityFrameworkCore": "Warning"
}
},
"WriteTo": [
{
"Name": "Console",
"Args": {
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} <s:{SourceContext}>{NewLine}{Exception}"
}
},
{
"Name": "File",
"Args": {
"path": "/var/log/app/app-.log",
"rollingInterval": "Day",
"retainedFileCountLimit": 30,
"outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
}
},
{
"Name": "ApplicationInsights",
"Args": {
"connectionString": "InstrumentationKey=your-key-here;IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/",
"telemetryConverter": "Serilog.Sinks.ApplicationInsights.TelemetryConverters.TraceTelemetryConverter"
}
},
{
"Name": "Seq",
"Args": {
"serverUrl": "http://seq:5341"
}
}
],
"Enrich": [
"FromLogContext",
"WithMachineName",
"WithThreadId",
"WithAssemblyName",
"WithAssemblyVersion"
],
"Properties": {
"Application": "ECommercePlatform",
"Environment": "Production"
}
},
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft.AspNetCore": "Warning",
"Microsoft.EntityFrameworkCore.Database.Command": "Warning"
}
}
}
3. Application Insights Deep Dive
Comprehensive Application Insights Setup
// Program.cs - Application Insights Configuration
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.ApplicationInsights.AspNetCore;
var builder = WebApplication.CreateBuilder(args);
// Add Application Insights
builder.Services.AddApplicationInsightsTelemetry(options =>
{
options.ConnectionString = builder.Configuration["ApplicationInsights:ConnectionString"];
options.EnableAdaptiveSampling = false; // We want all telemetry for critical apps
options.EnablePerformanceCounterCollectionModule = true;
options.EnableQuickPulseMetricStream = true;
options.EnableDiagnosticsTelemetryModule = true;
options.EnableAzureInstanceMetadataTelemetryModule = true;
});
// Additional telemetry processors
builder.Services.AddApplicationInsightsTelemetryProcessor<CustomTelemetryProcessor>();
builder.Services.AddSingleton<ITelemetryInitializer, CustomTelemetryInitializer>();
// Add distributed tracing
builder.Services.AddDistributedTracing();
var app = builder.Build();
// Custom telemetry initializer
public class CustomTelemetryInitializer : ITelemetryInitializer
{
private readonly IHttpContextAccessor _httpContextAccessor;
public CustomTelemetryInitializer(IHttpContextAccessor httpContextAccessor)
{
_httpContextAccessor = httpContextAccessor;
}
public void Initialize(ITelemetry telemetry)
{
var context = _httpContextAccessor.HttpContext;
if (context != null)
{
// Add correlation ID
var correlationId = context.Request.Headers["X-Correlation-ID"].FirstOrDefault();
if (!string.IsNullOrEmpty(correlationId))
{
telemetry.Context.Operation.Id = correlationId;
}
// Add user context
var userId = context.User.Identity.Name;
if (!string.IsNullOrEmpty(userId))
{
telemetry.Context.User.Id = userId;
telemetry.Context.User.AuthenticatedUserId = userId;
}
// Add business context
telemetry.Context.GlobalProperties["BusinessUnit"] = "ECommerce";
telemetry.Context.GlobalProperties["ApplicationVersion"] = "1.2.3";
// Add custom properties based on request
if (context.Request.Path.HasValue)
{
telemetry.Context.GlobalProperties["RequestPath"] = context.Request.Path.Value;
}
}
}
}
// Custom telemetry processor for filtering
public class CustomTelemetryProcessor : ITelemetryProcessor
{
private readonly ITelemetryProcessor _next;
public CustomTelemetryProcessor(ITelemetryProcessor next)
{
_next = next;
}
public void Process(ITelemetry item)
{
// Filter out health check requests from metric collection
if (item is RequestTelemetry request &&
request.Url.ToString().Contains("/health"))
{
return;
}
// Filter out noisy dependencies
if (item is DependencyTelemetry dependency &&
dependency.Type == "HTTP" &&
dependency.Duration < TimeSpan.FromMilliseconds(100))
{
return;
}
// Add custom properties to all telemetry
item.Context.GlobalProperties["ProcessingTimestamp"] = DateTime.UtcNow.ToString("O");
_next.Process(item);
}
}
Custom Telemetry Service
// Services/TelemetryService.cs
public interface ITelemetryService
{
void TrackEvent(string eventName, Dictionary<string, string> properties = null, Dictionary<string, double> metrics = null);
void TrackException(Exception exception, Dictionary<string, string> properties = null);
void TrackMetric(string metricName, double value, Dictionary<string, string> properties = null);
void TrackDependency(string dependencyType, string dependencyName, string data, DateTimeOffset startTime, TimeSpan duration, bool success);
void TrackRequest(string name, DateTimeOffset startTime, TimeSpan duration, string responseCode, bool success);
void TrackAvailability(string name, DateTimeOffset timeStamp, TimeSpan duration, string runLocation, bool success, string message = null);
void TrackBusinessMetric(string metricName, double value, string category, Dictionary<string, string> dimensions = null);
}
public class ApplicationInsightsTelemetryService : ITelemetryService
{
private readonly TelemetryClient _telemetryClient;
private readonly ILogger<ApplicationInsightsTelemetryService> _logger;
public ApplicationInsightsTelemetryService(TelemetryClient telemetryClient, ILogger<ApplicationInsightsTelemetryService> logger)
{
_telemetryClient = telemetryClient;
_logger = logger;
}
public void TrackEvent(string eventName, Dictionary<string, string> properties = null, Dictionary<string, double> metrics = null)
{
try
{
_telemetryClient.TrackEvent(eventName, properties, metrics);
_logger.LogDebug("Tracked event: {EventName}", eventName);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track event: {EventName}", eventName);
}
}
public void TrackException(Exception exception, Dictionary<string, string> properties = null)
{
try
{
var telemetry = new ExceptionTelemetry(exception);
if (properties != null)
{
foreach (var prop in properties)
{
telemetry.Properties[prop.Key] = prop.Value;
}
}
_telemetryClient.TrackException(telemetry);
_logger.LogDebug("Tracked exception: {ExceptionType}", exception.GetType().Name);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track exception");
}
}
public void TrackMetric(string metricName, double value, Dictionary<string, string> properties = null)
{
try
{
var metric = new MetricTelemetry(metricName, value);
if (properties != null)
{
foreach (var prop in properties)
{
metric.Properties[prop.Key] = prop.Value;
}
}
_telemetryClient.TrackMetric(metric);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track metric: {MetricName}", metricName);
}
}
public void TrackDependency(string dependencyType, string dependencyName, string data, DateTimeOffset startTime, TimeSpan duration, bool success)
{
try
{
_telemetryClient.TrackDependency(dependencyType, dependencyName, data, startTime, duration, success);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track dependency: {DependencyName}", dependencyName);
}
}
public void TrackRequest(string name, DateTimeOffset startTime, TimeSpan duration, string responseCode, bool success)
{
try
{
_telemetryClient.TrackRequest(name, startTime, duration, responseCode, success);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track request: {RequestName}", name);
}
}
public void TrackAvailability(string name, DateTimeOffset timeStamp, TimeSpan duration, string runLocation, bool success, string message = null)
{
try
{
_telemetryClient.TrackAvailability(name, timeStamp, duration, runLocation, success, message);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track availability: {AvailabilityTestName}", name);
}
}
public void TrackBusinessMetric(string metricName, double value, string category, Dictionary<string, string> dimensions = null)
{
try
{
var properties = new Dictionary<string, string>
{
["Category"] = category,
["BusinessMetric"] = "true"
};
if (dimensions != null)
{
foreach (var dimension in dimensions)
{
properties[dimension.Key] = dimension.Value;
}
}
TrackMetric(metricName, value, properties);
_logger.LogInformation("Business metric tracked: {MetricName}={Value} in category {Category}",
metricName, value, category);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to track business metric: {MetricName}", metricName);
}
}
}
// Example usage in business service
public class OrderService : IOrderService
{
private readonly ITelemetryService _telemetry;
private readonly IAdvancedLogger _logger;
public OrderService(ITelemetryService telemetry, IAdvancedLogger logger)
{
_telemetry = telemetry;
_logger = logger;
}
public async Task<Order> ProcessOrderAsync(Order order)
{
var stopwatch = Stopwatch.StartNew();
try
{
// Track business event
_telemetry.TrackEvent("OrderProcessingStarted", new Dictionary<string, string>
{
["OrderId"] = order.Id.ToString(),
["CustomerId"] = order.CustomerId,
["TotalAmount"] = order.TotalAmount.ToString("C"),
["ItemCount"] = order.Items.Count.ToString()
});
// Business logic
await ValidateOrderAsync(order);
await ProcessPaymentAsync(order);
await UpdateInventoryAsync(order);
await SendConfirmationAsync(order);
// Track success metrics
_telemetry.TrackBusinessMetric("OrdersProcessed", 1, "Revenue");
_telemetry.TrackBusinessMetric("Revenue", order.TotalAmount, "Revenue");
stopwatch.Stop();
// Track performance
_telemetry.TrackMetric("OrderProcessingTime", stopwatch.ElapsedMilliseconds);
_logger.LogInformation("Order {OrderId} processed successfully in {ProcessingTime}ms",
order.Id, stopwatch.ElapsedMilliseconds);
return order;
}
catch (Exception ex)
{
stopwatch.Stop();
// Track failure
_telemetry.TrackException(ex, new Dictionary<string, string>
{
["OrderId"] = order.Id.ToString(),
["ProcessingTime"] = stopwatch.ElapsedMilliseconds.ToString()
});
_telemetry.TrackBusinessMetric("OrderProcessingFailed", 1, "Errors");
_logger.LogError(ex, "Failed to process order {OrderId}", order.Id);
throw;
}
}
}
4. Advanced Health Check Systems
Comprehensive Health Check Setup
// Program.cs - Health Checks Configuration
using Microsoft.Extensions.Diagnostics.HealthChecks;
var builder = WebApplication.CreateBuilder(args);
// Add basic health checks
builder.Services.AddHealthChecks()
.AddCheck("self", () => HealthCheckResult.Healthy(), tags: new[] { "ready" })
.AddSqlServer(
connectionString: builder.Configuration.GetConnectionString("DefaultConnection"),
name: "sql",
failureStatus: HealthStatus.Unhealthy,
tags: new[] { "ready", "database" })
.AddRedis(
redisConnectionString: builder.Configuration.GetConnectionString("Redis"),
name: "redis",
failureStatus: HealthStatus.Degraded,
tags: new[] { "ready", "cache" })
.AddAzureBlobStorage(
connectionString: builder.Configuration.GetConnectionString("AzureStorage"),
name: "blobstorage",
failureStatus: HealthStatus.Degraded,
tags: new[] { "ready", "storage" })
.AddApplicationInsightsPublisher();
// Add custom health checks
builder.Services.AddHealthChecks()
.AddCheck<DatabaseHealthCheck>("database_advanced", tags: new[] { "detailed" })
.AddCheck<ExternalApiHealthCheck>("external_api", tags: new[] { "external" })
.AddCheck<DiskSpaceHealthCheck>("disk_space", tags: new[] { "infrastructure" });
// Configure health check UI
builder.Services.AddHealthChecksUI(settings =>
{
settings.SetHeaderText("ECommerce Platform - Health Status");
settings.AddHealthCheckEndpoint("API", "/health");
settings.AddHealthCheckEndpoint("Database", "/health/database");
settings.AddHealthCheckEndpoint("External Services", "/health/external");
settings.SetEvaluationTimeInSeconds(60);
settings.SetApiMaxActiveRequests(3);
settings.MaximumHistoryEntriesPerEndpoint(50);
})
.AddInMemoryStorage();
var app = builder.Build();
// Map health check endpoints
app.MapHealthChecks("/health", new HealthCheckOptions
{
Predicate = _ => true,
ResponseWriter = WriteResponse,
AllowCachingResponses = false
});
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready"),
ResponseWriter = WriteResponse
});
app.MapHealthChecks("/health/database", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("database"),
ResponseWriter = WriteResponse
});
app.MapHealthChecks("/health/external", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("external"),
ResponseWriter = WriteResponse
});
// Health checks UI
app.MapHealthChecksUI(options =>
{
options.UIPath = "/health-ui";
options.AddCustomStylesheet("wwwroot/custom-health.css");
});
// Custom response writer
static Task WriteResponse(HttpContext context, HealthReport result)
{
context.Response.ContentType = "application/json; charset=utf-8";
var response = new
{
status = result.Status.ToString(),
totalDuration = result.TotalDuration.TotalMilliseconds,
checks = result.Entries.Select(entry => new
{
name = entry.Key,
status = entry.Value.Status.ToString(),
duration = entry.Value.Duration.TotalMilliseconds,
description = entry.Value.Description,
exception = entry.Value.Exception?.Message,
data = entry.Value.Data
})
};
var json = JsonSerializer.Serialize(response, new JsonSerializerOptions
{
WriteIndented = true,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
});
return context.Response.WriteAsync(json);
}
// Custom health check implementations
public class DatabaseHealthCheck : IHealthCheck
{
private readonly IConfiguration _configuration;
private readonly ILogger<DatabaseHealthCheck> _logger;
public DatabaseHealthCheck(IConfiguration configuration, ILogger<DatabaseHealthCheck> logger)
{
_configuration = configuration;
_logger = logger;
}
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
var stopwatch = Stopwatch.StartNew();
try
{
var connectionString = _configuration.GetConnectionString("DefaultConnection");
using var connection = new SqlConnection(connectionString);
await connection.OpenAsync(cancellationToken);
// Check basic connectivity
var canConnect = connection.State == ConnectionState.Open;
if (!canConnect)
{
return HealthCheckResult.Unhealthy("Cannot connect to database");
}
// Check critical tables exist
var tablesCheck = await connection.QueryFirstOrDefaultAsync<int>(
"SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME IN ('Orders', 'Products', 'Customers')");
if (tablesCheck < 3)
{
return HealthCheckResult.Degraded("Some critical tables are missing");
}
// Check performance with a simple query
var performanceStopwatch = Stopwatch.StartNew();
await connection.ExecuteScalarAsync<int>("SELECT COUNT(*) FROM Orders");
performanceStopwatch.Stop();
var data = new Dictionary<string, object>
{
["connection_time_ms"] = stopwatch.ElapsedMilliseconds,
["query_performance_ms"] = performanceStopwatch.ElapsedMilliseconds,
["database_name"] = connection.Database,
["server"] = connection.DataSource
};
if (performanceStopwatch.ElapsedMilliseconds > 1000)
{
_logger.LogWarning("Database performance check took {ElapsedMs}ms", performanceStopwatch.ElapsedMilliseconds);
return HealthCheckResult.Degraded("Database performance is slow", data: data);
}
stopwatch.Stop();
_logger.LogInformation("Database health check completed in {ElapsedMs}ms", stopwatch.ElapsedMilliseconds);
return HealthCheckResult.Healthy("Database is healthy", data);
}
catch (Exception ex)
{
stopwatch.Stop();
_logger.LogError(ex, "Database health check failed after {ElapsedMs}ms", stopwatch.ElapsedMilliseconds);
return HealthCheckResult.Unhealthy(
"Database health check failed",
exception: ex,
data: new Dictionary<string, object> { ["failure_time_ms"] = stopwatch.ElapsedMilliseconds });
}
}
}
public class ExternalApiHealthCheck : IHealthCheck
{
private readonly HttpClient _httpClient;
private readonly ILogger<ExternalApiHealthCheck> _logger;
public ExternalApiHealthCheck(HttpClient httpClient, ILogger<ExternalApiHealthCheck> logger)
{
_httpClient = httpClient;
_logger = logger;
}
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
var stopwatch = Stopwatch.StartNew();
var results = new List<string>();
try
{
// Check payment gateway
var paymentGatewayResult = await CheckPaymentGatewayAsync(cancellationToken);
results.Add($"Payment Gateway: {paymentGatewayResult}");
// Check shipping service
var shippingServiceResult = await CheckShippingServiceAsync(cancellationToken);
results.Add($"Shipping Service: {shippingServiceResult}");
// Check email service
var emailServiceResult = await CheckEmailServiceAsync(cancellationToken);
results.Add($"Email Service: {emailServiceResult}");
var failedChecks = results.Count(r => r.Contains("Unhealthy"));
var degradedChecks = results.Count(r => r.Contains("Degraded"));
stopwatch.Stop();
var data = new Dictionary<string, object>
{
["total_external_checks"] = results.Count,
["failed_checks"] = failedChecks,
["degraded_checks"] = degradedChecks,
["check_duration_ms"] = stopwatch.ElapsedMilliseconds,
["detailed_results"] = results
};
if (failedChecks > 0)
{
return HealthCheckResult.Unhealthy($"{failedChecks} external services are unhealthy", data: data);
}
if (degradedChecks > 0)
{
return HealthCheckResult.Degraded($"{degradedChecks} external services are degraded", data: data);
}
return HealthCheckResult.Healthy("All external services are healthy", data);
}
catch (Exception ex)
{
stopwatch.Stop();
_logger.LogError(ex, "External API health check failed");
return HealthCheckResult.Unhealthy("External API health check failed", ex);
}
}
private async Task<string> CheckPaymentGatewayAsync(CancellationToken cancellationToken)
{
try
{
var response = await _httpClient.GetAsync("https://api.paymentgateway.com/health", cancellationToken);
if (response.IsSuccessStatusCode)
{
return "Healthy";
}
else if ((int)response.StatusCode >= 500)
{
return "Unhealthy";
}
else
{
return "Degraded";
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Payment gateway health check failed");
return "Unhealthy";
}
}
private async Task<string> CheckShippingServiceAsync(CancellationToken cancellationToken)
{
// Similar implementation for shipping service
return "Healthy";
}
private async Task<string> CheckEmailServiceAsync(CancellationToken cancellationToken)
{
// Similar implementation for email service
return "Healthy";
}
}
public class DiskSpaceHealthCheck : IHealthCheck
{
private readonly ILogger<DiskSpaceHealthCheck> _logger;
public DiskSpaceHealthCheck(ILogger<DiskSpaceHealthCheck> logger)
{
_logger = logger;
}
public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
try
{
var drive = new DriveInfo(Path.GetPathRoot(Environment.CurrentDirectory));
var availableFreeSpaceGB = drive.AvailableFreeSpace / 1024 / 1024 / 1024;
var totalSizeGB = drive.TotalSize / 1024 / 1024 / 1024;
var usedPercentage = (double)(drive.TotalSize - drive.AvailableFreeSpace) / drive.TotalSize * 100;
var data = new Dictionary<string, object>
{
["available_gb"] = availableFreeSpaceGB,
["total_gb"] = totalSizeGB,
["used_percentage"] = usedPercentage,
["drive_name"] = drive.Name,
["drive_format"] = drive.DriveFormat
};
if (availableFreeSpaceGB < 5) // Less than 5GB free
{
return Task.FromResult(HealthCheckResult.Unhealthy("Disk space critically low", data: data));
}
else if (availableFreeSpaceGB < 10) // Less than 10GB free
{
return Task.FromResult(HealthCheckResult.Degraded("Disk space low", data: data));
}
else if (usedPercentage > 90) // More than 90% used
{
return Task.FromResult(HealthCheckResult.Degraded("Disk usage high", data: data));
}
else
{
return Task.FromResult(HealthCheckResult.Healthy("Disk space adequate", data: data));
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Disk space health check failed");
return Task.FromResult(HealthCheckResult.Unhealthy("Disk space check failed", ex));
}
}
}
5. Real-Time Monitoring Dashboards
Custom Monitoring Dashboard
csharp
// Controllers/MonitoringController.cs
[ApiController]
[Route("api/[controller]")]
public class MonitoringController : ControllerBase
{
private readonly ITelemetryService _telemetry;
private readonly IAdvancedLogger _logger;
private readonly ApplicationDbContext _context;
public MonitoringController(ITelemetryService telemetry, IAdvancedLogger logger, ApplicationDbContext context)
{
_telemetry = telemetry;
_logger = logger;
_context = context;
}
[HttpGet("metrics")]
public async Task<ActionResult<ApplicationMetrics>> GetApplicationMetrics()
{
try
{
var metrics = new ApplicationMetrics
{
Timestamp = DateTime.UtcNow,
// System metrics
CpuUsage = await GetCpuUsageAsync(),
MemoryUsage = GetMemoryUsage(),
DiskUsage = GetDiskUsage(),
// Application metrics
ActiveRequests = GetActiveRequestsCount(),
RequestRate = await GetRequestRateAsync(),
ErrorRate = await GetErrorRateAsync(),
// Business metrics
TotalOrders = await GetTotalOrdersAsync(),
RevenueToday = await GetRevenueTodayAsync(),
ActiveUsers = await GetActiveUsersAsync(),
// Database metrics
DatabaseConnections = await GetDatabaseConnectionsAsync(),
SlowQueries = await GetSlowQueriesCountAsync()
};
// Track the metrics collection
_telemetry.TrackMetric("Monitoring_MetricsCollected", 1);
return Ok(metrics);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to collect application metrics");
return StatusCode(500, new { error = "Metrics collection failed" });
}
}
[HttpGet("performance")]
public async Task<ActionResult<PerformanceMetrics>> GetPerformanceMetrics([FromQuery] TimeRange timeRange = TimeRange.LastHour)
{
var startTime = GetStartTime(timeRange);
var metrics = new PerformanceMetrics
{
TimeRange = timeRange,
AverageResponseTime = await GetAverageResponseTimeAsync(startTime),
P95ResponseTime = await GetPercentileResponseTimeAsync(startTime, 95),
P99ResponseTime = await GetPercentileResponseTimeAsync(startTime, 99),
Throughput = await GetThroughputAsync(startTime),
ErrorPercentage = await GetErrorPercentageAsync(startTime),
SlowEndpoints = await GetSlowEndpointsAsync(startTime)
};
return Ok(metrics);
}
[HttpGet("alerts")]
public async Task<ActionResult<List<Alert>>> GetActiveAlerts()
{
var alerts = new List<Alert>
{
await CheckErrorRateAlertAsync(),
await CheckResponseTimeAlertAsync(),
await CheckDatabasePerformanceAlertAsync(),
await CheckDiskSpaceAlertAsync(),
await CheckExternalServiceAlertAsync()
};
return Ok(alerts.Where(a => a != null));
}
private async Task<Alert> CheckErrorRateAlertAsync()
{
var errorRate = await GetErrorRateAsync();
if (errorRate > 0.05) // 5% error rate threshold
{
return new Alert
{
Type = AlertType.Error,
Severity = AlertSeverity.High,
Title = "High Error Rate Detected",
Message = $"Current error rate is {errorRate:P2} which exceeds the 5% threshold",
Timestamp = DateTime.UtcNow,
MetricName = "ErrorRate",
MetricValue = errorRate
};
}
return null;
}
// Helper methods for metric collection
private async Task<double> GetCpuUsageAsync()
{
// Implementation for getting CPU usage
return await Task.FromResult(0.15); // Example value
}
private double GetMemoryUsage()
{
var process = Process.GetCurrentProcess();
return (double)process.WorkingSet64 / 1024 / 1024; // MB
}
private async Task<int> GetTotalOrdersAsync()
{
return await _context.Orders.CountAsync();
}
private async Task<decimal> GetRevenueTodayAsync()
{
var today = DateTime.Today;
return await _context.Orders
.Where(o => o.CreatedAt >= today)
.SumAsync(o => o.TotalAmount);
}
}
// Models for monitoring
public class ApplicationMetrics
{
public DateTime Timestamp { get; set; }
// System metrics
public double CpuUsage { get; set; }
public double MemoryUsage { get; set; }
public double DiskUsage { get; set; }
// Application metrics
public int ActiveRequests { get; set; }
public double RequestRate { get; set; }
public double ErrorRate { get; set; }
// Business metrics
public int TotalOrders { get; set; }
public decimal RevenueToday { get; set; }
public int ActiveUsers { get; set; }
// Database metrics
public int DatabaseConnections { get; set; }
public int SlowQueries { get; set; }
}
public class PerformanceMetrics
{
public TimeRange TimeRange { get; set; }
public double AverageResponseTime { get; set; }
public double P95ResponseTime { get; set; }
public double P99ResponseTime { get; set; }
public double Throughput { get; set; }
public double ErrorPercentage { get; set; }
public List<SlowEndpoint> SlowEndpoints { get; set; }
}
public class Alert
{
public AlertType Type { get; set; }
public AlertSeverity Severity { get; set; }
public string Title { get; set; }
public string Message { get; set; }
public DateTime Timestamp { get; set; }
public string MetricName { get; set; }
public double MetricValue { get; set; }
}
public enum AlertType
{
Performance,
Error,
Security,
Business,
Infrastructure
}
public enum AlertSeverity
{
Low,
Medium,
High,
Critical
}
public enum TimeRange
{
Last5Minutes,
LastHour,
Last24Hours,
Last7Days
}
Real-Time Dashboard with SignalR
csharp
// Hubs/MonitoringHub.cs
using Microsoft.AspNetCore.SignalR;
public class MonitoringHub : Hub
{
private readonly IMonitoringService _monitoringService;
private readonly ILogger<MonitoringHub> _logger;
public MonitoringHub(IMonitoringService monitoringService, ILogger<MonitoringHub> logger)
{
_monitoringService = monitoringService;
_logger = logger;
}
public override async Task OnConnectedAsync()
{
_logger.LogInformation("Monitoring client connected: {ConnectionId}", Context.ConnectionId);
await Groups.AddToGroupAsync(Context.ConnectionId, "MonitoringClients");
await base.OnConnectedAsync();
}
public override async Task OnDisconnectedAsync(Exception exception)
{
_logger.LogInformation("Monitoring client disconnected: {ConnectionId}", Context.ConnectionId);
await Groups.RemoveFromGroupAsync(Context.ConnectionId, "MonitoringClients");
await base.OnDisconnectedAsync(exception);
}
public async Task SubscribeToMetrics(string metricType)
{
await Groups.AddToGroupAsync(Context.ConnectionId, $"Metrics-{metricType}");
_logger.LogInformation("Client {ConnectionId} subscribed to {MetricType}", Context.ConnectionId, metricType);
}
public async Task UnsubscribeFromMetrics(string metricType)
{
await Groups.RemoveFromGroupAsync(Context.ConnectionId, $"Metrics-{metricType}");
_logger.LogInformation("Client {ConnectionId} unsubscribed from {MetricType}", Context.ConnectionId, metricType);
}
}
// Services/RealTimeMonitoringService.cs
public interface IRealTimeMonitoringService
{
Task SendMetricsUpdateAsync();
Task SendAlertAsync(Alert alert);
Task SendPerformanceUpdateAsync(PerformanceMetrics metrics);
}
public class RealTimeMonitoringService : IRealTimeMonitoringService
{
private readonly IHubContext<MonitoringHub> _hubContext;
private readonly IMonitoringService _monitoringService;
private readonly ILogger<RealTimeMonitoringService> _logger;
private Timer _metricsTimer;
public RealTimeMonitoringService(IHubContext<MonitoringHub> hubContext, IMonitoringService monitoringService, ILogger<RealTimeMonitoringService> logger)
{
_hubContext = hubContext;
_monitoringService = monitoringService;
_logger = logger;
}
public void StartRealTimeUpdates(TimeSpan interval)
{
_metricsTimer = new Timer(async _ => await SendMetricsUpdateAsync(), null, TimeSpan.Zero, interval);
_logger.LogInformation("Started real-time metrics updates every {Interval}", interval);
}
public void StopRealTimeUpdates()
{
_metricsTimer?.Dispose();
_logger.LogInformation("Stopped real-time metrics updates");
}
public async Task SendMetricsUpdateAsync()
{
try
{
var metrics = await _monitoringService.GetCurrentMetricsAsync();
await _hubContext.Clients.Group("MonitoringClients").SendAsync("MetricsUpdate", metrics);
_logger.LogDebug("Sent real-time metrics update to {ClientCount} clients",
await GetConnectedClientCountAsync());
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to send real-time metrics update");
}
}
public async Task SendAlertAsync(Alert alert)
{
try
{
await _hubContext.Clients.Group("MonitoringClients").SendAsync("Alert", alert);
_logger.LogInformation("Sent alert: {AlertTitle} to monitoring clients", alert.Title);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to send alert");
}
}
public async Task SendPerformanceUpdateAsync(PerformanceMetrics metrics)
{
try
{
await _hubContext.Clients.Group("Metrics-Performance").SendAsync("PerformanceUpdate", metrics);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to send performance update");
}
}
private async Task<int> GetConnectedClientCountAsync()
{
// This would typically query your hub context for connected clients
return 1; // Simplified for example
}
}
6. Distributed Tracing & Correlation
Implementing Distributed Tracing
// Middleware/CorrelationIdMiddleware.cs
public class CorrelationIdMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<CorrelationIdMiddleware> _logger;
public CorrelationIdMiddleware(RequestDelegate next, ILogger<CorrelationIdMiddleware> logger)
{
_next = next;
_logger = logger;
}
public async Task InvokeAsync(HttpContext context)
{
var correlationId = GetOrCreateCorrelationId(context);
// Add to response headers
context.Response.OnStarting(() =>
{
context.Response.Headers["X-Correlation-ID"] = correlationId;
return Task.CompletedTask;
});
// Add to logger scope
using (_logger.BeginScope(new Dictionary<string, object>
{
["CorrelationId"] = correlationId,
["RequestPath"] = context.Request.Path,
["UserAgent"] = context.Request.Headers["User-Agent"].ToString()
}))
{
var stopwatch = Stopwatch.StartNew();
try
{
_logger.LogInformation("Starting request {Method} {Path}",
context.Request.Method, context.Request.Path);
await _next(context);
stopwatch.Stop();
_logger.LogInformation("Completed request {Method} {Path} with status {StatusCode} in {ElapsedMs}ms",
context.Request.Method, context.Request.Path, context.Response.StatusCode, stopwatch.ElapsedMilliseconds);
}
catch (Exception ex)
{
stopwatch.Stop();
_logger.LogError(ex, "Request {Method} {Path} failed after {ElapsedMs}ms",
context.Request.Method, context.Request.Path, stopwatch.ElapsedMilliseconds);
throw;
}
}
}
private string GetOrCreateCorrelationId(HttpContext context)
{
if (context.Request.Headers.TryGetValue("X-Correlation-ID", out var correlationId) &&
!string.IsNullOrEmpty(correlationId))
{
return correlationId.ToString();
}
return Guid.NewGuid().ToString();
}
}
// Middleware/DistributedTracingMiddleware.cs
public class DistributedTracingMiddleware
{
private readonly RequestDelegate _next;
private readonly ILogger<DistributedTracingMiddleware> _logger;
public DistributedTracingMiddleware(RequestDelegate next, ILogger<DistributedTracingMiddleware> logger)
{
_next = next;
_logger = logger;
}
public async Task InvokeAsync(HttpContext context, ITelemetryService telemetryService)
{
var operationId = context.TraceIdentifier;
var parentId = context.Request.Headers["Request-Id"].FirstOrDefault();
using var activity = Activity.Current?.Source.StartActivity(
$"{context.Request.Method} {context.Request.Path}",
ActivityKind.Server,
parentId: parentId);
if (activity != null)
{
activity.SetTag("http.method", context.Request.Method);
activity.SetTag("http.url", context.Request.Path);
activity.SetTag("http.user_agent", context.Request.Headers["User-Agent"].ToString());
activity.SetTag("correlation.id", context.Request.Headers["X-Correlation-ID"].FirstOrDefault());
}
var stopwatch = Stopwatch.StartNew();
try
{
await _next(context);
stopwatch.Stop();
if (activity != null)
{
activity.SetTag("http.status_code", context.Response.StatusCode);
activity.SetTag("duration.ms", stopwatch.ElapsedMilliseconds);
}
telemetryService.TrackRequest(
$"{context.Request.Method} {context.Request.Path}",
DateTime.UtcNow - stopwatch.Elapsed,
stopwatch.Elapsed,
context.Response.StatusCode.ToString(),
true);
}
catch (Exception ex)
{
stopwatch.Stop();
if (activity != null)
{
activity.SetTag("error", true);
activity.SetTag("error.message", ex.Message);
}
telemetryService.TrackRequest(
$"{context.Request.Method} {context.Request.Path}",
DateTime.UtcNow - stopwatch.Elapsed,
stopwatch.Elapsed,
"500",
false);
telemetryService.TrackException(ex, new Dictionary<string, string>
{
["OperationId"] = operationId,
["Path"] = context.Request.Path,
["Method"] = context.Request.Method
});
throw;
}
}
}
// HttpClient with distributed tracing
public class TracingHttpClientHandler : DelegatingHandler
{
private readonly ILogger<TracingHttpClientHandler> _logger;
public TracingHttpClientHandler(ILogger<TracingHttpClientHandler> logger)
{
_logger = logger;
}
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
var activity = Activity.Current;
if (activity != null)
{
// Add tracing headers
request.Headers.Add("Request-Id", activity.Id);
request.Headers.Add("traceparent", activity.Id);
if (activity.ParentId != null)
{
request.Headers.Add("Request-Id", activity.ParentId);
}
}
// Add correlation ID
var correlationId = Activity.Current?.GetBaggageItem("correlation.id") ?? Guid.NewGuid().ToString();
request.Headers.Add("X-Correlation-ID", correlationId);
var stopwatch = Stopwatch.StartNew();
_logger.LogInformation("Starting HTTP request {Method} {Url}",
request.Method, request.RequestUri);
try
{
var response = await base.SendAsync(request, cancellationToken);
stopwatch.Stop();
_logger.LogInformation("Completed HTTP request {Method} {Url} with status {StatusCode} in {ElapsedMs}ms",
request.Method, request.RequestUri, (int)response.StatusCode, stopwatch.ElapsedMilliseconds);
return response;
}
catch (Exception ex)
{
stopwatch.Stop();
_logger.LogError(ex, "HTTP request {Method} {Url} failed after {ElapsedMs}ms",
request.Method, request.RequestUri, stopwatch.ElapsedMilliseconds);
throw;
}
}
}
// Registration in Program.cs
var builder = WebApplication.CreateBuilder(args);
// Add HTTP client with tracing
builder.Services.AddHttpClient("TracingClient")
.AddHttpMessageHandler<TracingHttpClientHandler>();
// Add middleware
var app = builder.Build();
app.UseMiddleware<CorrelationIdMiddleware>();
app.UseMiddleware<DistributedTracingMiddleware>();
7. Performance Counters & Metrics
Custom Performance Counters
// Services/PerformanceCounterService.cs
public interface IPerformanceCounterService
{
void IncrementRequestCounter(string endpoint);
void IncrementErrorCounter(string endpoint, string errorType);
void RecordResponseTime(string endpoint, TimeSpan duration);
void RecordBusinessMetric(string metricName, double value);
PerformanceCounters GetCurrentCounters();
void ResetCounters();
}
public class PerformanceCounterService : IPerformanceCounterService
{
private readonly ConcurrentDictionary<string, AtomicLong> _requestCounters;
private readonly ConcurrentDictionary<string, AtomicLong> _errorCounters;
private readonly ConcurrentDictionary<string, AtomicDouble> _responseTimeSums;
private readonly ConcurrentDictionary<string, AtomicLong> _responseTimeCounts;
private readonly ConcurrentDictionary<string, AtomicDouble> _businessMetrics;
private readonly ILogger<PerformanceCounterService> _logger;
public PerformanceCounterService(ILogger<PerformanceCounterService> logger)
{
_requestCounters = new ConcurrentDictionary<string, AtomicLong>();
_errorCounters = new ConcurrentDictionary<string, AtomicLong>();
_responseTimeSums = new ConcurrentDictionary<string, AtomicDouble>();
_responseTimeCounts = new ConcurrentDictionary<string, AtomicLong>();
_businessMetrics = new ConcurrentDictionary<string, AtomicDouble>();
_logger = logger;
}
public void IncrementRequestCounter(string endpoint)
{
var counter = _requestCounters.GetOrAdd(endpoint, _ => new AtomicLong());
counter.Increment();
_logger.LogDebug("Incremented request counter for {Endpoint}: {Count}", endpoint, counter.Value);
}
public void IncrementErrorCounter(string endpoint, string errorType)
{
var key = $"{endpoint}:{errorType}";
var counter = _errorCounters.GetOrAdd(key, _ => new AtomicLong());
counter.Increment();
_logger.LogWarning("Incremented error counter for {Endpoint}/{ErrorType}: {Count}",
endpoint, errorType, counter.Value);
}
public void RecordResponseTime(string endpoint, TimeSpan duration)
{
var sumCounter = _responseTimeSums.GetOrAdd(endpoint, _ => new AtomicDouble());
var countCounter = _responseTimeCounts.GetOrAdd(endpoint, _ => new AtomicLong());
sumCounter.Add(duration.TotalMilliseconds);
countCounter.Increment();
_logger.LogDebug("Recorded response time for {Endpoint}: {DurationMs}ms", endpoint, duration.TotalMilliseconds);
}
public void RecordBusinessMetric(string metricName, double value)
{
var metric = _businessMetrics.GetOrAdd(metricName, _ => new AtomicDouble());
metric.Set(value);
_logger.LogInformation("Recorded business metric {MetricName}: {Value}", metricName, value);
}
public PerformanceCounters GetCurrentCounters()
{
var counters = new PerformanceCounters
{
Timestamp = DateTime.UtcNow,
RequestCounts = _requestCounters.ToDictionary(kvp => kvp.Key, kvp => kvp.Value.Value),
ErrorCounts = _errorCounters.ToDictionary(kvp => kvp.Key, kvp => kvp.Value.Value),
AverageResponseTimes = _responseTimeSums.ToDictionary(
kvp => kvp.Key,
kvp =>
{
var sum = kvp.Value.Value;
var count = _responseTimeCounts.GetValueOrDefault(kvp.Key)?.Value ?? 0;
return count > 0 ? sum / count : 0;
}),
BusinessMetrics = _businessMetrics.ToDictionary(kvp => kvp.Key, kvp => kvp.Value.Value)
};
return counters;
}
public void ResetCounters()
{
_requestCounters.Clear();
_errorCounters.Clear();
_responseTimeSums.Clear();
_responseTimeCounts.Clear();
_businessMetrics.Clear();
_logger.LogInformation("Performance counters reset");
}
}
// Thread-safe atomic counters
public class AtomicLong
{
private long _value;
public long Value => Interlocked.Read(ref _value);
public long Increment()
{
return Interlocked.Increment(ref _value);
}
public long Decrement()
{
return Interlocked.Decrement(ref _value);
}
public void Set(long value)
{
Interlocked.Exchange(ref _value, value);
}
}
public class AtomicDouble
{
private double _value;
public double Value => Interlocked.CompareExchange(ref _value, 0.0, 0.0);
public double Add(double value)
{
double initial, computed;
do
{
initial = Value;
computed = initial + value;
} while (Math.Abs(Interlocked.CompareExchange(ref _value, computed, initial) - initial) > double.Epsilon);
return computed;
}
public void Set(double value)
{
Interlocked.Exchange(ref _value, value);
}
}
// Performance counters model
public class PerformanceCounters
{
public DateTime Timestamp { get; set; }
public Dictionary<string, long> RequestCounts { get; set; } = new();
public Dictionary<string, long> ErrorCounts { get; set; } = new();
public Dictionary<string, double> AverageResponseTimes { get; set; } = new();
public Dictionary<string, double> BusinessMetrics { get; set; } = new();
}
Metrics Collection Middleware
// Middleware/MetricsCollectionMiddleware.cs
public class MetricsCollectionMiddleware
{
private readonly RequestDelegate _next;
private readonly IPerformanceCounterService _counters;
private readonly ILogger<MetricsCollectionMiddleware> _logger;
public MetricsCollectionMiddleware(RequestDelegate next, IPerformanceCounterService counters, ILogger<MetricsCollectionMiddleware> logger)
{
_next = next;
_counters = counters;
_logger = logger;
}
public async Task InvokeAsync(HttpContext context)
{
var stopwatch = Stopwatch.StartNew();
var endpoint = $"{context.Request.Method} {context.Request.Path}";
try
{
_counters.IncrementRequestCounter(endpoint);
await _next(context);
stopwatch.Stop();
_counters.RecordResponseTime(endpoint, stopwatch.Elapsed);
if (context.Response.StatusCode >= 400)
{
_counters.IncrementErrorCounter(endpoint, $"HTTP_{context.Response.StatusCode}");
}
}
catch (Exception ex)
{
stopwatch.Stop();
_counters.IncrementErrorCounter(endpoint, ex.GetType().Name);
_counters.RecordResponseTime(endpoint, stopwatch.Elapsed);
_logger.LogError(ex, "Request failed: {Endpoint}", endpoint);
throw;
}
}
}
// Background service for metrics aggregation
public class MetricsAggregationService : BackgroundService
{
private readonly IPerformanceCounterService _counters;
private readonly ITelemetryService _telemetry;
private readonly ILogger<MetricsAggregationService> _logger;
private readonly TimeSpan _aggregationInterval;
public MetricsAggregationService(IPerformanceCounterService counters, ITelemetryService telemetry, ILogger<MetricsAggregationService> logger)
{
_counters = counters;
_telemetry = telemetry;
_logger = logger;
_aggregationInterval = TimeSpan.FromMinutes(1);
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
_logger.LogInformation("Metrics aggregation service started");
while (!stoppingToken.IsCancellationRequested)
{
try
{
await Task.Delay(_aggregationInterval, stoppingToken);
var counters = _counters.GetCurrentCounters();
await PublishMetricsAsync(counters);
_logger.LogDebug("Published {MetricCount} metrics",
counters.RequestCounts.Count + counters.BusinessMetrics.Count);
}
catch (OperationCanceledException)
{
break;
}
catch (Exception ex)
{
_logger.LogError(ex, "Error in metrics aggregation service");
}
}
_logger.LogInformation("Metrics aggregation service stopped");
}
private async Task PublishMetricsAsync(PerformanceCounters counters)
{
// Publish request counts
foreach (var (endpoint, count) in counters.RequestCounts)
{
_telemetry.TrackMetric($"Requests.{endpoint}", count, new Dictionary<string, string>
{
["Endpoint"] = endpoint,
["Aggregation"] = "Total"
});
}
// Publish error counts
foreach (var (errorKey, count) in counters.ErrorCounts)
{
var parts = errorKey.Split(':');
var endpoint = parts[0];
var errorType = parts.Length > 1 ? parts[1] : "Unknown";
_telemetry.TrackMetric($"Errors.{endpoint}", count, new Dictionary<string, string>
{
["Endpoint"] = endpoint,
["ErrorType"] = errorType
});
}
// Publish response times
foreach (var (endpoint, avgTime) in counters.AverageResponseTimes)
{
_telemetry.TrackMetric($"ResponseTime.{endpoint}", avgTime, new Dictionary<string, string>
{
["Endpoint"] = endpoint,
["Aggregation"] = "Average"
});
}
// Publish business metrics
foreach (var (metricName, value) in counters.BusinessMetrics)
{
_telemetry.TrackBusinessMetric(metricName, value, "Business");
}
await Task.CompletedTask;
}
}
8. Alerting & Notification Systems
Smart Alerting System
// Services/AlertingService.cs
public interface IAlertingService
{
Task CheckAndSendAlertsAsync();
Task SendImmediateAlertAsync(Alert alert);
Task<List<AlertRule>> GetActiveRulesAsync();
Task<AlertRule> CreateRuleAsync(AlertRule rule);
}
public class AlertingService : IAlertingService
{
private readonly List<AlertRule> _alertRules;
private readonly IMonitoringService _monitoringService;
private readonly INotificationService _notificationService;
private readonly ILogger<AlertingService> _logger;
public AlertingService(IMonitoringService monitoringService, INotificationService notificationService, ILogger<AlertingService> logger)
{
_monitoringService = monitoringService;
_notificationService = notificationService;
_logger = logger;
_alertRules = LoadDefaultRules();
}
public async Task CheckAndSendAlertsAsync()
{
_logger.LogInformation("Starting alert check cycle");
var activeAlerts = new List<Alert>();
foreach (var rule in _alertRules.Where(r => r.IsEnabled))
{
try
{
var alert = await EvaluateRuleAsync(rule);
if (alert != null)
{
activeAlerts.Add(alert);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to evaluate alert rule: {RuleName}", rule.Name);
}
}
foreach (var alert in activeAlerts)
{
await SendAlertAsync(alert);
}
_logger.LogInformation("Alert check cycle completed. Found {AlertCount} alerts", activeAlerts.Count);
}
public async Task SendImmediateAlertAsync(Alert alert)
{
await SendAlertAsync(alert);
_logger.LogInformation("Sent immediate alert: {AlertTitle}", alert.Title);
}
public Task<List<AlertRule>> GetActiveRulesAsync()
{
return Task.FromResult(_alertRules.Where(r => r.IsEnabled).ToList());
}
public Task<AlertRule> CreateRuleAsync(AlertRule rule)
{
rule.Id = Guid.NewGuid();
_alertRules.Add(rule);
_logger.LogInformation("Created new alert rule: {RuleName}", rule.Name);
return Task.FromResult(rule);
}
private async Task<Alert> EvaluateRuleAsync(AlertRule rule)
{
var metricValue = await GetMetricValueAsync(rule.MetricName, rule.TimeWindow);
if (ShouldTriggerAlert(rule, metricValue))
{
return new Alert
{
Id = Guid.NewGuid(),
RuleId = rule.Id,
Type = rule.AlertType,
Severity = rule.Severity,
Title = rule.Name,
Message = FormatAlertMessage(rule, metricValue),
Timestamp = DateTime.UtcNow,
MetricName = rule.MetricName,
MetricValue = metricValue,
IsAcknowledged = false
};
}
return null;
}
private async Task<double> GetMetricValueAsync(string metricName, TimeSpan timeWindow)
{
// This would query your monitoring system for the metric value
// For example, from Application Insights, Prometheus, or custom counters
switch (metricName)
{
case "ErrorRate":
return await _monitoringService.GetErrorRateAsync(timeWindow);
case "ResponseTime":
return await _monitoringService.GetAverageResponseTimeAsync(timeWindow);
case "CpuUsage":
return await _monitoringService.GetCpuUsageAsync();
case "MemoryUsage":
return await _monitoringService.GetMemoryUsageAsync();
default:
return await _monitoringService.GetCustomMetricAsync(metricName, timeWindow);
}
}
private bool ShouldTriggerAlert(AlertRule rule, double metricValue)
{
return rule.Operator switch
{
AlertOperator.GreaterThan => metricValue > rule.Threshold,
AlertOperator.GreaterThanOrEqual => metricValue >= rule.Threshold,
AlertOperator.LessThan => metricValue < rule.Threshold,
AlertOperator.LessThanOrEqual => metricValue <= rule.Threshold,
AlertOperator.Equal => Math.Abs(metricValue - rule.Threshold) < double.Epsilon,
_ => false
};
}
private string FormatAlertMessage(AlertRule rule, double metricValue)
{
return $"{rule.Description}. Current value: {metricValue:F2}, Threshold: {rule.Threshold}";
}
private async Task SendAlertAsync(Alert alert)
{
// Send to multiple notification channels
var tasks = new List<Task>
{
_notificationService.SendEmailAsync(alert),
_notificationService.SendSlackAsync(alert),
_notificationService.SendTeamsAsync(alert),
_notificationService.CreateIncidentAsync(alert)
};
await Task.WhenAll(tasks);
_logger.LogInformation("Alert sent: {AlertTitle} to {ChannelCount} channels", alert.Title, tasks.Count);
}
private List<AlertRule> LoadDefaultRules()
{
return new List<AlertRule>
{
new()
{
Id = Guid.NewGuid(),
Name = "High Error Rate",
Description = "Error rate exceeds 5%",
MetricName = "ErrorRate",
Operator = AlertOperator.GreaterThan,
Threshold = 0.05,
TimeWindow = TimeSpan.FromMinutes(5),
AlertType = AlertType.Error,
Severity = AlertSeverity.High,
IsEnabled = true
},
new()
{
Id = Guid.NewGuid(),
Name = "Slow Response Time",
Description = "Average response time exceeds 2 seconds",
MetricName = "ResponseTime",
Operator = AlertOperator.GreaterThan,
Threshold = 2000, // milliseconds
TimeWindow = TimeSpan.FromMinutes(5),
AlertType = AlertType.Performance,
Severity = AlertSeverity.Medium,
IsEnabled = true
},
new()
{
Id = Guid.NewGuid(),
Name = "High CPU Usage",
Description = "CPU usage exceeds 80%",
MetricName = "CpuUsage",
Operator = AlertOperator.GreaterThan,
Threshold = 0.8,
TimeWindow = TimeSpan.FromMinutes(5),
AlertType = AlertType.Infrastructure,
Severity = AlertSeverity.Medium,
IsEnabled = true
}
};
}
}
// Alert models
public class AlertRule
{
public Guid Id { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public string MetricName { get; set; }
public AlertOperator Operator { get; set; }
public double Threshold { get; set; }
public TimeSpan TimeWindow { get; set; }
public AlertType AlertType { get; set; }
public AlertSeverity Severity { get; set; }
public bool IsEnabled { get; set; }
public TimeSpan? CooldownPeriod { get; set; }
}
public enum AlertOperator
{
GreaterThan,
GreaterThanOrEqual,
LessThan,
LessThanOrEqual,
Equal
}
// Notification service
public interface INotificationService
{
Task SendEmailAsync(Alert alert);
Task SendSlackAsync(Alert alert);
Task SendTeamsAsync(Alert alert);
Task CreateIncidentAsync(Alert alert);
}
public class NotificationService : INotificationService
{
private readonly ILogger<NotificationService> _logger;
private readonly IEmailService _emailService;
private readonly ISlackService _slackService;
private readonly ITeamsService _teamsService;
private readonly IIncidentService _incidentService;
public NotificationService(ILogger<NotificationService> logger, IEmailService emailService,
ISlackService slackService, ITeamsService teamsService, IIncidentService incidentService)
{
_logger = logger;
_emailService = emailService;
_slackService = slackService;
_teamsService = teamsService;
_incidentService = incidentService;
}
public async Task SendEmailAsync(Alert alert)
{
try
{
var subject = $"[{alert.Severity}] {alert.Title}";
var body = $@"
<h2>Alert: {alert.Title}</h2>
<p><strong>Message:</strong> {alert.Message}</p>
<p><strong>Severity:</strong> {alert.Severity}</p>
<p><strong>Timestamp:</strong> {alert.Timestamp:yyyy-MM-dd HH:mm:ss UTC}</p>
<p><strong>Metric:</strong> {alert.MetricName} = {alert.MetricValue:F2}</p>
<p>Please investigate this issue promptly.</p>";
await _emailService.SendAsync("[email protected]", subject, body);
_logger.LogInformation("Alert email sent for: {AlertTitle}", alert.Title);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to send alert email for: {AlertTitle}", alert.Title);
}
}
public async Task SendSlackAsync(Alert alert)
{
try
{
var message = new SlackMessage
{
Channel = "#alerts",
Text = $":warning: *{alert.Title}*",
Attachments = new[]
{
new SlackAttachment
{
Color = GetSlackColor(alert.Severity),
Fields = new[]
{
new SlackField { Title = "Message", Value = alert.Message },
new SlackField { Title = "Severity", Value = alert.Severity.ToString() },
new SlackField { Title = "Metric", Value = $"{alert.MetricName}: {alert.MetricValue:F2}" },
new SlackField { Title = "Timestamp", Value = alert.Timestamp.ToString("yyyy-MM-dd HH:mm:ss UTC") }
}
}
}
};
await _slackService.SendMessageAsync(message);
_logger.LogInformation("Alert Slack message sent for: {AlertTitle}", alert.Title);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to send Slack alert for: {AlertTitle}", alert.Title);
}
}
public async Task SendTeamsAsync(Alert alert)
{
// Similar implementation for Microsoft Teams
await Task.CompletedTask;
}
public async Task CreateIncidentAsync(Alert alert)
{
try
{
if (alert.Severity >= AlertSeverity.High)
{
var incident = new Incident
{
Title = alert.Title,
Description = alert.Message,
Severity = alert.Severity,
CreatedAt = DateTime.UtcNow,
Status = IncidentStatus.Open
};
await _incidentService.CreateAsync(incident);
_logger.LogInformation("Incident created for alert: {AlertTitle}", alert.Title);
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to create incident for alert: {AlertTitle}", alert.Title);
}
}
private string GetSlackColor(AlertSeverity severity)
{
return severity switch
{
AlertSeverity.Low => "#36a64f", // Green
AlertSeverity.Medium => "#ffcc00", // Yellow
AlertSeverity.High => "#ff9900", // Orange
AlertSeverity.Critical => "#ff0000", // Red
_ => "#757575" // Gray
};
}
}
9. Security Monitoring & Compliance
Security Event Monitoring
// Services/SecurityMonitoringService.cs
public interface ISecurityMonitoringService
{
Task LogSecurityEventAsync(SecurityEvent securityEvent);
Task<List<SecurityEvent>> GetRecentSecurityEventsAsync(TimeSpan timeWindow);
Task<bool> CheckForSuspiciousActivityAsync(string userId);
Task GenerateSecurityReportAsync(DateTime startDate, DateTime endDate);
}
public class SecurityMonitoringService : ISecurityMonitoringService
{
private readonly ILogger<SecurityMonitoringService> _logger;
private readonly ITelemetryService _telemetry;
private readonly ApplicationDbContext _context;
private readonly List<SecurityEvent> _securityEvents;
public SecurityMonitoringService(ILogger<SecurityMonitoringService> logger, ITelemetryService telemetry, ApplicationDbContext context)
{
_logger = logger;
_telemetry = telemetry;
_context = context;
_securityEvents = new List<SecurityEvent>();
}
public async Task LogSecurityEventAsync(SecurityEvent securityEvent)
{
try
{
securityEvent.Id = Guid.NewGuid();
securityEvent.Timestamp = DateTime.UtcNow;
securityEvent.IpAddress = GetClientIpAddress();
// Add to in-memory collection for real-time monitoring
_securityEvents.Add(securityEvent);
// Track in telemetry
_telemetry.TrackEvent("SecurityEvent", new Dictionary<string, string>
{
["EventType"] = securityEvent.EventType.ToString(),
["Severity"] = securityEvent.Severity.ToString(),
["UserId"] = securityEvent.UserId,
["Resource"] = securityEvent.Resource,
["Description"] = securityEvent.Description
});
// Log to security log
_logger.LogWarning("Security event: {EventType} - {Description} - User: {UserId} - IP: {IpAddress}",
securityEvent.EventType, securityEvent.Description, securityEvent.UserId, securityEvent.IpAddress);
// Check for suspicious activity
if (securityEvent.Severity >= SecurityEventSeverity.Medium)
{
await CheckForSuspiciousActivityAsync(securityEvent.UserId);
}
// Persist to database for long-term storage
await _context.SecurityEvents.AddAsync(securityEvent);
await _context.SaveChangesAsync();
_logger.LogInformation("Security event logged: {EventId}", securityEvent.Id);
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to log security event");
}
}
public Task<List<SecurityEvent>> GetRecentSecurityEventsAsync(TimeSpan timeWindow)
{
var cutoff = DateTime.UtcNow - timeWindow;
var events = _securityEvents
.Where(e => e.Timestamp >= cutoff)
.OrderByDescending(e => e.Timestamp)
.ToList();
return Task.FromResult(events);
}
public async Task<bool> CheckForSuspiciousActivityAsync(string userId)
{
if (string.IsNullOrEmpty(userId)) return false;
var timeWindow = TimeSpan.FromMinutes(30);
var recentEvents = await GetRecentSecurityEventsAsync(timeWindow);
var userEvents = recentEvents
.Where(e => e.UserId == userId && e.Severity >= SecurityEventSeverity.Medium)
.ToList();
var suspiciousEventCount = userEvents.Count;
var failedLoginCount = userEvents.Count(e => e.EventType == SecurityEventType.FailedLogin);
// Check for brute force attacks
if (failedLoginCount > 5)
{
await LogSecurityEventAsync(new SecurityEvent
{
EventType = SecurityEventType.BruteForceAttempt,
Severity = SecurityEventSeverity.High,
UserId = userId,
Resource = "Authentication",
Description = $"Possible brute force attack detected: {failedLoginCount} failed login attempts in {timeWindow.TotalMinutes} minutes"
});
return true;
}
// Check for multiple security events in short time
if (suspiciousEventCount > 10)
{
await LogSecurityEventAsync(new SecurityEvent
{
EventType = SecurityEventType.SuspiciousActivity,
Severity = SecurityEventSeverity.Medium,
UserId = userId,
Resource = "Account",
Description = $"Suspicious activity detected: {suspiciousEventCount} security events in {timeWindow.TotalMinutes} minutes"
});
return true;
}
return false;
}
public async Task GenerateSecurityReportAsync(DateTime startDate, DateTime endDate)
{
var events = await _context.SecurityEvents
.Where(e => e.Timestamp >= startDate && e.Timestamp <= endDate)
.ToListAsync();
var report = new SecurityReport
{
Period = $"{startDate:yyyy-MM-dd} to {endDate:yyyy-MM-dd}",
TotalEvents = events.Count,
EventsByType = events.GroupBy(e => e.EventType)
.ToDictionary(g => g.Key.ToString(), g => g.Count()),
EventsBySeverity = events.GroupBy(e => e.Severity)
.ToDictionary(g => g.Key.ToString(), g => g.Count()),
TopUsers = events.GroupBy(e => e.UserId)
.OrderByDescending(g => g.Count())
.Take(10)
.ToDictionary(g => g.Key, g => g.Count()),
CommonResources = events.GroupBy(e => e.Resource)
.OrderByDescending(g => g.Count())
.Take(10)
.ToDictionary(g => g.Key, g => g.Count())
};
// Log the report
_logger.LogInformation("Security report generated: {EventCount} events, {HighSeverityCount} high severity events",
report.TotalEvents, report.EventsBySeverity.GetValueOrDefault(SecurityEventSeverity.High.ToString(), 0));
// Track report metrics
_telemetry.TrackBusinessMetric("SecurityEventsTotal", report.TotalEvents, "Security");
_telemetry.TrackBusinessMetric("SecurityEventsHighSeverity",
report.EventsBySeverity.GetValueOrDefault(SecurityEventSeverity.High.ToString(), 0), "Security");
}
private string GetClientIpAddress()
{
// Implementation to get client IP address from HTTP context
return "127.0.0.1"; // Simplified for example
}
}
// Security event models
public class SecurityEvent
{
public Guid Id { get; set; }
public SecurityEventType EventType { get; set; }
public SecurityEventSeverity Severity { get; set; }
public string UserId { get; set; }
public string Resource { get; set; }
public string Description { get; set; }
public string IpAddress { get; set; }
public DateTime Timestamp { get; set; }
public Dictionary<string, string> AdditionalData { get; set; } = new();
}
public enum SecurityEventType
{
LoginSuccess,
FailedLogin,
Logout,
PasswordChange,
PermissionChange,
DataAccess,
BruteForceAttempt,
SuspiciousActivity,
DataBreachAttempt,
SystemCompromise
}
public enum SecurityEventSeverity
{
Low,
Medium,
High,
Critical
}
public class SecurityReport
{
public string Period { get; set; }
public int TotalEvents { get; set; }
public Dictionary<string, int> EventsByType { get; set; } = new();
public Dictionary<string, int> EventsBySeverity { get; set; } = new();
public Dictionary<string, int> TopUsers { get; set; } = new();
public Dictionary<string, int> CommonResources { get; set; } = new();
public List<SecurityRecommendation> Recommendations { get; set; } = new();
}
Security Monitoring Middleware
// Middleware/SecurityMonitoringMiddleware.cs
public class SecurityMonitoringMiddleware
{
private readonly RequestDelegate _next;
private readonly ISecurityMonitoringService _securityService;
private readonly ILogger<SecurityMonitoringMiddleware> _logger;
public SecurityMonitoringMiddleware(RequestDelegate next, ISecurityMonitoringService securityService, ILogger<SecurityMonitoringMiddleware> logger)
{
_next = next;
_securityService = securityService;
_logger = logger;
}
public async Task InvokeAsync(HttpContext context)
{
// Skip security monitoring for health checks and monitoring endpoints
if (context.Request.Path.StartsWithSegments("/health") ||
context.Request.Path.StartsWithSegments("/monitoring"))
{
await _next(context);
return;
}
var stopwatch = Stopwatch.StartNew();
try
{
await _next(context);
stopwatch.Stop();
// Log successful security-relevant requests
if (IsSecurityRelevantRequest(context))
{
await LogSecurityEventAsync(context, SecurityEventSeverity.Low, stopwatch.Elapsed);
}
}
catch (UnauthorizedAccessException ex)
{
stopwatch.Stop();
await LogSecurityEventAsync(context, SecurityEventSeverity.Medium, stopwatch.Elapsed,
$"Unauthorized access attempt: {ex.Message}");
throw;
}
catch (Exception ex)
{
stopwatch.Stop();
// Log security-related exceptions
if (IsSecurityException(ex))
{
await LogSecurityEventAsync(context, SecurityEventSeverity.High, stopwatch.Elapsed,
$"Security exception: {ex.Message}");
}
throw;
}
}
private bool IsSecurityRelevantRequest(HttpContext context)
{
var path = context.Request.Path.ToString().ToLower();
return path.Contains("/api/auth") ||
path.Contains("/api/users") ||
path.Contains("/api/admin") ||
path.Contains("/api/security") ||
context.Request.Method == "POST" ||
context.Request.Method == "PUT" ||
context.Request.Method == "DELETE";
}
private bool IsSecurityException(Exception ex)
{
return ex is UnauthorizedAccessException ||
ex is System.Security.SecurityException ||
ex.Message.Contains("authorization", StringComparison.OrdinalIgnoreCase) ||
ex.Message.Contains("authentication", StringComparison.OrdinalIgnoreCase) ||
ex.Message.Contains("access denied", StringComparison.OrdinalIgnoreCase);
}
private async Task LogSecurityEventAsync(HttpContext context, SecurityEventSeverity severity, TimeSpan duration, string additionalDescription = null)
{
var userId = context.User.Identity.Name ?? "anonymous";
var description = $"{context.Request.Method} {context.Request.Path} completed in {duration.TotalMilliseconds}ms";
if (!string.IsNullOrEmpty(additionalDescription))
{
description += $". {additionalDescription}";
}
var securityEvent = new SecurityEvent
{
EventType = GetSecurityEventType(context),
Severity = severity,
UserId = userId,
Resource = context.Request.Path,
Description = description,
AdditionalData =
{
["HttpMethod"] = context.Request.Method,
["StatusCode"] = context.Response.StatusCode.ToString(),
["DurationMs"] = duration.TotalMilliseconds.ToString(),
["UserAgent"] = context.Request.Headers["User-Agent"].ToString()
}
};
await _securityService.LogSecurityEventAsync(securityEvent);
}
private SecurityEventType GetSecurityEventType(HttpContext context)
{
if (context.Request.Path.ToString().Contains("/auth/login"))
{
return context.Response.StatusCode == 200 ?
SecurityEventType.LoginSuccess : SecurityEventType.FailedLogin;
}
if (context.Request.Path.ToString().Contains("/auth/logout"))
{
return SecurityEventType.Logout;
}
return SecurityEventType.DataAccess;
}
}
10. Production Deployment & Best Practices
Production-Ready Configuration
// Program.cs - Production Configuration
using Azure.Identity;
var builder = WebApplication.CreateBuilder(args);
// Configuration for production
if (builder.Environment.IsProduction())
{
// Use Azure Key Vault for secrets
builder.Configuration.AddAzureKeyVault(
new Uri($"https://{builder.Configuration["KeyVaultName"]}.vault.azure.net/"),
new DefaultAzureCredential());
// Enhanced logging for production
builder.Logging.AddAzureWebAppDiagnostics();
builder.Logging.AddApplicationInsights();
// Production health checks with longer timeouts
builder.Services.AddHealthChecks()
.AddSqlServer(
builder.Configuration.GetConnectionString("DefaultConnection"),
timeout: TimeSpan.FromSeconds(30))
.AddAzureServiceBusQueue(
builder.Configuration.GetConnectionString("ServiceBus"),
"orders")
.AddAzureBlobStorage(
builder.Configuration.GetConnectionString("AzureStorage"))
.AddApplicationInsightsPublisher();
}
// Production middleware configuration
var app = builder.Build();
if (app.Environment.IsProduction())
{
app.UseExceptionHandler("/Error");
app.UseHsts();
app.UseHttpsRedirection();
// Security headers
app.Use(async (context, next) =>
{
context.Response.Headers.Add("X-Content-Type-Options", "nosniff");
context.Response.Headers.Add("X-Frame-Options", "DENY");
context.Response.Headers.Add("X-XSS-Protection", "1; mode=block");
context.Response.Headers.Add("Referrer-Policy", "strict-origin-when-cross-origin");
context.Response.Headers.Add("Content-Security-Policy", "default-src 'self'");
await next();
});
}
app.UseRouting();
app.UseAuthentication();
app.UseAuthorization();
// Custom middleware for production
app.UseMiddleware<CorrelationIdMiddleware>();
app.UseMiddleware<MetricsCollectionMiddleware>();
app.UseMiddleware<SecurityMonitoringMiddleware>();
app.UseMiddleware<DistributedTracingMiddleware>();
app.MapControllers();
app.MapHealthChecks("/health", new HealthCheckOptions
{
Predicate = _ => true,
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
app.MapHealthChecksUI();
app.Run();
// Production monitoring startup
public static class ApplicationMonitoring
{
public static void Start(IServiceProvider serviceProvider)
{
var logger = serviceProvider.GetRequiredService<ILogger<Program>>();
var monitoringService = serviceProvider.GetRequiredService<IRealTimeMonitoringService>();
var alertingService = serviceProvider.GetRequiredService<IAlertingService>();
// Start real-time monitoring
monitoringService.StartRealTimeUpdates(TimeSpan.FromSeconds(30));
// Start alerting service
_ = Task.Run(async () =>
{
while (true)
{
try
{
await alertingService.CheckAndSendAlertsAsync();
await Task.Delay(TimeSpan.FromMinutes(1));
}
catch (Exception ex)
{
logger.LogError(ex, "Alerting service encountered an error");
await Task.Delay(TimeSpan.FromSeconds(30));
}
}
});
logger.LogInformation("Application monitoring services started");
}
}
Docker Production Configuration
# Dockerfile for Production
FROM mcr.microsoft.com/dotnet/aspnet:7.0 AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443
# Install monitoring tools
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
procps \
&& rm -rf /var/lib/apt/lists/*
FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /src
COPY ["ECommerceApp/ECommerceApp.csproj", "ECommerceApp/"]
RUN dotnet restore "ECommerceApp/ECommerceApp.csproj"
COPY . .
WORKDIR "/src/ECommerceApp"
RUN dotnet build "ECommerceApp.csproj" -c Release -o /app/build
FROM build AS publish
RUN dotnet publish "ECommerceApp.csproj" -c Release -o /app/publish
FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
# Create directories for logs and monitoring
RUN mkdir -p /app/logs /app/monitoring
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:80/health || exit 1
ENTRYPOINT ["dotnet", "ECommerceApp.dll"]
Kubernetes Deployment with Monitoring
# kubernetes/monitoring-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ecommerce-app
labels:
app: ecommerce
monitoring: "true"
spec:
replicas: 3
selector:
matchLabels:
app: ecommerce
template:
metadata:
labels:
app: ecommerce
monitoring: "true"
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
spec:
containers:
- name: ecommerce-app
image: myregistry.azurecr.io/ecommerce-app:latest
ports:
- containerPort: 80
env:
- name: ASPNETCORE_ENVIRONMENT
value: "Production"
- name: ApplicationInsights__ConnectionString
valueFrom:
secretKeyRef:
name: app-secrets
key: applicationinsights-connectionstring
- name: Serilog__WriteTo__2__Args__connectionString
valueFrom:
secretKeyRef:
name: app-secrets
key: applicationinsights-connectionstring
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /health/startup
port: 80
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 10
volumeMounts:
- name: log-volume
mountPath: /app/logs
- name: monitoring-volume
mountPath: /app/monitoring
volumes:
- name: log-volume
persistentVolumeClaim:
claimName: log-pvc
- name: monitoring-volume
emptyDir: {}
---
# Service
apiVersion: v1
kind: Service
metadata:
name: ecommerce-service
labels:
app: ecommerce
monitoring: "true"
spec:
selector:
app: ecommerce
ports:
- port: 80
targetPort: 80
type: LoadBalancer
---
# Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ecommerce-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ecommerce-app
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Best Practices Summary
Monitoring Best Practices
Implement structured logging with correlation IDs
Use health checks for all dependencies
Set up comprehensive alerting with proper thresholds
Monitor both technical and business metrics
Implement distributed tracing for microservices
Performance Considerations
Use async logging to avoid blocking
Implement sampling for high-volume telemetry
Cache frequently accessed monitoring data
Use background services for metric aggregation
Security Considerations
Don't log sensitive information
Implement security event monitoring
Use secure channels for monitoring data
Regularly review and update alerting rules
Operational Excellence
Document all monitoring and alerting procedures
Establish incident response processes
Regularly test monitoring systems
Review and optimize alert thresholds
This comprehensive guide provides everything needed to implement enterprise-grade monitoring and logging in ASP.NET Core applications. The examples cover real-world scenarios from basic logging to advanced distributed tracing and alerting systems, ensuring your applications remain observable, reliable, and maintainable in production environments.