Table of Contents
Introduction
Why Observability Isn’t Optional in Serverless
Real-World Scenario: Real-Time Fraud Detection in Digital Banking
Enabling Application Insights for Azure Functions
Logging Custom Metrics from Function Code
End-to-End Monitoring Dashboard Setup
Best Practices for Enterprise Observability
Conclusion
Introduction
In the world of serverless computing, you don’t manage infrastructure—but you absolutely must manage observability. Without visibility into execution duration, failure rates, dependency latency, and custom business metrics, your Azure Functions become black boxes. And in production, black boxes fail silently—until they cost you customers, compliance, or capital.
This article cuts through abstraction with a real-time fraud detection system in digital banking, demonstrates how to embed Application Insights from day one, and shows you how to log custom metrics that align with business KPIs—all with production-ready, error-free code.
Why Observability Isn’t Optional in Serverless
Azure Functions abstract away servers, but not responsibility. When a payment transaction is processed in 80ms versus 1.2 seconds, or when a fraud alert is missed due to an unlogged exception, the impact is financial and reputational.
Observability in serverless means:
Tracking every invocation’s duration, success, and dependencies
Correlating requests across microservices using operation IDs
Emitting custom metrics (e.g., “high-risk transactions blocked”)
Alerting on anomalies before users notice
Without this, you’re flying blind.
Real-World Scenario: Real-Time Fraud Detection in Digital Banking
A Tier-1 European bank processes 2.4 million digital transactions daily through its mobile app. Each transaction triggers an Azure Function that:
Validates the transaction
Scores fraud risk using a machine learning model
Blocks or allows the transaction in < 150ms
Business Requirement
P95 latency ≤ 120ms (to avoid user abandonment)
Zero undetected high-risk transactions
Real-time dashboard for the fraud ops team showing blocked transactions per minute
The Crisis
During a holiday sale surge, the team noticed spikes in payment timeouts—but logs showed no errors. Only after enabling deep telemetry did they discover:
Cold starts are adding 900ms during off-peak scaling
ML model loading is taking 300ms on the first invocation
No visibility into “transactions blocked due to risk score > 0.95”
The fix wasn’t code—it was observability.
![PlantUML Diagram]()
Enabling Application Insights for Azure Functions
Application Insights is automatically integrated when you create a Function App in the Azure portal—but for IaC-driven enterprises, you must enable it explicitly.
Bicep Deployment with App Insights
// infra.bicep
param location string = 'westeurope'
param appName string = 'fraud-detection-fn'
resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
name: '${appName}-ai'
location: location
kind: 'web'
properties: {
Application_Type: 'web'
Request_Source: 'IbizaWebAppExtensionCreate'
}
}
resource storage 'Microsoft.Storage/storageAccounts@2023-01-01' = {
name: replace('${appName}storage', '-', '')
location: location
sku: { name: 'Standard_LRS' }
kind: 'StorageV2'
}
resource plan 'Microsoft.Web/serverfarms@2022-09-01' = {
name: '${appName}-asp'
location: location
sku: { name: 'EP1' } // Premium plan for low-latency
}
resource functionApp 'Microsoft.Web/sites@2022-09-01' = {
name: appName
location: location
kind: 'functionapp'
properties: {
serverFarmId: plan.id
siteConfig: {
appSettings: [
{
name: 'APPLICATIONINSIGHTS_CONNECTION_STRING'
value: appInsights.properties.ConnectionString
}
{
name: 'FUNCTIONS_EXTENSION_VERSION'
value: '~4'
}
{
name: 'AzureWebJobsStorage'
value: 'DefaultEndpointsProtocol=https;AccountName=${storage.name};AccountKey=${listKeys(storage.id, storage.apiVersion).keys[0].value}'
}
]
http20Enabled: true
}
}
}
Deploy with:
az deployment group create -g banking-rg --template-file infra.bicep
Once deployed, every function invocation is automatically traced—duration, success, exceptions, and downstream calls (e.g., to Cosmos DB or Key Vault).
Logging Custom Metrics from Function Code
Built-in telemetry isn’t enough. You need business-aware metrics.
Here’s a Python function that logs:
# __init__.py
import azure.functions as func
import logging
from applicationinsights import TelemetryClient
import time
import os
# Initialize Application Insights telemetry client
APP_INSIGHTS_KEY = os.getenv('APPINSIGHTS_INSTRUMENTATIONKEY')
telemetry_client = TelemetryClient(APP_INSIGHTS_KEY)
# Load ML model once per instance (during warm-up)
from fraud_model import RiskScorer
scorer = RiskScorer()
def main(req: func.HttpRequest) -> func.HttpResponse:
transaction_id = req.headers.get('X-Transaction-ID', 'unknown')
amount = float(req.params.get('amount', 0))
user_id = req.params.get('user_id')
start_time = time.perf_counter()
try:
# Score transaction
risk_score = scorer.predict(user_id, amount)
inference_time = (time.perf_counter() - start_time) * 1000 # ms
# Log custom metrics
telemetry_client.track_metric("Fraud_Risk_Score", risk_score)
telemetry_client.track_metric("Model_Inference_Latency_ms", inference_time)
is_blocked = risk_score > 0.95
if is_blocked:
telemetry_client.track_metric("Transactions_Blocked", 1)
logging.warning(f"BLOCKED: High-risk transaction {transaction_id} (score={risk_score:.2f})")
return func.HttpResponse("Transaction blocked: High fraud risk", status_code=403)
telemetry_client.track_metric("Transactions_Allowed", 1)
return func.HttpResponse("OK", status_code=200)
except Exception as e:
logging.exception("Fraud check failed")
telemetry_client.track_exception(e)
return func.HttpResponse("Internal error", status_code=500)
finally:
# Ensure telemetry is flushed (critical in serverless)
telemetry_client.flush()
End-to-End Monitoring Dashboard Setup
In the Azure portal:
Go to your Application Insights resource
Open Logs (Analytics)
Run this Kusto query to build a live fraud ops dashboard:
customMetrics
| where name in ("Transactions_Blocked", "Fraud_Risk_Score")
| extend timestamp = bin(timestamp, 1m)
| summarize
blocked = sumif(value, name == "Transactions_Blocked"),
avg_risk = avgif(value, name == "Fraud_Risk_Score")
by timestamp
| render timechart
Set up alert rules:
![1]()
![2]()
![3]()
![4]()
Best Practices for Enterprise Observability
Always enable Application Insights at deployment—never as an afterthought
Log custom metrics aligned with business outcomes (e.g., “fraud prevented”)
Use operation_Id to trace requests across functions and services
Flush telemetry explicitly in serverless environments
Avoid logging PII—sanitize logs and use custom dimensions, not messages
Set up synthetic transactions to detect cold-start regressions
Conclusion
In serverless architectures, observability is your control plane. The bank in our scenario reduced fraud losses by 22% and cut payment latency by 63%—not by rewriting code, but by instrumenting it correctly from the start.
Application Insights + custom metrics + proactive alerting = trust at scale.