Fix Generic Error in Custom Activity Using Batch Account in ADF

Azure Batch is used to run large-scale parallel and high-performance computing (HPC) batch jobs efficiently in Azure. Azure Batch creates and manages a pool of compute nodes (virtual machines), installs the applications you want to run, and schedules jobs to run on the nodes. There's no cluster or job scheduler software to install, manage, or scale. Instead, you use command-line scripts, or the Azure portal to configure, manage, and monitor your jobs.

While executing a C# code in a custom activity using batch activity in ADF sometimes it would get the error "The underlying connection was closed An unexpected error occurred on a send. ---> System.IO.IOException Unable to read data from the transport connection An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host". But while executing it locally in Visual Studio debug mode it would be a success.

In such a scenario, we need to check the SSL/TLS certificate presented in the batch account VM's web server. This is the link that would be useful to determine which TLS is better for which .Net Framework Transport Layer Security (TLS) best practices with .NET Framework | Microsoft Learn. To check the SSL/TLS version used in the batch account VM you can log to the batch VM (Connect using Remote Desktop to an Azure VM running Windows - Azure Virtual Machines | Microsoft Learn

Check the system level TLS 1.2 below

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\Schannel\Protocols\TLS 1.2\Client]

To open the registry location you can take the help of this document How to open Registry Editor in Windows 10 - Microsoft Support.

If proper TLS certification is present then you can check the compute node metrics (Disk Utilization & CPU) while running the code it was going till 100% due to which certificates were not getting in the VM and Porper handshake was not happening post creating a new batch pool issue got resolved.

Consider using the ‘2022-datacenter’ SKU instead of ‘2012-r2-datacenter’ since the latter is nearing expiration