VMware Cloud Director - Virtual Machine Remote Console (VMRC) connectivity issues
Once again I came across the issue with VMRC (VMware Remote Console) when implementing a new VMware Cloud Director environment. As usual I went through the list of KB Articles from VMware and the Community to check whether anyone experienced this same issue before. Unfortunately none of the threads found helped and also all topics covered in Troubleshooting Virtual Machine Remote Console (VMRC) connectivity issues in VMware vCloud Director (2087524) were verified.
The design with two application cells as entry point for the customers to configure their virtual datacenters with the Cell servers with embedded Postgres database in the trusted zone behind a firewall seemed to be fairly simple. Due to different companies and parties involved in the environment setup without access to components like firewalls, troubleshooting was made a little more difficult than when managing all these components internally.
The load balancer configured in front of the two application cells was configured with TCP mode (Layer 4) just passing the traffic back to the cell server and was therefore not terminating any traffic (as it has to be configured). The VCD UI was terminating SSL traffic in an HTTP Layer 7 configuration.
The Cell server is connecting to the ESXi Server to open the console session. The log was showing the client initiating the session but there was no indication if this ever succeeded. To keep the story short, it wasn't.
The problem was, once again as often with VMRC configurations, that there was no connection possible when accessing it through these two application cells through the load balancer. Bypassing the load balancer unfortunately was not possible due to some limitations so the only way to troubleshoot it was through the developing tools in the browser and the cell logs themselves.
In the Microsoft Edge developing tools there was no error seen and acquiring the MKS Ticket (which is eventually also showing in the VMware Cloud Director task list) was also not an issue. To make sure there was no issue when translating 443 to 8443 it was even tested directly with 8443 but the issue persisted.
Next step was to check the console-proxy.log file in /opt/vmware/vcloud-director/logs. Interestingly I've seen handshake issues between the Cell Server with IP 10.0.1.10 and the load balancer with IP 10.0.1.14 and tried to focus on this issue as they should just pass the traffic back to the cell because there is no termination on the balancer, however, key was to look a line above to understand the process.
2022-03-17 12:32:52,273 | DEBUG | consoleproxy queue : 0 | ABaseInitialServerTransfer | Initiating a client connection to 192.168.25.123 on port 443 with initial request: 870be68faGec67a | java.nio.channels.SocketChannel[connected local=/10.0.1.10:8443 remote=/10.0.1.14:51020] 2022-03-17 12:34:06,220 | DEBUG | consoleproxy queue : 0 | SimpleProxyConnectionHandler | Initiated handling for channel 0x525e4728 [java.nio.channels.SocketChannel[connected local=/10.0.1.10:8443 remote=/10 2022-03-17 12:34:06,220 | DEBUG | consoleproxy queue : 0 | SSLHandshakeTask | Exception during handshake: java.io.IOException: EOF encountered during handshake. | java.nio.channels.SocketChannel[connected local=/10.0.1.10:8443 remote=/10.0.1.14:51734]
The Cell server is connecting to the ESXi Server with IP 192.168.25.123 to open the console session. The log was showing the client initiating the session but there was no indication if this ever succeeded and to keep the story short, it wasn't. The firewall was blocking 443 traffic to the ESXi server due to a typo in the Firewall configuration and right after correcting this VMRC was working again.