Friday, June 26, 2020

vCSA - 503 Service Unavailable - Failed to connect to endpoint

I was recently asked to look into a vCSA issue. The user was unable to log in using the web client. The vCSA was subsequently rebooted and it then produced the following error: “503 Service Unavailable”


503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x00007fb7d00200a0] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe

Upon running “service-control –status” in the vCSA, it was determined that  the vapi-endpoint service was in a stopped state.


It tuns out The Security Token Service (STS) Certificate had expired. If your vCSA was deployed using version 6.5 U2 or later, the STS cert may only be good for 2 years.


Sadly, there is no warning or easy way to determine when the STS Cert will expire using the HTML 5 client.


Resolution:
Part 1:
VMware has created a script to resolve this issue (fixsts.sh).  It can be found here:

https://kb.vmware.com/s/article/76719


1. Create a snapshots of all vCSAs (and external PSCs, if applicable).
2. Copy the .sh script to the /tmp dir of the appliance running the PSC role.
3. Make the script executable by running “chmod +x fixsts.sh
4. Execute the file “./fixsts.sh”
5. Restart the appliance and confirm functionality.
6. The script only needs to be run once per SSO Domain.

Part 2:
Upon restart, the Web client was then throwing an Error 400. The user certificates needed to be replaced as well.

1. Launch the Certificate Manager utility: /usr/lib/vmware-vmca/bin/certificate-manager

2. Select Option 6.

3. The default options were taken except for the following:

Enter proper value for 'IPAddress' [optional] : IPADDRESS
Enter proper value for 'Hostname' [Enter valid Fully Qualified Domain Name(FQDN), For Example : example.domain.com] : VCFQDN
Enter proper value for VMCA 'Name' : VCSHORTNAME

4. The Certificate Replacement process was stopped at 85% by hitting CTRL+C, and the services were manually started by running “Service-control –start –all”.

The process was stopped at 85% since the certificate replacement was completed. At 85%,, the vCSA was just waiting for all the processes to restart. If any of the services fail to start, the certificate replacement will be rolled back.

5. This process can be done at any time. It only affects the vCSA the process is performed on.


Checking the Expiration Date of the STS Certificate: 

Method 1 - Flash Client - login using administrator@vsphere.local using the:


Method 2
1. Download the “checkSTS.py” script from the following location: https://kb.vmware.com/s/article/79248
2. Copy the script the /tmp dir.
3. Run the Python Script “python checksts.ph”