Errors encountered during this process:
Failed to start File System Check on /dev/dis...
Failed to start update UTMP about System RunLevel Changes.
Failed to start Network Service.
We recently had a quick "blip" on one of our storage arrays. All the Windows Servers had no disruption of service or came back up without incident.
This was not the case with our vCSA and external PSC. BOTH servers were not functional. So, it doesn't appear to be a one-off or fluke. The appliances were running version 6.5.0.15000, the March 2018 release.
Both servers showed "Detected aborted journal" and "journal has aborted" errors on the console. I started trouble shooting with the external PSC.
Upon restart, I received the following error, and the server entered Emergency Mode:
Failed to start File System Check on /dev/dis...
Log in and run the following commands to determine the device which is causing the error (Both were /dev/sda3 in my case):
/bin/sh
/bin/mount
blkid
Match the UUID in the error message with the PARTUUID in the output. In the example below, we see it matches up with /dev/sda3.
Run the following command which runs a check on ext2, 3 and 4 File Systems. "-y" answers "yes" to all the questions. (Super handy)
e2fsck -y /dev/sda3
After the file system check has completed, restart the appliance.
This resolved the issue with the the external PSC. Cool, just repeat the process on the vCSA right? Not so fast....
I had the following additional errors with the vCSA after running the file check on /dev/sda3.
Failed to start update UTMP about System RunLevel Changes.
Failed to start Network Service.
Running the following command to view the contents of the systemd journal. This pointed me to log_vg-log
journalctl -xb
Run a file system check against log_vg-log by running the following:
fsck -y /dev/mapper/log_vg-log
Reboot the server after the fsck has completed. After coming back up, the vCenter services started successfully and I was able to log into the vCSA.