Unusual System State

Home » CentOS » Unusual System State
CentOS 3 Comments

Our smallest network of systems has only four computers connected via Gigabit Ethernet.  The oldest and most stable platform is an eight year old Dell E520 running CentOS 6.8.  We often try out applications on this Dell/CentOS machine before moving them to other systems on our other networks.

Last night, one of our users decided to create a single, 228GB home directory tar archive on an empty, 500GB, external, USB, Ext4 disk drive. This was obviously a poor decision. The extent of the results were not obvious until this morning.

All disk activity had stopped and the system appeared to be hibernation. A push on the power button usually brings the system back to life, but in this case, the unlock screen was presented for only three seconds and then the hibernation mode was resumed.  Repeated attempts to log on were all thwarted due to this behavior.  SSH from other systems wasalso not possible.

Holding the power button in order to initiate power down did not work either.  The result was the same as a one second press of front panel power button bringing up the unlock screen for only a short time.  We eventually removed the power cord for five minutes and then restarted the machine.

The system is running normally once again.  The corrupted file system on the USB disk has been restored by re-partitioning and building a new Ext4 file system on it. The user no longer gets to use external disks.

Examination of log files and the dmesg output did not yield any useful information regarding the unusual state of the system when unlock logon was not possible.  Is there somewhere else we should look for evidence of what actually happened and the unusual state of the system.  Thanks.

[user@computer ~]$ uname -a Linux computer 2.6.32-642.11.1.el6.x86_64 #1 SMP Fri Nov 18 19:25:05 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
[user@computer ~]$

3 thoughts on - Unusual System State

  • Chris Olson wrote:

    Not that this will be of any help, but we, once in a while, will suddenly find a machine unresponsive, and in an undefined state. IIRC, pingable, but can’t SSH in, nor is there any response whatsoever to plugging in a keyboard and monitor. Power cycle is the only answer, and there’s never anything in the logs.

    Mostly, those systems are used for very serious scientific computing (as in, no VM, and I’ve seen loads of > 80).

    mark

  • I have seen *some* similar activity in different machines through the years and it *always* turns out to be a hardware issue. If this machine is particularly old, I would be suspicious of that.

    Linc Fessenden

  • Lincoln Fessenden wrote:
    My version of it has been on both Dells and Supermicros (rebranded by Penguin).

    mark