Are you staggering the startups of the VMs? The server may be choking trying to boot 8 machines at once. I suggest starting a VM every 30-60
seconds, so that you aren’t trying to boot all 8 at once. Don’t know if it will help, but it might.
The crashs happen long after boot time when all of the VMs are running.
(Actually, startup goes very smoothly, with the VMs starting in parallel in the background while system boot completes.)
still starting it’s boot process? You might want to make sure your host system is finished it’s boot process before any of the VM’s want to try and start. I know you did say it’s after they all boot that the issue begins,
but I guess it’s best to be safe than sorry just in case there’s something that hasn’t been started that’s needed by the VM’s.
How long do they stay up for?
This sounds like the issue with the machine running out of memory and the Out of Memory killer actually killing one of the VMWare instances.
My experience with this on a very good machine was that there was enough memory, but it was timing that was causing the issue. The machine did not respond quickly enough to the memory request and the OOM Killer then acted.
How I solved my problem was to reserve more memory as unused with this memory variable:
I have had issues with VMWare host server and running out of memory,
maybe try setting this variable in sysctl.conf:
(that will maintain 64MB of free RAM and should allow for enough time to prevent OOM kills)
I’ll give that a try.
But the problem was not that one or more VMware instances was killed and other processes continued, but that the system hung. Nothing was running.