KVM Random Reboots AMD EPYC Server

Home » CentOS » KVM Random Reboots AMD EPYC Server
CentOS 2 Comments

our new Server with AMD EPYC and super micro board reboots ramdonly. There is no error message before the reboot in /var/log/messages.

we are running 2 Server with VMWare workstation without any problem.

The new server should run KVM.

older servers with AMD (before EPYC) running KVM without any problem.

any idea or recommendation?


Viele Grüße Helmut Drodofsky

Internet XS Service GmbH
Heßbrühlstraße 15
70565 Stuttgart

Geschäftsführung Helmut Drodofsky HRB 21091 Stuttgart USt.ID: DE190582774
Fon: 0711 781941 0
Fax: 0711 781941 79
Mail: info@internet-xs.de www.internet-xs.de

2 thoughts on - KVM Random Reboots AMD EPYC Server

  • Anything in the hardware logs of the server like memory error or so? Any watchdog on the servers acting bad?
    We run CentOS 7 and KVM on AMD Opteron and AMD EPYC servers without issues.

    Regards, Simon

  • I had issues with Supermicro and EPYC in the past year and it was isolated to a faulty 16GB ECC RAM module and the error was just showing in the log of the super micro web-based BMC and nowhere else. The fault was neither Supermicro nor AMD. The brand of the ECC module was Samsung.it failed after
    1 year of use. the bad batch I assume because the other 25 pieces of ECC
    RAM from Samsung that we use in the other servers have no issue.

    The behavior was that randomly, the server suddenly rebooted with no message at all at CentOS level.

    I realize that the BMC error log is far (very very far) from perfect but perhaps the error is in a strange message lying there.

    Hope this helps

    ———————
    Erick Perez Quadrian Enterprises S.A. – Panama, Republica de Panama Skype chat: eaperezh WhatsApp IM: +507-6675-5083
    ———————