/lib/firmware/microcode.dat Update On CentOS 6

Home » CentOS » /lib/firmware/microcode.dat Update On CentOS 6
CentOS 29 Comments

Dear All,

An update just brought on my CenOS 6 boxes updated microcode.dat files:

/lib/firmware/microcode.dat

Does anybody know off hand what (how critical) is that, as, if it is related to most famous these days trouble with CPU hardware, I will need to reboot relevant boxes to have new microcode loaded. But if it is not that critical, it can wait till next reboot.

Thanks a lot and apologies for laziness (not looking into details of this particular update myself).

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

29 thoughts on - /lib/firmware/microcode.dat Update On CentOS 6

  • See:

    https://access.redhat.com/errata/RHSA-2018:0093

    Red Hat have rolled back the recent microcode updates for Spectre as they were causing instabilities in some systems.

    Updated microcode was only made available for a relatively small number of CPUs so it might be the case that the your microcode was never actually updated, hence there is nothing to roll back in the latest release, so no need to panic about rebooting. Checking /var/log/messages should give you more clues when your microcode was actually last updated and allow you to determine if it was before or after the recent Spectre fiasco.

  • As Phil said, this basically REMOVES the new microcode.dat file and reverts to the last stable one before the spectre / meltdown update set.

    Please see my tweet about this:

    https://twitter.com/JohnnyCentOS/status/953734648764477440

    For those of you without twitter (WHAT .. who doesn’t have twitter :D )

    Look at:

    https://t.co/6fT61xgtGH

    Get the latest microcode.dat file from here:

    https://t.co/zPwagbeJFY

    See how to update the microcode from the links at the bottom of this page:

    https://t.co/EOgclWdHCw

    An before anyone asks .. I have no idea why Red Hat chose this path, they did. It doesn’t matter if I (or anyone else) agrees with the decision. It is what it is.

    Thanks. Johnny Hughes

  • **I’m not blaming you.**

    But can I just clarify. We have to *manually* install the microcode update an EL7 in order to be protected against Spectre? EL6 as well?

    Presumably this is to remove RH from the loop and to stop people blaming them – i.e. this is between Intel and the customer, it’s nothing to do with them.

    What about future microcode updates? They come out reasonably regularly
    (2 or 3 times a year) – are RH going to absolve themselves from all future updates because presumably the next update will also contain the Spectre fixes?

    So, before I re-invent the wheel, does anyone have automation scripts to do the microcode update? I don’t relish the prospect of doing this manually on a couple of hundred machines. Is it reasonable to grab the microcode_ctl SRPM and create my own updated RPM to do it?

    P.

  • No, this is because at least one major CPU (Intel type 79) is completely broken by the Intel Microcode Update. Those machines can’t boot after the microcode rpm is installed. It impacts at least these processors:

    Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz Intel(R) Xeon(R) CPU E5-2643 v4 @ 3.40GHz Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.50GHz

    There may be others.

    https://bugzilla.redhat.com/show_bug.cgi?id=1532283

    That means, it is NOT full-proof install and likely leaves many servers/machines broken. I suppose that the decision is, go pack to what works for all machines (the last known good install) and let admins work with their hardware vendor because the alternative is breaking things.

    This also needs to be fixed with a combination of firmware updates AND
    microcode updates. All of that is outside the expertise of a Linux vendor and is unique for each processor, chipset and firmware combination.

    How they handle it in the future I have no way of knowing, but if you had 20,000 servers with the impacted CPU and you updated and could not reboot, I would assume that you did not appreciate it.

    That is what I have found so far with a bit of research.

    This is NOTHING about who to blame and everything about stable, working updates .. at least it seems so to me.

  • So, if we applied the previous microcode update, and all our machines rebooted OK, then we don’t need to fallback?

    Also, do we know if the updated CentOS microcode RPM reverted the microcode for *all* Intel CPUs, or just the ones that had issues? In other words, if I apply the latest microcode update to our 100+ machines (which all have the previous update, and are OK) will they revert to a vulnerable state?

  • It reverted for all .. but, your machines may or may not be protected as only a subset of machines were updated with the original microcode from Intel.

    It is your call as to what you install .. but the correct method is to install the current microcode_ctl .. and then research your specific machine, its CPU, chipset, firmware .. go to the vendor and make sure you get all the things necessary to mitigate the issues. It will be different for each CPU vendor (Intel or AMD), each CPU / Chipset combo, and even each vendor (Dell may have new firmware for x and y but not z models, etc.)

    There is no one size fits all update for this issue.

  • I bet you are right. I was going to rant about that… then it occurred to me that class action against Intel (didn’t hear about AMD though) is quite likely, so, indeed, RedHat does not want to be even mentioned in it, which will be unfair, especially after RedHat putting effort into fixing somebody’s else crap.

    Valeri

  • OK, so color me confused about the timing in all this.

    Do we update the microcode now or do we wait until the latest microcode_ctl rpm is available and then tackle this issue?

  • As a data point, we have the updated microcode running on 600+ Haswell servers and so far no indication of problems.

    We’ll keep the ibrs/spectre mitigation this gives us and not revert
    (unless it turns out it does cause problems).

    /Peter

  • The message is: stay away from microcode updates because they’re broken right now. Intel may or may not release fixes next week to be tested by OEMs.

    Once working updates are available, OEMs will integrate them into their firmware/BIOS releases. That is one method to avail of microcode updates. The other method is loading during OS boot (via udev rule), with codes provided by the microcode_ctl rpm. It looks like Red Hat are now staying away from that; in any case, their previous rpm only included ucodes for three cpus. I did not check if the microcode.dat included more updates than that.

    Method number 2b is to download the firmware from Intel directly and provide it in the locations defined by the microcode_ctl rpm. Then it’s up to you to do the QA.

    If your RHEL/CentOS is fully up to date, you’re protected against variant 1/Spectre and Meltdown. Red Hat have done a pretty good job to backport those patches from upstream. GKH’s blog is worth a read.

  • downloaded the appropriate files from the links that Johnny provided in a previous posting. My question is, do we wait until the latest microcode_ctl rpm is installed or do it now?  My concern is that if I do it now the new rpm might undo what I’ve done.


    Unencumbered by the thought process.
    — Click and Clack the Tappet brothers


    This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.

  • downloaded the appropriate files from the links that Johnny provided in a previous posting. My question is, do we wait until the latest microcode_ctl rpm is installed or do it now?  My concern is that if I do it now the new rpm might undo what I’ve done.

    Pete


    Unencumbered by the thought process.
    — Click and Clack the Tappet brothers


    This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.

  • It does not matter. The microcode_ctl package contains CPU firmware that is loaded at by the kernel early in the boot process if it’s newer than the one provided by the system firmware/BIOS. It is never permanently stored in NVRAM or anything — it’s loaded at each boot.

    You should get a BIOS/EFI firmware update from your hardware vendor which includes updated microcode. Then, you’ll get the IBRS-capable microcode at boot, every boot. This makes microcode_ctl moot.

    Read more about this here: https://access.redhat.com/solutions/3315431


    Matthew Miller

    Fedora Project Leader

  • Hence, by my understanding, there should not be any permanent damage should you get a ‘bad’ microcode update, either from Intel or Red Hat, that prevents the system from booting. Presumably one should still always be able to boot the machine from a rescue disk, mount the fs and either delete the offending microcode or uninstall the microcode_ctl package to allow the system to boot again. This should not result in a
    ‘bricked’ permanently unrecoverable system.

  • Thanks, Johnny, Matthew, Peter, … everybody for your insights!

    Valeri

    ++++++++++++++++++++++++++++++++++++++++
    Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
    ++++++++++++++++++++++++++++++++++++++++

  • It isn’t about washing hands, lawsuits, or soeoen else’s stuff. It is about broken microcode updates.

    The code from intel was broken .. causing several CPUs not to boot after update. That (and only that) is why they were pulled.

    Users MUST individually QA the microcode for their individual CPUs, OEM
    Frimware, chipset etc.

    The issue here is that the microcode broke peoples machines .. therefore it had to be rolled back. All the other discussion is full and total BS.

    It is likely ONCE all the microcode updates are tested and completely working that Red Hat will again include it in the microcode_ctl RPM .. but that can’t put stuff in there that is breaking machines.

    While things are beuing released as QA quality, they are going to have to be done individually by admins .. that’s just how it is.

  • Thanks Johnny, for correcting my wild guess which was wrong, and the fact that my guess was wrong I realized after reading your other post!

    As with everything else the end user pays one way or another. :-(

    Valeri

    ++++++++++++++++++++++++++++++++++++++++
    Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
    ++++++++++++++++++++++++++++++++++++++++

  • Except this doesn’t mention microcode at all. I can’t even tell WTF
    they’re recommending not doing in this doc, it’s that badly written. You have to infer, by reading two prior docs, that they’re referring to microcode. And then you have to assume that’s still what they’re referring to when they say:

    “We recommend that OEMs, cloud service providers, system manufacturers, software vendors and end users stop deployment of current versions.” Current versions of what? Microcode?

    But yes, indeed they appear to have pulled the 20180108 microcode, which was previously set to latest at this link, and it is now reverted to the 20171117 microcode.

    https://downloadcenter.intel.com/download/27337/Linux-Processor-Microcode-Data-File?v=t

    What these means for people who have CPUs which were not crashing
    (rebooting being a new euphemism for crashing) , but saw variant 2
    Spectre mitigation with the 20180108 microcode, will lose full mitigation until Intel gets its ducks into a row.

    *eye roll*

    His comments aren’t about microcode though. And it also looks like he got IBRS and IBPB confused. The better post on this front is

    https://lkml.org/lkml/2018/1/22/598

    As far as I know, there still is no mitigation for Spectre variant 1.

  • What’s amazing to me is, after “Intel Inside – don’t divide” (their 486 debacle), they didn’t learn and have a better plan for addressing these kinds of things.

    —– Original Message —

  • It probably would be fair to conclude that they didn’t plan to address these things at all ;-)

    Valeri

    ++++++++++++++++++++++++++++++++++++++++
    Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
    ++++++++++++++++++++++++++++++++++++++++

  • Once upon a time, Chris Murphy said:

    Well, that’s the only thing Intel provides for CPUs, so that’s all it can be.

    Lots of people weren’t seeing issues, but that’s in part because Intel’s updated microcode release only actually updated microcode for recent CPUs. I have many servers that aren’t crashing, but that’s because Intel hasn’t actually even tried to fix the microcode for their CPUs yet.

  • Comparing microcode-20171117 with microcode-20180108 shows that from the 94 ucode files only 19 where updated

    $ diff -r –brief microcode-20171117 microcode-20180108
    Files microcode-20171117/intel-ucode/06-3c-03 and microcode-20180108/intel-ucode/06-3c-03 differ Files microcode-20171117/intel-ucode/06-3d-04 and microcode-20180108/intel-ucode/06-3d-04 differ Files microcode-20171117/intel-ucode/06-3e-04 and microcode-20180108/intel-ucode/06-3e-04 differ Files microcode-20171117/intel-ucode/06-3f-02 and microcode-20180108/intel-ucode/06-3f-02 differ Files microcode-20171117/intel-ucode/06-3f-04 and microcode-20180108/intel-ucode/06-3f-04 differ Files microcode-20171117/intel-ucode/06-45-01 and microcode-20180108/intel-ucode/06-45-01 differ Files microcode-20171117/intel-ucode/06-46-01 and microcode-20180108/intel-ucode/06-46-01 differ Files microcode-20171117/intel-ucode/06-47-01 and microcode-20180108/intel-ucode/06-47-01 differ Files microcode-20171117/intel-ucode/06-4e-03 and microcode-20180108/intel-ucode/06-4e-03 differ Files microcode-20171117/intel-ucode/06-55-04 and microcode-20180108/intel-ucode/06-55-04 differ Files microcode-20171117/intel-ucode/06-56-02 and microcode-20180108/intel-ucode/06-56-02 differ Files microcode-20171117/intel-ucode/06-56-03 and microcode-20180108/intel-ucode/06-56-03 differ Files microcode-20171117/intel-ucode/06-5e-03 and microcode-20180108/intel-ucode/06-5e-03 differ Files microcode-20171117/intel-ucode/06-7a-01 and microcode-20180108/intel-ucode/06-7a-01 differ Files microcode-20171117/intel-ucode/06-8e-09 and microcode-20180108/intel-ucode/06-8e-09 differ Files microcode-20171117/intel-ucode/06-8e-0a and microcode-20180108/intel-ucode/06-8e-0a differ Files microcode-20171117/intel-ucode/06-9e-09 and microcode-20180108/intel-ucode/06-9e-09 differ Files microcode-20171117/intel-ucode/06-9e-0a and microcode-20180108/intel-ucode/06-9e-0a differ Files microcode-20171117/intel-ucode/06-9e-0b and microcode-20180108/intel-ucode/06-9e-0b differ Files microcode-20171117/microcode.dat and microcode-20180108/microcode.dat differ Files microcode-20171117/releasenote and microcode-20180108/releasenote differ

    Microcode ID?

    $ awk ‘/cpu family/||/model\t/||/stepping/’ /proc/cpuinfo |sort |uniq

    and convert it into hex

  • Thanks for this info Leon. Very helpful. I was trying to figure this out. Intel should make this clear on their microcode download page.