Nvme M.2 Disk Problem

Home » CentOS » Nvme M.2 Disk Problem
CentOS No Comments

Hi list, I’m running CentOS 7.6 on an Corsair Force MP500 120 GB. Root fs is ext4
and this drive is ~1 year old. System works very well except on boot. During boot process I got always a file system check on nvme drive.

Running smartctl on this drive I got this:

=== START OF SMART DATA SECTION ==

SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0x1)

Critical Warning: 0x00

Temperature: 41 Celsius

Available Spare: 100%

Available Spare Threshold: 1%

Percentage Used: 1%

Data Units Read: 5,355,595 [2,74 TB]

Data Units Written: 5,826,517 [2,98 TB]

Host Read Commands: 67,978,550

Host Write Commands: 75,422,898

Controller Busy Time: 32,863

Power Cycles: 811

Power On Hours: 2,813
Unsafe Shutdowns: 317
Media and Data Integrity Errors: 0
Error Information Log Entries: 177
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 2: 77 Celsius

Error Information (NVMe Log 0x01, max 64 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
0 177 0 0x0014 0x4004 – 8796109799680 1 –
1 176 0 0x0019 0x4004 – 8796109799680 1 –
2 175 0 0x001a 0x4004 – 8796109799680 1 –
3 174 0 0x0005 0x4004 – 8796109799680 1 –
4 173 0 0x000c 0x4004 – 8796109799680 1 –
5 172 0 0x0019 0x4004 – 8796109799680 1 –
6 171 0 0x001d 0x4004 – 8796109799680 1 –
7 170 0 0x0014 0x4004 – 8796109799680 1 –
8 169 0 0x0011 0x4004 – 8796109799680 1 –
9 168 0 0x000f 0x4004 – 8796109799680 1 –
10 167 0 0x0000 0x4004 – 8796109799680 1 –
11 166 0 0x0006 0x4004 – 8796109799680 1 –
12 165 0 0x0008 0x4004 – 8796109799680 1 –
13 164 0 0x000e 0x4004 – 8796109799680 1 –
14 163 0 0x0008 0x4004 – 8796109799680 1 –
15 162 0 0x0006 0x4004 – 8796109799680 1 –
… (48 entries not shown)

I noticed that Unsafe shutdowns increased rapidly and I don’t know why there is an unsafe shutdown. Every 3/4 boot this value is increased by 1
and I don’t know why.

I can’t find any errors on system logs.

Can someone point me in the right direction?

Thanks in advance.

Alessandro.