Slow Performance On Company Production Server I Need Help

Home » CentOS » Slow Performance On Company Production Server I Need Help
CentOS 7 Comments

Hello Everyone,

Since rebooting my CentOS 6.10 Openvz server “daisy” yesterday, I am getting horrible system performance.  /var/log/messages is full of HDIO_GET_IDENTITY failed for /dev/sdb.  The latest entries look like this:

Apr 22 08:51:32 daisy kernel: [141224.655699] CT: 1005: stopped Apr 22 08:55:04 daisy ata_id[21513]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:00:05 daisy ata_id[21584]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:05:02 daisy ata_id[21644]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:10:01 daisy ata_id[22282]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:11:49 daisy kernel: [142441.721065] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:11:49 daisy kernel: [142441.721083]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:11:49 daisy kernel: [142441.721093] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:11:49 daisy kernel: [142441.721109] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:11:49 daisy kernel: [142441.721115]  ffff88006654bcb8
0000000000000086 ffffffff8114f130 ffff88002821fa40
Apr 22 09:11:49 daisy kernel: [142441.721121]  ffff88000004d238
ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0
Apr 22 09:11:49 daisy kernel: [142441.721125]  ffff88011a707000
ffff880028321168 000000000001b7ea 0000816be9b3faa2
Apr 22 09:11:49 daisy kernel: [142441.721130] Call Trace:
Apr 22 09:11:49 daisy kernel: [142441.721139]  [] ?
sync_page+0x0/0x50
Apr 22 09:11:49 daisy kernel: [142441.721144]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:11:49 daisy kernel: [142441.721149]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:11:49 daisy kernel: [142441.721155]  [] ?
check_preempt_curr+0x82/0xa0
Apr 22 09:11:49 daisy kernel: [142441.721159]  []
wait_for_completion+0xe4/0x120
Apr 22 09:11:49 daisy kernel: [142441.721162]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:11:49 daisy kernel: [142441.721167]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:11:49 daisy kernel: [142441.721172]  []
sync_inodes_sb_ub+0xa8/0x1d0
Apr 22 09:11:49 daisy kernel: [142441.721176]  [] ?
filemap_fdatawait+0x2f/0x40
Apr 22 09:11:49 daisy kernel: [142441.721181]  []
__sync_filesystem+0x95/0xa0
Apr 22 09:11:49 daisy kernel: [142441.721184]  []
sync_filesystems+0x30d/0x350
Apr 22 09:11:49 daisy kernel: [142441.721188]  []
sys_sync+0x155/0x1a0
Apr 22 09:11:49 daisy kernel: [142441.721192]  []
system_call_fastpath+0x22/0x3a Apr 22 09:13:49 daisy kernel: [142561.721069] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:13:49 daisy kernel: [142561.721087]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:13:49 daisy kernel: [142561.721096] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:13:49 daisy kernel: [142561.721112] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:13:49 daisy kernel: [142561.721118]  ffff88006654bcb8
0000000000000086 ffffffff8114f130 ffff88002821fa40
Apr 22 09:13:49 daisy kernel: [142561.721123]  ffff88000004d238
ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0
Apr 22 09:13:49 daisy kernel: [142561.721128]  ffff88011a707000
ffff880028321168 000000000001b7ea 0000816be9b3faa2
Apr 22 09:13:49 daisy kernel: [142561.721133] Call Trace:
Apr 22 09:13:49 daisy kernel: [142561.721142]  [] ?
sync_page+0x0/0x50
Apr 22 09:13:49 daisy kernel: [142561.721148]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:13:49 daisy kernel: [142561.721153]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:13:49 daisy kernel: [142561.721158]  [] ?
check_preempt_curr+0x82/0xa0
Apr 22 09:13:49 daisy kernel: [142561.721162]  []
wait_for_completion+0xe4/0x120
Apr 22 09:13:49 daisy kernel: [142561.721166]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:13:49 daisy kernel: [142561.721170]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:13:49 daisy kernel: [142561.721176]  []
sync_inodes_sb_ub+0xa8/0x1d0
Apr 22 09:13:49 daisy kernel: [142561.721180]  [] ?
filemap_fdatawait+0x2f/0x40
Apr 22 09:13:49 daisy kernel: [142561.721184]  []
__sync_filesystem+0x95/0xa0
Apr 22 09:13:49 daisy kernel: [142561.721188]  []
sync_filesystems+0x30d/0x350
Apr 22 09:13:49 daisy kernel: [142561.721192]  []
sys_sync+0x155/0x1a0
Apr 22 09:13:49 daisy kernel: [142561.721196]  []
system_call_fastpath+0x22/0x3a Apr 22 09:15:06 daisy ata_id[22299]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:15:49 daisy kernel: [142681.721085] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:15:49 daisy kernel: [142681.721104]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:15:49 daisy kernel: [142681.721113] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:15:49 daisy kernel: [142681.721129] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:15:49 daisy kernel: [142681.721136]  ffff88006654bcb8
0000000000000086 ffffffff8114f130 ffff88002821fa40
Apr 22 09:15:49 daisy kernel: [142681.721141]  ffff88000004d238
ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0
Apr 22 09:15:49 daisy kernel: [142681.721146]  ffff88011a707000
ffff880028321168 000000000001b7ea 0000816be9b3faa2
Apr 22 09:15:49 daisy kernel: [142681.721150] Call Trace:
Apr 22 09:15:49 daisy kernel: [142681.721160]  [] ?
sync_page+0x0/0x50
Apr 22 09:15:49 daisy kernel: [142681.721166]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:15:49 daisy kernel: [142681.721172]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:15:49 daisy kernel: [142681.721178]  [] ?
check_preempt_curr+0x82/0xa0
Apr 22 09:15:49 daisy kernel: [142681.721182]  []
wait_for_completion+0xe4/0x120
Apr 22 09:15:49 daisy kernel: [142681.721185]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:15:49 daisy kernel: [142681.721190]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:15:49 daisy kernel: [142681.721196]  []
sync_inodes_sb_ub+0xa8/0x1d0
Apr 22 09:15:49 daisy kernel: [142681.721200]  [] ?
filemap_fdatawait+0x2f/0x40
Apr 22 09:15:49 daisy kernel: [142681.721204]  []
__sync_filesystem+0x95/0xa0
Apr 22 09:15:49 daisy kernel: [142681.721208]  []
sync_filesystems+0x30d/0x350
Apr 22 09:15:49 daisy kernel: [142681.721212]  []
sys_sync+0x155/0x1a0
Apr 22 09:15:49 daisy kernel: [142681.721217]  []
system_call_fastpath+0x22/0x3a Apr 22 09:17:49 daisy kernel: [142801.721064] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:17:49 daisy kernel: [142801.721082]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:17:49 daisy kernel: [142801.721091] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:17:49 daisy kernel: [142801.721107] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:17:49 daisy kernel: [142801.721114]  ffff88006654bcb8
0000000000000086 ffffffff8114f130 ffff88002821fa40
Apr 22 09:17:49 daisy kernel: [142801.721119]  ffff88000004d238
ffff88006654bd70 ffff88006654bc88 ffffea00016ab7c0
Apr 22 09:17:49 daisy kernel: [142801.721124]  ffff88011a707000
ffff880028321168 000000000001b7ea 0000816be9b3faa2
Apr 22 09:17:49 daisy kernel: [142801.721128] Call Trace:
Apr 22 09:17:49 daisy kernel: [142801.721137]  [] ?
sync_page+0x0/0x50
Apr 22 09:17:49 daisy kernel: [142801.721143]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:17:49 daisy kernel: [142801.721149]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:17:49 daisy kernel: [142801.721154]  [] ?
check_preempt_curr+0x82/0xa0
Apr 22 09:17:49 daisy kernel: [142801.721158]  []
wait_for_completion+0xe4/0x120
Apr 22 09:17:49 daisy kernel: [142801.721162]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:17:49 daisy kernel: [142801.721166]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:17:49 daisy kernel: [142801.721172]  []
sync_inodes_sb_ub+0xa8/0x1d0
Apr 22 09:17:49 daisy kernel: [142801.721176]  [] ?
filemap_fdatawait+0x2f/0x40
Apr 22 09:17:49 daisy kernel: [142801.721180]  []
__sync_filesystem+0x95/0xa0
Apr 22 09:17:49 daisy kernel: [142801.721184]  []
sync_filesystems+0x30d/0x350
Apr 22 09:17:49 daisy kernel: [142801.721188]  []
sys_sync+0x155/0x1a0
Apr 22 09:17:49 daisy kernel: [142801.721192]  []
system_call_fastpath+0x22/0x3a Apr 22 09:20:01 daisy ata_id[22405]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:21:49 daisy kernel: [143041.721494] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:21:49 daisy kernel: [143041.721512]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:21:49 daisy kernel: [143041.721522] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:21:49 daisy kernel: [143041.721691] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:21:49 daisy kernel: [143041.721697]  ffff88006654bcc8
0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:21:49 daisy kernel: [143041.721702]  ffff880028200000
000000001a42f238 ffff8800110101c0 ffff88011a42f200
Apr 22 09:21:49 daisy kernel: [143041.721706]  ffff88006654bc68
ffffffff8107bbfe ffff8800110101c0 0000000000000000
Apr 22 09:21:49 daisy kernel: [143041.721711] Call Trace:
Apr 22 09:21:49 daisy kernel: [143041.721720]  [] ?
__switch_to+0x16f/0x470
Apr 22 09:21:49 daisy kernel: [143041.721726]  [] ?
finish_task_switch+0xce/0x120
Apr 22 09:21:49 daisy kernel: [143041.721730]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:21:49 daisy kernel: [143041.721735]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:21:49 daisy kernel: [143041.721739]  []
wait_for_completion+0xe4/0x120
Apr 22 09:21:49 daisy kernel: [143041.721743]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:21:49 daisy kernel: [143041.721747]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:21:49 daisy kernel: [143041.721753]  []
writeback_inodes_sb_nr_ub+0x83/0xb0
Apr 22 09:21:49 daisy kernel: [143041.721757]  []
writeback_inodes_sb_ub+0x46/0x50
Apr 22 09:21:49 daisy kernel: [143041.721762]  []
__sync_filesystem+0x48/0xa0
Apr 22 09:21:49 daisy kernel: [143041.721765]  []
sync_filesystems+0x30d/0x350
Apr 22 09:21:49 daisy kernel: [143041.721769]  []
sys_sync+0x148/0x1a0
Apr 22 09:21:49 daisy kernel: [143041.721773]  []
system_call_fastpath+0x22/0x3a Apr 22 09:23:49 daisy kernel: [143161.721064] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:23:49 daisy kernel: [143161.721169]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:23:49 daisy kernel: [143161.721259] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:23:49 daisy kernel: [143161.721430] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:23:49 daisy kernel: [143161.721437]  ffff88006654bcc8
0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:23:49 daisy kernel: [143161.721442]  ffff880028200000
000000001a42f238 ffff8800110101c0 ffff88011a42f200
Apr 22 09:23:49 daisy kernel: [143161.721447]  ffff88006654bc68
ffffffff8107bbfe ffff8800110101c0 0000000000000000
Apr 22 09:23:49 daisy kernel: [143161.721451] Call Trace:
Apr 22 09:23:49 daisy kernel: [143161.721460]  [] ?
__switch_to+0x16f/0x470
Apr 22 09:23:49 daisy kernel: [143161.721466]  [] ?
finish_task_switch+0xce/0x120
Apr 22 09:23:49 daisy kernel: [143161.721470]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:23:49 daisy kernel: [143161.721475]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:23:49 daisy kernel: [143161.721479]  []
wait_for_completion+0xe4/0x120
Apr 22 09:23:49 daisy kernel: [143161.721483]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:23:49 daisy kernel: [143161.721487]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:23:49 daisy kernel: [143161.721493]  []
writeback_inodes_sb_nr_ub+0x83/0xb0
Apr 22 09:23:49 daisy kernel: [143161.721498]  []
writeback_inodes_sb_ub+0x46/0x50
Apr 22 09:23:49 daisy kernel: [143161.721502]  []
__sync_filesystem+0x48/0xa0
Apr 22 09:23:49 daisy kernel: [143161.721506]  []
sync_filesystems+0x30d/0x350
Apr 22 09:23:49 daisy kernel: [143161.721510]  []
sys_sync+0x148/0x1a0
Apr 22 09:23:49 daisy kernel: [143161.721514]  []
system_call_fastpath+0x22/0x3a Apr 22 09:25:02 daisy ata_id[22445]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
Apr 22 09:25:49 daisy kernel: [143281.721066] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:25:49 daisy kernel: [143281.721159]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:25:49 daisy kernel: [143281.721244] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:25:49 daisy kernel: [143281.721408] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:25:49 daisy kernel: [143281.721415]  ffff88006654bcc8
0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:25:49 daisy kernel: [143281.721420]  ffff880028200000
000000001a42f238 ffff8800110101c0 ffff88011a42f200
Apr 22 09:25:49 daisy kernel: [143281.721424]  ffff88006654bc68
ffffffff8107bbfe ffff8800110101c0 0000000000000000
Apr 22 09:25:49 daisy kernel: [143281.721429] Call Trace:
Apr 22 09:25:49 daisy kernel: [143281.721438]  [] ?
__switch_to+0x16f/0x470
Apr 22 09:25:49 daisy kernel: [143281.721444]  [] ?
finish_task_switch+0xce/0x120
Apr 22 09:25:49 daisy kernel: [143281.721448]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:25:49 daisy kernel: [143281.721453]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:25:49 daisy kernel: [143281.721457]  []
wait_for_completion+0xe4/0x120
Apr 22 09:25:49 daisy kernel: [143281.721461]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:25:49 daisy kernel: [143281.721465]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:25:49 daisy kernel: [143281.721471]  []
writeback_inodes_sb_nr_ub+0x83/0xb0
Apr 22 09:25:49 daisy kernel: [143281.721476]  []
writeback_inodes_sb_ub+0x46/0x50
Apr 22 09:25:49 daisy kernel: [143281.721480]  []
__sync_filesystem+0x48/0xa0
Apr 22 09:25:49 daisy kernel: [143281.721484]  []
sync_filesystems+0x30d/0x350
Apr 22 09:25:49 daisy kernel: [143281.721487]  []
sys_sync+0x148/0x1a0
Apr 22 09:25:49 daisy kernel: [143281.721492]  []
system_call_fastpath+0x22/0x3a Apr 22 09:27:49 daisy kernel: [143401.721072] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:27:49 daisy kernel: [143401.721165]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:27:49 daisy kernel: [143401.721253] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:27:49 daisy kernel: [143401.721421] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000080
Apr 22 09:27:49 daisy kernel: [143401.721427]  ffff88006654bcc8
0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:27:49 daisy kernel: [143401.721432]  ffff880028200000
000000001a42f238 ffff8800110101c0 ffff88011a42f200
Apr 22 09:27:49 daisy kernel: [143401.721436]  ffff88006654bc68
ffffffff8107bbfe ffff8800110101c0 0000000000000000
Apr 22 09:27:49 daisy kernel: [143401.721441] Call Trace:
Apr 22 09:27:49 daisy kernel: [143401.721450]  [] ?
__switch_to+0x16f/0x470
Apr 22 09:27:49 daisy kernel: [143401.721456]  [] ?
finish_task_switch+0xce/0x120
Apr 22 09:27:49 daisy kernel: [143401.721460]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:27:49 daisy kernel: [143401.721465]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:27:49 daisy kernel: [143401.721469]  []
wait_for_completion+0xe4/0x120
Apr 22 09:27:49 daisy kernel: [143401.721473]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:27:49 daisy kernel: [143401.721477]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:27:49 daisy kernel: [143401.721483]  []
writeback_inodes_sb_nr_ub+0x83/0xb0
Apr 22 09:27:49 daisy kernel: [143401.721487]  []
writeback_inodes_sb_ub+0x46/0x50
Apr 22 09:27:49 daisy kernel: [143401.721492]  []
__sync_filesystem+0x48/0xa0
Apr 22 09:27:49 daisy kernel: [143401.721495]  []
sync_filesystems+0x30d/0x350
Apr 22 09:27:49 daisy kernel: [143401.721499]  []
sys_sync+0x148/0x1a0
Apr 22 09:27:49 daisy kernel: [143401.721503]  []
system_call_fastpath+0x22/0x3a Apr 22 09:29:49 daisy kernel: [143521.721059] INFO: task hdparm:22246
blocked for more than 120 seconds. Apr 22 09:29:49 daisy kernel: [143521.721158]       Not tainted
2.6.32-042stab142.1 #1
Apr 22 09:29:49 daisy kernel: [143521.721245] “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message. Apr 22 09:29:49 daisy kernel: [143521.721415] hdparm        D
ffff88000c778300     0 22246  20845    0 0x00000084
Apr 22 09:29:49 daisy kernel: [143521.721421]  ffff88006654bcc8
0000000000000086 ffff88006654bc58 ffffffff810098af Apr 22 09:29:49 daisy kernel: [143521.721426]  ffff880028200000
000000001a42f238 ffff8800110101c0 ffff88011a42f200
Apr 22 09:29:49 daisy kernel: [143521.721431]  ffff88006654bc68
ffffffff8107bbfe ffff8800110101c0 0000000000000000
Apr 22 09:29:49 daisy kernel: [143521.721436] Call Trace:
Apr 22 09:29:49 daisy kernel: [143521.721445]  [] ?
__switch_to+0x16f/0x470
Apr 22 09:29:49 daisy kernel: [143521.721451]  [] ?
finish_task_switch+0xce/0x120
Apr 22 09:29:49 daisy kernel: [143521.721455]  [] ?
update_curr+0xe1/0x1f0
Apr 22 09:29:49 daisy kernel: [143521.721460]  []
schedule_timeout+0x215/0x2f0
Apr 22 09:29:49 daisy kernel: [143521.721465]  []
wait_for_completion+0xe4/0x120
Apr 22 09:29:49 daisy kernel: [143521.721469]  [] ?
default_wake_function+0x0/0x20
Apr 22 09:29:49 daisy kernel: [143521.721473]  [] ?
_spin_unlock_bh+0x1b/0x20
Apr 22 09:29:49 daisy kernel: [143521.721479]  []
writeback_inodes_sb_nr_ub+0x83/0xb0
Apr 22 09:29:49 daisy kernel: [143521.721483]  []
writeback_inodes_sb_ub+0x46/0x50
Apr 22 09:29:49 daisy kernel: [143521.721487]  []
__sync_filesystem+0x48/0xa0
Apr 22 09:29:49 daisy kernel: [143521.721491]  []
sync_filesystems+0x30d/0x350
Apr 22 09:29:49 daisy kernel: [143521.721495]  []
sys_sync+0x148/0x1a0
Apr 22 09:29:49 daisy kernel: [143521.721499]  []
system_call_fastpath+0x22/0x3a Apr 22 09:30:04 daisy ata_id[22489]: HDIO_GET_IDENTITY failed for ‘/dev/sdb’
——————
I tried running hdparm -tT /dev/sda, but after waiting 5+ minutes for any command output I cancelled it. 

I am rsyncing the data from this system over to another system now, clearly something is wrong, but I can’t tell what. 

The system is an older AMD Opteron 180 processor (dual core) 4 GB ram, RAID controller with RAID 5 set up with 4x 4TB Western Digital Drives.

I rebooted the system day before yesterday, and that’s when the timeout messages started pouring into the log.   

when I run tw_cli /c8 show, all four drives say they are ok
[root@daisy cron.daily]# tw_cli /c8 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache 
AVrfy
——————————————————————————
u0    RAID-5    OK             –       –       256K    11175.8   Ri     ON

VPort Status         Unit Size      Type  Phy Encl-Slot    Model
——————————————————————————
p0    OK             u0   3.63 TB   SATA  0   –            WDC
WD4005FZBX-00K5
p1    OK             u0   3.63 TB   SATA  1   –            WDC
WD4005FZBX-00K5
p2    OK             u0   3.63 TB   SATA  2   –            WDC
WD4005FZBX-00K5
p3    OK             u0   3.63 TB   SATA  3   –            WDC
WD4005FZBX-00K5

Logical Volumes appear active:
[root@daisy cron.daily]# lvscan
  ACTIVE            ‘/dev/vg_daisy/lv_root’ [10.89 TiB] inherit
  ACTIVE            ‘/dev/vg_daisy/lv_swap’ [3.88 GiB] inherit
  ACTIVE            ‘/dev/vg_daisy/lv_home’ [20.00 GiB] inherit
[root@daisy cron.daily]#

[root@daisy cron.daily]# lvmdiskscan
  /dev/ram0             [      16.00 MiB]
  /dev/root             [      10.89 TiB]
  /dev/ram1             [      16.00 MiB]
  /dev/sda1             [       2.82 TiB]
  /dev/vg_daisy/lv_swap [       3.88 GiB]
  /dev/ram2             [      16.00 MiB]
  /dev/vg_daisy/lv_home [      20.00 GiB]
  /dev/ram3             [      16.00 MiB]
  /dev/sda3             [     842.87 GiB]
  /dev/ram4             [      16.00 MiB]
  /dev/ram5             [      16.00 MiB]
  /dev/ram6             [      16.00 MiB]
  /dev/ram7             [      16.00 MiB]
  /dev/ram8             [      16.00 MiB]
  /dev/ram9             [      16.00 MiB]
  /dev/ram10            [      16.00 MiB]
  /dev/ram11            [      16.00 MiB]
  /dev/ram12            [      16.00 MiB]
  /dev/ram13            [      16.00 MiB]
  /dev/ram14            [      16.00 MiB]
  /dev/ram15            [      16.00 MiB]
  /dev/sdb1             [       1.82 TiB] LVM physical volume
  /dev/sdc1             [     500.00 MiB]
  /dev/sdc2             [       4.00 TiB] LVM physical volume
  /dev/sdd1             [       4.00 TiB] LVM physical volume
  /dev/sde1             [       2.91 TiB] LVM physical volume
  3 disks
  19 partitions
  0 LVM physical volume whole disks
  4 LVM physical volumes
[root@daisy cron.daily]#

grub.conf:
[root@daisy grub]# cat grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/vg_daisy-lv_root
#          initrd /initrd-[generic-]version.img
#boot=/dev/sdb default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title OpenVZ (2.6.32-042stab142.1)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-042stab142.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-042stab142.1.img title OpenVZ (2.6.32-042stab141.3)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-042stab141.3 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-042stab141.3.img title OpenVZ (2.6.32-042stab140.4)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-042stab140.4 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-042stab140.4.img title OpenVZ (2.6.32-042stab140.1)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-042stab140.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-042stab140.1.img title OpenVZ (2.6.32-042stab139.1)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-042stab139.1 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-042stab139.1.img title CentOS 6 (2.6.32-754.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-754.el6.x86_64 ro root=/dev/mapper/vg_daisy-lv_root rd_NO_LUKS rd_LVM_LV=vg_daisy/lv_swap LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_daisy/lv_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
        initrd /initramfs-2.6.32-754.el6.x86_64.img
————-

Top is not showing anything out of the ordinary:
———-
[root@daisy grub]#

top – 09:41:57 up 1 day, 16:04,  3 users,  load average: 5.89, 5.83, 5.43
Tasks: 369 total,   1 running, 368 sleeping,   0 stopped,   0 zombie Cpu(s):  0.2%us,  1.2%sy,  0.0%ni, 25.0%id, 73.5%wa,  0.0%hi,  0.2%si, 
0.0%st Mem:   3894628k total,  3861280k used,    33348k free,    95608k buffers Swap:  4063228k total,    34888k used,  4028340k free,  3139272k cached

    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   1266 root      20   0     0    0    0 D  1.0  0.0  12:27.75 flush-253:0
  21041 1153      20   0  3188 1840 1012 D  0.7  0.0   0:00.72 imap
  21599 97        20   0  5160 1940 1568 S  0.7  0.0   0:01.06 imap-login
  22636 root      20   0 15272 1524  964 R  0.7  0.0   0:00.06 top
   1977 root      20   0  2096  644  360 S  0.3  0.0   0:27.92 dovecot
  22528 97        20   0  5160 2044 1672 S  0.3  0.1   0:00.35 imap-login
  22578 1155      20   0  2904 1528  940 D  0.3  0.0   0:00.22 imap
      1 root      20   0 19236  268  136 S  0.0  0.0   0:00.68 init
      2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
      3 root      RT   0     0    0    0 S  0.0  0.0   0:00.04 migration/0
      4 root      20   0     0    0    0 S  0.0  0.0   0:01.88 ksoftirqd/0
      5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/0
      6 root      RT   0     0    0    0 S  0.0  0.0   0:00.19 watchdog/0
      7 root      RT   0     0    0    0 S  0.0  0.0   0:00.07 migration/1
      8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/1
      9 root      20   0     0    0    0 S  0.0  0.0   0:03.17 ksoftirqd/1
     10 root      RT   0     0    0    0 S  0.0  0.0   0:00.20 watchdog/1
     11 root      20   0     0    0    0 S  0.0  0.0   0:07.23 events/0
     12 root      20   0     0    0    0 S  0.0  0.0   0:08.55 events/1
     13 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/0
     14 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/1
     15 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/0
     16 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/1
     17 root      20   0     0    0    0 S  0.0  0.0   0:00.00
events_power_ef
     18 root      20   0     0    0    0 S  0.0  0.0   0:00.00
events_power_ef
     19 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cgroup
     20 root      20   0     0    0    0 S  0.0  0.0   0:00.00 khelper
     21 root      20   0     0    0    0 S  0.0  0.0   0:00.01 netns
     22 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr
     23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pm
     24 root      20   0     0    0    0 S  0.0  0.0   0:00.29 sync_supers
————
This is a company production mail server, and I can’t find the solution, I need help, as soon as someone is able, thank you!

7 thoughts on - Slow Performance On Company Production Server I Need Help

  • Hi,

    You didn’t tell us much about your hardware. It seems you’re running a
    3ware controller in it.

    What exactly is /dev/sdb? I don’t know 3ware so I can only guess but it looks to me like at least one of your drives is having problems.

    Any chance you can check SMART status on the disks or have it do a SMART
    test? I guess your 3ware controller has to initiate this.

    Regards, Simon

  • And, what is the status of the BBU?

    Looks like “tw_cli /cx/bbu show status” should tell.

  • Correct, 3ware 9670SE SATA-II Raid PCIe

    [root@daisy dev]# lspci
    00:00.0 Memory controller: NVIDIA Corporation CK804 Memory Controller
    (rev a3)
    00:01.0 ISA bridge: NVIDIA Corporation CK804 ISA Bridge (rev a3)
    00:01.1 SMBus: NVIDIA Corporation CK804 SMBus (rev a2)
    00:02.0 USB controller: NVIDIA Corporation CK804 USB Controller (rev a2)
    00:02.1 USB controller: NVIDIA Corporation CK804 USB Controller (rev a3)
    00:06.0 IDE interface: NVIDIA Corporation CK804 IDE (rev f2)
    00:07.0 IDE interface: NVIDIA Corporation CK804 Serial ATA Controller
    (rev f3)
    00:08.0 IDE interface: NVIDIA Corporation CK804 Serial ATA Controller
    (rev f3)
    00:09.0 PCI bridge: NVIDIA Corporation CK804 PCI Bridge (rev a2)
    00:0a.0 Bridge: NVIDIA Corporation CK804 Ethernet Controller (rev a3)
    00:0b.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3)
    00:0c.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3)
    00:0d.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3)
    00:0e.0 PCI bridge: NVIDIA Corporation CK804 PCIE Bridge (rev a3)
    00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
    [Athlon64/Opteron] HyperTransport Technology Configuration
    00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
    [Athlon64/Opteron] Address Map
    00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
    [Athlon64/Opteron] DRAM Controller
    00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
    [Athlon64/Opteron] Miscellaneous Control
    01:05.0 VGA compatible controller: Advanced Micro Devices, Inc.
    [AMD/ATI] Rage 3 [Rage XL PCI] (rev 27)
    04:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 11)
    05:00.0 RAID bus controller: 3ware Inc 9650SE SATA-II RAID PCIe (rev 01)

    There is no bbu phsyically present: 

    //daisy/c8> /c8/bbu show Error: (CLI:059) Battery Backup Unit is not present.

    //daisy/c8> show

    Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache 
    AVrfy
    ——————————————————————————
    u0    RAID-5    OK             –       –       256K    11175.8   Ri     ON

    VPort Status         Unit Size      Type  Phy Encl-Slot    Model
    ——————————————————————————
    p0    OK             u0   3.63 TB   SATA  0   –            WDC
    WD4005FZBX-00K5
    p1    OK             u0   3.63 TB   SATA  1   –            WDC
    WD4005FZBX-00K5
    p2    OK             u0   3.63 TB   SATA  2   –            WDC
    WD4005FZBX-00K5
    p3    OK             u0   3.63 TB   SATA  3   –            WDC
    WD4005FZBX-00K5

    //daisy/c8>

    I had an 8 TB External USB disk plugged into the system, that I had been using for additional space for backups, I was under the impression that sda, sdb, sdc, and sdd were the four disks on the raid controller card, but after unplugging the usb drive when running hdparm I am getting this:

    [root@daisy dev]# hdparm -tT /dev/sdb

    /dev/sdb:
    read() hit EOF – device too small
     BLKGETSIZE failed: Inappropriate ioctl for device BLKFLSBUF failed: Inappropriate ioctl for device
    [root@daisy dev]#

    ——————————————————

    the 3ware controller shows all four disks still online with a status of ok:

    [root@daisy dev]# tw_cli /c8 show

    Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache 
    AVrfy
    ——————————————————————————
    u0    RAID-5    OK             –       –       256K    11175.8   Ri     ON

    VPort Status         Unit Size      Type  Phy Encl-Slot    Model
    ——————————————————————————
    p0    OK             u0   3.63 TB   SATA  0   –            WDC
    WD4005FZBX-00K5
    p1    OK             u0   3.63 TB   SATA  1   –            WDC
    WD4005FZBX-00K5
    p2    OK             u0   3.63 TB   SATA  2   –            WDC
    WD4005FZBX-00K5
    p3    OK             u0   3.63 TB   SATA  3   –            WDC
    WD4005FZBX-00K5

    [root@daisy dev]#
    ———————————–

    //daisy> /c8/u0
    Error: (CLI:039) Invalid unit command syntax.

    Unit Info: 
    //daisy> /c8/u0 show

    Unit     UnitType  Status         %RCmpl  %V/I/M  Port  Stripe  Size(GB)
    ————————————————————————
    u0       RAID-5    OK             –       –       –     256K    11175.8
    u0-0     DISK      OK             –       –       p0    –       1677.28
    u0-1     DISK      OK             –       –       p1    –       1677.28
    u0-2     DISK      OK             –       –       p2    –       1677.28
    u0-3     DISK      OK             –       –       p3    –       1677.28
    u0/v0    Volume    –              –       –       –     –       4096
    u0/v1    Volume    –              –       –       –     –       4096
    u0/v2    Volume    –              –       –       –     –       2983.84

    I’m still trying to find the exact smartctl command syntax to make it work.

    Does this give you what your looking for?

  • Not exactly.  If you have a RAID5 array, then you have one volume spread across the physical disks.  You can then divide that into smaller virtual disks, each of which is still spread across disks.  The OS
    doesn’t see the four component disks, directly.

    sdb was the disk that appeared in the kernel errors, so I’d imagine that you’ve fixed the problem by removing the USB drive enclosure.

    https://www.cyberciti.biz/faq/unix-linux-freebsd-3w-9xxx-smartctl-check-hard-disk-command/

  • Smartctl tests have passed: 

    [root@daisy dev]# smartctl -H -d 3ware,0 /dev/twa0
    smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-042stab142.1] (local build)
    Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    ——————-

    The problem may have been the USB disk the whole time, and not one of the internal disks, it’s surprising that the performance of the entire system would suffer that much from a faulting external USB drive.

    To monitor the system and ensure that was the root cause, other than hdparm, what other suggested performance tests should I run to monitor the performance of the system as a whole?  What is everyone’s top rated performance monitoring commands / apps that can be dumped into cron jobs or logwatch, etc?

    Chris

  • This read: You have 4 physical drives configured as one raid5 array. The array has three volumes (v0, v1, v2). These three volumes will be visible as three scsi devices in Linux.

    You can use for example lsblk or lsscsi to list them.

    Also, you may want to look physically for a battery backup unit (bbu). The software said there isn’t one and that certainly could be a performance problem (as I would assume the controller turns off write-back caching).

    /Peter

  • Christopher,

    you are running your RAID-controller without BBU and without write caching. (see the Ri statement)

    I get

    tw_cli /c0 show

    Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
    ——————————————————————————
    u0 RAID-6 OK – – 256K 16763.7 RiW ON

    VPort Status Unit Size Type Phy Encl-Slot Model
    ——————————————————————————
    p0 OK u0 2.73 TB SATA 0 – WDC WD30EURS-63SPKY0
    p1 OK u0 2.73 TB SATA 1 – WDC WD30EURS-63R8UY0
    p2 OK u0 2.73 TB SATA 2 – WDC WD30EURS-63SPKY0
    p3 OK u0 2.73 TB SATA 3 – WDC WD30EURS-63SPKY0
    p4 OK u0 2.73 TB SATA 4 – WDC WD30EURS-63SPKY0
    p5 OK u0 2.73 TB SATA 5 – WDC WD30EURS-63SPKY0
    p6 OK u0 2.73 TB SATA 6 – WDC WD30EURS-63SPKY0
    p7 OK u0 2.73 TB SATA 7 – WDC WD30EURS-63SPKY0

    Name OnlineState BBUReady Status Volt Temp Hours LastCapTest
    —————————————————————————
    bbu On Yes OK OK OK 197 31-Dec-2019

    with the same type of controllers. When I did the initial installation, I found that the write caching makes a VERY big difference in my mailserver application. However, you might have get problems finding a matching BBU these days if your controller is not very recent. And depending on where you are located and how good your power supply is, you don’t want to activate write caching without a BBU.

    cheers,

    Michael Schumacher