One of the most viewed posts in 2022… I hope it helped resolving some linux performance issues…
This might be old school, and maybe even boring reading. But, if you concern about performance on Linux servers, at some point, you will have to have a look to the kernel messages.
The problem:
When we run very stressful jobs running on large servers (large number of CPU’s and RAM memory), where IO activity is very high. It is pretty common to start seeing these messages on the ‘dmesg’ kernel output:
[24169.372862] INFO: task kswapd1:1140 blocked for more than 120 seconds. [24169.375623] Tainted: G E 4.9.51-10.52.amzn1.x86_64 #1 [24169.378445] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [24169.382533] kswapd1 D 0 1140 2 0x00000000 [24169.385066] ffff8811605c5a00 0000000000000000 ffff8823645dc900 ffff882362844900 [24169.389208] ffff882371357c00 ffffc9001a13ba08 ffffffff8153896c ffffc9001a13ba18 [24169.393329] ffff881163ac92d8 ffff88115e87f400 ffff882362844900 ffff88115e87f46c [24169.445313] Call Trace: [24169.446981] [<ffffffff8153896c>] ? __schedule+0x23c/0x680 [24169.449454] [<ffffffff81538de6>] schedule+0x36/0x80 [24169.451790] [<ffffffff8153907e>] schedule_preempt_disabled+0xe/0x10 [24169.454509] [<ffffffff8153a8d5>] __mutex_lock_slowpath+0x95/0x110 [24169.457156] [<ffffffff8153a967>] mutex_lock+0x17/0x27 [24169.459546] [<ffffffffa0966d00>] xfs_reclaim_inodes_ag+0x2b0/0x330 [xfs] [24169.462407] [<ffffffffa0967be3>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [24169.465203] [<ffffffffa0978f59>] xfs_fs_free_cached_objects+0x19/0x20 [xfs] ...
View original post 374 more words