[HN Gopher] Linux 6.13 will report the number of hung tasks sinc...
___________________________________________________________________
Linux 6.13 will report the number of hung tasks since boot
Author : LinuxBender
Score : 100 points
Date : 2024-11-24 16:20 UTC (6 hours ago)
(HTM) web link (www.phoronix.com)
(TXT) w3m dump (www.phoronix.com)
| gcr wrote:
| What counts as a hung task? Blocking on unsatisfiable I/O for
| more than X seconds? Scheduler hasn't gotten to it in X seconds?
|
| If a server process is blocking on accept(), wouldn't it count as
| hung until a remote client connects? or do only certain
| operations count?
| westurner wrote:
| torvalds/linux//kernel/hung_task.c :
|
| static void check_hung_task(struct task_struct *t, unsigned
| long timeout)
| https://github.com/torvalds/linux/blob/9f16d5e6f220661f73b36...
|
| static void check_hung_uninterruptible_tasks(unsigned long
| timeout)
| https://github.com/torvalds/linux/blob/9f16d5e6f220661f73b36...
| striking wrote:
| Just to double check my understanding (because being wrong on
| the internet is perhaps the fastest way to get people to
| check your work):
|
| Is this saying that regular tasks that haven't been scheduled
| for two minutes and tasks that are uninterruptible (truly so,
| not idle or also killable despite being marked as
| uninterruptible) that haven't been woken up for two minutes
| are counted?
| westurner wrote:
| Your and the Llama's explanations would make good comments
| for the source and/or the docs if true.
| ape4 wrote:
| And there's https://en.wikipedia.org/wiki/Zombie_process too
| Polizeiposaune wrote:
| Not the same thing by any means - they don't indicate something
| is wrong with kernel or hardware.
|
| The zombie process state is a normal transient state for all
| exiting processes where the only remaining function of the
| process is as a container for the exiting process's id and exit
| status; they go away once the parent process calls some flavor
| of the "wait" system call to collect the exit status. A pileup
| of zombies indicates a userspace bug: a negligent parent
| process that isn't collecting the exit status in a timely
| manner.
| thwarted wrote:
| Additionally, there are a few more process accounting things,
| rusage, that zombie processes hold until reaped. See
| wait3(2), wait4(2) and getrusage(2).
| bhaney wrote:
| My dmesg is already constantly full of INFO: task
| btrfs:103945 blocked for more than 120 seconds. "echo 0 >
| /proc/sys/kernel/hung_task_timeout_secs" disables this message.
|
| Until eventually Future hung task reports are
| suppressed, see sysctl kernel.hung_task_warnings
|
| So I'm looking forward to getting an actual count of how often
| this happens without needing to babysit the warning suppressions
| and count the incidents myself.
| jeffbee wrote:
| You could leave this problem behind by switching to a
| filesystem that isn't full of deadlock bugs.
| yjftsjthsd-h wrote:
| I am curious - _is_ this message indicative of a problem in
| the fs? I would have assumed anything marked "INFO" is,
| tautologically, not an error, but surely a filesystem
| shouldn't be locking up? Or is it just suggestive of high
| system load or poor hardware performance?
| blueflow wrote:
| That the in-kernel code for btrfs locks up should never
| happen at all. There is a rumor going around that btrfs
| never reached maturity and suffers from design issues.
| ramon156 wrote:
| Given the mailing History with Linus I wouldn't be
| surprised
| SoftTalker wrote:
| That's why I use ext4 exclusively on linux. Never once
| had a filesystem issue.
| shric wrote:
| It could be any of the above, I'd say it's info because the
| kernel itself is not in an error state, it's information
| about a process doing something unusual
| bhaney wrote:
| I was planning on it but the filesystem I wanted to switch to
| keeps getting set back by the author's CoC drama
___________________________________________________________________
(page generated 2024-11-24 23:01 UTC)