[HN Gopher] Linux 6.13 will report the number of hung tasks sinc...
       ___________________________________________________________________
        
       Linux 6.13 will report the number of hung tasks since boot
        
       Author : LinuxBender
       Score  : 100 points
       Date   : 2024-11-24 16:20 UTC (6 hours ago)
        
 (HTM) web link (www.phoronix.com)
 (TXT) w3m dump (www.phoronix.com)
        
       | gcr wrote:
       | What counts as a hung task? Blocking on unsatisfiable I/O for
       | more than X seconds? Scheduler hasn't gotten to it in X seconds?
       | 
       | If a server process is blocking on accept(), wouldn't it count as
       | hung until a remote client connects? or do only certain
       | operations count?
        
         | westurner wrote:
         | torvalds/linux//kernel/hung_task.c :
         | 
         | static void check_hung_task(struct task_struct *t, unsigned
         | long timeout)
         | https://github.com/torvalds/linux/blob/9f16d5e6f220661f73b36...
         | 
         | static void check_hung_uninterruptible_tasks(unsigned long
         | timeout)
         | https://github.com/torvalds/linux/blob/9f16d5e6f220661f73b36...
        
           | striking wrote:
           | Just to double check my understanding (because being wrong on
           | the internet is perhaps the fastest way to get people to
           | check your work):
           | 
           | Is this saying that regular tasks that haven't been scheduled
           | for two minutes and tasks that are uninterruptible (truly so,
           | not idle or also killable despite being marked as
           | uninterruptible) that haven't been woken up for two minutes
           | are counted?
        
             | westurner wrote:
             | Your and the Llama's explanations would make good comments
             | for the source and/or the docs if true.
        
       | ape4 wrote:
       | And there's https://en.wikipedia.org/wiki/Zombie_process too
        
         | Polizeiposaune wrote:
         | Not the same thing by any means - they don't indicate something
         | is wrong with kernel or hardware.
         | 
         | The zombie process state is a normal transient state for all
         | exiting processes where the only remaining function of the
         | process is as a container for the exiting process's id and exit
         | status; they go away once the parent process calls some flavor
         | of the "wait" system call to collect the exit status. A pileup
         | of zombies indicates a userspace bug: a negligent parent
         | process that isn't collecting the exit status in a timely
         | manner.
        
           | thwarted wrote:
           | Additionally, there are a few more process accounting things,
           | rusage, that zombie processes hold until reaped. See
           | wait3(2), wait4(2) and getrusage(2).
        
       | bhaney wrote:
       | My dmesg is already constantly full of                 INFO: task
       | btrfs:103945 blocked for more than 120 seconds.       "echo 0 >
       | /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       | 
       | Until eventually                 Future hung task reports are
       | suppressed, see sysctl kernel.hung_task_warnings
       | 
       | So I'm looking forward to getting an actual count of how often
       | this happens without needing to babysit the warning suppressions
       | and count the incidents myself.
        
         | jeffbee wrote:
         | You could leave this problem behind by switching to a
         | filesystem that isn't full of deadlock bugs.
        
           | yjftsjthsd-h wrote:
           | I am curious - _is_ this message indicative of a problem in
           | the fs? I would have assumed anything marked  "INFO" is,
           | tautologically, not an error, but surely a filesystem
           | shouldn't be locking up? Or is it just suggestive of high
           | system load or poor hardware performance?
        
             | blueflow wrote:
             | That the in-kernel code for btrfs locks up should never
             | happen at all. There is a rumor going around that btrfs
             | never reached maturity and suffers from design issues.
        
               | ramon156 wrote:
               | Given the mailing History with Linus I wouldn't be
               | surprised
        
               | SoftTalker wrote:
               | That's why I use ext4 exclusively on linux. Never once
               | had a filesystem issue.
        
             | shric wrote:
             | It could be any of the above, I'd say it's info because the
             | kernel itself is not in an error state, it's information
             | about a process doing something unusual
        
           | bhaney wrote:
           | I was planning on it but the filesystem I wanted to switch to
           | keeps getting set back by the author's CoC drama
        
       ___________________________________________________________________
       (page generated 2024-11-24 23:01 UTC)