[HN Gopher] Uptime related server crashes (2011)
___________________________________________________________________
Uptime related server crashes (2011)
Author : luu
Score : 31 points
Date : 2023-12-13 20:42 UTC (2 days ago)
(HTM) web link (barry.blog)
(TXT) w3m dump (barry.blog)
| forbiddenlake wrote:
| (2011)
| macNchz wrote:
| Rather relevant in this case...I was like "ok interesting
| debugging but have you considered just running a kernel that
| was released in the last decade??" until I got to the date at
| the bottom.
| readingnews wrote:
| Right. When I saw "Debian kernels ranging from 2.6.32-21 to
| 2.6.32-24..." I was like, uhhhh.
|
| Someone should change the title of the HN post to say 2011.
| withzombies wrote:
| > _sds.total_pwr is the sum of the power of all CPUs in the
| scheduling domain. This sum ends up being zero and that's what
| causing the crash - division by zero._
|
| > _The "CPU power" is used to take into account how much
| calculating capabilities a CPU has compared to the other CPUs and
| the main factors for calculating it are:_
|
| > _1. Whether the CPU is shared, for example by using
| multithreading._
|
| > _2. How many real-time tasks the CPU is processing._
|
| > _3. In newer kernels, how much time the CPU had spent
| processing IRQs._
|
| > _The current suggested fix for this bug is relying on the
| theory that while taking into account the real-time tasks (#2
| above), scale_rt_power() could return negative value, and thus
| the sum of all CPU powers may end up being zero._
|
| The author doesn't really describe how this panic is related to
| uptime. Do long running kernels collect a lot of real-time tasks,
| is it a leak of some kind?
|
| The suggested fix link doesn't provide any extra context as to
| why its uptime related either.
| theandrewbailey wrote:
| The post doesn't explain why scale_rt_power() isn't in the code
| snippet, or how it factors in.
| krallja wrote:
| Uptime-related crashes seem fairly common. Here's one of my
| stories, from Thanksgiving 2012: https://jacob.jkrall.net/turkey-
| day-down-time I've seen a couple others since, but they had the
| same general shape so didn't bother writing the same story again.
| keep_reading wrote:
| I know of a FreeBSD 6.0 server with 13+ years uptime, what the
| heck was Linux doing back then????
___________________________________________________________________
(page generated 2023-12-15 23:02 UTC)