[HN Gopher] setBigTimeout
___________________________________________________________________
setBigTimeout
Author : cfj
Score : 81 points
Date : 2024-10-17 18:01 UTC (1 days ago)
(HTM) web link (evanhahn.com)
(TXT) w3m dump (evanhahn.com)
| miiiiiike wrote:
| Got hit with this one a few months ago.
| graypegg wrote:
| Just out of curiosity, what was the use case for a really long
| timeout? Feels like most if not all long timeouts would be best
| served with some sort of "job" you could persist, rather than
| leaving it in the event queue.
| cout wrote:
| https://thedailywtf.com/articles/The_Harbinger_of_the_Epoch_
| graypegg wrote:
| To be fair, this will be fixed by browsers when it's within
| spitting distance of the scale of numbers setTimeout is
| normally used with. (not huge numbers) Like, if it's close
| enough that setTimeout(() => {}, 5000) will stop working a
| month later, that would be a major failure on the browser
| vendor's part. Much too close for comfort.
|
| But I totally understand it not being a priority if the
| situation is: setTimeout(() => {}, 500000000) not working
| in X years.
| BillyTheKing wrote:
| this is the thing with JS and TS - the types and stuff, it's
| all good until you realise that all integers are basically int
| 52 (represented as float 64, with 52 bits for the fraction).
|
| Yes, it's nice and flexible - but also introduces some
| dangerous subtle bugs.
| 8n4vidtmkvmk wrote:
| 2^53-1 I thought.
|
| And no, they're not all that. There's a bunch that are 2^32
| such as this timeout, apparently, plus all the bit shift
| operations.
| vhcr wrote:
| Not ALL integers are 52 bit, BigInts were added on ECMAScript
| 2020.
| sjaak wrote:
| What is the use-case for such a function?
| echoangle wrote:
| Make a joke and have something to write a blogpost about, while
| letting your readers learn something new.
| keithwhor wrote:
| Off the top of my head, a cron scheduler for a server that
| reads from a database and sets a timeout upon boot. Every time
| the server is reboot the timeouts are reinitialized (fail safe
| in case of downtime). If upon boot there's a timeout > 25 days
| it'll get executed immediately which is not the behavior you
| want.
| skykooler wrote:
| Why would you do that in JS rather than just using cron for
| it?
| hinkley wrote:
| This should be an interval with a lookup.
|
| Every five seconds check for due dates sooner than 10 seconds
| from now and schedule them.
|
| The longer a delay the higher the odds the process exits
| without finishing the work.
| bgirard wrote:
| Not having your timeout fire unexpectedly instantly is a good
| use-case IMO.
| yifanl wrote:
| If we're pedantic, this doesn't actually do what's advertised,
| this would be waiting X timeouts worth of event cycles rather
| than just the one for a true Big timeout, assuming the precision
| matters when you're stalling a function for 40 days.
| keithwhor wrote:
| I haven't looked at the code but it's fairly likely the author
| considered this? eg the new timeout is set based on the delta
| of Date.now() instead of just subtracting the time from the
| previous timeout.
| yifanl wrote:
| No, it pretty much just does exactly that.
| const subtractNextDelay = () => { if (typeof
| remainingDelay === "number") { remainingDelay -=
| MAX_REAL_DELAY; } else { remainingDelay
| -= BigInt(MAX_REAL_DELAY); } };
| keithwhor wrote:
| Oh yikes. Yeah; not ideal.
| Aachen wrote:
| To be fair, this is what I expect of any delay function.
| If it needs to be precise to the millisecond,
| _especially_ when scheduled hours or days ahead, I 'd
| default to doing a sleep until shortly before (ballpark:
| 98% of the full time span) and then a smaller sleep for
| the remaining time, or even a busy wait for the last bit
| if it needs to be sub-millisecond accurate
|
| I've had too many sleep functions not work as they should
| to still rely on this, especially on mobile devices and
| webpages where background power consumption is a concern.
| It doesn't excuse new bad implementations but it's also
| not exactly surprising
| gnachman wrote:
| That wouldn't very well because Date.now() isn't monotonic.
| ballenf wrote:
| Each subtracted timeout is a 25 day timer, so any accumulated
| error would be miniscule. In your example there would a total
| of 2 setTimeouts called, one 25 day timer and one 15 day. I
| think the room for error with this approach is smaller and much
| simpler than calculating the date delta and trying to take into
| account daylight savings, leap days, etc. (but I don't know
| what setTimeout does with those either).
|
| Or maybe I'm missing your point.
| n2d4 wrote:
| The default behaviour of setTimeout seems problematic. Could be
| used for an exploit, because code like this might not work as
| expected: const attackerControlled = ...;
| if (attackerControlled < 60_000) { throw new
| Error("Must wait at least 1min!"); }
| setTimeout(() => { console.log("Surely at least 1min
| has passed!"); }, attackerControlled);
|
| The attacker could set the value to a comically large number and
| the callback would execute immediately. This also seems to be
| true for NaN. The better solution (imo) would be to throw an
| error, but I assume we can't due to backwards compatibility.
| arghwhat wrote:
| A scenario where an attacker can control a timeout where having
| the callback run sooner than one minute later would lead to
| security failures, but having it set to run days later is
| perfectly fine and so no upper bound check is required seems...
| quite a constructed edge case.
|
| The problem here is having an attacker control a security
| sensitive timer in the first place.
| a_cardboard_box wrote:
| The exploit could be a DoS attack. I don't think it's that
| contrived to have a service that runs an expensive operation
| at a fixed rate, controlled by the user, limited to 1
| operation per minute.
| lucideer wrote:
| > _I don 't think it's that contrived to have a service
| that runs an expensive operation at a fixed rate,
| controlled by the user_
|
| Maybe not contrived but definitely insecure by definition.
| Allowing user control of rates is definitely useful & a
| power devs will need to grant but it should never be direct
| control.
| shawnz wrote:
| Can you elaborate on what _indirect_ control would look
| like in your opinion?
|
| No matter how many layers of abstraction you put in
| between, you're still eventually going to be passing a
| value to the setTimeout function that was computed based
| on something the user inputted, right?
|
| If you're not aware of these caveats about extremely high
| timeout values, how do any layers of abstraction in
| between help you prevent this? As far as I can see, the
| only prevention is knowing about the caveats and
| specifically adding validation for them.
| lucideer wrote:
| > _that was computed_
|
| Or comes from a set of known values. This stuff isn't
| that difficult.
|
| This doesn't require prescient knowledge of high timeout
| edge cases. It's generally accepted good security
| practice to limit business logic execution based on user
| input parameters. This goes beyond input validation &
| bounds on user input (both also good practice but most
| likely to just involve a !NaN check here), but more
| broadly user input is data & timeout values are code.
| Data should be treated differently by your app than code.
|
| To generalise the case more, another common case of a
| user submitting a config value that would be used in
| logic would be string labels for categories. You could
| validate against a known list of categories (good but
| potentially expensive) but whether you do or not it's
| still good hygiene to key the user submitted string
| against a category hashmap or enum - this cleanly avoids
| using user input directly in your executing business
| logic.
| arghwhat wrote:
| A minimum timing of an individual task is not a useful rate
| limit. I could schedule a bunch of tasks to happen far into
| the future but all at once for example.
|
| Rate limits are implemented with e.g., token buckets which
| fill to a limit at a fixed rate. Timed tasks would then on
| run try to take a token, and if none is present wait for
| one. This would then be dutifully enforced regardless of
| the current state of scheduled tasks.
|
| Only consideration for the timer itself would be to always
| add random jitter to avoid having peak loads coalesce.
| sfvisser wrote:
| Don't ever use attacker controlled data directly in your source
| code without validation. Don't blame setTimeout for this, it's
| impolite!
| n2d4 wrote:
| The problem is the validation. You'd expect you just have to
| validate a lower bound, but you also have to validate an
| upper bound.
| leptons wrote:
| It's user input, you have to validate _all the bounds_ ,
| and filter out whatever else might cause problems. Not
| doing so is a a problem with the programmer, not
| setTimeout.
| swatcoder wrote:
| That's just terrible input validation and has nothing to do
| with setTimeout.
|
| If your code would misbehave outside a certain range of values
| and you're input might span a larger range, _you should be
| checking your input against the range that 's valid_. Your
| sample code simply doesn't do that, and that's why there's a
| bug.
|
| That the bug happens to involve a timer is irrelevant.
| tourist2d wrote:
| > That's just terrible input validation and has nothing to do
| with setTimeout.
|
| Except for the fact that this behaviour is surprising.
|
| > you should be checking your input against the range that's
| valid. Your sample code simply doesn't do that, and that's
| why there's a bug.
|
| Indeed, so why doesn't setTimeout internally do that?
| drdaeman wrote:
| > Indeed, so why doesn't setTimeout internally do that?
|
| Given that `setTimeout` is a part of JavaScript's ancient
| reptilian brain, I wouldn't be surprised it doesn't do
| those checks just because there's some silly compatibility
| requirement still lingering and no one in the committees is
| brave enough to make a breaking change.
|
| (And then, what should setTimeout do if delay is NaN? Do
| nothing? Call immediately? Throw an exception? Personally
| I'd prefer it to throw, but I don't think there's any
| single undeniably correct answer.)
|
| Given the trend to move away from the callbacks, I wonder
| why there is no `async function sleep(delay)` in the
| language, that would be free to sort this out nicely
| without having to be compatible with stuff from '90s. Or
| something like that.
| wging wrote:
| In nodejs you at least get a warning along with the problematic
| behavior: Welcome to Node.js v22.7.0.
| Type ".help" for more information. > setTimeout(() =>
| console.log('reached'), 3.456e9) Timeout { <contents
| elided> } > (node:64799) TimeoutOverflowWarning:
| 3456000000 does not fit into a 32-bit signed integer.
| Timeout duration was set to 1. (Use `node --trace-
| warnings ...` to show where the warning was created)
| reached
|
| I'm surprised to see that setTimeout returns an object - I
| assume at one point it was an integer identifying the timer,
| the same way it is on the web. (I think I remember it being so
| at one point.)
| augusto-moura wrote:
| It returns an object for a long time now, I might say it was
| always like this actually. Don't know about very old versions
| issafram wrote:
| I wish that I could actually see the code. I understand that it's
| chaining timeouts, but the git site is just garbage
| maxbond wrote:
| You've gotta click "tree".
|
| https://git.sr.ht/~evanhahn/setBigTimeout/tree/main/item/mod...
| zgk7iqea wrote:
| yes, sourcehuts interface is just godawful
| egwynn wrote:
| I agree it's not the prettiest, but I had no trouble clicking
| on "tree" to get to the folder and then "mod.ts" to see the
| code.
| Joker_vD wrote:
| One has still to know that "tree" stands for "source code".
| adregan wrote:
| This reminds me of when I am trying to find something in a
| cabinet, but don't really look very hard, and my wife will say
| "did you even try?" and find it in ~1second.
|
| https://git.sr.ht/~evanhahn/setBigTimeout/tree/main/item/mod...
| Minor49er wrote:
| I clicked on "browse" under the refs: main section and found
| the code right away
| yesco wrote:
| > In most JavaScript runtimes, this duration is represented as a
| 32-bit signed integer
|
| I thought all numbers in JavaScript were basically some variation
| of double precision floating points, if so, why is setTimeout
| limited to a smaller 32bit signed integer?
|
| If this is true, then if I pass something like "0.5", does it
| round the number when casting it to an integer? Or does it
| execute the callback after half a millisecond like you would
| expect it would?
| arp242 wrote:
| You're correct about JS numbers. It works like this presumably
| because the implementation is written in C++ or the like and
| uses an int32 for this, because "25 days ought to be enough for
| everyone".
| drdaeman wrote:
| I thought most non-abandoned C/C++ projects have long
| switched to time_t or similar. 2038 is not that far in the
| future.
| andrewmcwatters wrote:
| 2038 is even "now" if you're calculating futures.
| bobmcnamara wrote:
| Debian conversion should be done mid2025.
| afavour wrote:
| Yes but JS always has backwards compatibility in mind, even
| if it wasn't in the spec. Wouldn't be surprised if more
| modern implementations still add an arbitrary restriction.
| blixt wrote:
| JS numbers technically have 53 bits for integers (mantissa) but
| all binary operators turns it into a 32-bit signed integer.
| Maybe this is related somehow to the setTimeout limitation.
| JavaScript also has the >>> unsigned bit shift operator so you
| can squeeze that last bit out of it if you only care about
| positive values: ((2*32-1)>>>0).toString(2).length === 32
| tubs wrote:
| I assume by binary you mean logical? A + b certainly does not
| treat either side as 32bit.
| blixt wrote:
| Sorry, I meant bitwise operators, such as: ~ >> << >>> | &
| DvdGiessen wrote:
| When implementing a tiny timing library in JS a few years back
| I found that most engines indeed seem to cast the value to an
| integer (effectively flooring it), so in order to get
| consistent behaviour in all environments I resorted to always
| calling Math.ceil on the timeout value first [1], thus making
| it so that the callbacks always fire after _at least_ the given
| timeout has passed (same as with regular setTimeout, which also
| cannot guarantee that the engine can run the callback at
| exactly the given timeout due to scheduling). Also used a very
| similar timeout chaining technique as described here, it works
| well!
|
| [1]: https://github.com/DvdGiessen/virtual-
| clock/blob/master/src/...
| darepublic wrote:
| instead of chaining together shorter timeouts, why not calculate
| the datetime of the delay and then invoke via
| window.requestAnimationFrame (by checking the current date ofc).
| augusto-moura wrote:
| Are you suggesting checking the date every frame vs scheduling
| long task every once in a long while? Can't tell if it is
| ironic or not, I'm sorry (damn Poe's law). But assuming not, it
| would be a lot more computationaly expensive to do that,
| timeouts are very optmized and they "give back" on the computer
| resources while in the meantime
| darepublic wrote:
| No irony intended I can be this dumb. Your point did occur to
| me as I posted, was just grasping at straws for a "clean"
| solution
| jw1224 wrote:
| Unlike setTimeout, requestAnimationFrame callbacks are
| automatically skipped if the browser viewport is minimized or
| no longer visible. You wouldn't want to miss the frame that
| matters!
| hiccuphippo wrote:
| So the js engine converting the javascript number (a double?) To
| an int and it's rolling over?
| jackconsidine wrote:
| This type of thing is actually practical. Google Cloud Tasks have
| a max schedule date of 30 days in the future so the typical
| workaround is to chain tasks. As other commenters have suggested
| you can also set a cron check. This has more persistent
| implications on your database, but chaining tasks can fail in
| other ways, or explode if there are retries and a failed request
| does trigger a reschedule (I hate to say I'm speaking from
| experience)
| Waterluvian wrote:
| True. Though if you have a need to trigger something after that
| much time, you might recognize the need to track that scheduled
| event more carefully and want a scheduler. Then you've just got
| a loop checking the clock and your scheduled tasks.
| keyle wrote:
| This is great for the folks running serverless compute! You get
| to start a process and let it hang until your credit card is
| maxed out. /s
___________________________________________________________________
(page generated 2024-10-18 23:00 UTC)