[HN Gopher] TimesFM: Time Series Foundation Model for time-serie...
       ___________________________________________________________________
        
       TimesFM: Time Series Foundation Model for time-series forecasting
        
       Author : yeldarb
       Score  : 193 points
       Date   : 2024-05-08 13:34 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | uoaei wrote:
       | "Time series" is such an over-subscribed term. What sorts of time
       | series is this actually useful for?
       | 
       | For instance, will it be able to predict dynamics for a machine
       | with thousands of sensors?
        
         | techwizrd wrote:
         | Specifically, its referring to univariate, contiguous point
         | forecasts. Honestly, I'm a little puzzled by the benchmarks.
        
         | sarusso wrote:
         | Even if it was for multivariate time series, the model would
         | first need to infer what machine are we talking about, then its
         | working conditions, and only then make a reasonable forecast
         | based on an hypothesis of its dynamics. I don't know, seems
         | pretty hard.
        
           | uoaei wrote:
           | Indeed. An issue I ran into over and over while doing
           | research for semiconductor manufacturing.
           | 
           | My complaint was more illustrative than earnest.
        
       | iamgopal wrote:
       | How can time series model be pre-trained ? I think I'm missing
       | something.
        
         | melenaboija wrote:
         | Third paragraph of the introduction of the mentioned paper[1]
         | in the first paragraph of the repo.
         | 
         | [1] https://arxiv.org/abs/2310.10688
        
           | jurgenaut23 wrote:
           | I guess they pre-trained the model to exploit common patterns
           | found in any time-series (e.g., seasonalities, trends,
           | etc.)... What would be interesting, though, is to see if it
           | spots patterns that are domain-specific (e.g., the
           | ventricular systole dip in an electrocardiogram), and
           | possibly transfer those (that would be obviously useless in
           | this specific example, but maybe there are interesting domain
           | transfers out there)
        
         | malux85 wrote:
         | If you have a univariate series, just single values following
         | each other -
         | 
         | [5, 3, 3, 2, 2, 2, 1, ...]
         | 
         | What is the next number? Well let's start with the search space
         | - what is the possible range of the next number? Assuming
         | unsigned 32bit integers (for explanation simplicity) it's
         | 0-(2^32-1)
         | 
         | So are all of those possible outputs equally likely? The next
         | number could be 1, or it could be 345,654,543 ... are those
         | outputs equally likely?
         | 
         | Even though we know nothing about this sequence, most time
         | series don't make enormous random jumps, so no, they are not
         | equally likely, 1 is the more likely of the two we discussed.
         | 
         | Ok, so some patterns are more likely than others, let's analyse
         | lots and lots of time series data and see if we can build a
         | generalised model that can be fine tuned or used as a feature
         | extractor.
         | 
         | Many time series datasets have repeating patterns, momentum,
         | symmetries, all of these can be learned. Is it perfect? No, but
         | what model is? And things don't have to be perfect to be
         | useful.
         | 
         | There you go - that's a pre-trained time series model in a
         | nutshell
        
         | sarusso wrote:
         | My understating is that, while your eye can naturally spot a
         | dependency over time in time series data, machines can't. So as
         | we did for imaging, where we pre-trained models to let machines
         | easily identify objects in pictures, now we are doing the same
         | to let machines "see" dependencies over time. Then, how these
         | dependencies work, this is another story.
        
       | nwoli wrote:
       | Seems like a pretty small (low latency) model. Would be
       | interesting to hook up to mouse input (x and y) and see how well
       | it predicts where I'm gonna move the mouse (maybe with and
       | without seeing the predicted path)
        
         | jarmitage wrote:
         | What is the latency?
        
         | throwtappedmac wrote:
         | Curious George here: why are you trying to predict where the
         | mouse is going? :)
        
           | nwoli wrote:
           | Just to see how good the model is (maybe it's creepily good
           | in a fun way)
        
             | Timon3 wrote:
             | There's a fun game idea in there! Imagine having to
             | outmaneuver a constantly learning model. Not to mention the
             | possibilities of using this in genres like bullet hell...
        
           | teaearlgraycold wrote:
           | Think of the sweet sweet ad revenue!
        
             | throwtappedmac wrote:
             | Haha as if advertisers don't know me better than I know me
        
           | tasty_freeze wrote:
           | Game developers are constantly trying to minimize lag. I have
           | no idea if computers are so fast these days that it is a
           | "solved" problem, but I knew a game developer ages ago who
           | used a predictive mouse model to reduce the apparent lag by
           | guessing where the mouse would be at the time the frame was
           | displayed (considering it took 30 ms or whatever to render
           | the frame).
        
             | orbital-decay wrote:
             | The only thing worse than lag is uneven lag, which is what
             | you're going to end up with. Constant lag can be dealt with
             | by players, jitter can't.
        
             | aeyes wrote:
             | Quake internet play only became acceptable when client side
             | prediction was implemented, I'm sure it would be better to
             | have real prediction instead of simple interpolation.
             | 
             | https://raw.githubusercontent.com/ESWAT/john-carmack-plan-
             | ar...
        
             | wongarsu wrote:
             | Competitive online games commonly predict the player's
             | movement. Network latencies have improved and are now
             | usually <16ms (useful milestone since at 60fps you render a
             | frame every 16.6ms), but players expect to still be able to
             | smoothly play when joining from the other side of the
             | continent to play with their friends. You usually want
             | every client to agree where everyone is, and predicting
             | movement leads to less disagreement than what you would get
             | from using "outdated" state because of speed-of-light
             | delays.
             | 
             | If you want to predict not just position but also
             | orientation in a shooter game, that's basically predicting
             | the mouse movements.
        
           | brigadier132 wrote:
           | Catching cheaters in games might seem like a good use.
        
       | dangerclose wrote:
       | is it better than prophet from meta?
        
         | VHRanger wrote:
         | I imagine they're both worse than good old exponential
         | smoothing or SARIMAX.
        
           | Pseudocrat wrote:
           | Depends on use case. Hybrid approaches have been dominating
           | the M-Competitions, but there are generally small percentage
           | differences in variance of statistical models vs machine
           | learning models.
           | 
           | And exponentially higher cost for ML models.
        
             | VHRanger wrote:
             | At the end of the day, if training or doing inference on
             | the ML model is massively more costly in time or compute,
             | you'll iterate much less with it.
             | 
             | I also think it's a dead end to try to have foundation
             | models for "time series" - it's a class of data! Like when
             | people tried to have foundation models for any general
             | graph type.
             | 
             | You could make foundation models for data within that type
             | - eg. meteorological time series, or social network graphs.
             | But for the abstract class type it seems like a dead end.
        
               | rockinghigh wrote:
               | These models may be helpful if they speed up convergence
               | when fine tuned on business-specific time series.
        
             | SpaceManNabs wrote:
             | is there a ranking of the methods that actually work on
             | benchmark datasets? Hybrid, "ML" or old stats? I remember
             | eamonnkeogh doing this on r/ML a few years ago.
        
       | l2dy wrote:
       | Blog link (Feb 2024): https://research.google/blog/a-decoder-
       | only-foundation-model...
       | 
       | Previous discussion:
       | https://news.ycombinator.com/item?id=39235983
        
       | whimsicalism wrote:
       | I'm curious why we seem convinced that this is a task that is
       | possible or something worthy of investigation.
       | 
       | I've worked on language models since 2018, even then it was
       | obvious why language was a useful _and transferable_ task. I do
       | not at all feel the same way about general univariate time series
       | that could have any underlying process.
        
         | sarusso wrote:
         | +1 for "any underlying process". It would be interesting what
         | use case they had in mind.
        
         | baq wrote:
         | well... if you look at a language in a certain way, it is just
         | a way to put bits in a certain order. if you forget about the
         | 'language' part, it kinda makes sense to try because why
         | shouldn't it work?
        
         | IshKebab wrote:
         | Why not? There are plenty of time series that have underlying
         | patterns which means you can do better than a total guess even
         | without any knowledge of what you are predicting.
         | 
         | Think about something like traffic patterns. You probably won't
         | predict higher traffic on game days, but predicting rush hour
         | is going to be pretty trivial.
        
         | smokel wrote:
         | The things that we are typically interested in have very clear
         | patterns. In a way, if we find that there are no patterns, we
         | don't even try to do any forecasting.
         | 
         | "The Unreasonable Effectiveness of Mathematics in the Natural
         | Sciences" [1] hints that there might be some value here.
         | 
         | [1]
         | https://en.m.wikipedia.org/wiki/The_Unreasonable_Effectivene...
        
           | yonixw wrote:
           | Exactly, so for example, I think the use of this model is in
           | cases where you want user count to have some pattern around
           | timing. And be alerted if it has spike.
           | 
           | But you wouldn't want this model for file upload storage
           | usage which only increases, where you would put alerts based
           | on max values and not patterns/periodic values.
        
         | zeroxfe wrote:
         | > I'm curious why we seem convinced that this is a task that is
         | possible or something worthy of investigation.
         | 
         | There's a huge industry around time series forecasting used for
         | all kinds of things like engineering, finance, climate science,
         | etc. and many of the modern ones incorporate some kind of
         | machine learning because they deal with very high dimensional
         | data. Given the very surprising success of LLMs in non-language
         | fields, it seems reasonable that people would work on this.
        
           | whimsicalism wrote:
           | Task specific time series models, not time series "foundation
           | models" - we are discussing different things.
        
         | wuj wrote:
         | Time series data are inherently context sensitive, unlike
         | natural languages which follow predictable grammar patterns.
         | The patterns in time series data vary based on context. For
         | example, flight data often show seasonal trends, while electric
         | signals depend on the type of sensor used. There's also data
         | that appear random, like stock data, though firms like Rentech
         | manage to consistently find unlerlying alphas. Training a
         | multivariate time series data would be challenging, but I don't
         | see why not for specific applications.
        
         | shaism wrote:
         | Fundamentally, the pre-trained model would need to learn a
         | "world model" to predict well in distinct domains. This should
         | be possible not regarding compute requirements and the exact
         | architecture.
         | 
         | After all, the physical world (down to the subatomic level) is
         | governed by physical laws. Ilya Sutskever from OpenAI stated
         | that next-token prediction might be enough to learn a world
         | model (see [1]). That would imply that a model learns a "world
         | model" indirectly, which is even more unrealistic than learning
         | the world model directly through pre-training on time-series
         | data.
         | 
         | [1] https://www.youtube.com/watch?v=YEUclZdj_Sc
        
           | whimsicalism wrote:
           | But the data generating process could be literally anything.
           | We are not constrained by physics in any real sense if we
           | predicting financial markets or occurrences of a certain
           | build error or termite behavior.
        
             | shaism wrote:
             | Sure, there are limits. Not everything is predictable, not
             | even physics. But that is also not the point of such a
             | model. The goal is to forecast across a broad range of use
             | cases that do have underlying laws. Similar to LLM, they
             | could also be fine-tuned.
        
         | itronitron wrote:
         | There was a paper written a while back that proved
         | mathematically how you can correlate any time series with any
         | other time series, thus vaporizing any perception of value
         | gained by correlating time series (at least for those people
         | that read the paper.) just wanted to share
        
       | polskibus wrote:
       | how good is it on stocks?
        
         | svaha1728 wrote:
         | The next index fund should use AI. What could possibly go
         | wrong?
        
           | whimsicalism wrote:
           | I promise you your market-making counterparties already are.
        
             | hackerlight wrote:
             | What kind of things are they doing with AI?
        
               | whimsicalism wrote:
               | Predicting price movements, finding good hedges, etc.
        
         | claytonjy wrote:
         | if I knew it was good, why would I tell you that?
        
       | esafak wrote:
       | Is anyone using neural networks for anomaly detection in
       | observability? If so, which model and how many metrics are you
       | supporting per core?
        
         | leeoniya wrote:
         | LSTM is common for this.
         | 
         | also https://facebook.github.io/prophet/
        
           | morkalork wrote:
           | How data hungry is it, or what is the minimum volume of data
           | needed before its worth investigating?
        
         | sarusso wrote:
         | What do you mean by "observability"?
        
           | esafak wrote:
           | Telemetry. Dashboards. The application is knowing when a
           | signal is anomalous.
           | 
           | https://en.wikipedia.org/wiki/Observability_(software)
        
             | sarusso wrote:
             | Oh, yes I am working on that. Usually LSTM, exploring
             | encoder-decoders and generative models, but also some
             | simpler models based on periodic averages (which are
             | surprisingly useful in some use cases). But I don't have
             | per-core metrics.
        
             | tiagod wrote:
             | Depending on how stable your signal is, I've had good
             | experience with seasonal ARIMA and LOESS (but it's not
             | neural networks)
        
       | optimalsolver wrote:
       | When it comes to time series forecasting, if the method actually
       | works, it sure as hell isn't being publicly released.
        
         | baq wrote:
         | and yet we have those huge llamas publicly available. these are
         | computers that talk, dammit
        
         | speedgoose wrote:
         | Some times series are more predictable than others. Being good
         | at predicting the predictable ones is useful.
         | 
         | For example you can easily predict the weather with descent
         | accuracy. Tomorrow is going to be about the same than today.
         | From there you can work on better models.
         | 
         | Or predicting a failure in a factory because a vibration
         | pattern on an industrial machine always ended up in a massive
         | failure after a few days.
         | 
         | But I agree that if a model is good at predicting the stock
         | market, it's not going to be released.
        
       | mhh__ wrote:
       | Dear googler or meta-er or timeseries transformer startup
       | something-er: Please make a ChatGPT/chat.lmsys.org style
       | interface for one of these that I can throw data at and see what
       | happens.
       | 
       | This one looks pretty easy to setup, in fairness, but some other
       | models I've looked at have been surprisingly fiddly / locked
       | behind an API.
       | 
       | Perhaps such a thing already exists somewhere?
        
       | wuj wrote:
       | On a related note, Amazon also had a model for time series
       | forecasting called Chronos.
       | 
       | https://github.com/amazon-science/chronos-forecasting
        
         | toasted-subs wrote:
         | Something I've had issues with time series has been having to
         | use relatively custom models.
         | 
         | It's difficult to use off the shelf tools when starting with
         | math models.
        
         | claytonjy wrote:
         | And like all deep learning forecasting models thus far, it
         | makes for a nice paper but is not worth anyone using for a real
         | problem. Much slower than the classical methods it fails to
         | beat.
        
           | p1esk wrote:
           | That's what people said about CV models in 2011.
        
             | claytonjy wrote:
             | That's fair, but they stopped saying it about CV models in
             | 2012. We've been saying this about foundational forecasting
             | models since...2019 at least, probably earlier. But it is a
             | harder problem!
        
       | aantix wrote:
       | Would this be useful in predicting lat/long coordinates along a
       | path? To mitigate issues with GPS drift.
       | 
       | If not, what would be a useful model?
        
         | smokel wrote:
         | Map matching to a road network might be helpful here. For
         | example, a Hidden Markov Model gives good results. See for
         | instance this paper:
         | 
         | "Hidden Markov map matching through noise and sparseness"
         | (2009)
         | 
         | https://www.microsoft.com/en-us/research/wp-content/uploads/...
        
       | chaos_emergent wrote:
       | "Why would you even try to predict the weather if you know it's
       | going to be wrong?"
       | 
       | - most OCs on this thread
        
       ___________________________________________________________________
       (page generated 2024-05-08 23:00 UTC)