[HN Gopher] TimesFM: Time Series Foundation Model for time-serie...
___________________________________________________________________
TimesFM: Time Series Foundation Model for time-series forecasting
Author : yeldarb
Score : 193 points
Date : 2024-05-08 13:34 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| uoaei wrote:
| "Time series" is such an over-subscribed term. What sorts of time
| series is this actually useful for?
|
| For instance, will it be able to predict dynamics for a machine
| with thousands of sensors?
| techwizrd wrote:
| Specifically, its referring to univariate, contiguous point
| forecasts. Honestly, I'm a little puzzled by the benchmarks.
| sarusso wrote:
| Even if it was for multivariate time series, the model would
| first need to infer what machine are we talking about, then its
| working conditions, and only then make a reasonable forecast
| based on an hypothesis of its dynamics. I don't know, seems
| pretty hard.
| uoaei wrote:
| Indeed. An issue I ran into over and over while doing
| research for semiconductor manufacturing.
|
| My complaint was more illustrative than earnest.
| iamgopal wrote:
| How can time series model be pre-trained ? I think I'm missing
| something.
| melenaboija wrote:
| Third paragraph of the introduction of the mentioned paper[1]
| in the first paragraph of the repo.
|
| [1] https://arxiv.org/abs/2310.10688
| jurgenaut23 wrote:
| I guess they pre-trained the model to exploit common patterns
| found in any time-series (e.g., seasonalities, trends,
| etc.)... What would be interesting, though, is to see if it
| spots patterns that are domain-specific (e.g., the
| ventricular systole dip in an electrocardiogram), and
| possibly transfer those (that would be obviously useless in
| this specific example, but maybe there are interesting domain
| transfers out there)
| malux85 wrote:
| If you have a univariate series, just single values following
| each other -
|
| [5, 3, 3, 2, 2, 2, 1, ...]
|
| What is the next number? Well let's start with the search space
| - what is the possible range of the next number? Assuming
| unsigned 32bit integers (for explanation simplicity) it's
| 0-(2^32-1)
|
| So are all of those possible outputs equally likely? The next
| number could be 1, or it could be 345,654,543 ... are those
| outputs equally likely?
|
| Even though we know nothing about this sequence, most time
| series don't make enormous random jumps, so no, they are not
| equally likely, 1 is the more likely of the two we discussed.
|
| Ok, so some patterns are more likely than others, let's analyse
| lots and lots of time series data and see if we can build a
| generalised model that can be fine tuned or used as a feature
| extractor.
|
| Many time series datasets have repeating patterns, momentum,
| symmetries, all of these can be learned. Is it perfect? No, but
| what model is? And things don't have to be perfect to be
| useful.
|
| There you go - that's a pre-trained time series model in a
| nutshell
| sarusso wrote:
| My understating is that, while your eye can naturally spot a
| dependency over time in time series data, machines can't. So as
| we did for imaging, where we pre-trained models to let machines
| easily identify objects in pictures, now we are doing the same
| to let machines "see" dependencies over time. Then, how these
| dependencies work, this is another story.
| nwoli wrote:
| Seems like a pretty small (low latency) model. Would be
| interesting to hook up to mouse input (x and y) and see how well
| it predicts where I'm gonna move the mouse (maybe with and
| without seeing the predicted path)
| jarmitage wrote:
| What is the latency?
| throwtappedmac wrote:
| Curious George here: why are you trying to predict where the
| mouse is going? :)
| nwoli wrote:
| Just to see how good the model is (maybe it's creepily good
| in a fun way)
| Timon3 wrote:
| There's a fun game idea in there! Imagine having to
| outmaneuver a constantly learning model. Not to mention the
| possibilities of using this in genres like bullet hell...
| teaearlgraycold wrote:
| Think of the sweet sweet ad revenue!
| throwtappedmac wrote:
| Haha as if advertisers don't know me better than I know me
| tasty_freeze wrote:
| Game developers are constantly trying to minimize lag. I have
| no idea if computers are so fast these days that it is a
| "solved" problem, but I knew a game developer ages ago who
| used a predictive mouse model to reduce the apparent lag by
| guessing where the mouse would be at the time the frame was
| displayed (considering it took 30 ms or whatever to render
| the frame).
| orbital-decay wrote:
| The only thing worse than lag is uneven lag, which is what
| you're going to end up with. Constant lag can be dealt with
| by players, jitter can't.
| aeyes wrote:
| Quake internet play only became acceptable when client side
| prediction was implemented, I'm sure it would be better to
| have real prediction instead of simple interpolation.
|
| https://raw.githubusercontent.com/ESWAT/john-carmack-plan-
| ar...
| wongarsu wrote:
| Competitive online games commonly predict the player's
| movement. Network latencies have improved and are now
| usually <16ms (useful milestone since at 60fps you render a
| frame every 16.6ms), but players expect to still be able to
| smoothly play when joining from the other side of the
| continent to play with their friends. You usually want
| every client to agree where everyone is, and predicting
| movement leads to less disagreement than what you would get
| from using "outdated" state because of speed-of-light
| delays.
|
| If you want to predict not just position but also
| orientation in a shooter game, that's basically predicting
| the mouse movements.
| brigadier132 wrote:
| Catching cheaters in games might seem like a good use.
| dangerclose wrote:
| is it better than prophet from meta?
| VHRanger wrote:
| I imagine they're both worse than good old exponential
| smoothing or SARIMAX.
| Pseudocrat wrote:
| Depends on use case. Hybrid approaches have been dominating
| the M-Competitions, but there are generally small percentage
| differences in variance of statistical models vs machine
| learning models.
|
| And exponentially higher cost for ML models.
| VHRanger wrote:
| At the end of the day, if training or doing inference on
| the ML model is massively more costly in time or compute,
| you'll iterate much less with it.
|
| I also think it's a dead end to try to have foundation
| models for "time series" - it's a class of data! Like when
| people tried to have foundation models for any general
| graph type.
|
| You could make foundation models for data within that type
| - eg. meteorological time series, or social network graphs.
| But for the abstract class type it seems like a dead end.
| rockinghigh wrote:
| These models may be helpful if they speed up convergence
| when fine tuned on business-specific time series.
| SpaceManNabs wrote:
| is there a ranking of the methods that actually work on
| benchmark datasets? Hybrid, "ML" or old stats? I remember
| eamonnkeogh doing this on r/ML a few years ago.
| l2dy wrote:
| Blog link (Feb 2024): https://research.google/blog/a-decoder-
| only-foundation-model...
|
| Previous discussion:
| https://news.ycombinator.com/item?id=39235983
| whimsicalism wrote:
| I'm curious why we seem convinced that this is a task that is
| possible or something worthy of investigation.
|
| I've worked on language models since 2018, even then it was
| obvious why language was a useful _and transferable_ task. I do
| not at all feel the same way about general univariate time series
| that could have any underlying process.
| sarusso wrote:
| +1 for "any underlying process". It would be interesting what
| use case they had in mind.
| baq wrote:
| well... if you look at a language in a certain way, it is just
| a way to put bits in a certain order. if you forget about the
| 'language' part, it kinda makes sense to try because why
| shouldn't it work?
| IshKebab wrote:
| Why not? There are plenty of time series that have underlying
| patterns which means you can do better than a total guess even
| without any knowledge of what you are predicting.
|
| Think about something like traffic patterns. You probably won't
| predict higher traffic on game days, but predicting rush hour
| is going to be pretty trivial.
| smokel wrote:
| The things that we are typically interested in have very clear
| patterns. In a way, if we find that there are no patterns, we
| don't even try to do any forecasting.
|
| "The Unreasonable Effectiveness of Mathematics in the Natural
| Sciences" [1] hints that there might be some value here.
|
| [1]
| https://en.m.wikipedia.org/wiki/The_Unreasonable_Effectivene...
| yonixw wrote:
| Exactly, so for example, I think the use of this model is in
| cases where you want user count to have some pattern around
| timing. And be alerted if it has spike.
|
| But you wouldn't want this model for file upload storage
| usage which only increases, where you would put alerts based
| on max values and not patterns/periodic values.
| zeroxfe wrote:
| > I'm curious why we seem convinced that this is a task that is
| possible or something worthy of investigation.
|
| There's a huge industry around time series forecasting used for
| all kinds of things like engineering, finance, climate science,
| etc. and many of the modern ones incorporate some kind of
| machine learning because they deal with very high dimensional
| data. Given the very surprising success of LLMs in non-language
| fields, it seems reasonable that people would work on this.
| whimsicalism wrote:
| Task specific time series models, not time series "foundation
| models" - we are discussing different things.
| wuj wrote:
| Time series data are inherently context sensitive, unlike
| natural languages which follow predictable grammar patterns.
| The patterns in time series data vary based on context. For
| example, flight data often show seasonal trends, while electric
| signals depend on the type of sensor used. There's also data
| that appear random, like stock data, though firms like Rentech
| manage to consistently find unlerlying alphas. Training a
| multivariate time series data would be challenging, but I don't
| see why not for specific applications.
| shaism wrote:
| Fundamentally, the pre-trained model would need to learn a
| "world model" to predict well in distinct domains. This should
| be possible not regarding compute requirements and the exact
| architecture.
|
| After all, the physical world (down to the subatomic level) is
| governed by physical laws. Ilya Sutskever from OpenAI stated
| that next-token prediction might be enough to learn a world
| model (see [1]). That would imply that a model learns a "world
| model" indirectly, which is even more unrealistic than learning
| the world model directly through pre-training on time-series
| data.
|
| [1] https://www.youtube.com/watch?v=YEUclZdj_Sc
| whimsicalism wrote:
| But the data generating process could be literally anything.
| We are not constrained by physics in any real sense if we
| predicting financial markets or occurrences of a certain
| build error or termite behavior.
| shaism wrote:
| Sure, there are limits. Not everything is predictable, not
| even physics. But that is also not the point of such a
| model. The goal is to forecast across a broad range of use
| cases that do have underlying laws. Similar to LLM, they
| could also be fine-tuned.
| itronitron wrote:
| There was a paper written a while back that proved
| mathematically how you can correlate any time series with any
| other time series, thus vaporizing any perception of value
| gained by correlating time series (at least for those people
| that read the paper.) just wanted to share
| polskibus wrote:
| how good is it on stocks?
| svaha1728 wrote:
| The next index fund should use AI. What could possibly go
| wrong?
| whimsicalism wrote:
| I promise you your market-making counterparties already are.
| hackerlight wrote:
| What kind of things are they doing with AI?
| whimsicalism wrote:
| Predicting price movements, finding good hedges, etc.
| claytonjy wrote:
| if I knew it was good, why would I tell you that?
| esafak wrote:
| Is anyone using neural networks for anomaly detection in
| observability? If so, which model and how many metrics are you
| supporting per core?
| leeoniya wrote:
| LSTM is common for this.
|
| also https://facebook.github.io/prophet/
| morkalork wrote:
| How data hungry is it, or what is the minimum volume of data
| needed before its worth investigating?
| sarusso wrote:
| What do you mean by "observability"?
| esafak wrote:
| Telemetry. Dashboards. The application is knowing when a
| signal is anomalous.
|
| https://en.wikipedia.org/wiki/Observability_(software)
| sarusso wrote:
| Oh, yes I am working on that. Usually LSTM, exploring
| encoder-decoders and generative models, but also some
| simpler models based on periodic averages (which are
| surprisingly useful in some use cases). But I don't have
| per-core metrics.
| tiagod wrote:
| Depending on how stable your signal is, I've had good
| experience with seasonal ARIMA and LOESS (but it's not
| neural networks)
| optimalsolver wrote:
| When it comes to time series forecasting, if the method actually
| works, it sure as hell isn't being publicly released.
| baq wrote:
| and yet we have those huge llamas publicly available. these are
| computers that talk, dammit
| speedgoose wrote:
| Some times series are more predictable than others. Being good
| at predicting the predictable ones is useful.
|
| For example you can easily predict the weather with descent
| accuracy. Tomorrow is going to be about the same than today.
| From there you can work on better models.
|
| Or predicting a failure in a factory because a vibration
| pattern on an industrial machine always ended up in a massive
| failure after a few days.
|
| But I agree that if a model is good at predicting the stock
| market, it's not going to be released.
| mhh__ wrote:
| Dear googler or meta-er or timeseries transformer startup
| something-er: Please make a ChatGPT/chat.lmsys.org style
| interface for one of these that I can throw data at and see what
| happens.
|
| This one looks pretty easy to setup, in fairness, but some other
| models I've looked at have been surprisingly fiddly / locked
| behind an API.
|
| Perhaps such a thing already exists somewhere?
| wuj wrote:
| On a related note, Amazon also had a model for time series
| forecasting called Chronos.
|
| https://github.com/amazon-science/chronos-forecasting
| toasted-subs wrote:
| Something I've had issues with time series has been having to
| use relatively custom models.
|
| It's difficult to use off the shelf tools when starting with
| math models.
| claytonjy wrote:
| And like all deep learning forecasting models thus far, it
| makes for a nice paper but is not worth anyone using for a real
| problem. Much slower than the classical methods it fails to
| beat.
| p1esk wrote:
| That's what people said about CV models in 2011.
| claytonjy wrote:
| That's fair, but they stopped saying it about CV models in
| 2012. We've been saying this about foundational forecasting
| models since...2019 at least, probably earlier. But it is a
| harder problem!
| aantix wrote:
| Would this be useful in predicting lat/long coordinates along a
| path? To mitigate issues with GPS drift.
|
| If not, what would be a useful model?
| smokel wrote:
| Map matching to a road network might be helpful here. For
| example, a Hidden Markov Model gives good results. See for
| instance this paper:
|
| "Hidden Markov map matching through noise and sparseness"
| (2009)
|
| https://www.microsoft.com/en-us/research/wp-content/uploads/...
| chaos_emergent wrote:
| "Why would you even try to predict the weather if you know it's
| going to be wrong?"
|
| - most OCs on this thread
___________________________________________________________________
(page generated 2024-05-08 23:00 UTC)