[HN Gopher] Time-Series Anomaly Detection: A Decade Review
___________________________________________________________________
Time-Series Anomaly Detection: A Decade Review
Author : belter
Score : 303 points
Date : 2025-01-06 11:10 UTC (11 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| whatever1 wrote:
| Sometimes HN just reads my mind. This was exactly the topic I was
| looking into this week.
| zaporozhets wrote:
| I recently tried to homebrew some anomaly detection work for a
| performance tracking project and was surprised at the absence of
| any off-the-shelf OSS or Paid solutions in this space (that
| weren't super basic or way too complex). Lots of fertile ground
| here!
| rad_gruchalski wrote:
| There's a ton of material related to anomaly detection with
| Prometheus and Grafana stack:
| https://grafana.com/blog/2024/10/03/how-to-use-prometheus-
| to.... But maybe this is the "way too complex" case you
| mention.
| CubsFan1060 wrote:
| I'm still playing around with this one:
| https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to...
| (there's a github repo for it).
|
| So far, it's not terrible, but has some pretty big flaws.
| jcreixell wrote:
| Hi, co-author of the blog post here. I would love to learn
| more about the flaws you see and if ideas on how to improve
| it! We definitely plan to iterate on it and make it as good
| as we possibly can.
| nyrikki wrote:
| Not really related to the above post, but one thing I am
| not seeing on an initial pass is the advancement of
| understanding of problems like riddled or wada basins.
|
| Especially with time delays this and 3+ attractors this can
| be problematic.
|
| A simple example:
|
| https://doi.org/10.21203/rs.3.rs-1088857/v1
|
| There are tools to try and detect these features that were
| found over the past few decades, and I know I wasted a few
| years on a project that superficially looked like a FP
| issue, but ended up being a mix of the wada property and/or
| porous sets.
|
| The complications will describing these worse than
| traditional chaos indeterminate situations may make it
| inappropriate for you.
|
| But it would be nice if visibility was increased. Funny
| enough most LLMs corpus is mostly fed from a LSAT question.
|
| There has been a lot of movement here when you have n>=3
| attractors/exits.
|
| Not solutions unfortunately, but tools to help figure out
| when you hit it.
| CubsFan1060 wrote:
| To be clear "some big flaws" was probably overstating it.
| I'm going to edit that. Also, thanks for the work on this.
| I would absolutely love to contribute, but my maths are not
| good enough for this :)
|
| The biggest thing I've run into in my testing is that an
| anomaly of reasonably short timeframe seems to throw the
| upper and lower bands off for quite some time.
|
| That being said, perhaps changing some of the variables
| would help with that, and I just don't have enough skill to
| be able to understand the exact way to adjust that.
| pnathan wrote:
| The number of manual tweaks required to the approach
| suggest that it is essentially an ad hoc experimental
| fitting, rather than a stable theoretical model that can
| adapt to your time series.
| ramon156 wrote:
| I needed a TS anomaly detection for my internship because we
| needed to track when a machine/server was doing poorly or had
| unplanned downtime. I expected Microsoft's C# library to be
| able to do this, but my god, it's a mess. If someone has the
| time and will to implement a proper library then that would ve
| awesome.
| neonsunset wrote:
| Anomaly detection in time-series data is not a concern of the
| standard library of all things. Nor is it a concern of "base
| abstractions" shipped as extensions (think ILogger).
| Phurist wrote:
| If only life was as simple as calling .isAnomaly() on
| anything
| neonsunset wrote:
| Hardcoded to return 'false' of course. Because nothing
| ever happens!
| jeffbee wrote:
| The reason there are not off-the-shelf solutions is this is an
| unsolved problem. There is no approach that is generally
| useful.
| phirschybar wrote:
| agreed. at my company we ended up rolling our own system. but
| this area is absolutely ripe for some configurable saas or OS
| tool with advanced reporting and alerting mechanisms. Datadog
| has a decent offering, but it's pretty $$$$.
| montereynack wrote:
| Gonna throw in my hat and say that if you're working on
| industrial applications (like energy or manufacturing) give
| us a holler at www.sentineldevices.com! Plug-and-play time
| series monitoring for industrial applications is exactly what
| we do.
| quijoteuniv wrote:
| I use offset function in Prometheus to make an average of past
| weeks as a recording rule. We have a use in our systems that is
| very "seasonal" as in weekly cycles so I make an average of some
| metric (offset 1 week, 2 week, 3 week , 4 week/4) and I compare
| it to the current value of that metric. That way the alarms can
| be set day or night, weekday or weekend, and the thresholds are
| dynamic. It compares against an average of the day of the week,
| or time of the day. There is someone in Gitlab that posted a more
| in depth explanation of this way of working.
| https://about.gitlab.com/blog/2019/07/23/anomaly-detection-u...
| Things get a bit more complicated with holidays, but you can
| actually programm them into prometheus
| https://promcon.io/2019-munich/slides/improved-alerting-with...
| CubsFan1060 wrote:
| Gitlab also has this: https://gitlab.com/gitlab-com/gl-
| infra/tamland
|
| I'm not really smart in these areas, but it feels like
| forecasting and anomaly detection are pretty related. I could
| be wrong though.
| diab0lic wrote:
| You are not wrong! An entire subclass of anomaly detection
| can basically be reduced to: forecast the next data point and
| then measure the forecast error when the data point arrives.
| fnordpiglet wrote:
| Well it doesn't really require a forecast - variance based
| anomaly detection doesn't make an assertion of the next
| point but that its maximum change is within some band. Such
| models usually can't be used to make a forecast other than
| the banding bounds.
| diab0lic wrote:
| That would be a different subclass of anomaly detection
| solutions.
| gr3ml1n wrote:
| Whenever I have a chart in Grafana that isn't too dense, I
| almost always add a line for the 7d offset value. Super useful
| to tell what's normal and what isn't.
| eth0up wrote:
| I had not known of Time Series (or most other) anomaly detection
| methods until recently, when I used several LLMs to assist with
| an analysis of the Florida Lottery Pick4 history.
|
| For years, I'd been casually observing the daily numbers (2 draws
| daily for each since around ?/?/2004?, and 1 prior), which are
| Pick2, 3, 4, and 5, but mostly Pick4, which is 4 digits, thus has
| 1:10,000 odds, vs 1:100, 1:1000 and 1:100,000 for the others.
|
| With truly random numbers, it is pretty difficult to identify
| anything but glaring anomalies. Among some of the tests performed
| were: (clusters\daily\weekly; (isolated forest; (popular
| permutations\by date\special holidays\etc; (individual
| digits\deviations; (temporal frequency; (dbscan; (zscore;
| (patterns; (correlation; (external factors; (auto correlation by
| cluster; (predictive modeling; (chi squared; (Time Series ... and
| a few more I've forgotten.
|
| For those wondering why I'd do this, around 2023-23, the FL
| Lottery drastically modified their website. Previously, one could
| enter a number for the game of their choice and receive all
| historical permutations of that number over all years, going back
| to the 1990s. With the new modification, the permutations have
| been eliminated and the history only shows for 2 years. The only
| option for the complete history is to download the provided PDF
| -- however, it is full of extraneous characters and cannot be
| readily searched via Ctrl-F, etc. Processing this PDF involves
| extensive character removal to render it parsable or modestly
| readable. So to restore the previously functional search ability,
| manual work is required. The seemingly deliberate obfuscation, or
| obstruction, was my motivation. The perceived anomalies over the
| years were secondary, as I am capable of little more than
| speculation without proper testing. But those two factors
| intrigued me.
|
| Having no background in math and only feeble abilities in
| programming, this was a task that I could not have performed
| without LLMs and python code used for the tests. The test is
| still incomplete, having increased in complexity as I progressed
| and left me too tired to persist. The result were ultimately
| within acceptable ranges of randomness, but some patterns were
| present. I had made files of (all numbers that ever occurred;
| (all numbers that have never occurred; (popularity of single,
| isolated digits -- I was actually correct in my intuition here,
| which proved certain single were occurring with lesser or greater
| frequencies as would be expected; (a script to apply Optical
| Character Recognition on the website and append the latest
| results to a living text and PDF file to offer anyone interested
| an opportunity to freely search, parse and analyze the numbers.
| But I couldn't quite wangle the OCR successfully.
|
| Working with a set over 60k individual number sets, looking for
| anomalies over a 30 year period; if there are other methods
| anyone would suggest, please offer them and I might resume this
| abandoned project.
| ukuina wrote:
| You could use a Visual LLM to transcribe the PDF back into JSON
| data for you.
|
| Something like: ghostpdf to convert PDF into images, then
| gpt-4o or ollama+Llama3 to transcribe each image into output
| JSON.
| eth0up wrote:
| First, let me admit I'm slow to understand, and I also may
| have explained the above poorly.
|
| The PDF is thousands of newlines, with multiple entries on
| each, convoluted with lots of garbage formatting. The only
| data to be preserved is winning num, date, evening/midday
| draw, fireball number (introduced in 2020-ish?) and factored
| along with the change from one to two daily draws (2001-ish)
| as acceptable anomalies in the data set, which has been done,
| I believe.
|
| The difficult part of this, actually, was cleaning the
| squalid pdf.
|
| In the end, after the work was all done, script using OCR
| would successfully append a number to the cleaned text/pdf,
| but usually not the correct num. The only reason I used OCR
| was that I couldn't find the right frames in the webpage that
| contained the latest winning numbers, and getting html
| extraction to to work in a script failed because of it.
|
| I must admit, although I have used JSON files, I don't know
| much about them. Additionally, I'm ignorant enough that it's
| probably best not to attempt to advise me too much here for
| sake of thread sanitation - it could get bloated and off
| topic :)
|
| I think with renewed inspiration, I could figure out a
| successful method to keep the public file updated, but I
| primarily need surefire methods of analysis upon the nums for
| my anomaly detection, which is a challenge for a caveman who
| never went to middle/high school and didn't resume school
| beyond 4th grade until community college much later. Of
| course, the fact that such an animal can twiddle with
| statistics and data analysis is a big testament to the
| positive attributes of LLMs, which without, the pursuit would
| be a vague thought at most.
|
| Although I welcome and appreciate any feedback, I'm pretty
| sure it isn't too welcome here. I'll try to make sense of
| your suggestions though.
| djoldman wrote:
| > Unfortunately, inherent complexities in the data generation of
| these processes, combined with imperfections in the measurement
| systems as well as interactions with malicious actors, often
| result in abnormal phenomena. Such abnormal events appear
| subsequently in the collected data as anomalies.
|
| This is critical; and difficult to deal with in many instances.
|
| > With the term anomalies we refer to data points or groups of
| data points that do not conform to some notion of normality or an
| expected behavior based on previously observed data.
|
| This is a key problem or perhaps _the_ problem: rigorously or
| precisely defining what an anomaly is and _is not_.
| Imanari wrote:
| Look up Eamonn Keogh, he has lots of interesting work on TSAD.
| ivoflipse wrote:
| His Google Tech Talk made me really appreciate his groups work,
| even though I have no need for time series analysis
|
| https://youtu.be/vzPgHF7gcUQ?si=rKQvOjK_qjiSSvKE
| brainwipe wrote:
| Wonderful! My PhD was in stream anomaly detection using dynamic
| neural networks in 2003. Can't wait to go deep through this paper
| and find out what the latest thinking is. Thanks, OP.
| jorl17 wrote:
| I have a soft spot for this area. Almost 10 years ago, my Masters
| touched on something somewhat adjacent to this (Online Failure
| Prediction): https://estudogeral.uc.pt/handle/10316/99218
|
| We built a system to detect exceptions before they happened, and
| act on them, hoping that this would be better than letting them
| happen (e.g. preemptively slow down the rate of requests instead
| of leading to database exhaustion)
|
| At the time, I felt that there was soooooooo much to do in the
| area, and I'm kinda sad I never worked on it again.
| mathewshen wrote:
| Very surprised that I can see this paper here and it deserved! I
| start fellow the work of Dr. Boniol since 2021(By the
| series2graph paper). The Series2Graph is an very good algorithm
| that works well in some complex situations. And his later works
| like _New Trends in Time-Series Anomaly Detection_ , _TSB-UAD_ ,
| _Theseus_ and _k-Graph_ and so on are very insightful too.
| mathewshen wrote:
| If you want to see more algorithms/systems that are used in
| industry company like
| Twitter/Microsoft/Amazon/LinkedIn/IBM/..., you can see my note
| here(The source page is in Chinese, and I just translate it
| into English using Google Translate): https://datahonor-
| com.translate.goog/odyssey/aiops/tsad/pape...
| countzro wrote:
| I also liked the main idea of Series2Graph but found the
| implementation a bit complicated.
|
| There is a similar algorithm with a simpler implementation in
| this paper: ,,GraphTS: Graph-represented time series for
| subsequence anomaly detection"
| https://pmc.ncbi.nlm.nih.gov/articles/PMC10431630/
|
| The approach is for univariate time series and I found it to
| perform well (with very minor tweaks).
| mikehollinger wrote:
| This doesn't capture work that's happened in the last year or so.
|
| For example some former colleagues timeseries foundation model
| (Granite TS) which was doing pretty well when we were
| experimenting with it. [1]
|
| An aha moment for me was realizing that the way you can think of
| anomaly models working is that they're effectively forecasting
| the next N steps, and then noticing when the actual measured
| values are "different enough" from the expected. This is simple
| to draw on a whiteboard for one signal but when it's multi
| variate, pretty neat that it works.
|
| [1] https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1
| apwheele wrote:
| Care to share the contexts in which someone needs a zero-shot
| model for time series? I have just never come across one in
| which you don't have some historical data to fit a model and go
| from there.
| delusional wrote:
| In this case I don't think zero-shot means no context. I
| think it's more used in relation to fine-tuning the model
| parameters over your data.
|
| > TTM-1 currently supports 2 modes:
|
| > Zeroshot forecasting: Directly apply the pre-trained model
| on your target data to get an initial forecast (with no
| training).
|
| > Finetuned forecasting: Finetune the pre-trained model with
| a subset of your target data to further improve the forecast
| tessierashpool9 wrote:
| what were you thinking then before your aha moment? :D
| mikehollinger wrote:
| > what were you thinking then before your aha moment? :D
|
| My naive view was that there was some sort of "normalization"
| or "pattern matching" that was happening. Like - you can look
| at a trend line that generally has some shape, and notice
| when something changes or there's a discontinuity. That's a
| very simplistic view - but - I assumed that stuff was trying
| to do regressions and notice when something was out of a
| statistical norm like k-means analysis. Which works, sort of,
| but is difficult to generalize.
| 0cf8612b2e1e wrote:
| My similar recognition was when I read about isolation forests
| for outlier detection[0]. When predictions are different from
| the average, something is off.
|
| [0] https://scikit-
| learn.org/stable/modules/generated/sklearn.en...
| Dowwie wrote:
| In the nascent world of water tech are IOT devices that monitor
| water flow. These devices can detect leaks and estimate fixture-
| level water consumption. Leak detection is all about identifying
| time series outliers. The distribution-based anomaly detection
| mentioned in the paper is relevant for leak detection.
| Interestingly, a residence may require multiple distributions due
| to pipe temperature variations between warm and cold seasons.
| lebotte wrote:
| Time-series anomaly detection involves using techniques like
| forecasting and historical data offsets to dynamically identify
| deviations in patterns, as discussed in practical applications
| with tools like Prometheus and Grafana.
| conjectures wrote:
| Hi Le Bot, say potato?
| hazrmard wrote:
| Anomaly detection (AD) can arguably be a value-add to any
| industry. It may not be a core product, but AD can help optimize
| operations for almost anyone.
|
| * Manufacturing: Computer vision to pick anomalies off the
| assembly line.
|
| * Operation: Accelerometers/temperature sensors w/ frequency
| analysis to detect onset of faults (prognostics / diagnostics)
| and do predictive maintenance.
|
| * Sales: Timeseries analyses on numbers / support calls to detect
| up/downticks in cashflows, customer satisfaction etc.
| leeoniya wrote:
| a colleague is doing a FOSDEM 2025 talk about
| https://github.com/grafana/augurs
| montereynack wrote:
| Gonna throw in my hat here, time series anomaly detection for
| industrial machinery is the problem my startup is working on!
| Specifically we're making it work offline-by-default (we
| integrate the AI with the equipment, and don't send data to any
| third party servers - even ours) because we feel there's a ton of
| customer opportunities that get left in the dust because they
| can't be online. If you or someone you know is looking for a
| monitoring solution for industrial machinery, or are passionate
| about security-conscious industrial software (we also are
| developing a data historian) let's talk! www.sentineldevices.com
| bluechair wrote:
| Didn't see it mentioned but good to know about: UCR matrix
| profile.
|
| The Matrix Profile is honestly one of the most underrated tools
| in the time series analysis space - it's ridiculously efficient.
| The killer feature is how it just works for finding motifs and
| anomalies without having to mess around with window sizes and
| thresholds like you do with traditional techniques. Solid across
| domains too, from manufacturing sensor data to ECG analysis to
| earthquake detection.
|
| https://www.cs.ucr.edu/~eamonn/MatrixProfile.html
| bee_rider wrote:
| What does it do? Anything to do with matrices, like, from math?
| itissid wrote:
| Can someone explain to me how are SVMs are being classified in
| this paper as "Distribution-Based"? This is quite confusing as a
| taxonomy. They generaly don't estimate model free
| densities(kernel density estimates) or model based(separating one
| or more possibly overlapping normal distributions).
|
| I get that they could be explicitly modeling a data generating
| process's probabilty itself(just like a NN) like of a
| Bernoulli(whose ML function is X-Entropy) or a Normal(ML function
| Mean Square loss), but I don't think that is what the author
| meant by a Distribution .
|
| My understandin is that they don't make distributional assumption
| on the random variable(your Y or X) they are trying to find a max
| margin for.
___________________________________________________________________
(page generated 2025-01-06 23:00 UTC)