hngopher.com

       [HN Gopher] Time-Series Anomaly Detection: A Decade Review
       ___________________________________________________________________
        
       Time-Series Anomaly Detection: A Decade Review
        
       Author : belter
       Score  : 303 points
       Date   : 2025-01-06 11:10 UTC (11 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | whatever1 wrote:
       | Sometimes HN just reads my mind. This was exactly the topic I was
       | looking into this week.
        
       | zaporozhets wrote:
       | I recently tried to homebrew some anomaly detection work for a
       | performance tracking project and was surprised at the absence of
       | any off-the-shelf OSS or Paid solutions in this space (that
       | weren't super basic or way too complex). Lots of fertile ground
       | here!
        
         | rad_gruchalski wrote:
         | There's a ton of material related to anomaly detection with
         | Prometheus and Grafana stack:
         | https://grafana.com/blog/2024/10/03/how-to-use-prometheus-
         | to.... But maybe this is the "way too complex" case you
         | mention.
        
         | CubsFan1060 wrote:
         | I'm still playing around with this one:
         | https://grafana.com/blog/2024/10/03/how-to-use-prometheus-to...
         | (there's a github repo for it).
         | 
         | So far, it's not terrible, but has some pretty big flaws.
        
           | jcreixell wrote:
           | Hi, co-author of the blog post here. I would love to learn
           | more about the flaws you see and if ideas on how to improve
           | it! We definitely plan to iterate on it and make it as good
           | as we possibly can.
        
             | nyrikki wrote:
             | Not really related to the above post, but one thing I am
             | not seeing on an initial pass is the advancement of
             | understanding of problems like riddled or wada basins.
             | 
             | Especially with time delays this and 3+ attractors this can
             | be problematic.
             | 
             | A simple example:
             | 
             | https://doi.org/10.21203/rs.3.rs-1088857/v1
             | 
             | There are tools to try and detect these features that were
             | found over the past few decades, and I know I wasted a few
             | years on a project that superficially looked like a FP
             | issue, but ended up being a mix of the wada property and/or
             | porous sets.
             | 
             | The complications will describing these worse than
             | traditional chaos indeterminate situations may make it
             | inappropriate for you.
             | 
             | But it would be nice if visibility was increased. Funny
             | enough most LLMs corpus is mostly fed from a LSAT question.
             | 
             | There has been a lot of movement here when you have n>=3
             | attractors/exits.
             | 
             | Not solutions unfortunately, but tools to help figure out
             | when you hit it.
        
             | CubsFan1060 wrote:
             | To be clear "some big flaws" was probably overstating it.
             | I'm going to edit that. Also, thanks for the work on this.
             | I would absolutely love to contribute, but my maths are not
             | good enough for this :)
             | 
             | The biggest thing I've run into in my testing is that an
             | anomaly of reasonably short timeframe seems to throw the
             | upper and lower bands off for quite some time.
             | 
             | That being said, perhaps changing some of the variables
             | would help with that, and I just don't have enough skill to
             | be able to understand the exact way to adjust that.
        
             | pnathan wrote:
             | The number of manual tweaks required to the approach
             | suggest that it is essentially an ad hoc experimental
             | fitting, rather than a stable theoretical model that can
             | adapt to your time series.
        
         | ramon156 wrote:
         | I needed a TS anomaly detection for my internship because we
         | needed to track when a machine/server was doing poorly or had
         | unplanned downtime. I expected Microsoft's C# library to be
         | able to do this, but my god, it's a mess. If someone has the
         | time and will to implement a proper library then that would ve
         | awesome.
        
           | neonsunset wrote:
           | Anomaly detection in time-series data is not a concern of the
           | standard library of all things. Nor is it a concern of "base
           | abstractions" shipped as extensions (think ILogger).
        
             | Phurist wrote:
             | If only life was as simple as calling .isAnomaly() on
             | anything
        
               | neonsunset wrote:
               | Hardcoded to return 'false' of course. Because nothing
               | ever happens!
        
         | jeffbee wrote:
         | The reason there are not off-the-shelf solutions is this is an
         | unsolved problem. There is no approach that is generally
         | useful.
        
         | phirschybar wrote:
         | agreed. at my company we ended up rolling our own system. but
         | this area is absolutely ripe for some configurable saas or OS
         | tool with advanced reporting and alerting mechanisms. Datadog
         | has a decent offering, but it's pretty $$$$.
        
           | montereynack wrote:
           | Gonna throw in my hat and say that if you're working on
           | industrial applications (like energy or manufacturing) give
           | us a holler at www.sentineldevices.com! Plug-and-play time
           | series monitoring for industrial applications is exactly what
           | we do.
        
       | quijoteuniv wrote:
       | I use offset function in Prometheus to make an average of past
       | weeks as a recording rule. We have a use in our systems that is
       | very "seasonal" as in weekly cycles so I make an average of some
       | metric (offset 1 week, 2 week, 3 week , 4 week/4) and I compare
       | it to the current value of that metric. That way the alarms can
       | be set day or night, weekday or weekend, and the thresholds are
       | dynamic. It compares against an average of the day of the week,
       | or time of the day. There is someone in Gitlab that posted a more
       | in depth explanation of this way of working.
       | https://about.gitlab.com/blog/2019/07/23/anomaly-detection-u...
       | Things get a bit more complicated with holidays, but you can
       | actually programm them into prometheus
       | https://promcon.io/2019-munich/slides/improved-alerting-with...
        
         | CubsFan1060 wrote:
         | Gitlab also has this: https://gitlab.com/gitlab-com/gl-
         | infra/tamland
         | 
         | I'm not really smart in these areas, but it feels like
         | forecasting and anomaly detection are pretty related. I could
         | be wrong though.
        
           | diab0lic wrote:
           | You are not wrong! An entire subclass of anomaly detection
           | can basically be reduced to: forecast the next data point and
           | then measure the forecast error when the data point arrives.
        
             | fnordpiglet wrote:
             | Well it doesn't really require a forecast - variance based
             | anomaly detection doesn't make an assertion of the next
             | point but that its maximum change is within some band. Such
             | models usually can't be used to make a forecast other than
             | the banding bounds.
        
               | diab0lic wrote:
               | That would be a different subclass of anomaly detection
               | solutions.
        
         | gr3ml1n wrote:
         | Whenever I have a chart in Grafana that isn't too dense, I
         | almost always add a line for the 7d offset value. Super useful
         | to tell what's normal and what isn't.
        
       | eth0up wrote:
       | I had not known of Time Series (or most other) anomaly detection
       | methods until recently, when I used several LLMs to assist with
       | an analysis of the Florida Lottery Pick4 history.
       | 
       | For years, I'd been casually observing the daily numbers (2 draws
       | daily for each since around ?/?/2004?, and 1 prior), which are
       | Pick2, 3, 4, and 5, but mostly Pick4, which is 4 digits, thus has
       | 1:10,000 odds, vs 1:100, 1:1000 and 1:100,000 for the others.
       | 
       | With truly random numbers, it is pretty difficult to identify
       | anything but glaring anomalies. Among some of the tests performed
       | were: (clusters\daily\weekly; (isolated forest; (popular
       | permutations\by date\special holidays\etc; (individual
       | digits\deviations; (temporal frequency; (dbscan; (zscore;
       | (patterns; (correlation; (external factors; (auto correlation by
       | cluster; (predictive modeling; (chi squared; (Time Series ... and
       | a few more I've forgotten.
       | 
       | For those wondering why I'd do this, around 2023-23, the FL
       | Lottery drastically modified their website. Previously, one could
       | enter a number for the game of their choice and receive all
       | historical permutations of that number over all years, going back
       | to the 1990s. With the new modification, the permutations have
       | been eliminated and the history only shows for 2 years. The only
       | option for the complete history is to download the provided PDF
       | -- however, it is full of extraneous characters and cannot be
       | readily searched via Ctrl-F, etc. Processing this PDF involves
       | extensive character removal to render it parsable or modestly
       | readable. So to restore the previously functional search ability,
       | manual work is required. The seemingly deliberate obfuscation, or
       | obstruction, was my motivation. The perceived anomalies over the
       | years were secondary, as I am capable of little more than
       | speculation without proper testing. But those two factors
       | intrigued me.
       | 
       | Having no background in math and only feeble abilities in
       | programming, this was a task that I could not have performed
       | without LLMs and python code used for the tests. The test is
       | still incomplete, having increased in complexity as I progressed
       | and left me too tired to persist. The result were ultimately
       | within acceptable ranges of randomness, but some patterns were
       | present. I had made files of (all numbers that ever occurred;
       | (all numbers that have never occurred; (popularity of single,
       | isolated digits -- I was actually correct in my intuition here,
       | which proved certain single were occurring with lesser or greater
       | frequencies as would be expected; (a script to apply Optical
       | Character Recognition on the website and append the latest
       | results to a living text and PDF file to offer anyone interested
       | an opportunity to freely search, parse and analyze the numbers.
       | But I couldn't quite wangle the OCR successfully.
       | 
       | Working with a set over 60k individual number sets, looking for
       | anomalies over a 30 year period; if there are other methods
       | anyone would suggest, please offer them and I might resume this
       | abandoned project.
        
         | ukuina wrote:
         | You could use a Visual LLM to transcribe the PDF back into JSON
         | data for you.
         | 
         | Something like: ghostpdf to convert PDF into images, then
         | gpt-4o or ollama+Llama3 to transcribe each image into output
         | JSON.
        
           | eth0up wrote:
           | First, let me admit I'm slow to understand, and I also may
           | have explained the above poorly.
           | 
           | The PDF is thousands of newlines, with multiple entries on
           | each, convoluted with lots of garbage formatting. The only
           | data to be preserved is winning num, date, evening/midday
           | draw, fireball number (introduced in 2020-ish?) and factored
           | along with the change from one to two daily draws (2001-ish)
           | as acceptable anomalies in the data set, which has been done,
           | I believe.
           | 
           | The difficult part of this, actually, was cleaning the
           | squalid pdf.
           | 
           | In the end, after the work was all done, script using OCR
           | would successfully append a number to the cleaned text/pdf,
           | but usually not the correct num. The only reason I used OCR
           | was that I couldn't find the right frames in the webpage that
           | contained the latest winning numbers, and getting html
           | extraction to to work in a script failed because of it.
           | 
           | I must admit, although I have used JSON files, I don't know
           | much about them. Additionally, I'm ignorant enough that it's
           | probably best not to attempt to advise me too much here for
           | sake of thread sanitation - it could get bloated and off
           | topic :)
           | 
           | I think with renewed inspiration, I could figure out a
           | successful method to keep the public file updated, but I
           | primarily need surefire methods of analysis upon the nums for
           | my anomaly detection, which is a challenge for a caveman who
           | never went to middle/high school and didn't resume school
           | beyond 4th grade until community college much later. Of
           | course, the fact that such an animal can twiddle with
           | statistics and data analysis is a big testament to the
           | positive attributes of LLMs, which without, the pursuit would
           | be a vague thought at most.
           | 
           | Although I welcome and appreciate any feedback, I'm pretty
           | sure it isn't too welcome here. I'll try to make sense of
           | your suggestions though.
        
       | djoldman wrote:
       | > Unfortunately, inherent complexities in the data generation of
       | these processes, combined with imperfections in the measurement
       | systems as well as interactions with malicious actors, often
       | result in abnormal phenomena. Such abnormal events appear
       | subsequently in the collected data as anomalies.
       | 
       | This is critical; and difficult to deal with in many instances.
       | 
       | > With the term anomalies we refer to data points or groups of
       | data points that do not conform to some notion of normality or an
       | expected behavior based on previously observed data.
       | 
       | This is a key problem or perhaps _the_ problem: rigorously or
       | precisely defining what an anomaly is and _is not_.
        
       | Imanari wrote:
       | Look up Eamonn Keogh, he has lots of interesting work on TSAD.
        
         | ivoflipse wrote:
         | His Google Tech Talk made me really appreciate his groups work,
         | even though I have no need for time series analysis
         | 
         | https://youtu.be/vzPgHF7gcUQ?si=rKQvOjK_qjiSSvKE
        
       | brainwipe wrote:
       | Wonderful! My PhD was in stream anomaly detection using dynamic
       | neural networks in 2003. Can't wait to go deep through this paper
       | and find out what the latest thinking is. Thanks, OP.
        
       | jorl17 wrote:
       | I have a soft spot for this area. Almost 10 years ago, my Masters
       | touched on something somewhat adjacent to this (Online Failure
       | Prediction): https://estudogeral.uc.pt/handle/10316/99218
       | 
       | We built a system to detect exceptions before they happened, and
       | act on them, hoping that this would be better than letting them
       | happen (e.g. preemptively slow down the rate of requests instead
       | of leading to database exhaustion)
       | 
       | At the time, I felt that there was soooooooo much to do in the
       | area, and I'm kinda sad I never worked on it again.
        
       | mathewshen wrote:
       | Very surprised that I can see this paper here and it deserved! I
       | start fellow the work of Dr. Boniol since 2021(By the
       | series2graph paper). The Series2Graph is an very good algorithm
       | that works well in some complex situations. And his later works
       | like _New Trends in Time-Series Anomaly Detection_ , _TSB-UAD_ ,
       | _Theseus_ and _k-Graph_ and so on are very insightful too.
        
         | mathewshen wrote:
         | If you want to see more algorithms/systems that are used in
         | industry company like
         | Twitter/Microsoft/Amazon/LinkedIn/IBM/..., you can see my note
         | here(The source page is in Chinese, and I just translate it
         | into English using Google Translate): https://datahonor-
         | com.translate.goog/odyssey/aiops/tsad/pape...
        
         | countzro wrote:
         | I also liked the main idea of Series2Graph but found the
         | implementation a bit complicated.
         | 
         | There is a similar algorithm with a simpler implementation in
         | this paper: ,,GraphTS: Graph-represented time series for
         | subsequence anomaly detection"
         | https://pmc.ncbi.nlm.nih.gov/articles/PMC10431630/
         | 
         | The approach is for univariate time series and I found it to
         | perform well (with very minor tweaks).
        
       | mikehollinger wrote:
       | This doesn't capture work that's happened in the last year or so.
       | 
       | For example some former colleagues timeseries foundation model
       | (Granite TS) which was doing pretty well when we were
       | experimenting with it. [1]
       | 
       | An aha moment for me was realizing that the way you can think of
       | anomaly models working is that they're effectively forecasting
       | the next N steps, and then noticing when the actual measured
       | values are "different enough" from the expected. This is simple
       | to draw on a whiteboard for one signal but when it's multi
       | variate, pretty neat that it works.
       | 
       | [1] https://huggingface.co/ibm-granite/granite-timeseries-ttm-r1
        
         | apwheele wrote:
         | Care to share the contexts in which someone needs a zero-shot
         | model for time series? I have just never come across one in
         | which you don't have some historical data to fit a model and go
         | from there.
        
           | delusional wrote:
           | In this case I don't think zero-shot means no context. I
           | think it's more used in relation to fine-tuning the model
           | parameters over your data.
           | 
           | > TTM-1 currently supports 2 modes:
           | 
           | > Zeroshot forecasting: Directly apply the pre-trained model
           | on your target data to get an initial forecast (with no
           | training).
           | 
           | > Finetuned forecasting: Finetune the pre-trained model with
           | a subset of your target data to further improve the forecast
        
         | tessierashpool9 wrote:
         | what were you thinking then before your aha moment? :D
        
           | mikehollinger wrote:
           | > what were you thinking then before your aha moment? :D
           | 
           | My naive view was that there was some sort of "normalization"
           | or "pattern matching" that was happening. Like - you can look
           | at a trend line that generally has some shape, and notice
           | when something changes or there's a discontinuity. That's a
           | very simplistic view - but - I assumed that stuff was trying
           | to do regressions and notice when something was out of a
           | statistical norm like k-means analysis. Which works, sort of,
           | but is difficult to generalize.
        
         | 0cf8612b2e1e wrote:
         | My similar recognition was when I read about isolation forests
         | for outlier detection[0]. When predictions are different from
         | the average, something is off.
         | 
         | [0] https://scikit-
         | learn.org/stable/modules/generated/sklearn.en...
        
       | Dowwie wrote:
       | In the nascent world of water tech are IOT devices that monitor
       | water flow. These devices can detect leaks and estimate fixture-
       | level water consumption. Leak detection is all about identifying
       | time series outliers. The distribution-based anomaly detection
       | mentioned in the paper is relevant for leak detection.
       | Interestingly, a residence may require multiple distributions due
       | to pipe temperature variations between warm and cold seasons.
        
       | lebotte wrote:
       | Time-series anomaly detection involves using techniques like
       | forecasting and historical data offsets to dynamically identify
       | deviations in patterns, as discussed in practical applications
       | with tools like Prometheus and Grafana.
        
         | conjectures wrote:
         | Hi Le Bot, say potato?
        
       | hazrmard wrote:
       | Anomaly detection (AD) can arguably be a value-add to any
       | industry. It may not be a core product, but AD can help optimize
       | operations for almost anyone.
       | 
       | * Manufacturing: Computer vision to pick anomalies off the
       | assembly line.
       | 
       | * Operation: Accelerometers/temperature sensors w/ frequency
       | analysis to detect onset of faults (prognostics / diagnostics)
       | and do predictive maintenance.
       | 
       | * Sales: Timeseries analyses on numbers / support calls to detect
       | up/downticks in cashflows, customer satisfaction etc.
        
       | leeoniya wrote:
       | a colleague is doing a FOSDEM 2025 talk about
       | https://github.com/grafana/augurs
        
       | montereynack wrote:
       | Gonna throw in my hat here, time series anomaly detection for
       | industrial machinery is the problem my startup is working on!
       | Specifically we're making it work offline-by-default (we
       | integrate the AI with the equipment, and don't send data to any
       | third party servers - even ours) because we feel there's a ton of
       | customer opportunities that get left in the dust because they
       | can't be online. If you or someone you know is looking for a
       | monitoring solution for industrial machinery, or are passionate
       | about security-conscious industrial software (we also are
       | developing a data historian) let's talk! www.sentineldevices.com
        
       | bluechair wrote:
       | Didn't see it mentioned but good to know about: UCR matrix
       | profile.
       | 
       | The Matrix Profile is honestly one of the most underrated tools
       | in the time series analysis space - it's ridiculously efficient.
       | The killer feature is how it just works for finding motifs and
       | anomalies without having to mess around with window sizes and
       | thresholds like you do with traditional techniques. Solid across
       | domains too, from manufacturing sensor data to ECG analysis to
       | earthquake detection.
       | 
       | https://www.cs.ucr.edu/~eamonn/MatrixProfile.html
        
         | bee_rider wrote:
         | What does it do? Anything to do with matrices, like, from math?
        
       | itissid wrote:
       | Can someone explain to me how are SVMs are being classified in
       | this paper as "Distribution-Based"? This is quite confusing as a
       | taxonomy. They generaly don't estimate model free
       | densities(kernel density estimates) or model based(separating one
       | or more possibly overlapping normal distributions).
       | 
       | I get that they could be explicitly modeling a data generating
       | process's probabilty itself(just like a NN) like of a
       | Bernoulli(whose ML function is X-Entropy) or a Normal(ML function
       | Mean Square loss), but I don't think that is what the author
       | meant by a Distribution .
       | 
       | My understandin is that they don't make distributional assumption
       | on the random variable(your Y or X) they are trying to find a max
       | margin for.
        
       ___________________________________________________________________
       (page generated 2025-01-06 23:00 UTC)