[HN Gopher] Spc-kit: A toolkit for statistical process control u...
___________________________________________________________________
Spc-kit: A toolkit for statistical process control using SQL
Author : kqr
Score : 38 points
Date : 2024-03-06 05:57 UTC (17 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| apwheele wrote:
| Not SPC, but in a similar spirit usecase, I have code examples
| here of calculating confidence intervals around proportions in
| SQL, https://andrewpwheeler.com/2020/11/30/confidence-
| intervals-a...
|
| I am one of the folks that for dashboarding prefer to push
| everything to SQL VIEWS and functions. So I use this to monitor
| proportions for different processes month to month, which may
| have error bars +/- five to ten percent.
|
| It is technically not the right test to know if the process
| changed, but is useful by eye to know typical variation.
| jacques_chester wrote:
| I think the SPC tests for detecting process shift would be EWMA
| and Cusum. The code has EWMA support, but it relies on a
| function to work and so is tied directly to PostgreSQL.
| apwheele wrote:
| Agreed a bit of different use cases. (Or maybe should say the
| SPC approach is a better/more principled approach, whereas
| this is an easy graphical approach and I don't need to worry
| about "resetting" the CUSUM.)
|
| I use the binomial CIs in dashboarding scenarios, so if the
| user selects a different subset of data, all the graphs are
| auto-updated (on the SQL side). Which when you get people
| subset into tinier slices of data it becomes more important.
|
| So if you can upfront identify the processes you want to
| monitor, then when flagged have this send an alert. That is
| better than a dashboard forcing people to click on stuff and
| hope they identify a anomaly in the process.
| melondonkey wrote:
| More dashboards need this I think. I've also added relative
| standard error values on aggregations before to serve as a
| reliability filter that doesn't even show users data when
| they slice it too then.
| jacques_chester wrote:
| Oh hey, this is me.
|
| The background is that I've thought about applying SPC to
| software behavior for a long time[0]. While I was at Shopify I
| began to think about using it for detecting regressions in the
| YJIT benchmark suite[1]. My first thought was to code a Ruby
| library (Shopify is a Ruby on Rails outfit), but I decided that
| it would be more accessible to more folks if it worked inside a
| database instead. I got caught in a 20% layoff at Shopify and
| that somewhat took the wind out of my sails, which is why it's
| been dormant for a year.
|
| There are some limitations to be aware of in a software context.
|
| First, sample sizes have to be constant. In a world of unreliable
| systems and networks this is a pretty tough constraint. It could
| be achieved by taking fixed subsamples at random from a varying
| larger sample (eg, normally we get 100, pick 50 at random).
| Switching to variable sample size would be possible, but it will
| be a lot of work and I haven't had the motivation to tackle it
| yet. It's also only lightly treated in the main textbook I've
| worked from, because the usual focus is manufacturing operations
| where fixed sample sizes are common.
|
| Second, it doesn't deal with non-parametric distributions.
| Classical SPC is rooted in the Normal distribution, and continues
| to work quite well with any distribution that has some kind of
| centrally-located mass that tapers off to the edges. But a lot of
| software behavior follows power laws, especially Pareto
| distributions. There is SPC literature to deal with non-normal
| distributions using non-parametric statistics. But I haven't
| bought and digested the relevant books.
|
| It's worth noting that SPC is as much a system of problem solving
| as it is a bunch of statistical tools. Most major metrics vendors
| now offer some kind of anomaly detection, but then what? What are
| your out-of-control action plans? What is your concept of an
| acceptable amount of staying close to the mean? What is your
| target, and why is it your target? Can you even achieve such a
| target with the system as-is, or are you dreaming? SPC attends to
| these questions as well.
|
| I can definitely recommend kqr's introduction[2], which is linked
| from the repo.
|
| [0] https://theoryof.predictable.software/articles/what-is-
| predi...
|
| [1] https://speed.yjit.org
|
| [2] https://two-wrongs.com/statistical-process-control-a-
| practit...
| jacques_chester wrote:
| I might add that if you're wondering about how the SPC tools
| work, read the comments in the SQL. I deliberately set out to
| write comments that would be helpful both to experienced SPC
| folks _and_ to curious beginners.
|
| Unfortunately the GitHub Linguist grammar for SQL doesn't
| recognize PostgreSQL comments, so highlighting will be wonky in
| places.
| OJFord wrote:
| Before I noticed your readme reading list I found:
| https://r-bar.net/xmr-control-chart-tutorial-examples which I
| think I preferred (having now read both that and [2]) for being
| a bit quicker to the point since I was already imagining things
| I might measure so didn't need the motivating examples. (SPC
| completely new to me - a bit difficult to search for
| information about since it seems popular with management
| consulting, six sigma certs, and the like.)
|
| I'd also be interested in applying this to software for better
| metrics, just thinking about how I might be able to get
| somewhere quickly in an innovation/hackathon days (aside from
| your project) came across
| https://grafana.com/grafana/plugins/kensobi-spc-panel/ - looks
| like it doesn't actually do any calculation, you have to feed
| it your measurements & sub-group sample size etc. but could be
| handy in conjunction with yours.
| jacques_chester wrote:
| The XmR is in some ways the best beginner's plot because it's
| so immediately tied to the data. The other Shewhart charts
| work with _samples_ , which is enough indirection to cause
| some mild confusion on first encounter (for me at least).
|
| That Grafana plugin looks awesome. If I can shave off some
| time from endless leetcode grinding I might play with it.
| OJFord wrote:
| Ha, since I found sample based examples before that article
| I was actually confused in the other direction when I read
| it. (Even more so when it links the table of constants and
| says that's the d2 value for n=2... I assume n=2 is the
| sub-group size, but in XmR as described it seems to
| correspond to the moving average/'range' of two (n=1 sized)
| samples?) Do you have a recommendation that discusses the
| various charts so I can better understand from XmR to n=8
| 'x bar'/'x bar R' etc. (I think I'm using the terminology
| correctly enough - where as I understand it we have n=8 xs
| in each time sample, the mean of which (i.e. x bar) is what
| we plot and then look at the moving average/'range' of)?
| jacques_chester wrote:
| Yes, in XmR the mR is the moving range -- ie the range
| between consecutive individual measurements, not the
| range of a single sample of multiple measurements.
|
| I can't think of a reference for the different charts off
| the top of my head. Certainly working SPC-kit helped me
| to better understand the differences and how they relate
| to each other. But that's not a very scalable way to
| transmit information.
| OJFord wrote:
| No worries, I'll have a read of your code and comments at
| least as you suggested in another comment.
|
| Just a quick question though if I may - why/when would I
| _want_ a n >1 group to average instead of treating them
| all independently as in XmR? Is it if you have multiple
| categories (of say machine operators, production lines,
| or software servers) and that variable (the category)
| isn't the one (x) you're interested in? So for example
| the first sample is the time taken by each of 8
| machinists to fabricate the first nail, etc., so each x_i
| bar is the mean time for a nail over time (i). Or if we
| _were_ interested in comparing the machinists, then x_j
| would be employee #j and for each one we 'd be taking the
| mean of their n nails.
| jacques_chester wrote:
| > _Just a quick question though if I may - why /when
| would I want a n>1 group to average instead of treating
| them all independently as in XmR?_
|
| Two reasons come to mind. First, for whatever reason, it
| might not be economical to measure every individual in
| the population. There might be too many, or the thing
| being sampled is a continuous flow (eg liquid product in
| a chemical processing plant) or perhaps the sampling is
| destructive (imagine a pressure failure test, or a
| reagent test).
|
| Second, you might just naturally have a sample. For
| example, each YJIT benchmark is run multiple times
| because there's variability in the measurement not due to
| changes in YJIT. Since you have _n_ runs of the
| benchmark, you naturally have a sample size of _n_. It
| doesn 't make sense to think of them as consecutive
| measurements.
|
| The rest of your comment goes to the business of
| selecting what the grouping of samples _is_ (the term is
| "rational subgrouping"). A lot of pages are given to this
| question in the SPC books I've read, because it's not
| difficult to unintentionally mask signal by combining
| things that shouldn't be combined. An example is checking
| the precision of piston machining. It might be you take
| all the pistons in a given engine as your sample. But
| later you discover that one of your four machining
| stations is off-center, which was masked by the ordinary
| variation of the other three. In this case the sampling
| should have been per-station, not per-engine.
| OJFord wrote:
| That's helpful, thank you.
| shadowsun7 wrote:
| I believe you may enjoy this, which takes SPC methods and
| extrapolates from it an entire path to becoming data driven:
| https://commoncog.com/becoming-data-driven-first-principles/
| jacques_chester wrote:
| This looks really good! I'll give it a read and perhaps add
| it to the reading list in the repo.
___________________________________________________________________
(page generated 2024-03-06 23:01 UTC)