[HN Gopher] Whom the gods would destroy, they first give real-ti...
___________________________________________________________________
Whom the gods would destroy, they first give real-time analytics
(2013)
Author : sbdchd
Score : 57 points
Date : 2023-07-25 21:56 UTC (1 hours ago)
(HTM) web link (mcfunley.com)
(TXT) w3m dump (mcfunley.com)
| tech_ken wrote:
| If the main objection to constructing a real-time product
| monitoring system for A/B(C/D/E...) decisions is that optional
| stopping is bad why not throw away the null-hypothesis sig
| testing and instead treat the problem as a multi-armed bandit?
| morkalork wrote:
| MAB and its friends like contextual MAB has always been the
| dream. Closing the loop so analytics data is pushed back to the
| decision point in code and isn't a one-way pipe to some
| dashboard is the hardest part though. For non-technical
| reasons.
| tech_ken wrote:
| Sort of a generalized PEBCAK
| whimsicalism wrote:
| Because it is difficult to map that onto real business
| decisions and requires oftentimes supporting a large space of
| possible UI combinations because they haven't been fully ruled
| out yet.
| taeric wrote:
| How well does that dodge the problem? I'd imagine a multi armed
| bandit should stay such that it is always sampling from many
| fair coins, as it were. I would be delighted to read a study on
| that.
| tqi wrote:
| I also think real time is mostly useless (aside from for
| alerting, which probably is a different tool), but I don't think
| the one day delay is much of (if any) protection against the
| experimental pitfalls described.
| brycewray wrote:
| (2013)
| PeterCorless wrote:
| Exactly. And it betrays the biasea of the era. This author
| really got it wrong.
| whimsicalism wrote:
| While there are probably all sorts of problems with marxism when
| it comes to economics, in large companies there should be a
| 'vanguard party' of statisticians who prevent the masses from
| making false claims of causality from p-hacked tests.
| bluecoconut wrote:
| I believe that the comment about CAP theorem violation / treating
| the problem as a technically unsolved thing isn't true. Eg. See
| the dataflow paper that sets up more clear tradeoffs for latency
| and correctness in large scale data processing [1]. I think it
| makes sense to always hold a high bar for your technology -- if
| it's technically feasible, and fits within budgets (time and
| complexity for the team), accepting artificial limitations
| because they soften social problems feels like a mistake /
| believing in a false "ignorance is bliss" belief. I think the
| problem that is presented is more of a problem of popular
| understanding of statistics and game theory, and not the
| technical problem.
|
| [1] "The Dataflow Model: A Practical Approach to Balancing
| Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-
| of-Order Data Processing" https://research.google/pubs/pub43864/
| hn_throwaway_99 wrote:
| I really liked this article, and I thought this statement hit the
| nail on the head: "Confusing _how we do things_ with _how we
| decide which things to do_ is a fatal mistake. " I've worked at
| companies that practice what I call "thrash management"
| (constantly jumping from one priority to the next based on
| whichever fire happens to be burning brightest that day) and it's
| no fun, to put it mildly.
|
| That said, once you build a system for operational metrics (i.e.
| what you need to detect anomalies that indicate outages, security
| concerns, etc.) you're already a huge way there towards having
| real-time analytics. I still wholeheartedly agree with the author
| that these real time metrics should only be in the service of
| operations, not product planning.
| wellpast wrote:
| And... near real-time with high uptime is relatively more costly
| to build / maintain / deploy / operate than batch -- so save your
| org the cost!
| masswerk wrote:
| > I can understand why engineers are predisposed to see
| instantaneous A/B statistics as self-evidently positive
|
| This is the crucial misunderstanding: in actuality, you are
| running a panel.
|
| (There is no such thing as an A/B test outside of marketing.
| Running a meaningful panel requires some information on the
| population, your samples, the homogeneity of those, etc, just to
| pick the right test, to begin with. Also, you need a controlled
| setup, which notably includes a predetermined, fixed timeframe
| for your panel to run. Before this is over, you have no data, at
| all. You are merely tossing coins...)
| PeterCorless wrote:
| Data scientists also do A/B testing on algorithms to see which
| one has better fit for a use case against real-world, real-time
| data.
| PeterCorless wrote:
| This is very 2013. Meanwhile in 2023, a decade later, you
| literally have systems detecting credit card fraud in
| milliseconds. [Disclosure: I work for StarTree, which is powered
| by Apache Pinot. We eat petabytes of data for breakfast.]
| Ecstatify wrote:
| What has that to do with product decisions?
| asimjalis wrote:
| This has aged well.
| dang wrote:
| Related:
|
| _Whom the Gods Would Destroy, They First Give Real-Time
| Analytics (2013)_ - https://news.ycombinator.com/item?id=15379660
| - Oct 2017 (70 comments)
|
| _Whom the Gods Would Destroy, They First Give Real-time
| Analytics_ - https://news.ycombinator.com/item?id=6515805 - Oct
| 2013 (1 comment)
|
| _Whom the Gods Would Destroy, They First Give Real-time
| Analytics_ - https://news.ycombinator.com/item?id=5032588 - Jan
| 2013 (55 comments)
| throwaway63820 wrote:
| Just use Amplitude
| alex_lav wrote:
| Last I used Amplitude it was insanely expensive. Is that not
| still the case?
| codevark wrote:
| [dead]
| Xeoncross wrote:
| Them: We need metrics to know if the users like this new feature
| we're pushing on them.
|
| Me: or you know, we could maybe see what the users biggest issues
| are first and try to build stuff to solve those problems.
___________________________________________________________________
(page generated 2023-07-25 23:00 UTC)