[HN Gopher] Show HN: Open-source A/B Testing framework
___________________________________________________________________
Show HN: Open-source A/B Testing framework
Author : cheeseblubber
Score : 127 points
Date : 2021-08-06 15:53 UTC (7 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jaggednad wrote:
| I think you guys really did a good job with something that's hard
| to do well. Kudos
| gingerlime wrote:
| Wow this looks very polished. (Out of frustration with
| Optimizely) I created and maintain a couple of A/B test open
| source projects[0][1] but the statistical analysis was always the
| hardest part so I'm keen to see what you are doing. We're
| currently relying on a commercial tool called Analytics
| Toolkit[2] for this part alone and have been quite happy with it
| though. The owner is very knowledgable and responsive (no
| affiliation just happy customers). I wonder if you can adopt
| similar ideas/algorithms into the open source tool. That can be
| useful I imagine.
|
| [0] https://github.com/Alephbet/alephbet
|
| [1] https://github.com/Alephbet/lamed
|
| [2] https://www.analytics-toolkit.com/
| jrdorn wrote:
| Thanks for the comment and for your work on Alephbet! Open
| source A/B testing is a graveyard of abandoned projects, so
| it's always great to see more people actively working in this
| space.
|
| Georgi at Analytics Toolkit definitely knows his stuff. We're
| taking a Bayesian approach instead, which I know he isn't the
| biggest fan of, but I think it is much easier to understand.
| Itamar Faran, the author of our stats engine, has a great
| article that goes into a lot more detail if you're interested:
| https://towardsdatascience.com/why-you-should-switch-to-baye...
| Dyac wrote:
| Hi. Regular A/B experimenter here.
|
| This looks like a great tool!
|
| Does your system store the stats, or does it trust the stats to
| be stored in, eg. GA, and then just allow you to analyse them?
|
| Is it appropriate to send email alerts when "significance" is
| reached? Without adhering to minimum sample sizes calculated in
| advance won't this result in a bunch of Type 1 errors?
|
| Are the changes to the pages made client side or server side? I
| think clientside but I'm not sure. If so are they sync or
| asynchronous?
|
| Thanks!
| jrdorn wrote:
| Hi, thanks for the questions!
|
| 1. We don't store any raw user data. We pull things like mean
| and standard deviation from data sources, run the statistics,
| and store the result.
|
| 2. We use a Bayesian statistics engine which is much more
| immune to peeking problems and Type I errors than frequentist
| approaches.
|
| 3. Tests can be run either client or server side. For client
| side, we recommend bundling the SDK with your app (webpack,
| etc). We really care about performance so never want to add
| additional http requests or script tags of any kind if at all
| possible.
| Dyac wrote:
| Interesting, thanks.
|
| 1. How do you get around needing session level data instead
| of aggregate data when working with non parametric KPIs? GA
| in particular is notorious for sampling data.
|
| 2. True, but you can't get away from the fact that a split
| test only run for a day or two isn't going to give you
| trustworthy results. It's things like this that abstract away
| the statistical reality for lay users that cause poor
| decisions to be made under the guise of being "data driven".
| I think as testers, and you as a provider of a testing
| system, have a duty not to lead businesses to believe that
| they are making statistically sound choices when they may not
| be.
| jrdorn wrote:
| 1. GA is very limited as a data source because of sampling
| and the fact that they don't expose variance. So if using
| GA, we only support simple binomial metrics, count data
| (assuming Poisson distribution), and duration data
| (assuming exponential distribution). For SQL data sources
| and non-parametric data, we currently rely on the CLT and
| treat the sampling distribution as Normal. There's a good
| article that goes over the stats in more detail (Itamar,
| the author, wrote our stats engine) -
| https://towardsdatascience.com/how-to-do-bayesian-a-b-
| testin...
|
| 2. We have a minimum sample size threshold before we run
| any statistics on the data. To your point, we don't want to
| say something is "significant" if it's 5 conversions vs 1.
| This is one area we're looking to improve with better
| heuristics. We can't completely take the human out of the
| loop, but we can help give them all the info they need to
| make the best decision. On that front, we do show Bayesian
| expected loss (risk) and credible intervals in addition to
| just the "chance to beat control".
| [deleted]
| Dyac wrote:
| Brilliant, thank you.
|
| Can you use the system to analyse results of tests it
| didn't run? ie. If I run tests using some SAAS that only
| supports frequentist stats could I use your system as a
| bayesian analysis backend?
| jrdorn wrote:
| Yes. As long as the variation assignment data and success
| metrics are in a supported data source (SQL, GA, or
| Mixpanel currently), it can be queried and analyzed in
| Growth Book.
| dreamer7 wrote:
| Would this be useful for A/B testing mobile app features as well?
| jrdorn wrote:
| We don't have native mobile SDKs yet, but it's something we
| want to support in the future. Mobile is a little tricky since
| you either need to do a new release every time you want to
| start/stop a test or use remote config and deal with offline,
| slow networks, etc.
| nacs wrote:
| Looks promising.
|
| Any plan to support Matomo (formerly Piwik) analytics as a data
| source?
| jrdorn wrote:
| We plan to add MySQL/MariaDB support soon which should let you
| use Matomo data as long as you have raw SQL access. For cloud-
| hosted Matomo, we would have to use the reporting API, which is
| doable but not as good since there's no way to get standard
| deviations out of it as far as I can tell.
| XCSme wrote:
| I am also building a self-hosted analytics platform[0] that
| has a MySQL/MariaDB database, and I provide a way of
| recording A/B test data, currently the visualization of the
| results is not that good so using a tool like GrowthBook
| makes sense. I assume that once the MySQL support is added,
| it would be possible to import userTrack data into
| GrowthBook?
|
| [0]: https://www.usertrack.net
| jrdorn wrote:
| Yep, should be possible once MySQL/MariaDB is done.
| cardosof wrote:
| Congratulations, that's very cool! Do you intend to add new
| features to expand the scope (i.e. visitor personalization)?
| [deleted]
| jrdorn wrote:
| We're definitely interested in supporting some personalization
| use cases in the future using contextual bandits.
| tehlike wrote:
| From the looks of it, it doesn't look like the configuration can
| be stored in the code repository itself. This is one of the key
| things to do - treating configuration as a code and properly
| version it/blame it etc.
|
| Otherwise, this looks great.
| jrdorn wrote:
| That's on our roadmap. We originally built the tool as a multi-
| tenant hosted platform so storing configs in a database made
| the most sense initially. For self hosting, we want to support
| defining db connections and metrics using yml.
| travisjungroth wrote:
| Have you considered cuelang? My new hobby is pointing people
| towards it.
| ablearcher83 wrote:
| Coupling metrics to experimentation is a huge red flag.
|
| You should offload that onto DBT or some other data modeling
| tools.
| jrdorn wrote:
| Do you mind explaining that a little more? As it's currently
| designed, a company could use DBT to model their raw data into
| dedicated metric tables and then Growth Book sits on top of
| those with a really simple SQL query and some settings (e.g. is
| the goal to increase or decrease the metric)
|
| I'd love to see an open source standard way to define metrics,
| but haven't found anything yet.
| mooneater wrote:
| Awesome! Can you comment on choice of mongodb? I admit I have a
| negative association with it but Im sure there are reasons.
| marcinzm wrote:
| This was also one of the first things I've noticed since we
| don't use it so it would be a decently large operational
| addition to our stack. Maybe it's needed at larger scales but
| for most companies a SQL server should be good enough.
| jrdorn wrote:
| Hi! One of the authors here. We're using MongoDB to store
| caches A/B test results (among other things), which are deeply
| nested JSON objects. MongoDB let us develop features really
| quickly so its been a great choice so far for us. We're willing
| to add support for another data store if there's a lot of
| demand for it.
| nrjames wrote:
| Honestly, I'd love to see it just use SQLite as a backend. If
| it's just storing results, that seems feasible and it would
| reduce the complexity of the tech stack.
| sojournerc wrote:
| Nice project! PostgreSQL has excellent JSON support these
| days, including the ability to query nested fields which may
| be beneficial to your project.
| ensignavenger wrote:
| Hi- thanks for making this and open sourcing it!
|
| I would also suggest supporting an alternative to MongoDB.
| Postgres using jsonb is a great option.
|
| I try to always use and support open source components, as
| open source provides much less business risk. Since MongoDB
| isn't itself open source, I would be hesitant to adopt it or
| a product that depends on it. Mongo also has a bad
| reputation...
|
| I would definitely evaluate and likely use your product if it
| did not depend on MongoDB.
| [deleted]
| gqewogpdqa wrote:
| Totally think it's good to have lots of options. PostgreSQL
| using JSONB however is a way to hurt your head. Using SQL
| to manipulate JSON is pretty painful. Why would MongoDB
| Community not be an ok choice? Unless you're planning on
| offering MongoDB as a cloud service, what would the concern
| be?
|
| What's the bad reputation of MongoDB that you're concerned
| about?
|
| Also, you seem to have a really strong bias against it -
| can you explain?
| ensignavenger wrote:
| > PostgreSQL using JSONB however is a way to hurt your
| head
|
| Really? I have used it pretty extensively and like it...
| I don't do a lot of complex manipulations though, it
| might be a pain for some use cases.
|
| > Why would MongoDB Community not be an ok choice?
|
| MongoDB community is SSPL licensed, which is not Open
| Source. While I don't intend to offer a MongoDB hosting
| service, I want the option to fork the code and create
| (or pay some one else to fork the code and create) a
| hosting service for me to use. This is important because
| MongoDB Inc's business may not always align well with my
| business and my needs. (or they may just decide that they
| don't want to do business with me, maybe they go out of
| business or their business focus shifts or political
| pressures come to bear.) The option to create a viable
| community fork is critical to ensuring that the software
| remains viably usable. The business risk of relying on
| proprietary software is great. The more reliant you are
| on it, the bigger the risk.
|
| > What's the bad reputation of MongoDB that you're
| concerned about?
|
| Mongo has a long history with Jepsen test failures. See
| http://jepsen.io/analyses/mongodb-4.2.6 and the linked
| articles from that page. In addition, I have heard many
| confirmations of issues from folks who have used it in
| production.
|
| > Also, you seem to have a really strong bias against it
| - can you explain?
|
| I think I have explained my position above. I don't have
| any interest in Mongo or any of its competitors. I don't
| personally know anyone involved with it or any of its
| competitors (Though I have naturally had professional
| contact with some.) My strong preference, as previously
| stated, is for Open Source software. This preference
| applies broadly to all software, but especially to
| infrastructure software, and is by no means specific to
| MongoDB.
| pphysch wrote:
| What is painful about "SELECT json_result FROM
| test_results WHERE json_result -> 'data' -> 'foo' ->
| 'bar' = 'baz'"?
___________________________________________________________________
(page generated 2021-08-06 23:00 UTC)