[HN Gopher] Show HN: Open-source A/B Testing framework
       ___________________________________________________________________
        
       Show HN: Open-source A/B Testing framework
        
       Author : cheeseblubber
       Score  : 127 points
       Date   : 2021-08-06 15:53 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jaggednad wrote:
       | I think you guys really did a good job with something that's hard
       | to do well. Kudos
        
       | gingerlime wrote:
       | Wow this looks very polished. (Out of frustration with
       | Optimizely) I created and maintain a couple of A/B test open
       | source projects[0][1] but the statistical analysis was always the
       | hardest part so I'm keen to see what you are doing. We're
       | currently relying on a commercial tool called Analytics
       | Toolkit[2] for this part alone and have been quite happy with it
       | though. The owner is very knowledgable and responsive (no
       | affiliation just happy customers). I wonder if you can adopt
       | similar ideas/algorithms into the open source tool. That can be
       | useful I imagine.
       | 
       | [0] https://github.com/Alephbet/alephbet
       | 
       | [1] https://github.com/Alephbet/lamed
       | 
       | [2] https://www.analytics-toolkit.com/
        
         | jrdorn wrote:
         | Thanks for the comment and for your work on Alephbet! Open
         | source A/B testing is a graveyard of abandoned projects, so
         | it's always great to see more people actively working in this
         | space.
         | 
         | Georgi at Analytics Toolkit definitely knows his stuff. We're
         | taking a Bayesian approach instead, which I know he isn't the
         | biggest fan of, but I think it is much easier to understand.
         | Itamar Faran, the author of our stats engine, has a great
         | article that goes into a lot more detail if you're interested:
         | https://towardsdatascience.com/why-you-should-switch-to-baye...
        
       | Dyac wrote:
       | Hi. Regular A/B experimenter here.
       | 
       | This looks like a great tool!
       | 
       | Does your system store the stats, or does it trust the stats to
       | be stored in, eg. GA, and then just allow you to analyse them?
       | 
       | Is it appropriate to send email alerts when "significance" is
       | reached? Without adhering to minimum sample sizes calculated in
       | advance won't this result in a bunch of Type 1 errors?
       | 
       | Are the changes to the pages made client side or server side? I
       | think clientside but I'm not sure. If so are they sync or
       | asynchronous?
       | 
       | Thanks!
        
         | jrdorn wrote:
         | Hi, thanks for the questions!
         | 
         | 1. We don't store any raw user data. We pull things like mean
         | and standard deviation from data sources, run the statistics,
         | and store the result.
         | 
         | 2. We use a Bayesian statistics engine which is much more
         | immune to peeking problems and Type I errors than frequentist
         | approaches.
         | 
         | 3. Tests can be run either client or server side. For client
         | side, we recommend bundling the SDK with your app (webpack,
         | etc). We really care about performance so never want to add
         | additional http requests or script tags of any kind if at all
         | possible.
        
           | Dyac wrote:
           | Interesting, thanks.
           | 
           | 1. How do you get around needing session level data instead
           | of aggregate data when working with non parametric KPIs? GA
           | in particular is notorious for sampling data.
           | 
           | 2. True, but you can't get away from the fact that a split
           | test only run for a day or two isn't going to give you
           | trustworthy results. It's things like this that abstract away
           | the statistical reality for lay users that cause poor
           | decisions to be made under the guise of being "data driven".
           | I think as testers, and you as a provider of a testing
           | system, have a duty not to lead businesses to believe that
           | they are making statistically sound choices when they may not
           | be.
        
             | jrdorn wrote:
             | 1. GA is very limited as a data source because of sampling
             | and the fact that they don't expose variance. So if using
             | GA, we only support simple binomial metrics, count data
             | (assuming Poisson distribution), and duration data
             | (assuming exponential distribution). For SQL data sources
             | and non-parametric data, we currently rely on the CLT and
             | treat the sampling distribution as Normal. There's a good
             | article that goes over the stats in more detail (Itamar,
             | the author, wrote our stats engine) -
             | https://towardsdatascience.com/how-to-do-bayesian-a-b-
             | testin...
             | 
             | 2. We have a minimum sample size threshold before we run
             | any statistics on the data. To your point, we don't want to
             | say something is "significant" if it's 5 conversions vs 1.
             | This is one area we're looking to improve with better
             | heuristics. We can't completely take the human out of the
             | loop, but we can help give them all the info they need to
             | make the best decision. On that front, we do show Bayesian
             | expected loss (risk) and credible intervals in addition to
             | just the "chance to beat control".
        
               | [deleted]
        
               | Dyac wrote:
               | Brilliant, thank you.
               | 
               | Can you use the system to analyse results of tests it
               | didn't run? ie. If I run tests using some SAAS that only
               | supports frequentist stats could I use your system as a
               | bayesian analysis backend?
        
               | jrdorn wrote:
               | Yes. As long as the variation assignment data and success
               | metrics are in a supported data source (SQL, GA, or
               | Mixpanel currently), it can be queried and analyzed in
               | Growth Book.
        
       | dreamer7 wrote:
       | Would this be useful for A/B testing mobile app features as well?
        
         | jrdorn wrote:
         | We don't have native mobile SDKs yet, but it's something we
         | want to support in the future. Mobile is a little tricky since
         | you either need to do a new release every time you want to
         | start/stop a test or use remote config and deal with offline,
         | slow networks, etc.
        
       | nacs wrote:
       | Looks promising.
       | 
       | Any plan to support Matomo (formerly Piwik) analytics as a data
       | source?
        
         | jrdorn wrote:
         | We plan to add MySQL/MariaDB support soon which should let you
         | use Matomo data as long as you have raw SQL access. For cloud-
         | hosted Matomo, we would have to use the reporting API, which is
         | doable but not as good since there's no way to get standard
         | deviations out of it as far as I can tell.
        
           | XCSme wrote:
           | I am also building a self-hosted analytics platform[0] that
           | has a MySQL/MariaDB database, and I provide a way of
           | recording A/B test data, currently the visualization of the
           | results is not that good so using a tool like GrowthBook
           | makes sense. I assume that once the MySQL support is added,
           | it would be possible to import userTrack data into
           | GrowthBook?
           | 
           | [0]: https://www.usertrack.net
        
             | jrdorn wrote:
             | Yep, should be possible once MySQL/MariaDB is done.
        
       | cardosof wrote:
       | Congratulations, that's very cool! Do you intend to add new
       | features to expand the scope (i.e. visitor personalization)?
        
         | [deleted]
        
         | jrdorn wrote:
         | We're definitely interested in supporting some personalization
         | use cases in the future using contextual bandits.
        
       | tehlike wrote:
       | From the looks of it, it doesn't look like the configuration can
       | be stored in the code repository itself. This is one of the key
       | things to do - treating configuration as a code and properly
       | version it/blame it etc.
       | 
       | Otherwise, this looks great.
        
         | jrdorn wrote:
         | That's on our roadmap. We originally built the tool as a multi-
         | tenant hosted platform so storing configs in a database made
         | the most sense initially. For self hosting, we want to support
         | defining db connections and metrics using yml.
        
           | travisjungroth wrote:
           | Have you considered cuelang? My new hobby is pointing people
           | towards it.
        
       | ablearcher83 wrote:
       | Coupling metrics to experimentation is a huge red flag.
       | 
       | You should offload that onto DBT or some other data modeling
       | tools.
        
         | jrdorn wrote:
         | Do you mind explaining that a little more? As it's currently
         | designed, a company could use DBT to model their raw data into
         | dedicated metric tables and then Growth Book sits on top of
         | those with a really simple SQL query and some settings (e.g. is
         | the goal to increase or decrease the metric)
         | 
         | I'd love to see an open source standard way to define metrics,
         | but haven't found anything yet.
        
       | mooneater wrote:
       | Awesome! Can you comment on choice of mongodb? I admit I have a
       | negative association with it but Im sure there are reasons.
        
         | marcinzm wrote:
         | This was also one of the first things I've noticed since we
         | don't use it so it would be a decently large operational
         | addition to our stack. Maybe it's needed at larger scales but
         | for most companies a SQL server should be good enough.
        
         | jrdorn wrote:
         | Hi! One of the authors here. We're using MongoDB to store
         | caches A/B test results (among other things), which are deeply
         | nested JSON objects. MongoDB let us develop features really
         | quickly so its been a great choice so far for us. We're willing
         | to add support for another data store if there's a lot of
         | demand for it.
        
           | nrjames wrote:
           | Honestly, I'd love to see it just use SQLite as a backend. If
           | it's just storing results, that seems feasible and it would
           | reduce the complexity of the tech stack.
        
           | sojournerc wrote:
           | Nice project! PostgreSQL has excellent JSON support these
           | days, including the ability to query nested fields which may
           | be beneficial to your project.
        
           | ensignavenger wrote:
           | Hi- thanks for making this and open sourcing it!
           | 
           | I would also suggest supporting an alternative to MongoDB.
           | Postgres using jsonb is a great option.
           | 
           | I try to always use and support open source components, as
           | open source provides much less business risk. Since MongoDB
           | isn't itself open source, I would be hesitant to adopt it or
           | a product that depends on it. Mongo also has a bad
           | reputation...
           | 
           | I would definitely evaluate and likely use your product if it
           | did not depend on MongoDB.
        
             | [deleted]
        
             | gqewogpdqa wrote:
             | Totally think it's good to have lots of options. PostgreSQL
             | using JSONB however is a way to hurt your head. Using SQL
             | to manipulate JSON is pretty painful. Why would MongoDB
             | Community not be an ok choice? Unless you're planning on
             | offering MongoDB as a cloud service, what would the concern
             | be?
             | 
             | What's the bad reputation of MongoDB that you're concerned
             | about?
             | 
             | Also, you seem to have a really strong bias against it -
             | can you explain?
        
               | ensignavenger wrote:
               | > PostgreSQL using JSONB however is a way to hurt your
               | head
               | 
               | Really? I have used it pretty extensively and like it...
               | I don't do a lot of complex manipulations though, it
               | might be a pain for some use cases.
               | 
               | > Why would MongoDB Community not be an ok choice?
               | 
               | MongoDB community is SSPL licensed, which is not Open
               | Source. While I don't intend to offer a MongoDB hosting
               | service, I want the option to fork the code and create
               | (or pay some one else to fork the code and create) a
               | hosting service for me to use. This is important because
               | MongoDB Inc's business may not always align well with my
               | business and my needs. (or they may just decide that they
               | don't want to do business with me, maybe they go out of
               | business or their business focus shifts or political
               | pressures come to bear.) The option to create a viable
               | community fork is critical to ensuring that the software
               | remains viably usable. The business risk of relying on
               | proprietary software is great. The more reliant you are
               | on it, the bigger the risk.
               | 
               | > What's the bad reputation of MongoDB that you're
               | concerned about?
               | 
               | Mongo has a long history with Jepsen test failures. See
               | http://jepsen.io/analyses/mongodb-4.2.6 and the linked
               | articles from that page. In addition, I have heard many
               | confirmations of issues from folks who have used it in
               | production.
               | 
               | > Also, you seem to have a really strong bias against it
               | - can you explain?
               | 
               | I think I have explained my position above. I don't have
               | any interest in Mongo or any of its competitors. I don't
               | personally know anyone involved with it or any of its
               | competitors (Though I have naturally had professional
               | contact with some.) My strong preference, as previously
               | stated, is for Open Source software. This preference
               | applies broadly to all software, but especially to
               | infrastructure software, and is by no means specific to
               | MongoDB.
        
               | pphysch wrote:
               | What is painful about "SELECT json_result FROM
               | test_results WHERE json_result -> 'data' -> 'foo' ->
               | 'bar' = 'baz'"?
        
       ___________________________________________________________________
       (page generated 2021-08-06 23:00 UTC)