[HN Gopher] A visual introduction to Gaussian Belief Propagation
       ___________________________________________________________________
        
       A visual introduction to Gaussian Belief Propagation
        
       Author : power
       Score  : 148 points
       Date   : 2021-08-31 12:34 UTC (10 hours ago)
        
 (HTM) web link (gaussianbp.github.io)
 (TXT) w3m dump (gaussianbp.github.io)
        
       | wiz21c wrote:
       | I'm a bit of noob in the matter but :
       | 
       | FTA: "Looking to the future, we anticipate a long-term trend
       | towards a "hardware jungle" of parallel, heterogeneous,
       | distributed and asynchronous computing systems which communicate
       | in a peer-to-peer manner. "
       | 
       | Does the HN crowd think it's right ?
        
         | epistasis wrote:
         | As somebody who has used belief propagation a lot, before the
         | current neural network renaissance, I'm not yet optimistic of
         | that. (However, I have been wrong enough in the past to be
         | somewhat humble about my ability to predict these things!)
         | 
         | As I see it, the current big machine learning models use data
         | that can't fit well on a single machine, or training regimes
         | that can't fit onto a single machine, but the end model still
         | fits on a single GPU or box.
         | 
         | What I understand them to be advocating for here, is a model
         | that is bigger than can fit on a single chip, that can use
         | belief propagation to scale to that number of variables/size. I
         | don't yet know what sort of applications that could address, so
         | that's my primary point of skepticism about this sort of huge
         | model being developed. BUT, my lack of imagination is not much
         | of an argument on its own, it's merely the absence of an
         | argument, not an actual argument against such large models
         | being useful.
        
       | shoo wrote:
       | > Theoretical grounds for applying the same update rules to loopy
       | graphs were later developed that explain loopy BP as an
       | approximate variational inference method in which inference is
       | cast as an optimization problem. Instead of directly minimizing
       | the factor energies (as is done in MAP inference), loopy BP
       | minimizes the KL divergence between the posterior and a
       | variational distribution which we use as a proxy for the
       | marginals after optimization
       | 
       | > As the Bethe free energy is non-convex, loopy BP is not
       | guaranteed to converge and even when it does it may converge to
       | the wrong marginals. Empirically, however BP generally converges
       | to the true marginals although for very loopy graphs it can fail
       | 
       | If i follow this, in the case where loopy BP converges to
       | something, it sounds like there are at least two components of
       | error:
       | 
       | (i) the error between the true posterior distribution and the
       | best approximation ; and (ii) the error between the approximation
       | you managed to find with loopy BP and the best approximation
       | 
       | The fact that loopy BP can fail to converge or fails to converge
       | to a global optima is not that attractive. On another hand,
       | global optimisation is often difficult or intractable. Maybe in
       | practice, in some cases the error from failing to find the
       | globally optimal best fit approximation might be a lot less than
       | the error introduced by making the approximation in the first
       | place. This doesn't necessarily mean that loopy BA will give
       | useful results -- maybe that approximation error is so bad that
       | you need to do something else!
       | 
       | > Other notable candidate methods that are local, probabilistic
       | and iterative are Expectation Propagation (EP) and Barfoot's
       | algorithm . EP is generally not node-wise parallel and simplifies
       | to GBP in the special case when it is node-wise parallel, while
       | Barfoot's algorithm involves extra communication edges and is yet
       | to be applied to real problems.
       | 
       | It's worth reading Barfoot's paper [1] -- especially the latter
       | sections discussing related work, conclusions & future work, give
       | another perspective:
       | 
       | > Belief propagation is known to converge to the correct global
       | solution to the inference problem when the graphical model is a
       | tree; however, loopy belief propagation may or may not converge
       | and various sufficient conditions have been established for
       | convergence. Weiss and Freeman (2000) provide the starting point
       | by showing that GaBP will converge (mean only not covariance)
       | under the condition that A is diagonally dominant. To our
       | knowledge, no condition has been derived for GaBP to ensure the
       | covariance quantities will converge to the global solution in the
       | case of a loopy graph. Some of the analyses of GaBP look to view
       | the method as an iterative solver for a linear system of
       | equations (e.g., Jacobi's algorithm), a connection first pointed
       | out by Weiss and Freeman (2000). However, as there have been
       | approximations made, GaBP cannot be shown in general to be
       | solving the primary (and secondary) linear systems discussed in
       | this paper. Our paper in a sense follows this connection in the
       | opposite direction by starting with the known result of Takahashi
       | et al. (1973) and then showing what it would take to implement
       | this as a local message-passing scheme. While we can guarantee
       | convergence in both the mean and covariance quantities, this
       | comes at the cost of additional communication links, memory, and
       | computation compared to GaBP. [...] It would also be interesting
       | to look more deeply at the connection to Gaussian belief
       | propagation; the methods perform the same on a tree graph, but
       | what approximations are needed in the loopy case for our message-
       | passing scheme to be similar/equivalent to GaBP (e.g., perhaps we
       | simply need to delete the extra communication channels our method
       | requires)?
       | 
       | [1] https://arxiv.org/pdf/2010.08022.pdf
        
       | epistasis wrote:
       | Very excited to see belief propagation again. I thought it might
       | disappear with all the nets being used lately!
        
       | wiz21c wrote:
       | Question to HN: is there somewhere a dictionary explaining and
       | relating terms, in applicable way, that everyone uses without
       | ever explaining them, such as "priors", "posteriors",
       | "marginals", "belief", "marginal distribution", etc.
        
         | IdiocyInAction wrote:
         | The first few chapters of many books about ML will go over
         | these, like _Pattern Recognition and Machine Learning_ by
         | Bishop.
        
           | iflp wrote:
           | Try _Bayesian Reasoning and Machine Learning_ specifically,
           | as these are all about Bayesian reasoning.
        
         | shoo wrote:
         | if you're willing to sink time into a lecture series, you may
         | enjoy Richard McElreath's introductory lectures on bayesian
         | inference: https://www.youtube.com/watch?v=_NEMHM1wDfI
        
       | vignesh_warar wrote:
       | What a great article - and the way it's presented is insanely
       | great. It reminds me of a talk given by Bret Victor
       | 
       | https://youtu.be/oUaOucZRlmE?t=1175 (Media for Thinking the
       | Unthinkable)
        
         | throwaway6734 wrote:
         | It looks like it's built with the Distill framework.
         | Distill.pub has some other really neat work.
        
         | youssefabdelm wrote:
         | Was just about to say it reminded me of Bret Victor. But
         | someone needs to synthesize these technical articles with the
         | field of General Semantics (
         | https://www.youtube.com/playlist?list=PLaoJIXlyLvLkMQUtbiTi1...
         | ) to make it even clearer for the curious complete beginner to
         | mathematical/technical thinking/approaches.
         | 
         | A deeper awareness of all the jargon one is using + deep
         | intuitive layman explanation would be valuable.
        
           | 123pie123 wrote:
           | that's neat, do you know what the drawing / visualization
           | program is called?
        
       ___________________________________________________________________
       (page generated 2021-08-31 23:02 UTC)