[HN Gopher] A visual introduction to Gaussian Belief Propagation
___________________________________________________________________
A visual introduction to Gaussian Belief Propagation
Author : power
Score : 148 points
Date : 2021-08-31 12:34 UTC (10 hours ago)
(HTM) web link (gaussianbp.github.io)
(TXT) w3m dump (gaussianbp.github.io)
| wiz21c wrote:
| I'm a bit of noob in the matter but :
|
| FTA: "Looking to the future, we anticipate a long-term trend
| towards a "hardware jungle" of parallel, heterogeneous,
| distributed and asynchronous computing systems which communicate
| in a peer-to-peer manner. "
|
| Does the HN crowd think it's right ?
| epistasis wrote:
| As somebody who has used belief propagation a lot, before the
| current neural network renaissance, I'm not yet optimistic of
| that. (However, I have been wrong enough in the past to be
| somewhat humble about my ability to predict these things!)
|
| As I see it, the current big machine learning models use data
| that can't fit well on a single machine, or training regimes
| that can't fit onto a single machine, but the end model still
| fits on a single GPU or box.
|
| What I understand them to be advocating for here, is a model
| that is bigger than can fit on a single chip, that can use
| belief propagation to scale to that number of variables/size. I
| don't yet know what sort of applications that could address, so
| that's my primary point of skepticism about this sort of huge
| model being developed. BUT, my lack of imagination is not much
| of an argument on its own, it's merely the absence of an
| argument, not an actual argument against such large models
| being useful.
| shoo wrote:
| > Theoretical grounds for applying the same update rules to loopy
| graphs were later developed that explain loopy BP as an
| approximate variational inference method in which inference is
| cast as an optimization problem. Instead of directly minimizing
| the factor energies (as is done in MAP inference), loopy BP
| minimizes the KL divergence between the posterior and a
| variational distribution which we use as a proxy for the
| marginals after optimization
|
| > As the Bethe free energy is non-convex, loopy BP is not
| guaranteed to converge and even when it does it may converge to
| the wrong marginals. Empirically, however BP generally converges
| to the true marginals although for very loopy graphs it can fail
|
| If i follow this, in the case where loopy BP converges to
| something, it sounds like there are at least two components of
| error:
|
| (i) the error between the true posterior distribution and the
| best approximation ; and (ii) the error between the approximation
| you managed to find with loopy BP and the best approximation
|
| The fact that loopy BP can fail to converge or fails to converge
| to a global optima is not that attractive. On another hand,
| global optimisation is often difficult or intractable. Maybe in
| practice, in some cases the error from failing to find the
| globally optimal best fit approximation might be a lot less than
| the error introduced by making the approximation in the first
| place. This doesn't necessarily mean that loopy BA will give
| useful results -- maybe that approximation error is so bad that
| you need to do something else!
|
| > Other notable candidate methods that are local, probabilistic
| and iterative are Expectation Propagation (EP) and Barfoot's
| algorithm . EP is generally not node-wise parallel and simplifies
| to GBP in the special case when it is node-wise parallel, while
| Barfoot's algorithm involves extra communication edges and is yet
| to be applied to real problems.
|
| It's worth reading Barfoot's paper [1] -- especially the latter
| sections discussing related work, conclusions & future work, give
| another perspective:
|
| > Belief propagation is known to converge to the correct global
| solution to the inference problem when the graphical model is a
| tree; however, loopy belief propagation may or may not converge
| and various sufficient conditions have been established for
| convergence. Weiss and Freeman (2000) provide the starting point
| by showing that GaBP will converge (mean only not covariance)
| under the condition that A is diagonally dominant. To our
| knowledge, no condition has been derived for GaBP to ensure the
| covariance quantities will converge to the global solution in the
| case of a loopy graph. Some of the analyses of GaBP look to view
| the method as an iterative solver for a linear system of
| equations (e.g., Jacobi's algorithm), a connection first pointed
| out by Weiss and Freeman (2000). However, as there have been
| approximations made, GaBP cannot be shown in general to be
| solving the primary (and secondary) linear systems discussed in
| this paper. Our paper in a sense follows this connection in the
| opposite direction by starting with the known result of Takahashi
| et al. (1973) and then showing what it would take to implement
| this as a local message-passing scheme. While we can guarantee
| convergence in both the mean and covariance quantities, this
| comes at the cost of additional communication links, memory, and
| computation compared to GaBP. [...] It would also be interesting
| to look more deeply at the connection to Gaussian belief
| propagation; the methods perform the same on a tree graph, but
| what approximations are needed in the loopy case for our message-
| passing scheme to be similar/equivalent to GaBP (e.g., perhaps we
| simply need to delete the extra communication channels our method
| requires)?
|
| [1] https://arxiv.org/pdf/2010.08022.pdf
| epistasis wrote:
| Very excited to see belief propagation again. I thought it might
| disappear with all the nets being used lately!
| wiz21c wrote:
| Question to HN: is there somewhere a dictionary explaining and
| relating terms, in applicable way, that everyone uses without
| ever explaining them, such as "priors", "posteriors",
| "marginals", "belief", "marginal distribution", etc.
| IdiocyInAction wrote:
| The first few chapters of many books about ML will go over
| these, like _Pattern Recognition and Machine Learning_ by
| Bishop.
| iflp wrote:
| Try _Bayesian Reasoning and Machine Learning_ specifically,
| as these are all about Bayesian reasoning.
| shoo wrote:
| if you're willing to sink time into a lecture series, you may
| enjoy Richard McElreath's introductory lectures on bayesian
| inference: https://www.youtube.com/watch?v=_NEMHM1wDfI
| vignesh_warar wrote:
| What a great article - and the way it's presented is insanely
| great. It reminds me of a talk given by Bret Victor
|
| https://youtu.be/oUaOucZRlmE?t=1175 (Media for Thinking the
| Unthinkable)
| throwaway6734 wrote:
| It looks like it's built with the Distill framework.
| Distill.pub has some other really neat work.
| youssefabdelm wrote:
| Was just about to say it reminded me of Bret Victor. But
| someone needs to synthesize these technical articles with the
| field of General Semantics (
| https://www.youtube.com/playlist?list=PLaoJIXlyLvLkMQUtbiTi1...
| ) to make it even clearer for the curious complete beginner to
| mathematical/technical thinking/approaches.
|
| A deeper awareness of all the jargon one is using + deep
| intuitive layman explanation would be valuable.
| 123pie123 wrote:
| that's neat, do you know what the drawing / visualization
| program is called?
___________________________________________________________________
(page generated 2021-08-31 23:02 UTC)