[HN Gopher] Gaussian integration is cool
___________________________________________________________________
Gaussian integration is cool
Author : beansbeansbeans
Score : 129 points
Date : 2025-06-08 08:35 UTC (14 hours ago)
(HTM) web link (rohangautam.github.io)
(TXT) w3m dump (rohangautam.github.io)
| constantcrying wrote:
| A good introduction to the basics.
|
| What is also worth pointing out and which was somewhat glanced
| over is the close connection between the weight function and the
| polynomials. For different weight functions you get different
| classes of orthogonal polynomials. Orthogonal has to be
| understood in relation to the scalar product given by integrating
| with respect to the weight function as well.
|
| Interestingly Gauss-Hermite integrates on the entire real line,
| so from -infinity to infinity. So the choice of weight function
| also influences the choice of integration domain.
| creata wrote:
| Sorry if this is a stupid question, but is there a direct link
| between the choice of weight function and the applications of
| the polynomial?
|
| Like, is it possible to infer that Chebyshev polynomials would
| be useful in approximation theory using only the fact that
| they're orthogonal wrt the Wigner semicircle (U_n) or arcsine
| (T_n) distribution?
| constantcrying wrote:
| Yes. Precisely that they are orthogonal means that they are
| suitable.
|
| If you are familiar with the Fourier series, the same
| principle can be applied to approximating with polynomials.
|
| In both cases the crucial point is that you can form an
| orthogonal subspace, onto which you can project the function
| to be approximated.
|
| For polynomials it is this:
| https://en.m.wikipedia.org/wiki/Polynomial_chaos
| sfpotter wrote:
| What you're saying isn't wrong, per se, but...
|
| There are polynomials that aren't orthogonal that are
| suitable for numerics: both the Bernstein basis and the
| monomial basis are used very often and neither are
| orthogonal. (Well, you could pick a weight function that
| makes them orthogonal, but...!)
|
| The fact of their orthogonality is crucial, but when you
| work with Chebyshev polynomials, it is very unlikely you
| are doing an orthogonal (L2) projection! Instead, you would
| normally use Chebyshev interpolation: 1) interpolate at
| either the Type-I or Type-II Chebyshev nodes, 2) use the
| DCT to compute the Chebyshev series coefficients. The fact
| that you can do this is _related_ to the weight function,
| but it isn 't an L2 procedure. Like I mentioned in my other
| post, the Chebyshev weight function is maybe more of an
| artifact of the Chebyshev polynomials' intimate relation to
| the Fourier series.
|
| I am also not totally sure what polynomial chaos has to do
| with any of this. PC is a term of art in uncertainty
| quantification, and this is all just basic numerical
| analysis. If you have a series in orthgonal polynomials, if
| you want to call it something fancy, you might call it a
| Fourier series, but usually there is no fancy term...
| constantcrying wrote:
| I think you are very confused. My post is about
| approximation. Obviously other areas use other
| polynomials or the same polynomials for other reasons.
|
| In this case it is about the principle of approximation
| by _orthogonal projection_ , which is quite common in
| different fields of mathematics. Here you create an
| approximation of a target by projecting it onto an
| orthogonal subspace. This is what the Fourier series is
| about, an orthogonal projection. Choosing e.g. the
| Chebychev Polynomials instead of the complex exponential
| gives you an Approximation onto the orthogonal space of
| e.g. Chebychev polynomials.
|
| The same principle applies e.g. when you are computing an
| SVD for a low rank approximation. That is another case of
| orthogonal projection.
|
| >Instead, you would normally use Chebyshev interpolation
|
| What you do not understand is that this is the same
| thing. The distinction you describe does not exist, these
| are the same things, just different perspectives. That
| they are the same easily follows from the uniqueness of
| polynomials, which are fully determined by their
| interpolation points. These aren't distinct ideas, there
| is a greater principle behind them and that you are using
| some other algorithm to compute the Approximation does
| not matter at all.
|
| >I am also not totally sure what polynomial chaos has to
| do with any of this.
|
| It is the exact same thing. Projection onto an orthogonal
| subspace of polynomials. Just that you choose the
| polynomials with regard to a random variable. So you get
| an approximation with good statistical properties.
| sfpotter wrote:
| No, I am not confused. :-) I am just trying to help you
| understand some basics of numerical analysis.
|
| > What you do not understand is that this is the same
| thing.
|
| It is not the same thing.
|
| You can express an analytic function f(x) in a convergent
| (on [-1, 1]) Chebyshev series: f(x) = \sum_{n=0}^\infty
| a_n T_n(x). You can then truncate it keeping N+1 terms,
| giving a degree N polynomial. Call it f_N.
|
| Alternatively, you can interpolate f at at N+1 Chebyshev
| nodes and use a DCT to compute the corresponding
| Chebyshev series coefficients. Call the resulting
| polynomial p_N.
|
| In general, f_N and p_N are _not_ the same polynomial.
|
| Furthermore, computing the coefficients of f_N is _much_
| more expensive than computing the coefficients of p_N.
| For f_N, you need to evaluate N+1 integral which may be
| quite expensive indeed if you want to get digits. For
| p_N, you simply evaluate f at N+1 nodes, compute a DCT in
| O(N log N) time, and the result is the coefficients of
| p_N up to rounding error.
|
| In practice, people do not compute the coefficients of
| f_N, they compute the coefficients of p_N. Nevertheless,
| f_N and p_N are essentially as good as each other when it
| comes to approximation.
| constantcrying wrote:
| At this point I really hate to ask. Do you know what
| "orthogonal subspace" means and what a projection is?
| sfpotter wrote:
| Ah, shucks. I can see you're getting upset and defensive.
| Sorry... Yes, it should be clear from everything I've
| written that I'm quite clear on the definition of these.
|
| If you would like to read what I'm saying but from a more
| authoritative reference that you feel you can trust, you
| can just take a look at Trefethen's "Approximation Theory
| and Approximation Practice". I'm just quoting contents of
| Chapter 4 at you.
|
| Again, like I said in my first response to you, what
| you're saying _isn 't wrong_, it just misses the mark a
| bit. If you want to compute the L2 projection of a
| function onto the orthogonal subspace of degree N
| Chebyshev polynomials, _you would need to evaluate a
| rather expensive integral to compute the coefficients_.
| It 's expensive because it requires the use of adaptive
| integration... many function evaluations per coefficient!
| Bad!
|
| On the other hand, you could just do polynomial
| interpolation using either of the degree N Chebyshev
| nodes (Type-I or Type-II). This requires only N+1
| functions evaluations. Only one function evaluation per
| coefficient. Good!
|
| And, again, since the the polynomial so constructed is
| _not_ the same polynomial as the one obtained via L2
| projection mentioned in paragraph 3 above, this
| interpolation procedure cannot be regarded as a
| projection! I guess you could call it an "approximate
| projection". It agrees quite closely with the L2
| projection, and has essentially the same approximation
| power. This is why Chebyshev polynomials are so useful
| _in practice_ for approximation, and why e.g. Legendre
| polynomials are much less useful (they do not have a
| convenient fast transform).
|
| Anyway, I hope this helps! It's a beautiful subject and a
| lot of fun to work on.
| sfpotter wrote:
| Chebyshev polynomials are useful in approximation theory
| because they're the minimax polynomials. The remainder of
| polynomial interpolation can be given in terms of the nodal
| polynomial, which is the polynomial with the interpolation
| nodes as zeros. Minimizing the maximum error then leads to
| the Chebyshev polynomials. This is a basic fact in numerical
| analysis that has tons of derivations online and in books.
|
| The weight function shows the Chebyshev polynomials' relation
| to the Fourier series . But they are not what you would
| usually think of as a good candidate for L2 approximation on
| the interval. Normally you'd use Legendre polynomials, since
| they have w = 1, but they are a much less convenient basis
| than Chebyshev for numerics.
| creata wrote:
| True, and there are plenty of other reasons Chebyshev
| polynomials are convenient, too.
|
| But I guess what I was asking was: is there some kind of
| abstract argument why the _semicircle distribution_ would
| be appropriate in this context?
|
| For example, you have abstract arguments like the central
| limit theorem that explain (in some loose sense) why the
| normal distribution is everywhere.
|
| I guess the semicircle _might_ more-or-less be the only way
| to get something where interpolation uses the DFT (by
| projecting points evenly spaced on the complex unit circle
| onto [-1, 1]), but I dunno, that motivation feels too many
| steps removed.
| sfpotter wrote:
| If there is, I'm not aware of it. Orthogonal polynomials
| come up in random matrix theory. Maybe there's something
| there?
|
| But your last paragraph is exactly it... it is a "basic"
| fact but the consequences are profound.
| efavdb wrote:
| Could you guys recommend an easy book on this topic?
| sfpotter wrote:
| Which topic?
| efavdb wrote:
| Some sort of numerical analysis book that covers these
| topics - minimax approx, quadrature etc. I've read on
| these separately but am curious what other sorts of
| things would be covered in courses including that.
| 4ce0b361 wrote:
| I'd suggest: Trefethen, Lloyd N., Approximation theory
| and approximation practice (Extended edition), SIAM,
| Philadelphia, PA (2020), ISBN 978-1-611975-93-2.
| sfpotter wrote:
| I would check out "An Introduction to Numerical Analysis"
| by Suli and Mayers or "Approximation Theory and
| Approximation Practice" by Trefethen. The former covers
| all the major intro numerical analysis topics in a format
| that is suitable for someone with ~undergrad math or
| engineering backgrounds. The latter goes deep into
| Chebyshev approximation (and some related topics). It is
| also very accessible but is much more specialized.
| seanhunter wrote:
| I thought when I first saw the title that it was going to be
| about the Gaussian integral[1] which has to be one of the coolest
| results in all of maths.
|
| That is, the integral from - to + infinity of e^(-x^2) dx =
| sqrt(pi).
|
| I remember being given this as an exercise and just being totally
| shocked by how beautiful it was as a result (when I eventually
| managed to work out how to evaluate it).
|
| [1] https://mathworld.wolfram.com/GaussianIntegral.html
| niklasbuschmann wrote:
| Gaussian integrals are also pretty much the basis of quantum
| field theory in the path integral formalism, where Isserlis's
| theorem is the analog to Wick's theorem in the operator
| formalism.
| srean wrote:
| Indeed.
|
| It's the gateway drug to Laplace's method (Laplace
| approximation), mean field theory, perturbation theory, ...
| QFT.
|
| https://en.m.wikipedia.org/wiki/Laplace%27s_method
| chermi wrote:
| Maybe I should just read the wiki you linked, but I guess
| I'm confused on how this is different than steepest
| descent? I'm a physicist by training so maybe we just call
| it something different?
| dawnofdusk wrote:
| They are the same for physicists who analytically
| continue everything. Steepest descent is technically for
| integrals in the complex plane, and Laplace's method is
| for integrating over the real line.
| srean wrote:
| It is Laplace's method of steepest decent. When the same
| method is used to approximate probability density
| function, for example the posterior probability density,
| it's called Laplace's approximation of the density.
|
| The wikipedia link would have made things quite clear :)
| constantcrying wrote:
| There is a relationship here, in the case of Gauss-Hermite
| Integration, where the weight function is exactly e^(-x^2) the
| weights have to add up sqrt(pi), because the integral is exact
| for the constant 1 polynomial.
| extrabajs wrote:
| What is Fig. 1 showing? Is it the value of the integral compared
| with two approximations? Would it not be more interesting to show
| the error of the approximations instead? Asking for a friend who
| isn't computing a lot of integrals.
| mturmon wrote:
| Fig 1 could use a rethink. It uses log scale, but the dynamic
| range of the y-axis is tiny, so the log transform isn't doing
| anything.
|
| It would be better shown as a table with 3 numbers. Or, maybe
| two columns, one for integral value and one for error, as you
| suggest.
| tim-kt wrote:
| No, the integral cannot be "expressed" as a sum over weights and
| function evaluations (with a "="), it can be approximated with
| this idea. If you fix any n+1 nodes, interpolate your function,
| and integrate your polynomial, you will get this sum where the
| weights are integrals over (Lagrange) basis polynomials. By
| construction, you can compute the integral of polynomials up to
| degree n _exactly_. Now, if you choose the nodes in a particular
| way (namely, as the zeros of some polynomials), you can increase
| this to up to 2n+1. What I 'm getting at is that the Gaussian
| integration is not _estimating_ the integrals of polynomials of
| degree 2n+1, but it 's evaluating them _exactly_.
| Onavo wrote:
| Convolution?
| tim-kt wrote:
| What about it?
___________________________________________________________________
(page generated 2025-06-08 23:00 UTC)