hngopher.com

       [HN Gopher] Intuitive guide to convolution
       ___________________________________________________________________
        
       Intuitive guide to convolution
        
       Author : abhi9u
       Score  : 38 points
       Date   : 2023-12-03 07:37 UTC (1 days ago)
        
 (HTM) web link (betterexplained.com)
 (TXT) w3m dump (betterexplained.com)
        
       | lxe wrote:
       | Reminds me of Setosa's interactive explanation of image kernels:
       | 
       | https://setosa.io/ev/image-kernels/
        
         | layoric wrote:
         | This is a great post that makes the concept of image kernels
         | way more accessible, thanks for sharing!
        
       | esafak wrote:
       | Convolution: The time domain operation that corresponds to
       | multiplication in the frequency domain.
        
         | Paul-Craft wrote:
         | And which you can use to do actual multiplication, using the
         | DFT and inverse DFT, if you let $f$ and $g$ be polynomials
         | over, say $Z/(2^k +1), where the modulus is selected such that
         | $2^k$ is the word size of your machine.
        
           | CyberDildonics wrote:
           | Isn't this just a complicated way of saying a computer can do
           | multiplication?
        
             | Paul-Craft wrote:
             | No. It's saying a computer can do multiplication really
             | fast by doing a bunch of additions, without having to worry
             | about carries, instead.
        
       | ndriscoll wrote:
       | The flip and slide thing that electrical engineers do has always
       | struck me as unintuitive; it makes more sense to think of it as a
       | blurring operation.
       | 
       | Consider a function f: R^2 -> R (or Z^2 -> Z if you like) that
       | represents a grayscale image. So f(0,0) is the pixel at the
       | origin, f(1,0) is the pixel at (1,0), etc. Think of g: R^2 -> R
       | as a blurring function, e.g. a gaussian.
       | 
       | What convolution does is it turns every pixel of f into a copy of
       | the blur g, weighted by and centered on each pixel being blurred.
       | So f(0,0) gets turned into a blurred image h(x,y) = f(0,0)g(x,y).
       | f(1,0) gets turned into a blurred image h(x,y) = f(1,0)g(x-1,0).
       | Note that the subtraction is just recentering g so the blur
       | applies in the right position. In general, each pixel gets
       | blurred into the function h(x,y) = f(a,b)g(x-a,y-b).
       | 
       | Now sum up all the blurred pixels, so you get the final image
       | (f*g)(x,y) = integral_(a,b) f(a,b)g(x-a,y-b).
       | 
       | Same thing can be done in the time domain instead of a spatial
       | domain, or you can write it in vector form, so (f*g)(x) =
       | integral_(r) f(r)g(x-r).
       | 
       | Note that you can also write it in a more symmetric way as
       | (f*g)(c) = integral_(a+b=c) f(a)g(b), which makes it clear that
       | you can do this for any semigroup (you get the normal definition
       | when you have a group by noting that b=c-a), makes commutativity
       | obvious, and makes it clear why polynomial multiplication is
       | convolution of coefficients: x^n comes from summing over all the
       | x^i*x^j with i+j=n, so the coefficient for x^n is the sum over
       | all coefficients indexed by i,j with i+j=n, which is the
       | symmetric way to write convolution.
        
         | duped wrote:
         | > it makes more sense to think of it as a blurring operation.
         | 
         | It's not a blurring operation, it's not even an operation on
         | images. And even when applied to images, only some convolutions
         | result in what could be described as a "blur." This is also
         | tautological, because you can't really define why a gaussian
         | filter "blurs" an image without first understanding the
         | convolution of the image and filter kernel.
         | 
         | imho you should throw out the notion of "intuition" and stick
         | with the mathematical definition and then explore its
         | properties and relationships.
         | 
         | The simplest and most intuitive description I've heard is
         | "convolution is a mathematical operation that describes mixing
         | two things together" (those things being vectors, matrices,
         | continuous functions, etc)
        
           | ndriscoll wrote:
           | I don't need to know about convolution to tell you what a
           | Gaussian is; it's just e^(-x^2). You can look at a grayscale
           | image with a single bright point pixel (like a star in the
           | sky) and say it's "sharp", and an image with a gaussian and
           | say it's a "blur" or a blob. That's just a qualitative
           | description of what a point or a gaussian look like. Then, as
           | I described, you can define convolution by saying you're
           | replacing each point pixel with an _image_ that is your blur
           | with its brightness weighted by the point pixel and located
           | at that pixel, and then add them all up.
           | 
           | That intuition also works fine in the time domain, where your
           | signal is "blurred" at each instant by the impulse response.
           | 
           | But yeah more generally, all you get is that (f*g)(c) is the
           | sum over all a (**) b=c f(a)g(b), where (**) is... some
           | operation.
           | 
           | I guess ultimately my point is that looking at g(t-tau) and
           | seeing -tau as "flip the signal" and t as "move to the
           | desired time" seems absolutely bizarre to me. g(t-tau) is
           | g(t), centered at tau. So f(tau)g(t-tau) is g(t), centered at
           | tau, weighted by f(tau). Then sum over all tau. I can never
           | see where the flip comes in; it seems backwards to me. Nor
           | can I see how this flipping thing would intuitively arise in
           | physical systems. Obviously they're equivalent, but it just
           | doesn't make sense to me.
        
             | nh23423fefe wrote:
             | Ok but when i try this, i dont get any good intuitions.
             | 
             | polynomial multiplication is convolution, so what am i
             | supposed to do here when you say, "g is blurring function"
             | 
             | When i think of multiplying term by term an summing over
             | the diagonal I can clearly see what the f(tau)g(t-tau) is
             | doing, its collecting terms with like exponents.
             | 
             | This also makes sense when I think of probability
             | distributions, i'm multiplying the densities and summing
             | "over the diagonal"
        
       | Severian wrote:
       | IMO 3blue1brown presents this in a very easy to understand way:
       | https://www.youtube.com/watch?v=KuXjwB4LzSA
        
         | 3abiton wrote:
         | I wonder if we will ever be at a stage where LLM can generate
         | videos like that as an ELI5
        
       | CyberDildonics wrote:
       | Any explanation should start by saying a convolution is just a
       | weighted average.
        
         | wnoise wrote:
         | Averages are normalized. Convolutions only are if they
         | integrate to 1.
        
         | jjgreen wrote:
         | Well it is, sort of, but treating one of the functions as
         | weights and the other as the thing that is averaged misses out
         | on the symmetry f*g = g*f
        
       | onos wrote:
       | It just smears out a signal.
        
       | ur-whale wrote:
       | Convolution, or the art of given hard names to easy things.
       | 
       | Convolution is something you learn when you're 7.
       | 
       | It's called weighted average.
        
         | wnoise wrote:
         | Weighted sum, sure. (Average is normalized).
         | 
         | But it's not just one sum, but a _collection_ of weighted sums,
         | and those have a particular order for reasons.
        
       ___________________________________________________________________
       (page generated 2023-12-04 23:01 UTC)