[HN Gopher] Intuitive guide to convolution
___________________________________________________________________
Intuitive guide to convolution
Author : abhi9u
Score : 38 points
Date : 2023-12-03 07:37 UTC (1 days ago)
(HTM) web link (betterexplained.com)
(TXT) w3m dump (betterexplained.com)
| lxe wrote:
| Reminds me of Setosa's interactive explanation of image kernels:
|
| https://setosa.io/ev/image-kernels/
| layoric wrote:
| This is a great post that makes the concept of image kernels
| way more accessible, thanks for sharing!
| esafak wrote:
| Convolution: The time domain operation that corresponds to
| multiplication in the frequency domain.
| Paul-Craft wrote:
| And which you can use to do actual multiplication, using the
| DFT and inverse DFT, if you let $f$ and $g$ be polynomials
| over, say $Z/(2^k +1), where the modulus is selected such that
| $2^k$ is the word size of your machine.
| CyberDildonics wrote:
| Isn't this just a complicated way of saying a computer can do
| multiplication?
| Paul-Craft wrote:
| No. It's saying a computer can do multiplication really
| fast by doing a bunch of additions, without having to worry
| about carries, instead.
| ndriscoll wrote:
| The flip and slide thing that electrical engineers do has always
| struck me as unintuitive; it makes more sense to think of it as a
| blurring operation.
|
| Consider a function f: R^2 -> R (or Z^2 -> Z if you like) that
| represents a grayscale image. So f(0,0) is the pixel at the
| origin, f(1,0) is the pixel at (1,0), etc. Think of g: R^2 -> R
| as a blurring function, e.g. a gaussian.
|
| What convolution does is it turns every pixel of f into a copy of
| the blur g, weighted by and centered on each pixel being blurred.
| So f(0,0) gets turned into a blurred image h(x,y) = f(0,0)g(x,y).
| f(1,0) gets turned into a blurred image h(x,y) = f(1,0)g(x-1,0).
| Note that the subtraction is just recentering g so the blur
| applies in the right position. In general, each pixel gets
| blurred into the function h(x,y) = f(a,b)g(x-a,y-b).
|
| Now sum up all the blurred pixels, so you get the final image
| (f*g)(x,y) = integral_(a,b) f(a,b)g(x-a,y-b).
|
| Same thing can be done in the time domain instead of a spatial
| domain, or you can write it in vector form, so (f*g)(x) =
| integral_(r) f(r)g(x-r).
|
| Note that you can also write it in a more symmetric way as
| (f*g)(c) = integral_(a+b=c) f(a)g(b), which makes it clear that
| you can do this for any semigroup (you get the normal definition
| when you have a group by noting that b=c-a), makes commutativity
| obvious, and makes it clear why polynomial multiplication is
| convolution of coefficients: x^n comes from summing over all the
| x^i*x^j with i+j=n, so the coefficient for x^n is the sum over
| all coefficients indexed by i,j with i+j=n, which is the
| symmetric way to write convolution.
| duped wrote:
| > it makes more sense to think of it as a blurring operation.
|
| It's not a blurring operation, it's not even an operation on
| images. And even when applied to images, only some convolutions
| result in what could be described as a "blur." This is also
| tautological, because you can't really define why a gaussian
| filter "blurs" an image without first understanding the
| convolution of the image and filter kernel.
|
| imho you should throw out the notion of "intuition" and stick
| with the mathematical definition and then explore its
| properties and relationships.
|
| The simplest and most intuitive description I've heard is
| "convolution is a mathematical operation that describes mixing
| two things together" (those things being vectors, matrices,
| continuous functions, etc)
| ndriscoll wrote:
| I don't need to know about convolution to tell you what a
| Gaussian is; it's just e^(-x^2). You can look at a grayscale
| image with a single bright point pixel (like a star in the
| sky) and say it's "sharp", and an image with a gaussian and
| say it's a "blur" or a blob. That's just a qualitative
| description of what a point or a gaussian look like. Then, as
| I described, you can define convolution by saying you're
| replacing each point pixel with an _image_ that is your blur
| with its brightness weighted by the point pixel and located
| at that pixel, and then add them all up.
|
| That intuition also works fine in the time domain, where your
| signal is "blurred" at each instant by the impulse response.
|
| But yeah more generally, all you get is that (f*g)(c) is the
| sum over all a (**) b=c f(a)g(b), where (**) is... some
| operation.
|
| I guess ultimately my point is that looking at g(t-tau) and
| seeing -tau as "flip the signal" and t as "move to the
| desired time" seems absolutely bizarre to me. g(t-tau) is
| g(t), centered at tau. So f(tau)g(t-tau) is g(t), centered at
| tau, weighted by f(tau). Then sum over all tau. I can never
| see where the flip comes in; it seems backwards to me. Nor
| can I see how this flipping thing would intuitively arise in
| physical systems. Obviously they're equivalent, but it just
| doesn't make sense to me.
| nh23423fefe wrote:
| Ok but when i try this, i dont get any good intuitions.
|
| polynomial multiplication is convolution, so what am i
| supposed to do here when you say, "g is blurring function"
|
| When i think of multiplying term by term an summing over
| the diagonal I can clearly see what the f(tau)g(t-tau) is
| doing, its collecting terms with like exponents.
|
| This also makes sense when I think of probability
| distributions, i'm multiplying the densities and summing
| "over the diagonal"
| Severian wrote:
| IMO 3blue1brown presents this in a very easy to understand way:
| https://www.youtube.com/watch?v=KuXjwB4LzSA
| 3abiton wrote:
| I wonder if we will ever be at a stage where LLM can generate
| videos like that as an ELI5
| CyberDildonics wrote:
| Any explanation should start by saying a convolution is just a
| weighted average.
| wnoise wrote:
| Averages are normalized. Convolutions only are if they
| integrate to 1.
| jjgreen wrote:
| Well it is, sort of, but treating one of the functions as
| weights and the other as the thing that is averaged misses out
| on the symmetry f*g = g*f
| onos wrote:
| It just smears out a signal.
| ur-whale wrote:
| Convolution, or the art of given hard names to easy things.
|
| Convolution is something you learn when you're 7.
|
| It's called weighted average.
| wnoise wrote:
| Weighted sum, sure. (Average is normalized).
|
| But it's not just one sum, but a _collection_ of weighted sums,
| and those have a particular order for reasons.
___________________________________________________________________
(page generated 2023-12-04 23:01 UTC)