[HN Gopher] Derivative at a Discontinuity
       ___________________________________________________________________
        
       Derivative at a Discontinuity
        
       Author : yuppiemephisto
       Score  : 128 points
       Date   : 2024-12-03 22:36 UTC (6 days ago)
        
 (HTM) web link (alok.github.io)
 (TXT) w3m dump (alok.github.io)
        
       | ogogmad wrote:
       | I think you can get a generalisation of autodiff using this idea
       | of "nonstandard real numbers": You just need a computable field
       | with infinitesimals in it. The Levi-Civita field looks especially
       | convenient because it's real-closed. You might be able to get an
       | auto-limit algorithm from it by evaluating a program infinitely
       | close to a limit. I'm not sure if there's a problem with
       | numerical stability when something like division by
       | infinitesimals gets done. Does this have something to do with how
       | Mathematica and other CASes take limits of algebraic expressions?
       | 
       | -----
       | 
       | Concerning the Dirac delta example: I think this is probably a
       | pleasant way of using a sequence of better and better
       | approximations to the Dirac delta. Terry Tao has some nice blog
       | posts where he shows that a lot of NSA can be translated into
       | sequences, either in a high-powered way using ultrafilters, or in
       | an elementary way using passage to convergent subsequences where
       | necessary.
       | 
       | An interesting question is: What does distribution theory really
       | accomplish? Why is it useful? I have an idea myself but I think
       | it's an interesting question.
        
         | srean wrote:
         | Thanks a bunch for pointing me towards Levi-Civita field. Where
         | can I learn more ? Any pedagogic text ?
        
           | yuppiemephisto wrote:
           | See my code at the end. The Wikipedia article is pretty good
           | too. I can send you more if you like.
        
             | srean wrote:
             | Found it, thanks.
        
         | srean wrote:
         | > I think this is probably a pleasant way of using a sequence
         | of better and better approximations to the Dirac delta.
         | 
         | That can give wrong answers because derivative of the limit is
         | not always the limit of the derivative.
         | 
         | When modeling phenomena with Dirac delta, I think the question
         | becomes do I really need a discontinuity to have a useful model
         | or can I get away with smoothening the discontinuity out.
        
         | cjfd wrote:
         | Distribution theory has lots of applications in physics. The
         | charge density of a point particle is the delta function.
         | 
         | Also when Fourier transforming over the whole real line (not
         | just an interval where the function is periodic), one has
         | identities that involve delta functions. E.g. \int dx e^(i * k1
         | * x) e^(-i * k2 * x) = 2 * pi * delta (k1 - k2).
        
           | elcritch wrote:
           | That's fascinating about charge density of a particle being a
           | dirac delta function. Is that a mathematical convenience or
           | something deeper in the theory?
        
             | cjfd wrote:
             | Well, if we assume that a point particle is an infinitely
             | small thing with all of its charge concentrated in one
             | point, the dirac delta function is obviously the correct
             | way to describe that. Of course, there is not really a way
             | to find out whether that is true. Still, the delta function
             | makes sense if something is so small that we do not know
             | its size. This idealization has, however, led to problems
             | in classical electrodynamics: https://en.wikipedia.org/wiki
             | /Abraham%E2%80%93Lorentz_force. Search for
             | 'preacceleration' in this page. This particular problem was
             | ultimately solved by realizing that in this context quantum
             | electrodynamics is the theory to applies. But, then again,
             | using point particles also causes problems, namely the need
             | to renormalize the theory which is or is not a problem
             | depending on your point of view.
        
           | ogogmad wrote:
           | The article showed that Dirac deltas could be defined WITHOUT
           | distributions. You ignored the article when answering my
           | question.
           | 
           | The question is why distribution theory is a particularly
           | good approach to notions like the Dirac delta.
        
       | dhosek wrote:
       | One minor nit: A function can be differentiable at _a_ and
       | discontinuous at _a_ even with the standard definition of the
       | derivative. A trivial example would be the function _f_ ( _x_ ) =
       | ( _x_ 2-1)/( _x_ -1) which is undefined at _x_ =1, but _f_ '(1)=1
       | (in fact derivatives have exactly this sort of discontinuity in
       | them which is why they're defined via limits). In complex
       | analysis, this sort of "hole" in the function is called a
       | removable singularity1 which is one of three types of
       | singularities that show up in complex functions.
       | 
       | [?]
       | 
       | 1. Yes, this is mathematically the reason why black holes are
       | referred to as singularities.
        
         | dawnofdusk wrote:
         | I don't think it makes sense to allow derivatives of a function
         | f to have a larger domain than the domain of f.
         | 
         | >which is why they're defined via limits
         | 
         | They're defined via studying f(x+h) - f(x) with a limit h -> 0.
         | But, your example is taking two limits, h->0 and x->1,
         | simultaneously. This is not the same thing.
        
         | bikenaga wrote:
         | I'm not understanding what you're saying. The standard
         | definition of the derivative of f at c is
         | 
         | f'(c) = lim_{h - 0} (f(c + h) - f(c))/h
         | 
         | The definition would not make sense if f wasn't defined at c
         | (note the "f(c)" in the numerator). For instance, it can't be
         | applied to your f(x) = (x2 - 1)/(x - 1) at x = 1, because f(1)
         | is not defined.
         | 
         | And it's a standard result (even stated in Calc 1 classes) that
         | if a function is differentiable at a point, then it's
         | continuous there. For example:
         | 
         | 5.2 Theorem. Let f be defined on [a, b]. If f is differentiable
         | at a point x [?] [a, b], then f is continuous at x.
         | 
         | (Walter Rudin, "Principles of Mathematical Analysis", 3rd
         | edition, p. 104)
         | 
         | Or:
         | 
         | Theorem 2.1 If f is differentiable at x = a, then f is
         | continuous at x = a.
         | 
         | (Robert Smith and Roland Minton, "Calculus -Early
         | Transcendentals", 4th edition, p. 140)
         | 
         | It's true that your f(x) = (x2 - 1)/(x - 1) has a removable
         | discontinuity at x = 1, since if we define g(x) = f(x) for x
         | [?] 1 and g(1) = 2, then g is continuous. Was this what you
         | meant?
        
           | terminalbraid wrote:
           | This is correct. You cannot have a discontinuity with any
           | accepted definition of a derivative (and your definition is
           | explicit about this: the value f(c) must exist). Just
           | allowing the limits on both sides to be equal already has a
           | mathematical definition which is that of a functional limit,
           | the function in this case being (f(x) - flim(c))/ (x-c) where
           | flim(c) is the value of a (different) functional limit of
           | f(x): x->c (as f(c) doesn't exist).
           | 
           | and yes, by defining a new function with that hole explicitly
           | filled in with a defined value to make it continuous is the
           | typical prescription. It does _not_ imply the derivative
           | exists for the other function as the other post posits.
        
           | dwattttt wrote:
           | https://en.m.wikipedia.org/wiki/Classification_of_discontinu.
           | .. is responsive and quite accessible. It notes that there
           | doesn't have to be an undefined point for a function to be
           | discontinuous (and that terminology often conflates the two),
           | and matches what I recall of determining that if the limit of
           | the derivative from both sides of the discontinuity exists
           | and is equal, the derivative exists.
        
             | bikenaga wrote:
             | > ... there doesn't have to be an undefined point for a
             | function to be discontinuous.
             | 
             | That's right. In the example f(x) = (x2 - 1)/(x - 1) for x
             | [?] 1, if we further define f(1) = 0, the function is now
             | defined at x = 1, but discontinuous there.
             | 
             | > ... if the limit of the derivative from both sides of the
             | discontinuity exists and is equal, the derivative exists.
             | 
             | (You probably mean "both sides of the point", since if
             | there's a discontinuity there the derivative can't exist.)
             | Your point that, if the left and right-hand limits both
             | exist and are equal, then the derivative exists (and equals
             | their common value) is true for all limits.
             | 
             | Also, there's a difference between the use of the word
             | "continuous" in calc courses and in topology. In calc
             | courses where functions tend to take real numbers to real
             | numbers, a function may be said to be "not continuous" at a
             | point where it isn't defined. So f(x) = 1/(x - 2) is "not
             | continuous at 2". But in topology, you only consider
             | continuity for points in the domain of the function. So
             | since the (natural) domain of f(x) = 1/(x - 2) is x [?] 2,
             | the function is continuous everywhere (that it's defined).
        
               | dwattttt wrote:
               | I was actually aiming for the situation where a function
               | is defined on all reals but still discontinuous (e.g. the
               | piecewise function in the wiki article for the removable
               | discontinuity). So there's a discontinuity (x=1), however
               | the function is defined everywhere.
        
           | smokedetector1 wrote:
           | The standard definition of a derivative c involves the
           | assumption that f is defined at c.
           | 
           | However, you could also (probably) define the derivative as
           | lim_{h->0} (f(c+h) - f(c-h))/2h, so without needing f(c) to
           | be defined. But that's not standard.
        
             | JadeNB wrote:
             | > However, you could also (probably) define the derivative
             | as lim_{h->0} (f(c+h) - f(c-h))/2h, so without needing f(c)
             | to be defined. But that's not standard.
             | 
             | Although this gives the right answer whenever f is
             | differentiable at c, it can wrongly think that a function
             | is differentiable when it isn't, as for the absolute-value
             | function at c = 0.
        
               | smokedetector1 wrote:
               | Good point. So this is probably one of the reasons why
               | the version I stated isn't used.
        
               | JadeNB wrote:
               | It is used, just with the caveat in mind that it may
               | exist when the derivative doesn't. It is usually called
               | the symmetric derivative
               | (https://en.wikipedia.org/wiki/Symmetric_derivative).
        
         | vouaobrasil wrote:
         | You are wrong. In order for you to make sense of what you are
         | saying, you first must REDEFINE f(x) to be f(x) = (x^2 - 1)(x -
         | 1) when x != 1 and define f(1) = 2. Of course, then f will be
         | continuous at x = 1 also.
         | 
         | A function is continuous at x = a if it is differentiable at x
         | = a.
         | 
         | You do understand the concept, but your precision in the
         | definitions is lacking.
        
         | Tainnor wrote:
         | > this sort of "hole" in the function is called a removable
         | singularity
         | 
         | It's called "removable" because it can be removed by a
         | continuous extension - the original function itself is still
         | formally discontinuous (of course, one would often "morally"
         | treat these as the same function, but strictly speaking they're
         | not). An important theorem in complex analysis is that any
         | continuous extension at a single point is automatically a
         | holomorphic (= complex differentiable) extension too.
        
       | Animats wrote:
       | Hm. Back when I was working on game physics engines this might
       | have been useful.
       | 
       | In impulse/constraint mechanics, when two objects collide, their
       | momentum changes in zero time. An impulse is an infinite force
       | applied over zero time with finite energy transfer. You have to
       | integrate over that to get the new velocity. This is done as a
       | special case. It is messy for multi-body collisions, and is hard
       | to make work with a friction model. This is why large objects in
       | video games bounce like small ones, changing direction in zero
       | time.
       | 
       | I wonder if nonstandard analysis might help.
        
         | ogogmad wrote:
         | The following is just my opinion:
         | 
         | Integration can be done with its own special arithmetic:
         | Interval arithmetic. I base this suggestion on the fact that
         | this is apparently the only way of automatically getting error
         | bounds on integrals. It's cool that it works.
         | 
         | NSA does not work with a computable field so it's not directly
         | useful. But at the end of the article, there's a link to some
         | code that uses the Levi-Civita field, which is a "nice"
         | approximation to NSA because it's computable and still real-
         | closed. You might be able to do an "auto-limit" using it, in a
         | kind of generalisation of automatic differentiation. This might
         | for instance turn one numerical algorithm, like Householder QR,
         | into another one, like Gaussian elimination, by taking an
         | appropriate limit.
         | 
         | I don't know if these two things interact well in practice:
         | Levi-Civita for algebraic limits and interval arithmetic for
         | integrals. They might! This might suggest rather provocatively
         | that integration is only clumsily interpreted as a limit of
         | some function. Finally tbh, I'm not sure if this is the best
         | solution to the friction/collision detection problem you're
         | describing.
        
         | lupire wrote:
         | Nonstandard analysis is the mathematical description of your
         | special case. Same thing.
        
         | btilly wrote:
         | Making it work in finite but short time should fix that. A
         | large object generally can deform a larger distance. This makes
         | all collisions inelastic, with large ones being different than
         | small ones.
         | 
         | If you can get realistic billiards breaks, you're on the right
         | track.
        
       | plus wrote:
       | I've personally always thought of the Dirac delta function as
       | being the limit of a Gaussian with variance approaching 0. From
       | this perspective, the Heaviside step function is a limit of the
       | error function. I feel the error function and logistic function
       | approaches _should_ be equivalent, though I haven 't worked
       | through to math to show it rigorously.
        
         | yuppiemephisto wrote:
         | All these would be infinitely close in the nonstandard
         | characterization. I just picked logistic because it was easy
         | and step is discontinuous so it shows off the approach's power.
         | If I started with delta instead I would have done Gaussian and
         | integrated that and ended up with erf.
        
         | thrance wrote:
         | It is, in a way. The whole point of distributions is to extend
         | the space of functions to one where more operations are
         | permitted.
         | 
         | The limit of the Gaussian function as variance goes to 0 is not
         | a function, but it is a distribution, the Dirac distribution.
         | 
         | Some distributions appear in intermediate steps while solving
         | differential equations, and then disappear in the final
         | solution. This is analogous to complex numbers sometimes
         | appearing while computing the roots of a cubic function, but
         | not being present in the roots themselves.
        
       | mturmon wrote:
       | I really appreciated this piece. Thank you to OP for writing and
       | submitting it.
       | 
       | The thing that piqued my interest was the side remark that the
       | Dirac delta is a "distribution", and that this is an unfortunate
       | name clash with the same concept in probability (measure theory).
       | 
       | My training (in EE) used both Dirac delta "functions" (in signal
       | processing) and distributions in the sense of measure theory (in
       | estimation theory). Really two separate forks of coursework.
       | 
       | I had always thought that the use of delta functions in
       | convolution integrals (signal processing) was ultimately
       | justified by measure theory -- the same machinery as I learned
       | (with some effort) when I took measure theoretic probability.
       | 
       | But, as flagged by the OP, that is not the case! Mind blown.
       | 
       | Some of this is the result of the way these concepts are taught.
       | There is some hand waving both in signal processing, and in
       | estimation theory, when these difficult functions and integrals
       | come up.
       | 
       | I'm not aware of signal processing courses (probably graduate
       | level) in which convolution against delta "functions" uses the
       | distribution concept. There are indeed words to the effect of
       | either,
       | 
       | - Dirac delta is not a function, but think of it as a limit of
       | increasingly-concentrated Gaussians;
       | 
       | - use of Dirac delta is ok, because we don't need to represent it
       | directly, only the result of an inner product against a smooth
       | function (i.e., a convolution)
       | 
       | But these excuses are not rigorously justified, even at the
       | graduate level, in my experience.
       | 
       | *
       | 
       | Separately from that, I wonder if OP has ever seen the book
       | Radically Elementary Probability Theory, by Edward Nelson
       | (https://web.math.princeton.edu/~nelson/books/rept.pdf). It uses
       | nonstandard analysis to get around a lot of the (elegant)
       | fussiness of measure theory.
       | 
       | The preface alone is fun to read.
        
         | creata wrote:
         | > But these excuses are not rigorously justified, even at the
         | graduate level, in my experience.
         | 
         | Imo, the informal use is already pretty close to the formal
         | definition. Formally, a distribution is _defined_ purely by its
         | inner products against certain smooth functions (usually the
         | ones with compact support) which is what the OP alluded to when
         | he said:
         | 
         | > The formal definition of a generalized function is: an
         | element of the continuous dual space of a space of smooth
         | functions.
         | 
         | That "element of the continuous dual space" is just a function
         | that takes in a smooth function with compact support f, and
         | returns what we take to be the inner product of f with our
         | generalized function.
         | 
         | So (again, imo) "we don't need to represent it directly, only
         | the result of an inner product against a smooth function" isn't
         | _that_ distant to the formal definition.
        
           | mturmon wrote:
           | I hear you, and I admit I'm drawing a fuzzy line (is the
           | conventional approach "rigorous").
           | 
           | Here are two "test functions"-
           | 
           | - we learned much about impulse responses, and sometimes
           | considered responses to dipoles, etc. However, if I read the
           | Wikipedia article correctly (it's not great...), the theory
           | implies that a distribution (in the technical sense) has
           | derivatives of any order. I'm not sure I really knew that I
           | could count on that. A rigorous treatment would have given me
           | that assurance.
           | 
           | - if I understand correctly, the concept of introducing an
           | impulse to a system that has an identity impulse response,
           | which implies an inner product of delta with itself, is not
           | well-defined. Again, I'm not sure if we covered that concept.
           | (Admittedly, it's been a long time.)
        
             | mturmon wrote:
             | oops, I realize I completely mis-stated the second point.
             | What it should say is:
             | 
             | - If delta(x) is OK, why is delta^2(x) not OK?
        
         | marcosdumay wrote:
         | The Dirac delta is a unitary vector when represented on a
         | vectorial basis it's a component of.
         | 
         | I don't know what kind of justification you expect. There's a
         | Dirac delta sized "hole" on linear algebra, that mathematicians
         | need a name for. It's not like we can just leave it there,
         | unfilled.
        
         | dannyz wrote:
         | While the limit of increasingly concentrated Gaussian's does
         | result in a Dirac delta, but it is not the only way the Dirac
         | delta comes about and is probably not the correct way to think
         | about it in the context of signal processing.
         | 
         | When we are doing signal processing the Dirac delta primarily
         | comes about as the Fourier transform of a constant function,
         | and if you work out the math this is roughly equivalent to a
         | sinc function where the oscillations become infinitely fast.
         | This distinction is important because the concentrated Gaussian
         | limit has the function going to 0 as we move away from the
         | origin, but the sinc function never goes to 0, it just
         | oscillates really fast. This becomes a Dirac delta because any
         | integral of a function multiplied by this sinc function has
         | cancelling components from the fast oscillations.
         | 
         | The poor behavior of this limit (primarily numerically) is the
         | closely related to the reasons why we have things like Gibbs
         | phenomenon.
        
         | yuppiemephisto wrote:
         | Thanks! And yeah I'm familiar with Nelson
        
       | shwouchk wrote:
       | It is an interesting piece but to claim that no heavy machinery
       | is used is a bit disingenuous at best. You have defined some
       | purely algebraic operation "differentiation". This operation
       | involves a choice of infinitesimal. Is it trivial to show that
       | the definition is independent of infinitesimal? especially if we
       | are deriving at a hyperreal point? I doubt it and likely you
       | would need to do more complicated set theoretic limits rather
       | analytic limits. How do you calculate the integral of this
       | function? Or even define it? Or rather functions, since it's an
       | infinite family of logistic functions? To even properly define
       | this space you need to go quite heavily into set theory and i
       | doubt many would find it simpler, even than working with
       | distributions
        
         | Tainnor wrote:
         | Even just defining the hyperreals and showing why statements
         | about them are also valid for the reals needs to go through
         | either ultrafilters (which are some rather abstract objects) or
         | model theory. Of course you can just handwave all of that away
         | but then I guess you can also do that with standard analysis.
        
           | yuppiemephisto wrote:
           | There are theories like SPOT and Internal Set Theory that
           | don't require filters.
           | 
           | Plus the ancient mathematicians did _very_ well with just
           | their intuition. And more to the point, I cared much more
           | about building (hyper)number sense than some New Math "let's
           | learn ultrafilters before we've even done arithmetic".
        
             | Tainnor wrote:
             | > Plus the ancient mathematicians did very well with just
             | their intuition.
             | 
             | They did. But they also got things wrong, such as thinking
             | that pointwise limits are enough to carry over continuity
             | (see here for this and other examples:
             | https://mathoverflow.net/a/35558). Anyway, mathematics has
             | changed as a discipline, we now have strong axiomatic
             | foundations and they mean that we can, in principle, always
             | verify whether a proof is correct.
        
         | bubblyworld wrote:
         | The machinery of mathematics goes arbitrarily deep. I think the
         | interesting thing here is that with relatively little training
         | you can start to compute with these numbers, which is
         | _definitely_ not the case with analysis on distributions.
         | 
         | Or put differently - here you can kinda ignore the deeper
         | formalities and still be productive, whereas with distributions
         | you actually need to sit down and pore over them before you can
         | do _anything_.
         | 
         | That said, I'm curious why infinitesmals never took off in
         | physics. This kind of quick, shut-up-and-calculate approach
         | seems right up their alley.
        
       | tzs wrote:
       | Differentiation turns out to be a deeper subject than most people
       | expect even if you just stick to the ordinary real numbers rather
       | than venturing into things like hyperreals.
       | 
       | I once saw in an elementary calculus book a note after the proof
       | of a theorem about differentiation that the converse of the
       | theorem was also true but needed more advanced techniques than
       | were covered in the book.
       | 
       | I checked the advanced calculus and real analysis books I had and
       | they didn't have the proof.
       | 
       | I then did some searching and found mention of a book titled
       | "Differentiation" (or something similar) and found a site that
       | had scans for the first chapter of that book. It proved the
       | theorem on something like page 6 and I couldn't understand it at
       | all. Starting from the beginning I think I got through maybe a
       | page or two before it got to my deep with my mere bachelor's
       | degree in mathematics level of preparation.
       | 
       | I kind of wish I'd bought a copy of that book. I've never since
       | been able to find it. I've found other books with the same or
       | similar title but they weren't it.
        
         | perihelions wrote:
         | Do you remember what the theorem was?
        
           | tzs wrote:
           | Nope.
        
       | chii wrote:
       | Wow, it never occurred to me that the step function and the dirac
       | delta are related in this way! but now that i see it, it's
       | obvious!
       | 
       | I've never learnt this level of maths formally, but it's been an
       | interest of mine on and off. And this post explained it very
       | well, and pretty understandably for the laymen.
        
       | thaumasiotes wrote:
       | > The Number of Pieces an Integral is Cut Into
       | 
       | > You're probably familiar with the idea that each piece has
       | infinitesimal width, but what about the question of 'how MANY
       | pieces are there?'. The answer to that is a hypernatural number.
       | Let's call it N again.
       | 
       | Is that right? I thought there was an important theorem
       | specifying that no matter the infinitesimal width of an integral
       | slice, the total area will be in the neighborhood of (=
       | infinitely close to) the same real number, which is the value of
       | the integral. That's why we don't have to specify the value of dx
       | when integrating over dx... right?
        
         | yuppiemephisto wrote:
         | The number N in question will adjust with dx (up to
         | infinitesimal error anyway). So if dx is halved, N will double.
         | But both retain their character as infinitesimal and
         | hyperfinite.
        
       | hoseja wrote:
       | >We'll use the hyperreal numbers from the unsexily named field of
       | nonstandard analysis
       | 
       | There it is.
        
       | agnosticmantis wrote:
       | Related to the Hyperreal numbers mentioned in the article is the
       | class of Surreal numbers which have many fun properties. There's
       | a nice book describing them authored by Don Knuth.
        
         | yuppiemephisto wrote:
         | The hyperreals and surreals are actually isomorphic under a
         | mild strengthening of the axiom of choice (NBG).
         | 
         | https://mathoverflow.net/questions/91646/surreal-numbers-vs-...
         | 
         | See Ehrlich's answer.
        
       ___________________________________________________________________
       (page generated 2024-12-09 23:01 UTC)