[HN Gopher] The fastest object diff library in JavaScript
       ___________________________________________________________________
        
       The fastest object diff library in JavaScript
        
       Author : AsyncBanana
       Score  : 76 points
       Date   : 2021-11-06 14:52 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | throwaway413 wrote:
       | Is this a true deep diff utility? Didn't check the code, but the
       | readme makes no mention of depth capability beyond comparing to
       | some other deep diffing libs.
        
         | nathcd wrote:
         | Yeah, it's just a single function that recurses. So the limit
         | will be your VM's call stack limit.
        
       | parhamn wrote:
       | Interesting it's so much faster, the code looks very natural and
       | trickless besides the richobject check (at least at first
       | glance).
       | 
       | Wonder what everyone else is doing to make it slow.
       | 
       | Also curious what mobx and reacts object diffing looks like now.
        
         | AsyncBanana wrote:
         | For a bit of info on why it is often faster, you can look at
         | https://github.com/AsyncBanana/microdiff/issues/2. I will
         | probably add something to the README about that soon too.
        
         | megous wrote:
         | It doesn't handle arrays, for one example.
        
       | megous wrote:
       | In all seriousness, what's so noteworthy about this? It's one
       | trivial 50 line recursive function. This is kind of stuff devs
       | just type out instead of searching for a npm package, because
       | it's faster that way.
        
         | peey wrote:
         | Would you rather implement `leftPad` or use the npm package?
        
           | hutrdvnj wrote:
           | Let's say I maintain dozen of packages, then I would rather
           | like to write it once and include it everywhere. I could copy
           | and paste all the common functions instead of including, but
           | if I notice a bug or have to upgrade the code for some
           | reason, then I have to edit the code dozen times. But if I
           | only have to maintain one package, then I prefer the
           | copy&paste approach.
        
           | megous wrote:
           | I'd use standard String.prototype.padStart. For anything more
           | complicated, but not too complex (say "libraries" with
           | 100-300 lines of code), it's very tedious and time consuming
           | to find something on npm that satisfies the specific
           | contraints of my app precisely. This certainly fits in that
           | category.
           | 
           | For libraries above 1-2k lines of code that can be quite
           | complicated (eg. things like database connectors), it starts
           | being reasonable to mold expectations of my app around the
           | library.
           | 
           | Also using coherent large utility libraries is something
           | worthwhile (or was in the past), libs like underscore/lodash,
           | etc.
           | 
           | But composing app from random underdocumented 2 function
           | "libraries" from npm is just pain if you know what you're
           | looking for. 1) you can't trust the code, so 2) you have to
           | read it anyway and reading is often slower than writing
        
           | edflsafoiewq wrote:
           | For trivial functions, I'd rather just copy them into my
           | source code.
        
         | qwertox wrote:
         | I wouldn't be able to write this. I mean, I could, but I lack
         | the knowledge of Typescript and willingness to dedicate the
         | needed time to create this while I have got other things to do.
         | 
         | The code is clean, well organized and nice to look at, easy to
         | read through and understand, and it's great that someone has
         | made the effort to create this and share it for us to use.
        
         | StevenWaterman wrote:
         | You only see the finished product, not the 100 failures and
         | incremental improvements over time. You could certainly write
         | something yourself from scratch, but I doubt it would be
         | competitive against this.
        
           | megous wrote:
           | > not the 100 failures and incremental improvements over time
           | 
           | This code has 3 actual code commits over 6 days, none of
           | which changed anything fundamental. What are you talking
           | about?
           | 
           | It's quite possible to compete against this on multiple
           | levels. One would be correctness. The other a bit less
           | wasteful interface. Another would be handling arrays in a
           | more useful way (or at all). And yet another would be safety.
           | Another would be speed and memory use. Yet another would be
           | non-recursive implementation. Another would be documentation
           | (what's this diffing? enumerable/non-enumerable properties?
           | how it deals with getters? how about inherited properties? -
           | I can't know any of this without looking long and hard at the
           | code)... etc. etc.
        
             | can16358p wrote:
             | Note on only the first part: the author might have tested
             | bunch of stuff on their local without commiting until
             | they've found something worth, perhaps? That would result
             | in very few commits.
        
           | engmgrmgr wrote:
           | Mmm, agree with parent; can get faster. Unfortunately lots of
           | perf critical code is created out of necessity and defaults
           | to proprietary.
           | 
           | You could profile this lib and figure out where it's spending
           | most of its time and figure out if there any ways to improve
           | (starts getting into implications of JS <-> system), but as
           | you get more idiosyncratic you lose flexibility and ease of
           | use.
           | 
           | Eg type checks, heap additions, recursion, etc can get "slow"
        
         | AsyncBanana wrote:
         | I agree it can be pretty easy to set up a basic version of
         | this, but eliminating edge cases, optimizing, and unit testing
         | can make it take significantly longer.
        
       | ianberdin wrote:
       | Please compare with a modern competitor: json-patch-plus
       | https://github.com/n1ru4l/graphql-live-query/blob/main/packa...
       | 
       | It's a bit different, but I think it covers more cases.
        
       | kosinus wrote:
       | I maintain symmetry[1] and wanted to try the benchmark against
       | it, but the results are very inconsistent. Symmetry is anywhere
       | between 50% slower to 20% _faster_ than microdiff on the small
       | object benchmark.
       | 
       | Symmetry doesn't get a lot of activity, but I've been using it in
       | production for many years. It does some more work on array
       | diffing, which this benchmark doesn't cover, by implementing part
       | of Myers' algorithm.
       | 
       | [1]: https://github.com/Two-Screen/symmetry/
        
       | laurent123456 wrote:
       | > type: "CREATE" | "REMOVE" | "CHANGE";
       | 
       | Since performance is a goal, why not define an enum and use
       | integer values? I don't know about speed, but the resulting diff
       | would be smaller.
       | 
       | Also I wonder what "const t = true" is for.
        
         | wruza wrote:
         | Usually it doesn't make a difference neither in speed, nor in
         | size, because of "interning" of literals. Unless you jsonify
         | it, of course. It could show up, if you'd later compare .type
         | with some "CRE"+"ATE" values made funny way, but not for
         | literals, e.g.:                 // x.js       export const foo
         | = "foo"            // y.js       if (x.foo === "foo") {
         | // ^ as cheap as int       }            const foo2 =
         | "FOO".toLowerCase()       if (x.foo === foo2) {         // ^
         | questionable         // depends on jit/optimization       }
         | 
         | https://stackoverflow.com/questions/5276915/do-common-javasc...
         | 
         | https://en.wikipedia.org/wiki/String_interning
        
       | drodil wrote:
       | Nice one. Have you tried replacing the for loops with forEach? It
       | might be even more performant.
        
         | sbr464 wrote:
         | Function callbacks are definitely slower. You could probably
         | shave another 10% by changing the array.push() calls to direct
         | assignments: array[array.length] = value. Same with the nested
         | array.map, changing to a for loop.
        
         | Etheryte wrote:
         | In Javascript, forEach, map, reduce and other similar array
         | methods are considerably slower than a regular for loop. You
         | can see [0] and the linked questions for more technical
         | details.
         | 
         | [0] https://stackoverflow.com/q/22155280/1470607
        
           | [deleted]
        
       | megous wrote:
       | What if object references have loops? That doesn't seem to be
       | handled. Also it looks like it will diff not just own properties
       | but properties from prototype objects, too.
        
         | tantalor wrote:
         | > loops
         | 
         | Don't
        
           | akie wrote:
           | Ok, let's pretend they don't exist and potentially hang the
           | app.
        
       | tekkk wrote:
       | Hey this is really cool! Not sure how this compares to
       | https://github.com/benjamine/jsondiffpatch which has been
       | unmaintained for some time. But if this could replace it I'd
       | happy to start using it.
        
         | AsyncBanana wrote:
         | Thanks for your interest! Currently, Microdiff does not have a
         | patching functionality by default, and the API is different. If
         | there is enough interest, though, I might make a compatibility
         | layer to ease the differences, and it probably would not be
         | hard to do yourself too.
        
       | wereHamster wrote:
       | > const t = true;
       | 
       | That should be the job of a minifier, makes the code less
       | readable.
        
       | wruza wrote:
       | Fast diffing of objects may be seen as rarely needed to a regular
       | user, but it's very important part of handling vdom/vnode diffs.
       | Recently I researched few lesser web ui libraries for my own use
       | and found out that many of them do not diff vnodes correctly,
       | e.g. in case of RegExp or Date arguments. I don't remember exact
       | library names vs flaws, but my conclusion was that almost all of
       | non-actively supported libs had it in one way or another. If you
       | want to explore, check that first, because it's a surprising pain
       | point.
       | 
       | Seems that this particular library is aware of these potential
       | issues (which motivated me to write this comment):
       | https://github.com/AsyncBanana/microdiff/issues/2
        
         | nathcd wrote:
         | But vnodes have a predefined shape, whereas this library is for
         | diffing (relatively) arbitrary objects, right? If your data
         | isn't arbitrarily shaped, I would assume you'd get much better
         | performance by just implementing a simple direct diff, no?
        
           | eyelidlessness wrote:
           | And, often but probably not always, using a class instance
           | rather than a POJO. Seems unintuitive (function call
           | overhead), but it can signal to the JIT that the object's
           | type is relatively stable.
        
           | wruza wrote:
           | Yes. I should have noted for clarity that vdom diff is a
           | special case, not a general one like that of subj. But their
           | similarity reminded me of the experience.
        
             | nathcd wrote:
             | Ah that's my mistake, I read more into your comment than
             | you wrote. Sorry!
             | 
             | It's a good point about edge cases in diffing. Dynamic
             | value diffing in dynamically-typed languages is an
             | interesting problem to think about.
        
       | spankalee wrote:
       | I know some diffing libraries are reluctant to do this for perf
       | reasons, but many of them do not handle cycles correctly leading
       | to potential infinite loops based on the shape of the objects and
       | order of arguments.
       | 
       | Fast is great, but unless you know that your data doesn't have
       | cycles, infinite loops take a long time :)
        
       ___________________________________________________________________
       (page generated 2021-11-06 23:01 UTC)