[HN Gopher] How to Implement a Cosine Similarity Function in Typ...
___________________________________________________________________
How to Implement a Cosine Similarity Function in TypeScript
Author : alexop
Score : 21 points
Date : 2025-03-09 09:20 UTC (1 days ago)
(HTM) web link (alexop.dev)
(TXT) w3m dump (alexop.dev)
| schappim wrote:
| I attempted to implement this on the front end of my e-commerce
| site, which has approximately 20,000 products (see gist [1]). My
| goal was to enhance search speed by performing as many local
| operations as possible.
|
| Biggest impact in performance was by moving to dot products.
|
| Regrettably, the sheer size of the index of embeddings rendered
| it impractical to achieve the desired results.
|
| 1.
| https://gist.github.com/schappim/d4a6f0b29c1ef4279543f6b6740...
| alexop wrote:
| This looks nice. I also played on the weekend with Vue and
| Transformer.js to build the embeddings locally. See
| https://github.com/alexanderop/vue-vector-search
| itishappy wrote:
| Those are some _janky_ diagrams. The labels are selectable, and
| therefore are repeatedly highlighted and un-highlighted while
| dragging the vector around. The "direction only" arrow prevents
| you from changing the magnitude, but it doesn't prevent said
| magnitude from changing and it does so often because the inputs
| are quantized but the magnitude isn't. Multiple conventions for
| decimals are used within the same diagram. The second diagram
| doesn't center the angle indicator on the actual angle. Also the
| "send me feedback on X" popup doesn't respond to the close
| button, but then disappeared when I scrolled again so maybe it
| did? I'm running Chrome 134.0.6998.36 for Windows 10 Enterprise
| 22H2 build 19045.5487.
|
| This whole thing looks like unreviewed AI. Stylish but
| fundamentally broken. I haven't had a chance to dig into the meat
| of the article yet, but unfortunately this is distracting enough
| that I'm not sure I will.
|
| Edit: I'm digging into the meat, and it's better! Fortunately, it
| appears accurate. Unfortunately, it's rather repetitive. There's
| two paragraphs discussing the meaning of -1, 0, and +1
| interleaved with multiple paragraphs explaining how cosine
| similarity allows vectors to be compared regardless of magnitude.
| The motivation is spread throughout the whole thing and
| repetitive, and the real world examples seem similar though
| formatted just differently enough to make it hard to tell at a
| glance.
|
| To try to offer suggestions instead of just complaining... Here's
| my recommended edits:
|
| I'd move the simple English explanation to the top after the
| intro, then remove everything but the diagrams until you reach
| the example. I'd completely remove the explanation of vectors
| unless you're going to include an explanation of dot products. I
| really like the functional approach, but feel like you could
| combine it with the `Math.hypot` example (leave the full formula
| as a comment, the rest is identical), and with the full example
| (although it's missing the `Math.hypot` optimization). Finally, I
| feel like you could get away with just one real web programming
| example, though don't know which one I'd choose. The last section
| about using OpenAI for embedding and it's disclaimer is already
| great!
| alexop wrote:
| Thank you for the good feedback. I tried to improve that. I was
| writing the blog post for myself to understand Cosine
| Similarity, which is why it's maybe a bit repetitive, but this
| is the best way for me to learn something. I get your point.
| Next time I will write it better. Good feedback - I love that.
| itishappy wrote:
| Ha, when you put it that way, I can totally see why it read
| like that!
|
| It looks super great now. What you have here leaves an
| entirely different impression, and a stylish one!
|
| Two last suggestions:
|
| * Now I'm thinking the Why Cosine Similarity Matters for
| Modern Web Development section belongs at the top, right
| after your intro.
|
| * The angle indicator is still a bit wonky in the diagram. I
| might even take direction only mode out entirely, as you
| point out cosine similarity is invariant to changes in
| magnitude.
| ashvardanian wrote:
| It's a nice post, but "using array methods" probably shouldn't be
| placed in the "Efficient Implementation" section. As often
| happens with high-level languages, a single _plain old_ loop is
| faster than three array methods.
|
| Similarly, if you plan to query those vectors in search, you
| should consider continuous `TypedArray` types and smaller scalars
| than the double precision `number`.
|
| I know very little about JS, but some of the amazing HackerNews
| community members have previously helped port SimSIMD to
| JavaScript (https://github.com/ashvardanian/SimSIMD), and I wrote
| a blog post covering some of those JS/TS-specifics, NumJS, and
| MathJS in 2023 (https://ashvardanian.com/posts/javascript-ai-
| vector-search/).
|
| Hopefully, it should help unlock another 10-100x in performance.
___________________________________________________________________
(page generated 2025-03-10 23:00 UTC)