hngopher.com

       [HN Gopher] How we used GPT-4o for image detection with 350 simi...
       ___________________________________________________________________
        
       How we used GPT-4o for image detection with 350 similar
       illustrations
        
       Author : olup
       Score  : 47 points
       Date   : 2025-01-10 21:02 UTC (3 days ago)
        
 (HTM) web link (olup-blog.pages.dev)
 (TXT) w3m dump (olup-blog.pages.dev)
        
       | olup wrote:
       | First time for me posting this kind of story - I thought it would
       | make an interesting case on solving a hard computer vision
       | problem with a crafty product engineer team.
        
         | caioariede wrote:
         | Just a small feedback... I have switched to the reader mode
         | because the font used is very challenging to read for me.
        
           | littlestymaar wrote:
           | Also, having a blog post about image detection, and not
           | showing a single picture in the whole post was quite
           | frustrating.
        
             | Oarch wrote:
             | Especially given the detailed description surely the author
             | could just generate a similar image
        
       | vessenes wrote:
       | Thanks for the "bitter lesson" news from the frontlines. Curious;
       | did you experiment with 4o as the sole pipeline? And of course as
       | I think you mention, it would be interesting to know if say llama
       | 8b could do a similar job as well.
       | 
       | Congrats on shipping.
        
       | schappim wrote:
       | I would love to see the prompt / image data sent to GPT-4o!
        
       | gazchop wrote:
       | I hear a lot of qualitative speak but nothing quantitative.
        
       | GaggiX wrote:
       | Is there a reason to choose VGG16 over more modern models?
        
       | saint_yossarian wrote:
       | I mean, cool tech, but why not just print a QR code next to each
       | illustration?
        
         | nnnnico wrote:
         | just in: using gpt4o to read QRs
        
       | kredd wrote:
       | A bit tangential, but I think we will see a good chunk of small
       | teams building competing products in different software business
       | segments, by just doubling on productivity and offering a cheaper
       | option due to less operational overhead (reads: paying
       | engineers). I can think of at least two businesses that can be
       | competed in costs if the team can automate a good chunk of it.
        
         | qeternity wrote:
         | > I can think of at least two businesses that can be competed
         | in costs if the team can automate a good chunk of it.
         | 
         | And which would those be?
        
           | kredd wrote:
           | We both know I didn't write it down with the hopes that I'll
           | act on the at some point in the near future, and want to
           | avoid my imaginary competitors. Even though, in reality, I
           | will ponder about it for another week or two, give up without
           | actually getting anything done, then regret for never trying
           | :)
        
       | Imnimo wrote:
       | It's tough to judge without seeing examples of the targets and
       | the user photos, but I'm curious if this could be done with just
       | old-school SIFT. If it really is exactly the same image in the in
       | the corpus and on the wall, does a neural embedding model really
       | buy you a lot? A small number of high confidence tie points seems
       | like it'd be all you need, but it probably depends a lot on just
       | how challenging the user photos are.
        
         | Morizero wrote:
         | I find a lot of applied AI use-cases to be "same as this other
         | method, but more expensive".
        
           | Terr_ wrote:
           | [delayed]
        
       | gunalx wrote:
       | Cool real life use Case. Don't think lmms usually get applied
       | reasonably where they should be and I am glad that a generic knn
       | model also was used to simplify costs and also just more
       | suitable.
        
       ___________________________________________________________________
       (page generated 2025-01-13 23:00 UTC)