[HN Gopher] How we used GPT-4o for image detection with 350 simi...
___________________________________________________________________
How we used GPT-4o for image detection with 350 similar
illustrations
Author : olup
Score : 47 points
Date : 2025-01-10 21:02 UTC (3 days ago)
(HTM) web link (olup-blog.pages.dev)
(TXT) w3m dump (olup-blog.pages.dev)
| olup wrote:
| First time for me posting this kind of story - I thought it would
| make an interesting case on solving a hard computer vision
| problem with a crafty product engineer team.
| caioariede wrote:
| Just a small feedback... I have switched to the reader mode
| because the font used is very challenging to read for me.
| littlestymaar wrote:
| Also, having a blog post about image detection, and not
| showing a single picture in the whole post was quite
| frustrating.
| Oarch wrote:
| Especially given the detailed description surely the author
| could just generate a similar image
| vessenes wrote:
| Thanks for the "bitter lesson" news from the frontlines. Curious;
| did you experiment with 4o as the sole pipeline? And of course as
| I think you mention, it would be interesting to know if say llama
| 8b could do a similar job as well.
|
| Congrats on shipping.
| schappim wrote:
| I would love to see the prompt / image data sent to GPT-4o!
| gazchop wrote:
| I hear a lot of qualitative speak but nothing quantitative.
| GaggiX wrote:
| Is there a reason to choose VGG16 over more modern models?
| saint_yossarian wrote:
| I mean, cool tech, but why not just print a QR code next to each
| illustration?
| nnnnico wrote:
| just in: using gpt4o to read QRs
| kredd wrote:
| A bit tangential, but I think we will see a good chunk of small
| teams building competing products in different software business
| segments, by just doubling on productivity and offering a cheaper
| option due to less operational overhead (reads: paying
| engineers). I can think of at least two businesses that can be
| competed in costs if the team can automate a good chunk of it.
| qeternity wrote:
| > I can think of at least two businesses that can be competed
| in costs if the team can automate a good chunk of it.
|
| And which would those be?
| kredd wrote:
| We both know I didn't write it down with the hopes that I'll
| act on the at some point in the near future, and want to
| avoid my imaginary competitors. Even though, in reality, I
| will ponder about it for another week or two, give up without
| actually getting anything done, then regret for never trying
| :)
| Imnimo wrote:
| It's tough to judge without seeing examples of the targets and
| the user photos, but I'm curious if this could be done with just
| old-school SIFT. If it really is exactly the same image in the in
| the corpus and on the wall, does a neural embedding model really
| buy you a lot? A small number of high confidence tie points seems
| like it'd be all you need, but it probably depends a lot on just
| how challenging the user photos are.
| Morizero wrote:
| I find a lot of applied AI use-cases to be "same as this other
| method, but more expensive".
| Terr_ wrote:
| [delayed]
| gunalx wrote:
| Cool real life use Case. Don't think lmms usually get applied
| reasonably where they should be and I am glad that a generic knn
| model also was used to simplify costs and also just more
| suitable.
___________________________________________________________________
(page generated 2025-01-13 23:00 UTC)