[HN Gopher] I accidentally built a meme search engine
       ___________________________________________________________________
        
       I accidentally built a meme search engine
        
       Author : EamonLeonard
       Score  : 145 points
       Date   : 2024-04-12 18:13 UTC (1 days ago)
        
 (HTM) web link (harper.blog)
 (TXT) w3m dump (harper.blog)
        
       | systemz wrote:
       | Interesting, I knew about something similar but more focused on
       | server side: https://github.com/simon987/sist2
        
       | bo0tzz wrote:
       | Last year we added CLIP-based image search to https://immich.app/
       | and even though I have a pretty good understanding of how it
       | works, it still blows my mind damn near every day. It's the
       | closest thing to magic I've ever seen.
        
         | apricot13 wrote:
         | Just needs OCR for the perfect meme searching solution!
        
           | bo0tzz wrote:
           | OCR will be there at some point, but it already does a
           | surprisingly good job without!
        
         | dsvf wrote:
         | Happy immich user here! I once took a cute photo of our baby
         | chewing on a whisk, and actually finding the correct photo in
         | an unsorted, untagged huge pile of photos by simply searching
         | for "whisk" was a mindblow experience! It is an amazingly
         | powerful tool!
        
       | robotnikman wrote:
       | Gives me an idea for a meme search service I can use locally to
       | search through all the images on my computer to find a specific
       | meme (I tend to know I downloaded a funny one and then when I
       | want to share it with someone I can never find it)
        
       | rmdes wrote:
       | I want to do this but for 30GB of PDFs
        
         | harper wrote:
         | this shouldn't be too hard
        
       | ritavdas wrote:
       | Did you host any version of this on the cloud for the general
       | public to access?
        
       | ianbicking wrote:
       | Huh, are the image vector embeddings implicitly doing OCR as
       | well? Because it seems like the meme search is pulling from the
       | text as well as images, though it's not entirely clear.
        
         | bo0tzz wrote:
         | CLIP does not have explicit OCR support, but it does somewhat
         | coincidentally have a slight understanding of text. This is
         | explained by training captions containing (some of) the text
         | that is in the image.
        
       | lancehasson wrote:
       | This is awesome! We made similar functionality (plus more)
       | available through an API. If anyone is interested to try it out
       | and share feedback, please message me and I'll hook you up.
        
         | harper wrote:
         | would love to check it out
        
       | yreg wrote:
       | Last year there was also a very funny project of meme search
       | engine leveraging an iPhone farm:
       | 
       | https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...
       | 
       | https://news.ycombinator.com/item?id=34315782
        
       | rovr138 wrote:
       | You might be interested in this,
       | https://github.com/mazzzystar/Queryable, https://queryable.app/
       | 
       | I run it on my iPhone.
       | 
       | Native app. Doesn't require a network connection (great for
       | privacy).
       | 
       | > Queryable is a Core ML model that runs locally on your device.
       | Leveraging OpenAI CLIP's model encoding technology to connect
       | images and text, you can search your iPhone photo album using any
       | natural language input. Most importantly, it is completely
       | offline, so your album privacy will not be revealed to anyone.
       | And, it is open-source: GitHub
        
       | speedgoose wrote:
       | It's very cool to see how it's now possible to easily replicate
       | old Google Photos features in 10 hours using open-source tools on
       | a laptop.
        
       | diptanu wrote:
       | These hacks/side projects are amazing! I feel we will see a lot
       | of creativity as tools to build data intensive AI applications
       | become easier.
       | 
       | We built and open sourced Indexify
       | https://github.com/tensorlakeai/indexify to make it easy to build
       | resilient pipelines to combine data with many different models
       | and transformations to build applications that relies on
       | embedding or any other metadata extracted by models from Videos,
       | Photos and any documents!
       | 
       | I didn't know about SigClip, the author mentioned on the blog,
       | need to add this to our library :) I also found it incredible
       | that he generated the crawler with Claude! This is the type of
       | boilerplate I hope we don't have to write in the future
        
       | thesz wrote:
       | It should be named "I accidentally a meme search engine" [1].
       | 
       | [1]
       | https://www.reddit.com/r/AskReddit/comments/jooo5/reddit_ori...
        
         | harper wrote:
         | i thought this far too late
        
       | justinator wrote:
       | Hey @harper, you ever write about your vision quests?
        
       | om8 wrote:
       | CLIP is a very interesting technology.
       | 
       | On my previous job ML department created internal tool, where you
       | could search through city panoramas (like google street view)
       | using text.
       | 
       | It could find you in a second all road pits, overfilled dumpsters
       | and other ugly (and beautiful) things you wanted.
        
       ___________________________________________________________________
       (page generated 2024-04-13 23:00 UTC)