[HN Gopher] Show HN: Sisi - Semantic Image Search CLI tool, loca...
___________________________________________________________________
Show HN: Sisi - Semantic Image Search CLI tool, locally without
third party APIs
I wrote this tool to get familiar with CLIP model, I know many
people have written similar tools with CLIP before, but I'm new to
machine learning and writing a classic tool helps my study. The
unusual thing with my version is, it is in pure Node.js, with the
power of node-mlx, a Node.js machine learning framework. The repo
in the link is mostly about implementing indexing and CLI, the code
of the model implementation lives as a Node.js module:
https://github.com/frost-beta/clip . Hope this helps other
learners!
Author : zcbenz
Score : 97 points
Date : 2024-09-16 10:59 UTC (12 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ivanjermakov wrote:
| In russian, "sisi" is a variation of "tits".
|
| Is there a job/services that confirm that branding is appropriate
| across different languages? Seems like a non trivial problem to
| solve.
| phito wrote:
| It's definitely not a good name in English either
| Zambyte wrote:
| I assume you're reading it as "sissy", but I read it as
| "seesee", which is fine in English.
| Narhem wrote:
| I read it as sisi, but which means "thank you" in viet.
| zcbenz wrote:
| That is sad, the name sisi comes from the sisi empress:
| https://en.m.wikipedia.org/wiki/Empress_Elisabeth_of_Austria
| rlpb wrote:
| Sounds like something one might try to train an AI to do :)
| bjord wrote:
| it's also the name of egypt's authoritarian leader
|
| https://en.wikipedia.org/wiki/Abdel_Fattah_el-Sisi
| jollyllama wrote:
| Yes this is the first thing that came to mind for me, strange
| name choice
| fkyoureadthedoc wrote:
| Even if that was the intent, which it almost certainly
| isn't, why would it be strange enough to warrant
| discussion?
| jollyllama wrote:
| As an American who monitors world affairs, the choice of
| a quasi-authoritarian junta leader as a name would be
| quite novel.
| philsnow wrote:
| In cantonese it's what a toddler might call poop
| Narhem wrote:
| A lot of the prodemently used programming languages and
| libraries have references to feces if you speak Farsi.
| kristopolous wrote:
| I read about a company in the 1990s that did that. They went
| one step further - picking culturally appropriate colors,
| shapes, numbers, and then permuting the brand names to
| favorable variations for a country. My (probably wrong) 25 year
| old recollection was when they introduced subway in China they
| basically found a way to pronounce it that translated to "this
| place is delicious". I bet it was in Wired. If not that,
| probably New York Magazine.
| visarga wrote:
| > Seems like a non trivial problem to solve.
|
| Took me 5 minutes to land this GPT prompt.
|
| https://chatgpt.com/share/66e84c0c-a92c-800a-b452-255d6fe942...
|
| Results:
|
| - Chinese (Simplified) Si Si (si si) - sounds like "four-
| four", which can be associated with bad luck due to the number
| four in Chinese culture
|
| - Arabic "Sisi" is a common nickname, also associated with
| Egypt's President Abdel Fattah el-Sisi
|
| - Russian Sisi (sisi) - slang for breasts
|
| - Bulgarian Sisi (sisi) - slang for breasts
|
| - Serbian Sisi (sisi) - slang for breasts
|
| - Croatian Sisi - slang for breasts
|
| You should probably complement with a web search and a
| wiktionary search because they have all languages on a single
| page.
| pdimitar wrote:
| Does ChatGPT get anything right, ever?
|
| In Bulgarian the slang is Tsitsi (tsi tsi). I imagine it's
| near-identical for many other Slavic languages.
| visarga wrote:
| Yeah I noticed it was pretty shaky, change the prompt a bit
| and the result changes a lot. Not very reliable after all
| by itself, but used in conjunction with other methods.
| fedeb95 wrote:
| that's a nice start, maybe does 99% of the job, but to be
| 100% sure, you still need additional (manual?) checks.
| ivanjermakov wrote:
| Cool LLM application! Might not be enough though.
| sureIy wrote:
| It's not that straightforward due to spelling. Does that
| catch kok? Tihts? P. Nus? For a non English swear word, I had
| to ask 3 times and about a specific language to finally make
| that connection.
| kgeist wrote:
| Not as bad as the Pidora project
|
| https://en.m.wiktionary.org/wiki/%D0%BF%D0%B8%D0%B4%D0%BE%D1...
| progx wrote:
| Uses only 1 core 100% under linux, can this be changed?
|
| 10 images, each ~20 kb size, took more than 10 minutes to index,
| is that normal without GPU-acceleration?
| zcbenz wrote:
| No it is not normal, I only tested x64/arm64 macs, I will try
| on linux.
| a_wild_dandan wrote:
| What's normal? On your Apple silicon.
| sureIy wrote:
| Wow that's atrocious performance. So there's no chance to use
| this on real photos
| netdur wrote:
| I have made similar android app for semantic image search, works
| offline too, still gathering feedback and polishing UI, but it
| works, if you are brave enough here is it
| https://drive.google.com/file/d/1tE0cY6umj5h5zCY_Jvaou1M8sCf...
| KetoManx64 wrote:
| Is there a github link?
| netdur wrote:
| We have not decided what to do with it yet. It could be free,
| paid, or open source. However, the logic code for using
| semantic search with CLIP-compatible models on Android will
| be available on GitHub.
| nickphx wrote:
| Why yes, I'll download a 695MB APK file from an internet
| stranger.
| notsylver wrote:
| I was planning to do this myself lol. I was going to use SQLite
| as the index, and use `sqlite-vec` or something similar to query
| for similar files directly. I think the only other thing I was
| planning were more filters, `"positive term" -"negative term"` to
| be able to negate results, `>90"search"` to find images that
| match by >90% and some generic filters like `--size >1mb` to help
| narrow it down when you are looking for a specific image.
| Quantizing embeddings to make them smaller/faster also seemed
| interesting but I haven't tried doing it yet.
| spullara wrote:
| Very cool! Here is a similar python version.
|
| https://github.com/spullara/photoindex
|
| Oh and if you want to run something locally on your iphone you
| can use my app I am still testing:
|
| https://x.com/getrememberwhen
| sureIy wrote:
| This is cool. Is there also a way to show contents of the image
| as indexed? i.e. image 1 has cat and dog
|
| There are a lot of tool/apps that let you "search images" but not
| much that lets you just as easily "read images"
| petesergeant wrote:
| I've been enjoying https://github.com/mazzzystar/Queryable on
| iPhone
| y04nn wrote:
| How does CLIP compare to YOLO[1]? I haven't looked into image
| classification/object recognition for a while, but I remember
| that YOLO was quite good was working on realtime video too.
|
| [1]: https://pjreddie.com/darknet/yolo/
| Eisenstein wrote:
| CLIP and YOLO work completely differently and have different
| purposes. CLIP uses transformers and embeddings and can compare
| text with images for classification. YOLO using a CNN and is
| trained with bounding boxes on images and is used for image
| recognition.
|
| Give an image to CLIP and you can compare the similarity
| between the image and a sentence like 'a vase with roses in
| it'. Whereas with YOLO you give it an image and get the
| coordinates of bounding boxes around a vase, and around roses.
| kjeldsendk wrote:
| I have wanted to clean up my photo collection for ages and remove
| any nsfw picture that might hide somewhere.
|
| Would this be able to do that and how likely is it It will see a
| pc release.
| Jack5500 wrote:
| Isn't clip superseeded by multimodal llms?
___________________________________________________________________
(page generated 2024-09-16 23:00 UTC)