[HN Gopher] Show HN: Semantic search over the National Gallery o...
___________________________________________________________________
Show HN: Semantic search over the National Gallery of Art
Author : breadislove
Score : 136 points
Date : 2025-10-10 20:33 UTC (1 days ago)
(HTM) web link (nga.demo.mixedbread.com)
(TXT) w3m dump (nga.demo.mixedbread.com)
| philipkglass wrote:
| How does this work? I thought it was probably powered by
| embeddings and maybe some more traditional search code, but I
| checked out the linked github repo and I didn't see any
| model/inference code. The public code is a wrapper that
| communicates with your commercial API?
|
| Some searches work like magic and others seem to veer off target
| a lot. For example, "sculpture" and "watercolor" worked just
| about how I'd expect. "Lamb" showed lambs and sheep. But "otter"
| showed a random selection of animals.
| breadislove wrote:
| It is powered by Mixedbread Search which is powered by our
| model Omni. Omni is multimodal (text, video, audio, images) and
| multi vector, which helps us to capture more information.
|
| The search is in beta and we improving the model. Thank you for
| reporting the queries which are not working well.
|
| Edit: Re the otter, I just checked and I did not found otters
| in the dataset. We should not return any results if the model
| is not sure to reduce confusion.
| justincormack wrote:
| neither "blue pictures" nor "multiples" worked well.
| breadislove wrote:
| thank you for reporting these. we will improve on them for
| the next iteration.
| reportrappor wrote:
| I'll pile on since these are useful. Searching for
| "fingers and holes" did find me some nice hand drawings,
| but the real gold at the national gallery to me is the
| Bruce Nauman. The nga.gov search knew what I wanted.
| philipkglass wrote:
| There's at least a little bit of otter in the data. The one
| relevant result I saw was "Plate 40: Two Otters and a Beaver"
| by Joris Hoefnagel.
|
| I also expected semantic search to return similar results for
| "fireworks" and "pyrotechnics," since the latter is a less
| common synonym for the former. But I got many results for
| fireworks and just one result for pyrotechnics.
|
| This is still impressive. My impulse is to poke at it with
| harder cases to try to reason about how it could be
| implemented. Thanks for your Show HN and for replying to me!
| breadislove wrote:
| If you find more such cases please feel free to send them
| over to aamir at domain name of the Show HN. I would love
| to see those cases and see how we can improve on them.
| Thank you so much for the feedback.
| treetalker wrote:
| Yeah, "naked chicks" returns women with no clothes instead of
| baby birds.
| yawnxyz wrote:
| hey, your service is back up again!!! Mixedbread was my favorite
| tool for so long since your pivot, and I'm so glad y'all are back
| breadislove wrote:
| We have a lot more things coming up soon. It just took us some
| time building Mixedbread Search.
| nmitchko wrote:
| In case anyone wants to do this themselves, check out the
| pipeline here: https://github.com/isc-nmitchko/iris-document-
| search
|
| Colnomic and nvidia models are great for embedding images and
| MUVERA can transform those to 1D vectors.
| losteric wrote:
| > check out the pipeline here
|
| "the pipeline" - seems like this is just a personal hackathon
| project?
|
| Why these models vs other multimodals? Which "nvidia models"?
| dfc wrote:
| It would be nice if took you to the NGA page about the item. I
| cant even copy the text easily for easy search.
|
| "Images of german shepherds" never fails to provide some humor.
| breadislove wrote:
| Thank you for pointing this out. We will add this tomorrow
| morning.
| dfc wrote:
| The results for "Mark Rothko", "Paintings by Mark Rothko",
| "Paintings similar to mark rothko" etc does not bring up
| anything that I was expecting. NGA has a large collection of
| Rothko paintings but none of them come up.
|
| This NGA link returns over a thousand pieces by Rothko:
| https://www.nga.gov/artists/1839-mark-rothko/artworks
| breadislove wrote:
| We are right now not including the artist name. Which will
| be done in the next iteration of the model (next week).
| Right now the search is only based on what the model can
| "see". And it seems like that the model does not understand
| the art of Mark Rothko.
|
| The next version can see the image and read the metadata.
|
| A bit more context: We are include everything in the latent
| space (embeddings) without trying to maintain multiple
| indexes and hack around things. There is still a huge
| mountain to climb. But this one seems really promising.
| 4ndrewl wrote:
| And this seems like a hard limitation of this approach as
| art (v craft) is concerned with interpretation and
| reception whereas this is more like unsplash-for-
| galleries in that the searches have to be very literal I
| guess? (eg search for something abstract, like 'dreams',
| something that you will find depicted in the collection,
| produces quite the mixed bag of results).
| iDon wrote:
| A search for : "character studies of old farmers" yielded
| good results. The results are drawings / engravings, which
| may reflect the balance of the collection, and perhaps this
| subject is more used in practice than in marketable oil
| paintings.
|
| Since this is a semantic search, using a vector embedding,
| it will handle meanings better than a text search, which
| would handle names better.
| Computer0 wrote:
| This is neat, not sure how to report queries that are working
| poorly as you have mentioned. But when I search "Waltz" I am
| presented with Kitchen Utensils and only one piece of dancing
| folks. Presumably this is due to the Artist's name being
| 'Walton'.
| breadislove wrote:
| We will add a feedback form tomorrow morning. For now please
| feel free to write to aamir at domain name of the page. thank
| you so much! this helps us a lot.
| khaki54 wrote:
| Tried "Images of german shepherds" and not one on the page of
| 16
| pogilvie wrote:
| I built a toy version of something like this a couple-ish years
| ago for a hackathon. I wrote up a blog of how I did it back then
| for anyone interested:
| https://www.patrickogilvie.com/engineering/Image_Search_Engi...
|
| Would be interesting to know how relevant that approach is now.
| ulrikhansen54 wrote:
| Congrats on the launch guys. I remember meeting ya'll in SF. What
| happened to your HF model/project?
| breadislove wrote:
| there is a lot coming
| kvsrh wrote:
| Is it possible to add other data sources?
| breadislove wrote:
| yes, in which one would be interested?
| samdg wrote:
| I love old stereograms, and was happy to find a couple using this
| tool!
| adamontherun wrote:
| love that a search for 'chill vibes sculpture' returned a very
| chill set of results. nice step change in art search capabilities
| khaki54 wrote:
| Yale has an amazing one, worth looking at:
| https://lux.collections.yale.edu/
| ted_dunning wrote:
| Is that a multi-modal search? Or just textual matching?
|
| I couldn't find any examples that couldn't be explained by
| simple text matches.
| ted_dunning wrote:
| Works really well for some artist names (rembrandt, whistler) and
| exceedingly poorly for others (john singer sargent).
| joki77 wrote:
| Ketika kode dan kanvas bertemu -- sebuah pencarian tak sekadar
| kata, tapi rasa. Di antara lukisan dan batang piksel, mesin
| mencoba memahami jawaban yang tak terucap.
| kburman wrote:
| I recently learned that semantic search embeddings mostly
| represent topics and concepts, but they don't handle negation or
| emotion very well.
|
| For example, if you search for "paintings of winter landscapes
| but without sun and trees," you'll still get results with trees.
| That's because embeddings capture the presence of concepts like
| "tree" or "landscape," but not logical relationships like
| "without" or "not."
|
| Similarly, embeddings aren't great at capturing how something
| feels. They can tell that "sad poem" and "happy poem" are
| different mainly because of the words used, not because they
| truly understand emotional tone.
|
| This happens because most embedding models (like OpenAI's or
| sentence-transformers) are trained to group things by semantic
| similarity, not logical meaning or sentiment. Negation, polarity,
| and affect aren't explicitly represented in the vector space.
|
| Might be common knowledge to some, but it was a cool TIL moment
| for me, realizing that embeddings are great at what something is
| about, but not how it feels or what it excludes.
| breadislove wrote:
| Thats actually not correct. Embeddings can handle relationships
| like "without" or "not." when trained for it. You need to scale
| up the training massively to make it generalize it well. The
| current version of Mixedbread Search supports negatives like
| "tshirt without stripes". You can check it out on our launch
| video [1]. We are working on a way more generalized model,
| which should be able to capture relationships, emotions and
| much more. The current models are just limited.
|
| [1]: https://www.mixedbread.com/blog/mixedbread-search
| kburman wrote:
| I was referring specifically to popular embedding models like
| OpenAI's and sentence-transformers, which (as far as I know)
| don't reliably handle negation or emotional nuance, they
| mostly capture topical similarity.
|
| I don't know enough of the underlying math to say for sure
| whether embeddings can be trained to consistently represent
| negation, but when I tried the Mixedbread demo myself with a
| query like "winter landscapes without sun and trees", it
| still showed me paintings with both sun and trees. So at
| least in its current form, it doesn't seem to fully handle
| those semantic relationships yet.
___________________________________________________________________
(page generated 2025-10-11 23:01 UTC)