[HN Gopher] I tested services to extract colors from image
___________________________________________________________________
I tested services to extract colors from image
Author : kiru_io
Score : 14 points
Date : 2023-05-17 11:23 UTC (3 days ago)
(HTM) web link (kiru.io)
(TXT) w3m dump (kiru.io)
| csense wrote:
| This is not too surprising.
|
| The most obvious way to implement something like this is to just
| histogram the pixels and sort by most popular, it's like
| literally 4 lines of Python: # Find the n most
| popular colors in the array of pixels def
| most_popular_colors(pixels, n): count = {}
| for i in range(len(pixels)): count[pixels[i]] =
| count.get(pixels[i], 0)+1 return
| sorted(count.items(), key=lambda kv : kv[1], reverse=True)[:n]
|
| This works (for some definition of "works"), but it doesn't do
| what users expect. They will probably complain a lot about the
| results.
|
| Any image that's a photo, or uses a gradient, or has ever been
| JPEG or MPEG compressed, is going to have a ton of fine
| variations in its colors. All the pixels of "that shade of green"
| aren't going to be exactly #123499, some of them are going to be
| #133499 or #123498 or something. A simple histogram like the
| above code will tally all those distinct colors separately.
|
| So you can get a most-popular list containing multiple very, very
| similar shades that are technically distinct integers. And you
| can have a color that "should" be on the most-popular list but
| isn't, because it's a bunch of very subtly different shades, none
| of which has a high enough count to crack the list.
|
| Which means to get the behavior the users expect, a simple
| histogram simply won't do. You need additional logic to "bucket"
| similar colors together for the counting part, and then show a
| "representative" of the most popular buckets for the final
| output.
|
| How are the buckets defined? How are the representatives
| selected? There are probably a dozen different reasonable
| implementation choices you could make. Different choices will all
| give slightly different (or even very different) outputs.
|
| I would actually be very surprised to find two independently
| written programs would give the same output for this problem, and
| I would suspect they're not actually independent (i.e. they both
| used the same software or both found the same textbook or RFC or
| whatever that suggests very specific implementation choices.)
| psadri wrote:
| We had to do this at Polyvore (a fashion e-commerce site) to
| extract dominant colors from products.
|
| To create the palette, we took 10k random product images from
| our catalog, cropped them to the center 1/3, and stitched them
| into a single image collage. Then we used imagemagick to
| quantize it to 64 colors to produce our palette.
|
| The extract colors for a given image, we'd crop to center,
| quantize using the above palette and then run the histogram.
| kiru_io wrote:
| That's interesting. What was the use case? (why extracting
| colors?)
| psadri wrote:
| The extracted colors were used in a few different ways:
|
| 1. product search - show me "red" shirts.
|
| We also mapped certain hues to "saturated", "muted",
| "pastel" etc... meta colors based on HSV values.
| Unfortunately, we never got to surfacing these meta colors
| as search facets.
|
| 2. similar product recommendations - show me shirts similar
| to this one. colors (and meta colors like "muted") were one
| of the features used in similar product recs.
|
| 3. analytics - turquoise is "trending"
___________________________________________________________________
(page generated 2023-05-20 23:00 UTC)