[HN Gopher] Content-aware image resizing in JavaScript
       ___________________________________________________________________
        
       Content-aware image resizing in JavaScript
        
       Author : mmazzarolo
       Score  : 436 points
       Date   : 2021-04-16 22:20 UTC (1 days ago)
        
 (HTM) web link (trekhleb.dev)
 (TXT) w3m dump (trekhleb.dev)
        
       | purplecats wrote:
       | Is there an opposite of this, where it will expand an image size?
        
         | sillysaurusx wrote:
         | It'd be pretty interesting to train an ML model. You could
         | generate a bunch of training examples: downsize lots of images,
         | then use the upsized versions as targets.
         | 
         | It's not quite the same thing as superresolution, since it's
         | seam carving.
        
         | liuliu wrote:
         | Yeah, I believe Seam Carving paper did this. Pretty simple,
         | find the low-energy path and do linear interpolation between
         | the two neighboring pixels (I think it also introduced some
         | tricks to find n low-energy paths at once to avoid inserting
         | into the same path again and again).
         | 
         | But like the top comment pointed out. This algorithm is easy to
         | implement and interesting, but in real-world examples are not
         | better than salient object detection + cropping.
        
         | ska wrote:
         | That's more related to a problem known as infilling. Generally
         | throwing out information is a lot easier than generating it.
         | You can do some statistical things by sampling other points of
         | the image into newly created "gap" but it will probably look
         | bad if fully automated, at least on big changes.
         | 
         | There's an obvious version of the algorithm in that direction.
         | For one line "seam", it's easy enough, you just pull data from
         | either side. But repeatedly applying it the more often your new
         | "seams" end up next to something already estimated, the less
         | real information there is - I suspect this becomes visually
         | noticeable pretty fast.
        
           | aeroheim wrote:
           | Are you referring to image inpainting? I think that's what
           | it's usually called (please correct me if I'm wrong though!)
           | 
           | Although I'm not really familiar with traditional algorithms
           | for inpainting, I've seen some ML research do some stuff with
           | it that I found to be really impressive.
           | 
           | One demo that really stood out to me was the following:
           | https://shihmengli.github.io/3D-Photo-Inpainting/
           | 
           | The algorithm they describe is able to inpaint pixels AND
           | depth information from existing RGB-D photos, enabling images
           | to be viewed in 3d space and be used with parallax effects.
           | Really cool stuff!
        
             | ska wrote:
             | > Are you referring to image inpainting?
             | 
             | Yes, too late to edit but that's the more common name.
        
       | [deleted]
        
       | FpUser wrote:
       | I wish there were more of such interesting and educational
       | articles with great examples and explanations.
        
         | SavantIdiot wrote:
         | Yaz, Right???
         | 
         | One of the biggest complaints I have about HN is that it
         | promotes really crappy "Look at me! I just learned a thing and
         | wrote a 300 word blog doing a crappy job explaining it because
         | I don't really get it but want to pad my CV..."
         | 
         | This article is exceptional. Thank you OP.
        
           | ZephyrBlu wrote:
           | Unfortunately HN is not immune from lowest common denominator
           | content.
        
           | CyberRabbi wrote:
           | Tragedy of the commons. Private hacker communities tend to
           | produce higher quality content.
        
           | littleninja wrote:
           | > promotes really crappy "Look at me! I just learned a thing
           | and wrote a 300 word blog doing a crappy job explaining it
           | because I don't really get it but want to pad my CV..."
           | 
           | This is a broad brush, are you sure the intent is always
           | resume padding? Some folks (like me) write poorly but I find
           | writing tests what I know (and shows me what I don't). I
           | share anyway so I can be corrected and learn more, and so
           | others might benefit if they have a similar problem. Your
           | comment felt like shaming.
           | 
           | > This article is exceptional. Thank you OP.
           | 
           | 100% agree, OP's writing and content are examplary!
        
             | SavantIdiot wrote:
             | > Some folks (like me) write poorly but I find writing
             | tests what I know (and shows me what I don't)
             | 
             | That's fine, just don't have such a big ego you need to
             | share your crap with the world unless you have something
             | important to say. That's why when you try to google
             | something to learn, you have to wade through pages and
             | pages of half-baked crap: all the good stuff has been
             | drowned out.
        
       | CyberRabbi wrote:
       | I first saw this algorithm in the mid 2000s, maybe when it was
       | invented. It's simple in retrospect but it's a beautiful fusion
       | of physics, algorithms, and graphics. Amazing things happen at
       | the cross section of different disciplines.
        
       | tectonicfury wrote:
       | I did seam carving as part of Coursera's Algorithms course by Bob
       | Sedgewick. It was gratifying.
        
       | berniemadoff69 wrote:
       | daaaang that's really nice. i sort of wondered if someone had
       | done something like this in javascript. is really cool to see,
       | nice work.
        
       | gitgud wrote:
       | That example where you can upload your own image is amazing. It
       | even shows the seams being carved in real time. Well done!
        
       | [deleted]
        
       | O_H_E wrote:
       | Learned about this from the amazing Grant Sanderson (3b1b) in the
       | Computational Thinking course using Julia.
       | 
       | https://www.youtube.com/watch?v=rpB6zQNsbQU
        
       | cecida wrote:
       | Jesus, JavaScript.
       | 
       | As a plumber once said to me; you can't flush an 8 inch shite
       | down a 4 inch hole.
        
       | antman wrote:
       | Nice presentation. You can also add regions that are important to
       | be kept do as not to distort van gogh's face
        
       | lbutler wrote:
       | You can also apply this to each frame in a video for a rather
       | interesting effect.
       | 
       | My brother has a YouTube channel full of content-aware scaling
       | videos:
       | 
       | https://youtu.be/a8k3b-QNbhs
        
         | jbverschoor wrote:
         | Damn that's trippy
        
         | marzell wrote:
         | Interesting. A couple thoughts.... this might be a lot smoother
         | on cartoons since there's usually less colors and noise
         | overall.
         | 
         | It _seems_ as though there 's an additional effect (for
         | extra... effect) when they scream. Not sure if that is a
         | natural result of the content of the visual scene being
         | processed, or if there's some sort of audio input into the
         | visual processing, or if they manually/intentionally applied
         | some sort of parameter change (at 0:29 and 0:38 in the video)
         | that causes the video to get all chaotic.
        
         | crazygringo wrote:
         | Wow, that's incredibly interesting!
         | 
         | My first reaction is how actor's faces look surprisingly like
         | traditional caricatures that illustrators do -- e.g. shrinking
         | foreheads and chins which are detail-light but keeping eyes and
         | ears which are detail-heavy.
         | 
         | But my second thought is that the extreme jumpiness in frames
         | occurs because each frame is processed separately. But if you
         | considered each seam not to be a "jagged line" from point A on
         | one edge to point B on the opposite edge of a single frame, but
         | rather a "jagged plane" cutting through a _series_ of frames --
         | all frames in a _single shot_ -- you could eliminate the
         | jumpiness entirely.
         | 
         | You might need to build a bit more flexibility into it to allow
         | for discontinuities generated from object movement and camera
         | panning, but I wonder if anyone's tried to do something like
         | that?
         | 
         | Though I imagine it might be quite a lot of programming for a
         | tool that might only ever be used as a kind of video filter for
         | entertainment purposes -- I have a hard time imagining a
         | cinematographer ever using it for serious purposes.
        
           | adflux wrote:
           | >You might need to build a bit more flexibility into it to
           | allow for discontinuities generated from object movement and
           | camera panning, but I wonder if anyone's tried to do
           | something like that?
           | 
           | easier solution would probably be frame interpolation between
           | the two seperate frames.
        
           | andersource wrote:
           | That's quite an insight!
           | 
           | Actually the authors of the seam carving paper went on to do
           | just that [0]. From the abstract: "We present video
           | retargeting using an improved seam carving operator. Instead
           | of removing 1D seams from 2D images we remove 2D seam
           | manifolds from 3D space-time volumes. To achieve this we
           | replace the dynamic programming method of seam carving with
           | graph cuts that are suitable for 3D volumes."
           | 
           | [0] https://faculty.idc.ac.il/arik/SCWeb/vidret/index.html
        
             | crazygringo wrote:
             | Son of a gun, this is why I love HN. Thank you! And it
             | turns out the results are _shockingly_ good, far better
             | than I expected. They have demo videos at:
             | 
             | https://faculty.idc.ac.il/arik/SCWeb/vidret/results/video_r
             | e...
             | 
             | My favorite is:
             | 
             | Original: https://faculty.idc.ac.il/arik/SCWeb/vidret/resul
             | ts/videos/w...
             | 
             | Narrowed: https://faculty.idc.ac.il/arik/SCWeb/vidret/resul
             | ts/videos/w...
             | 
             | Widened: https://faculty.idc.ac.il/arik/SCWeb/vidret/result
             | s/videos/w...
             | 
             | Just wow.
        
               | andersource wrote:
               | Gladly! And yeah, the results really are quite good. This
               | is why I like optimization problems - if you can formally
               | capture what you want as an objective, and if you can
               | find a way to optimize it, you can get surprisingly good
               | results. Of course these are two very big IFs...
        
             | rijoja wrote:
             | Interesting.
        
         | imhoguy wrote:
         | Very interesting, but it is hard to watch for me because of
         | extreme shaking. Maybe some morphing would smooth the
         | transitions.
        
         | kzrdude wrote:
         | Oh that's horrifying/skin-crawling. I don't know what's wrong
         | with me, but I can't stand to watch it. :)
        
         | jonplackett wrote:
         | Really cool. Do you know what he's using to do the scaling?
        
           | lbutler wrote:
           | In the video comments he includes a link to a tutorial but
           | essentially he is dumping all the frames and then running a
           | script to content aware scale down the frames 50% and then
           | merges it all back together.
        
       | axiosgunnar wrote:
       | This should be a prime example where WebAssembly could come into
       | play, no?
        
         | kevingadd wrote:
         | WebGL would work really well for this assuming the constrained
         | shader subset it has to work with can actually do the analysis
         | and transforms
        
       | awb wrote:
       | Very cool for simple images like the demo ones provided. But
       | images with detailed content don't resize well and are much worse
       | than a naive resize.
       | 
       | Try: https://unsplash.com/photos/ZtRuoAKr9vM
       | 
       | Resize: 50% width, 70% height
       | 
       | The basketball hoop is heavily distorted, as is the court, the
       | squares on the building and the 3 point line.
        
         | gchamonlive wrote:
         | That is explained in the section with the van Gogh painting. It
         | is not like they are advertising the algorithm as a jack of all
         | trades.
        
       | ACAVJW4H wrote:
       | It would be awesome if sharp (https://github.com/lovell/sharp)
       | would implement this algorithm
        
         | SavantIdiot wrote:
         | On a side note, thanks for posting this. I didn't sharp existed
         | and had been doing things the hard way (process calls to image
         | magick).
        
       | arjunkava wrote:
       | One of the good explanation of seam carving, but there are other
       | facts as well.
       | 
       | 1. It was first developed by Shai Avidan at MERL.
       | 
       | 2. Then introduced in paper by Vidya Setlur, Saeko Takage, Ramesh
       | Raskar, Michael Gleicher and Bruce Gooch in 2005 which won
       | 10-year impact award in 2015.
       | 
       | 3. Adobe Systems acquired a non-exclusive license to seam carving
       | technology from MERL and implemented in Photoshop CS4.
        
         | rijoja wrote:
         | Do you meant that this is patented?
        
           | arjunkava wrote:
           | As I said licence is non-exclusive so anyone can use it but
           | it was mentioned that adobe used Content Aware Scaling.
        
       | colejohnson66 wrote:
       | Amazing! I tried it with a schematic (practically all right
       | angles), and it did an impressive job (until it ran out of room
       | and decided to mess with the text). Of course, images work great
       | too :)
       | 
       | I do have one question: I see this is based on RGB, but how good
       | is a "seam carving" implementation using RGB compared to one
       | based on a color space more like human vision (such as CIELAB)?
        
       | fireattack wrote:
       | Totally unrelated to this great project, but I'm curious: do
       | people really use content-aware resizing feature in practice?
       | 
       | I use Photoshop frequently, and I use content-aware removal A LOT
       | (super handy). But it never occured to me, not even once, that I
       | need to use the content-aware resizing despite it's there for
       | years. If I really need to change the ratio of an image/photo I
       | usually just crop.
        
         | jftuga wrote:
         | Yes, I use it daily. Here is my project:
         | 
         | https://github.com/jftuga/photo_id_resizer
         | 
         | I am using this content aware image resizing library, which is
         | used for: "Face detection to avoid face deformation."
         | 
         | https://github.com/esimov/caire
        
         | Bilal_io wrote:
         | Now that I am aware it exists, I can use it for blog headers
         | where I'd like to maintain the same ratio, this would be needed
         | for images that have details around the edges I don't want to
         | lose.
        
       | dariosalvi78 wrote:
       | Strange, but someone is still writing about algorithms that are
       | not deep learning these days!
        
       | etaioinshrdlu wrote:
       | I think this would make a very good data augmentation for
       | training deep learning models, because the resulting images are
       | both unique, not just linearly transformed, and still often look
       | natural.
        
       | pyentropy wrote:
       | Worked well with the "Pale Blue Dot" by Voyager regardless of
       | which sub-region containing the dot I uploaded.
       | 
       | Then, when uploading the Solar System, it managed to capture each
       | planet and its label without distorting them while only removing
       | the space in-between... except for Saturn's rings which became
       | wobbly :)
       | 
       | Architecture pictures tends to perform horrible because they
       | contain so many straight lines and perspective cues. Faces are
       | too stretched regardless of aspect ratio.
        
       | egeozcan wrote:
       | Unsurprisingly it fails when there are no low-risk paths:
       | https://i.imgur.com/58d5AFM.png
       | 
       | Brilliant implementation anyway. Having a lot of fun!
        
         | rijoja wrote:
         | Interesting have someone somewhere found a method to handle
         | images like this?
        
       | rijoja wrote:
       | I was very impressed with this algorithm when I first found it
       | and am very happy to see this implementation that seems really
       | polished.
       | 
       | What are the performance implications of this? Would it be
       | possible and or a good idea to implement this in WebASM?
        
       | foota wrote:
       | There's an improvement to seam carving using something termed
       | "forward energy", see: https://avikdas.com/2019/07/29/improved-
       | seam-carving-with-fo...
        
         | saganus wrote:
         | I wonder if further research on this has been done.
         | 
         | For example, what if some ML tagging mechanism is used to find
         | the silhouette of interesting objects in the image (people,
         | animals, traffics signs, etc), and then "freezing" them to
         | prevent the energy function from operating on those areas, thus
         | preserving those objects intact, while resizing the rest of the
         | image.
        
           | foota wrote:
           | I don't remember where I saw it linked from, but someone did
           | that with face detection: https://github.com/esimov/caire
        
       | rogerhoms wrote:
       | asdfsasa
        
       | jonplackett wrote:
       | I've been playing with this for half an hour now trying different
       | images. Really fun. Nice work.
        
       | sarak12070 wrote:
       | Secret beauty tips http://healthwithbeauty.xyz/2021/04/15/secret-
       | beauty-tips-fo...
       | 
       | Are sloth bears dangerous|Fun facts about sloth bears|
       | https://www.interestingnews.club/2021/04/are-sloth-bears-dan...
        
       | ska wrote:
       | You see lots of demos of this, I think because the algorithm is
       | interesting but also pretty easy to implement.
       | 
       | As an approach it seems to do a decent job with either very small
       | changes (e.g. slight change of aspect ratio) or uninteresting
       | images, but aesthetically the results seem bad on most
       | interesting images; I suspect because targeting "low information"
       | regions of the image removes tension that is needed. Often a
       | simple crop is much better, it seems.
        
         | ZephyrBlu wrote:
         | That what I was amazed by. The algorithm is pretty ingenious.
        
       | mrkurt wrote:
       | Gosh dang this is amazing.
        
       | m3at wrote:
       | Simple solutions are great and I find seam carving elegant, but
       | maybe that's an application where machine learning can shine?
       | 
       | As globally the task is defined as displacing pixels while
       | minimizing a perceptual loss, it should be reasonably easy to
       | express in a differentiable way. The benefits I see are higher
       | quality semantics preservation, and potentially faster inference
       | (one pass only).
       | 
       | The recent development of transformer models might provide just
       | the tool to tackle variable sizes efficiently, maybe I should
       | give it a go
       | 
       | Edit: if you're interested too and want to play on it together,
       | shoot me a message :)
        
       | thadk wrote:
       | Has the author put this up on `npm` yet? I don't see it and want
       | to use it in platforms like ObservableHQ easily.
        
       ___________________________________________________________________
       (page generated 2021-04-17 23:01 UTC)