[HN Gopher] Content-aware image resizing in JavaScript
___________________________________________________________________
Content-aware image resizing in JavaScript
Author : mmazzarolo
Score : 436 points
Date : 2021-04-16 22:20 UTC (1 days ago)
(HTM) web link (trekhleb.dev)
(TXT) w3m dump (trekhleb.dev)
| purplecats wrote:
| Is there an opposite of this, where it will expand an image size?
| sillysaurusx wrote:
| It'd be pretty interesting to train an ML model. You could
| generate a bunch of training examples: downsize lots of images,
| then use the upsized versions as targets.
|
| It's not quite the same thing as superresolution, since it's
| seam carving.
| liuliu wrote:
| Yeah, I believe Seam Carving paper did this. Pretty simple,
| find the low-energy path and do linear interpolation between
| the two neighboring pixels (I think it also introduced some
| tricks to find n low-energy paths at once to avoid inserting
| into the same path again and again).
|
| But like the top comment pointed out. This algorithm is easy to
| implement and interesting, but in real-world examples are not
| better than salient object detection + cropping.
| ska wrote:
| That's more related to a problem known as infilling. Generally
| throwing out information is a lot easier than generating it.
| You can do some statistical things by sampling other points of
| the image into newly created "gap" but it will probably look
| bad if fully automated, at least on big changes.
|
| There's an obvious version of the algorithm in that direction.
| For one line "seam", it's easy enough, you just pull data from
| either side. But repeatedly applying it the more often your new
| "seams" end up next to something already estimated, the less
| real information there is - I suspect this becomes visually
| noticeable pretty fast.
| aeroheim wrote:
| Are you referring to image inpainting? I think that's what
| it's usually called (please correct me if I'm wrong though!)
|
| Although I'm not really familiar with traditional algorithms
| for inpainting, I've seen some ML research do some stuff with
| it that I found to be really impressive.
|
| One demo that really stood out to me was the following:
| https://shihmengli.github.io/3D-Photo-Inpainting/
|
| The algorithm they describe is able to inpaint pixels AND
| depth information from existing RGB-D photos, enabling images
| to be viewed in 3d space and be used with parallax effects.
| Really cool stuff!
| ska wrote:
| > Are you referring to image inpainting?
|
| Yes, too late to edit but that's the more common name.
| [deleted]
| FpUser wrote:
| I wish there were more of such interesting and educational
| articles with great examples and explanations.
| SavantIdiot wrote:
| Yaz, Right???
|
| One of the biggest complaints I have about HN is that it
| promotes really crappy "Look at me! I just learned a thing and
| wrote a 300 word blog doing a crappy job explaining it because
| I don't really get it but want to pad my CV..."
|
| This article is exceptional. Thank you OP.
| ZephyrBlu wrote:
| Unfortunately HN is not immune from lowest common denominator
| content.
| CyberRabbi wrote:
| Tragedy of the commons. Private hacker communities tend to
| produce higher quality content.
| littleninja wrote:
| > promotes really crappy "Look at me! I just learned a thing
| and wrote a 300 word blog doing a crappy job explaining it
| because I don't really get it but want to pad my CV..."
|
| This is a broad brush, are you sure the intent is always
| resume padding? Some folks (like me) write poorly but I find
| writing tests what I know (and shows me what I don't). I
| share anyway so I can be corrected and learn more, and so
| others might benefit if they have a similar problem. Your
| comment felt like shaming.
|
| > This article is exceptional. Thank you OP.
|
| 100% agree, OP's writing and content are examplary!
| SavantIdiot wrote:
| > Some folks (like me) write poorly but I find writing
| tests what I know (and shows me what I don't)
|
| That's fine, just don't have such a big ego you need to
| share your crap with the world unless you have something
| important to say. That's why when you try to google
| something to learn, you have to wade through pages and
| pages of half-baked crap: all the good stuff has been
| drowned out.
| CyberRabbi wrote:
| I first saw this algorithm in the mid 2000s, maybe when it was
| invented. It's simple in retrospect but it's a beautiful fusion
| of physics, algorithms, and graphics. Amazing things happen at
| the cross section of different disciplines.
| tectonicfury wrote:
| I did seam carving as part of Coursera's Algorithms course by Bob
| Sedgewick. It was gratifying.
| berniemadoff69 wrote:
| daaaang that's really nice. i sort of wondered if someone had
| done something like this in javascript. is really cool to see,
| nice work.
| gitgud wrote:
| That example where you can upload your own image is amazing. It
| even shows the seams being carved in real time. Well done!
| [deleted]
| O_H_E wrote:
| Learned about this from the amazing Grant Sanderson (3b1b) in the
| Computational Thinking course using Julia.
|
| https://www.youtube.com/watch?v=rpB6zQNsbQU
| cecida wrote:
| Jesus, JavaScript.
|
| As a plumber once said to me; you can't flush an 8 inch shite
| down a 4 inch hole.
| antman wrote:
| Nice presentation. You can also add regions that are important to
| be kept do as not to distort van gogh's face
| lbutler wrote:
| You can also apply this to each frame in a video for a rather
| interesting effect.
|
| My brother has a YouTube channel full of content-aware scaling
| videos:
|
| https://youtu.be/a8k3b-QNbhs
| jbverschoor wrote:
| Damn that's trippy
| marzell wrote:
| Interesting. A couple thoughts.... this might be a lot smoother
| on cartoons since there's usually less colors and noise
| overall.
|
| It _seems_ as though there 's an additional effect (for
| extra... effect) when they scream. Not sure if that is a
| natural result of the content of the visual scene being
| processed, or if there's some sort of audio input into the
| visual processing, or if they manually/intentionally applied
| some sort of parameter change (at 0:29 and 0:38 in the video)
| that causes the video to get all chaotic.
| crazygringo wrote:
| Wow, that's incredibly interesting!
|
| My first reaction is how actor's faces look surprisingly like
| traditional caricatures that illustrators do -- e.g. shrinking
| foreheads and chins which are detail-light but keeping eyes and
| ears which are detail-heavy.
|
| But my second thought is that the extreme jumpiness in frames
| occurs because each frame is processed separately. But if you
| considered each seam not to be a "jagged line" from point A on
| one edge to point B on the opposite edge of a single frame, but
| rather a "jagged plane" cutting through a _series_ of frames --
| all frames in a _single shot_ -- you could eliminate the
| jumpiness entirely.
|
| You might need to build a bit more flexibility into it to allow
| for discontinuities generated from object movement and camera
| panning, but I wonder if anyone's tried to do something like
| that?
|
| Though I imagine it might be quite a lot of programming for a
| tool that might only ever be used as a kind of video filter for
| entertainment purposes -- I have a hard time imagining a
| cinematographer ever using it for serious purposes.
| adflux wrote:
| >You might need to build a bit more flexibility into it to
| allow for discontinuities generated from object movement and
| camera panning, but I wonder if anyone's tried to do
| something like that?
|
| easier solution would probably be frame interpolation between
| the two seperate frames.
| andersource wrote:
| That's quite an insight!
|
| Actually the authors of the seam carving paper went on to do
| just that [0]. From the abstract: "We present video
| retargeting using an improved seam carving operator. Instead
| of removing 1D seams from 2D images we remove 2D seam
| manifolds from 3D space-time volumes. To achieve this we
| replace the dynamic programming method of seam carving with
| graph cuts that are suitable for 3D volumes."
|
| [0] https://faculty.idc.ac.il/arik/SCWeb/vidret/index.html
| crazygringo wrote:
| Son of a gun, this is why I love HN. Thank you! And it
| turns out the results are _shockingly_ good, far better
| than I expected. They have demo videos at:
|
| https://faculty.idc.ac.il/arik/SCWeb/vidret/results/video_r
| e...
|
| My favorite is:
|
| Original: https://faculty.idc.ac.il/arik/SCWeb/vidret/resul
| ts/videos/w...
|
| Narrowed: https://faculty.idc.ac.il/arik/SCWeb/vidret/resul
| ts/videos/w...
|
| Widened: https://faculty.idc.ac.il/arik/SCWeb/vidret/result
| s/videos/w...
|
| Just wow.
| andersource wrote:
| Gladly! And yeah, the results really are quite good. This
| is why I like optimization problems - if you can formally
| capture what you want as an objective, and if you can
| find a way to optimize it, you can get surprisingly good
| results. Of course these are two very big IFs...
| rijoja wrote:
| Interesting.
| imhoguy wrote:
| Very interesting, but it is hard to watch for me because of
| extreme shaking. Maybe some morphing would smooth the
| transitions.
| kzrdude wrote:
| Oh that's horrifying/skin-crawling. I don't know what's wrong
| with me, but I can't stand to watch it. :)
| jonplackett wrote:
| Really cool. Do you know what he's using to do the scaling?
| lbutler wrote:
| In the video comments he includes a link to a tutorial but
| essentially he is dumping all the frames and then running a
| script to content aware scale down the frames 50% and then
| merges it all back together.
| axiosgunnar wrote:
| This should be a prime example where WebAssembly could come into
| play, no?
| kevingadd wrote:
| WebGL would work really well for this assuming the constrained
| shader subset it has to work with can actually do the analysis
| and transforms
| awb wrote:
| Very cool for simple images like the demo ones provided. But
| images with detailed content don't resize well and are much worse
| than a naive resize.
|
| Try: https://unsplash.com/photos/ZtRuoAKr9vM
|
| Resize: 50% width, 70% height
|
| The basketball hoop is heavily distorted, as is the court, the
| squares on the building and the 3 point line.
| gchamonlive wrote:
| That is explained in the section with the van Gogh painting. It
| is not like they are advertising the algorithm as a jack of all
| trades.
| ACAVJW4H wrote:
| It would be awesome if sharp (https://github.com/lovell/sharp)
| would implement this algorithm
| SavantIdiot wrote:
| On a side note, thanks for posting this. I didn't sharp existed
| and had been doing things the hard way (process calls to image
| magick).
| arjunkava wrote:
| One of the good explanation of seam carving, but there are other
| facts as well.
|
| 1. It was first developed by Shai Avidan at MERL.
|
| 2. Then introduced in paper by Vidya Setlur, Saeko Takage, Ramesh
| Raskar, Michael Gleicher and Bruce Gooch in 2005 which won
| 10-year impact award in 2015.
|
| 3. Adobe Systems acquired a non-exclusive license to seam carving
| technology from MERL and implemented in Photoshop CS4.
| rijoja wrote:
| Do you meant that this is patented?
| arjunkava wrote:
| As I said licence is non-exclusive so anyone can use it but
| it was mentioned that adobe used Content Aware Scaling.
| colejohnson66 wrote:
| Amazing! I tried it with a schematic (practically all right
| angles), and it did an impressive job (until it ran out of room
| and decided to mess with the text). Of course, images work great
| too :)
|
| I do have one question: I see this is based on RGB, but how good
| is a "seam carving" implementation using RGB compared to one
| based on a color space more like human vision (such as CIELAB)?
| fireattack wrote:
| Totally unrelated to this great project, but I'm curious: do
| people really use content-aware resizing feature in practice?
|
| I use Photoshop frequently, and I use content-aware removal A LOT
| (super handy). But it never occured to me, not even once, that I
| need to use the content-aware resizing despite it's there for
| years. If I really need to change the ratio of an image/photo I
| usually just crop.
| jftuga wrote:
| Yes, I use it daily. Here is my project:
|
| https://github.com/jftuga/photo_id_resizer
|
| I am using this content aware image resizing library, which is
| used for: "Face detection to avoid face deformation."
|
| https://github.com/esimov/caire
| Bilal_io wrote:
| Now that I am aware it exists, I can use it for blog headers
| where I'd like to maintain the same ratio, this would be needed
| for images that have details around the edges I don't want to
| lose.
| dariosalvi78 wrote:
| Strange, but someone is still writing about algorithms that are
| not deep learning these days!
| etaioinshrdlu wrote:
| I think this would make a very good data augmentation for
| training deep learning models, because the resulting images are
| both unique, not just linearly transformed, and still often look
| natural.
| pyentropy wrote:
| Worked well with the "Pale Blue Dot" by Voyager regardless of
| which sub-region containing the dot I uploaded.
|
| Then, when uploading the Solar System, it managed to capture each
| planet and its label without distorting them while only removing
| the space in-between... except for Saturn's rings which became
| wobbly :)
|
| Architecture pictures tends to perform horrible because they
| contain so many straight lines and perspective cues. Faces are
| too stretched regardless of aspect ratio.
| egeozcan wrote:
| Unsurprisingly it fails when there are no low-risk paths:
| https://i.imgur.com/58d5AFM.png
|
| Brilliant implementation anyway. Having a lot of fun!
| rijoja wrote:
| Interesting have someone somewhere found a method to handle
| images like this?
| rijoja wrote:
| I was very impressed with this algorithm when I first found it
| and am very happy to see this implementation that seems really
| polished.
|
| What are the performance implications of this? Would it be
| possible and or a good idea to implement this in WebASM?
| foota wrote:
| There's an improvement to seam carving using something termed
| "forward energy", see: https://avikdas.com/2019/07/29/improved-
| seam-carving-with-fo...
| saganus wrote:
| I wonder if further research on this has been done.
|
| For example, what if some ML tagging mechanism is used to find
| the silhouette of interesting objects in the image (people,
| animals, traffics signs, etc), and then "freezing" them to
| prevent the energy function from operating on those areas, thus
| preserving those objects intact, while resizing the rest of the
| image.
| foota wrote:
| I don't remember where I saw it linked from, but someone did
| that with face detection: https://github.com/esimov/caire
| rogerhoms wrote:
| asdfsasa
| jonplackett wrote:
| I've been playing with this for half an hour now trying different
| images. Really fun. Nice work.
| sarak12070 wrote:
| Secret beauty tips http://healthwithbeauty.xyz/2021/04/15/secret-
| beauty-tips-fo...
|
| Are sloth bears dangerous|Fun facts about sloth bears|
| https://www.interestingnews.club/2021/04/are-sloth-bears-dan...
| ska wrote:
| You see lots of demos of this, I think because the algorithm is
| interesting but also pretty easy to implement.
|
| As an approach it seems to do a decent job with either very small
| changes (e.g. slight change of aspect ratio) or uninteresting
| images, but aesthetically the results seem bad on most
| interesting images; I suspect because targeting "low information"
| regions of the image removes tension that is needed. Often a
| simple crop is much better, it seems.
| ZephyrBlu wrote:
| That what I was amazed by. The algorithm is pretty ingenious.
| mrkurt wrote:
| Gosh dang this is amazing.
| m3at wrote:
| Simple solutions are great and I find seam carving elegant, but
| maybe that's an application where machine learning can shine?
|
| As globally the task is defined as displacing pixels while
| minimizing a perceptual loss, it should be reasonably easy to
| express in a differentiable way. The benefits I see are higher
| quality semantics preservation, and potentially faster inference
| (one pass only).
|
| The recent development of transformer models might provide just
| the tool to tackle variable sizes efficiently, maybe I should
| give it a go
|
| Edit: if you're interested too and want to play on it together,
| shoot me a message :)
| thadk wrote:
| Has the author put this up on `npm` yet? I don't see it and want
| to use it in platforms like ObservableHQ easily.
___________________________________________________________________
(page generated 2021-04-17 23:01 UTC)