[HN Gopher] Google's New AI Photo Upscaling Tech Is Jaw-Dropping
___________________________________________________________________
Google's New AI Photo Upscaling Tech Is Jaw-Dropping
Author : thunderbong
Score : 138 points
Date : 2021-09-01 02:47 UTC (7 hours ago)
(HTM) web link (petapixel.com)
(TXT) w3m dump (petapixel.com)
| urbandw311er wrote:
| I'm not used to seeing clickbaity words like "jaw dropping" on
| HN: is this within guidelines?
| aembleton wrote:
| Debatable. It is used in the article title, but then terms like
| jaw dropping are trying to make it stand out.
|
| Probably better to have linked to Google's original blog post
| and used their title "High Fidelity Image Generation Using
| Diffusion Models": https://ai.googleblog.com/2021/07/high-
| fidelity-image-genera...
| yumraj wrote:
| I'm assuming that the low-res images used were created from the
| high-res images, which might imply that the process could be
| reversible if the AI/ML or some algo could learn how to reverse
| it.
|
| I wonder, if my assumption above is correct, how this would
| behave if the image was low-res to begin with due to whatever
| reason. Would it perform at the same level?
| okasaki wrote:
| In my experience these algorithms only work when you have a
| high quality, high resolution image that has been scaled down.
|
| If you give it a low quality image, it doesn't do anything.
|
| So it's quite limited.
| hoseja wrote:
| I notice it adds blemishes and freckles to where were none
| before. This seems weird. Sure it looks more "realistic" but that
| is no actual reality.
| vanous wrote:
| This creates 'nice looking believable' images, but why do they
| also not show a comparison of the AI generated result with the
| hi-res original?
| nulbyte wrote:
| I wondered the same. To me, there seems something slightly off
| about the supposedly upscaled images. I really want to see a
| legitimate comparison.
| eurasiantiger wrote:
| It changes eye shape, brow shape, nose shape...
| AndrewKemendo wrote:
| They do in the paper. I'm not sure why the article doesn't.
|
| https://arxiv.org/pdf/2104.07636.pdf
| stedaniels wrote:
| Thank you for this link. The differences between the original
| and the upscaled SR3 model are highlighted mostly for me on
| the picture of the leopard. The facial markings are clearly
| different.
| hn_throwaway_99 wrote:
| Thank you very much for posting the original. I agree with
| some of the other comments, while the generated faces look
| highly photo-realistic, in some cases they also look quite
| different from the actual person.
|
| Not that different mind you, but humans are obviously super
| sensitive to tiny changes on faces - it doesn't take that
| much to make it look like a different person altogether. For
| the non-face images it was much harder for me to really
| detect many differences, and they certainly didn't bother me.
| eximius wrote:
| Enhance.
| iamnotwhoiam wrote:
| It seems to me that it's basically like me (a non artist) drawing
| something I saw and handing it to a good artist and asking them
| to draw that but better. They aren't drawing what I saw, but they
| are drawing a better representation, so it can satisfy my need to
| see the thing in physical form, but it can never be a real
| replacement.
|
| If you ever have something that you would be happy to substitute
| a very good painting for a blurry image then this is good. If you
| need to know what something actually looked like in high def
| (license plate numbers, micro tumors) this is useless, or worse
| than useless if it ever gets admitted in court.
| adius wrote:
| Not entirely true. The model can extract image information from
| the pixels a human might not be able to see. Like how you can
| enhance the colors in a video of a face in a way that it pulses
| red with your heartbeat. The information about your heartbeat
| was there all along, our eyes were just not able to extract /
| recognize it.
| mmmpetrichor wrote:
| I don't understand what "confusion rate" metric was in the
| article. Also don't see any comparison with original high-res
| image so that we can see how true to life the generated images
| look?
| taeric wrote:
| So many of these upscaling technologies feel like the sketch of
| crime scenes. Artistic, to be sure, but probably not as good
| for actually filling in details as we'd like. And downright
| dangerous if they set unreasonably high expectations of
| fidelity.
| mithr wrote:
| > Also don't see any comparison with original high-res image so
| that we can see how true to life the generated images look?
|
| Was wondering about that too. It certainly produces realistic-
| looking high-res images, but especially when the article talks
| about potential uses ranging "from restoring old family photos
| to improving medical imaging", it seems like _accuracy_ may be
| more valued than "looking realistic".
| jeffparsons wrote:
| > improving medical imaging
|
| Spot on. This is not "zoom, enhance" -- it is "fabricate
| plausible detail based on a training set". Using it for
| anything other than making pictures look nice would be
| disastrous.
|
| Other people in these comments are talking about using it for
| law enforcement. Train it on a bunch of pictures of black
| people holding guns and now suddenly it will "reveal" guns in
| the hands of all black people in blurry CCTV footage. (This
| specific example is likely a little simplistic to be a
| problem in reality, but it demonstrates the problem of
| thinking it actually reveals some hidden detail.)
| catsncomputers wrote:
| There are a _couple_ of originals here to compare:
| https://iterative-refinement.github.io/ I would like to see
| more though!
| ALittleLight wrote:
| My understanding is that they present people with two images,
| the real original high-res image, and the upscaled image, and
| ask "Which of these is the real image". The extent to which
| people are confused demonstrates how good the algorithm is. If
| it was perfect, and there was no difference between the
| original and upscaled then you'd expect people to pick each
| image about 50% of the time randomly.
| nathanvanfleet wrote:
| This is frustrating that it's not showing the source image a a
| high DPI, the reduced image, then the result image. Some of the
| faces seem a bit off, but I imagine they are all pretty far off,
| but I don't know these people or the images. Still impressive of
| course.
| temp8964 wrote:
| This can be really useful in criminal investigation.
| stephen_g wrote:
| Not much. What it's doing was quite well summed up by
| _jeffparsons_ in a different comment thread - it 's
| "fabricating detail based on a training set".
|
| This system makes an image that looks _better_ than a low res
| image, but it doesn 't necessarily make it look more like the
| original image.
| Baeocystin wrote:
| This is generating a fantasy face. To use this kind of AI-
| generated content as if it was factual would be terrifying.
| visarga wrote:
| Same with human memory recall, and that is used in trials as
| evidence
| taeric wrote:
| How? We don't trust machine learning to match faces, why would
| we trust it to recreate faces? Genuine question.
| mark-r wrote:
| Only a fool would believe these reconstructions were real.
| Unfortunately fools are more common than we would hope.
| Baeocystin wrote:
| It is genuinely alarming to me how many people in this
| thread are saying that this will be a boon for crime
| investigation, medical imaging or the like.
|
| To spell it out: This is not re-creating what was actually
| present in the original image. That is, and will always be,
| impossible due to fundamental limits of information in the
| source seed. What it is doing is using AI Hallucinations to
| create a believable-looking fake.
| modmans2nd wrote:
| It's probably a tool that could be used for generating a
| likely appearance for helping to find a person to
| interrogate, similar to eye witness sketches. I would hope
| it's never allowed at trial though.
| evan_ wrote:
| Prosecutors would need to be able to defend every element
| they use to gather evidence in court. Nothing downstream
| from inadmissible evidence can be admissible in court.
| "Fruit of the poisonous tree."
| lanternfish wrote:
| This tech is probably less error prone than the mentioned
| eye witness sketches
| taeric wrote:
| It is almost certainly as bias trained. Such that if it
| is not stupid iterative with the witness, it is probably
| less likely to converge.
| renewiltord wrote:
| No. Actually, that's not true. If you upscaled a picture
| of a drug deal and then used that to narrow your search
| set and then posted a watch on that set to catch them
| dealing drugs, you've got a useful case. Totally
| admissible.
| modmans2nd wrote:
| For identifying persons of interest from bad footage perhaps.
| Not so much for positive identification.
| aetherspawn wrote:
| Need to be careful what data the AI model is trained on. i.e.
| it could bias the features of a particular race, causing
| those people to be held as suspects more often.
| shannifin wrote:
| I could not help but be reminded of the "enhance it!" trope
| common in crime shows:
| https://www.youtube.com/watch?v=Vxq9yj2pVWk
| abacadaba wrote:
| :waits patiently for all the people that said this was
| impossible to recant:
| mark-r wrote:
| Oh but it IS impossible, that hasn't changed. We're just
| getting better at generating believable fakes.
| Baeocystin wrote:
| This is less upscaling and more using a seed to generate a
| believable high-res image. Which is interesting in and of itself,
| but I find myself mostly wondering how much variation you can get
| from the same starting seed.
| hencoappel wrote:
| Isn't what you describe what all upscaling does? You're trying
| to add information that isn't there.
| npteljes wrote:
| Hard to point out the difference, but I feel it too.
|
| It's adding information either way, I agree. The difference
| is that the old algos used information from the image itself,
| and this one uses information from a lot of other images.
| dwd wrote:
| So, we all end up with a blend of celebrity features if that is
| a large proportion of their training data.
|
| I would be interested to see what it does with Doom guy, as
| mentioned in the OP comments.
| dotancohen wrote:
| The fine article shows the low-res input and high-res output
| photos, but conspicuously does not show a high-res original from
| whence the low-res input was derived.
|
| Without comparing a high-res original photograph to the high-res
| output photograph, we do not know if this fine technique is
| capable of producing nice-looking high-res imagery, or if it is
| capable of reproducing how an image of the subject would have
| looked like had it been taken in higher resolution.
|
| In other words, does the output of the technique match the actual
| object in the photograph?
| Ballas wrote:
| That is indeed a shortcoming of this article in my opinion as
| well. If you want a comparison to the original high res photos,
| there are some examples in the original paper for SR3:
| https://arxiv.org/pdf/2104.07636.pdf Have not had a look at the
| CDM paper.
| tcpekin wrote:
| Fig. 9 of this paper is really interesting if you zoom in. It
| looks like if the model was not trained on the appropriate
| class label, it just goes completely off the rails. As
| previous commenters have noted, I would be very hesitant to
| use this for anything analytical, or where you are looking
| for something unexpected. For faces though, this is amazing.
| zepn wrote:
| > where you are looking for something unexpected
|
| I'd be extremely sceptical of its use on medical images for
| this reason.
| sandos wrote:
| What strikes me is that the jaguar (?) got "kind" eyes with
| SR3, and "mean" eyes with the other algoritms, including the
| original!
| choeger wrote:
| Of course it does not. The model generates "believable" images
| not exact ones.
| savolai wrote:
| The point is, it's helpful if the reader can evaluate just
| how faithful the transformation is vis-a-vis the original.
| amitport wrote:
| Helpful, interesting, yes. BUT not a real critique on this
| paper which specifically does not claim anything regarding
| similarity with the original.
|
| This happens a lot with junior peer reviewers, they focus
| more on what the paper isn't or could be instead of what's
| actually there.
| thrdbndndn wrote:
| There is an app called "face app" or whatever that already
| did pretty good job upscaling people's face using state-of-
| the-art AI upscaling.
|
| The result is impressive, but the moment you started to use
| it on someone you're actually familiar, it becomes weird very
| quickly for the obvious reasons. The teech, for example, are
| never right.
|
| This kind of "believable but not truthful" results are
| rampant in all these machine-learning based tools. It's not
| very harmful in case of upscaling a few photos I guess, but
| I've been bitten by it in an acclaimed translation service
| called DeepL. I use it to translate Japanese to English
| frequently, and have found that it often (nontrivially) made
| up sentences that don't exist in the original paragraphs,
| sometimes have the opposite meanings, or totally ignore part
| of the text to make the result "more fluent". And unlike
| traditional translation tools, they are very hard to notice
| if you know nothing about the original language. I have to
| from time to time use some more "primitive" translation
| tools, and compare the results side by side, to avoid such
| issues. It's frustrating.
| dorkwood wrote:
| This isn't obvious to regular people, though.
|
| I can recall seeing a conspiracy get traction on Twitter a
| few months back, where it was claimed that a photograph of a
| famous person was actually a body double. Someone used an ML
| upscaler to "enhance" the image, and their followers began
| scrutinizing the result: "The teeth are different!", "The
| nose shape is wrong!", "It's not the same person!"
| actually_a_dog wrote:
| How do you expect an algorithm to create information out of
| nowhere to fill in these details exactly as they were in the
| source photo?
| nayaketo wrote:
| If this AI can't produce upscaled photos that are close to
| original photos, the application of this tech is severely
| limited. There's not going to be CSI like "enhance" moment in
| real life like the article claims.
| actually_a_dog wrote:
| You really don't see any use for this if it can't create
| information? CSI "enhance" is, and always has been
| impossible.
| helsinkiandrew wrote:
| > In other words, does the output of the technique match the
| actual object in the photograph?
|
| Probably a lot in some cases and a little bit in most others. I
| wonder how long before this gets used in court by an
| incompetent prosecutor.
| machinelearning wrote:
| 1. Maybe I'm guilty of moving goalposts but super-resolution of
| faces isn't that 'Jaw-Dropping' after the recent GAN work that
| showed that you can create hyper-realistic synthetic faces from 0
| input to guide it.
|
| 2. There are certain portions of the image that clearly do not
| contain enough resolution to be reconstructed satisfactorily.
| E.g. teeth, skin imperfections. I wonder how well a person would
| react if their teeth were either messed up or "fixed" by "the
| AI".
| srathi wrote:
| CSI: NY was way ahead of its time! :-)
| oh_sigh wrote:
| When would anyone use this? When they want to super zoom in on
| something? I think a more useful photo upgrading tech would be
| trying to un-blur shots and adjust lighting.
| lmilcin wrote:
| No, it does not provide sci-fi abilities to "enhance" resolution
| end extract new details.
|
| Because those details are generated by AI.
|
| For example, the woman in the photo might have different teeth in
| reality. We can't learn anything about her teeth because the
| teeth in the generated photo are one of many possible solutions
| that match the input.
|
| Actually, the photo now has less information for practical
| purpose as you don't know which details are real and which have
| been manufactured.
|
| So about the only gain is to improve the photo for aesthetic
| reasons.
| scotty79 wrote:
| Can I get such upscaler software somewhere to try it out?
| daniel_iversen wrote:
| What's the best commercial or open-source software for photo
| upscaling these days? It would be so wonderful to breathe new
| life into very old family photos!
| zimpenfish wrote:
| Pixelmator Pro[1] does a pretty good job with its "ML Super
| Resolution". Apparently Adobe have a similar "Super
| Resolution"[2]. One of the VQGAN-CLIP notebooks uses ISR[3]
| (but I haven't managed to get that working locally yet because
| of weird tensorflow version requirements.)
|
| [1] https://www.pixelmator.com/pro/ [2]
| https://photographylife.com/reviews/adobe-super-resolution [3]
| https://github.com/idealo/image-super-resolution
| [deleted]
| prawn wrote:
| This would be good to know. Last week I had a job photographing
| whales with a drone. Usual legal distance is 300m but I had a
| permit to photograph from 80m. Meanwhile, I suspect the clients
| would want results that looked even closer. Being able to
| upscale the waves and whale details might actually work pretty
| well in software - it just has to look like a whale up close
| and not necessarily the exact whale photographed.
| achow wrote:
| Just for upscaling (without cleaning up or enhancing noisy or
| blurred pictures), I use icons8.com/upscaler.
|
| For enhancing images Remini works very well for human face
| enhancing - sharpening and filling missing details of noisy
| blurred images.
| prawn wrote:
| Thanks - I'll give that first URL a shot.
| pcurve wrote:
| Wish they had actual high res image for comparison.
|
| Looks like a great way to save bandwidth for video conferencing
| calls
| shaklee3 wrote:
| They do in the paper
| pcurve wrote:
| Thanks, I found it. https://iterative-refinement.github.io/
| ackbar03 wrote:
| but at cost of increased computation. I doubt the upscaling
| computation is able to process at an acceptable frame rate
| Causality1 wrote:
| I found the compounding errors quite interesting, especially with
| the dog. The pixel changes originally caused by diffraction of
| light around the edges became a quite distorted skull shape with
| a rounded muzzle that resembled a poorly-done taxidermy job. The
| original photo of the line of teeth with a single dark spot is
| transformed into a bizarre serpentine line of teeth that would
| never exist in real life.
| pgt wrote:
| Wow, this is basically deconvolution. Can't wait to hear this
| applied to reverby audio. Reverb is basically blurring
| ('smearing' of sound) in the audio domain.
| kumarm wrote:
| Original Google Blog Post:
| https://ai.googleblog.com/2021/07/high-fidelity-image-genera...
|
| Probably better to use the original link.
| [deleted]
| alasdair_ wrote:
| > improving medical imaging.
|
| This can be dangerous. A lot of medical imaging deliberately
| avoids using any kind of lossy compression due to worries about
| artifacts in the image. Actually adding new pixels that are not
| in the raw image seems especially worrying.
| gpt5 wrote:
| This would depends on the false positive / false negative rate.
|
| Depending on these numbers it could be used as a screener test
| for example, where it is used before a more invasive test is
| done.
| kongin wrote:
| > This would depends on the false positive / false negative
| rate.
|
| I'm not a doctor, but I am a physicist and former pro-
| photographer, what is noise and what is signal in an
| experiment whose output is an 'image' has nothing to do with
| what makes a photo look good to human eyes. Often the whole
| point of methods of visualization is to make the image look
| objectively bad so you can easily pick out the areas of
| interest by the fact they are an eyesore. Applying upscaling
| to those images will actively destroy vital data.
| TeeMassive wrote:
| In depends on the way it is used. If it's used knowingly and as
| a last resort effort just to make sure that nothing is there
| then I don't see the problem.
| smt88 wrote:
| That means it should only be used to detect false negatives,
| not false positives.
|
| I'm not sure I trust people to maintain that discipline.
| tsimionescu wrote:
| Not sure what you mean, but if you're envisionign something
| like 'doctor looking at original, doesn't think there's
| anything wrong, but then checks upscaled image as well,
| just to be sure' then that is very dangerous, as it can
| lead to a significant increase in unnecessary testing.
|
| It may not be exactly as dangerous as the opposite (doctor
| looks at image thinks there is something suspicious, checks
| upscaled image to see if it's there as well), but it's
| still very dangerous.
| rubatuga wrote:
| Wait, you mean you don't want a doctor using hallucinated
| images to treat you? Can't wait for deepdream to be applied to
| chest x rays. /s
| forgingahead wrote:
| Sadly, pharma companies & hospitals will probably prefer
| these types of images being used: "Oh more likely than not,
| there is something there - let's start you on this long-term
| course of expensive medication!".
| otabdeveloper4 wrote:
| No biggie, just get an AI diagnosis classifier to look at
| your AI upscaled medical images.
| eurasiantiger wrote:
| Now you know the ethnicities of your patients, but have no
| idea whether they have cancer.
| 0-_-0 wrote:
| Actually, just train the classifier on the raw data. Cut
| out the middleman.
| AndrewKemendo wrote:
| I'd agree with this. This is a great example where I wouldn't
| put this into a critical production system.
|
| Up-sample your Tinder photo? Sure.
|
| Look for a sarcoma or bulging disk? No
| TeMPOraL wrote:
| I worry such funny algorithms find their way into _hardware_
| and start causing chaos in science and engineering. People do
| rely on COTS measuring equipment for a lot of important work,
| and there 's a tacit assumption that the equipment tries to
| reflect reality.
|
| I've mentioned this before[0], so quoting myself:
|
| "for example, a research team may decide to not spend money on
| expensive scientific cameras for monitoring experiment, and
| instead opt to buy an expensive - but still much cheaper - DSLR
| sold to photographers, or strap a couple of iPhones 15 they
| found in the drawer (it's the future, they're all using iPhones
| 17, which is two generations behind the newest one). That's
| using COTS equipment. COTS is typically sold to less
| sophisticated users, but is often useful for less sophisticated
| needs of more sophisticated users too. But if COTS cameras
| start to accrue built-in algorithms that literally fake data,
| it may be a while before such researchers realize they're
| looking at photos where most of the pixels don't correspond to
| observable reality, in a complicated way they didn't expect."
|
| --
|
| [0] - https://news.ycombinator.com/item?id=26451691
| systemvoltage wrote:
| It's like trying to search for new galaxies and celestial
| bodies using homemade telescope + Google AI Photo Upscaling
| service. Facepalm.
| kumarvvr wrote:
| Yeah. I am amazed when I see doctors seeing an xray cat or mri
| images and look at some haze somewhere and diagnoize the issue.
|
| Imagine that thing being removed or enhanced by some algorithm.
|
| Also, why the heck would medical images want to be upscaled?
| jfoster wrote:
| This is probably an example that the writer came up with. I'm
| very sure that the people who work on this are well aware that
| the details it fills in may not match reality.
| KingMachiavelli wrote:
| Right. This technology is very dangerous if used to compress &
| then 'uncompress' medical images. I used to be a bit more
| cautious but I think if the model was specifically trained on
| x-rays or some type of medical images, it could do a very good
| job. I think the original image should always be shown in
| addition to the AI upscaled image. Having both the original
| plus a AI upscaled image that is 'correct' 90% of the time
| could be very useful.
|
| When it comes to things like distinguishing a shadow on a scan,
| I think AI might actually be better 'detecting' whether
| something is a real shadow or just very similar to a shadow. I
| think it's just one of those things where AI up-scaling
| improves stuff ~80% of the time but is worse the other ~20%.
| The fundamental issue may become the same with self driving
| cars; people trust the AI too much and become inattentive
| themselves.
|
| While you certainly can't add 'correct' information that
| doesn't already exist in an image, the upscaling could
| correctly make existing information more obvious. Assuming that
| the human brain functions pretty much like AI (or rather the
| opposite) then at some point AI will become as competent which
| means that eventually with enough training & tweaking it should
| be as good or better than having a second human perspective.
| lathiat wrote:
| Reminds me of some Xerox copiers that actually were changing 6s
| to 8s with their compression:
| https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...
| JimDabell wrote:
| Not to mention accidentally adding Ryan Gosling's face to a
| photo!
|
| https://petapixel.com/2020/08/17/gigapixel-ai-
| accidentally-a...
| scoopertrooper wrote:
| Accident or a vast conspiracy?
| forgingahead wrote:
| Or upscaling President Obama to become a Caucasian person:
|
| https://twitter.com/Chicken3gg/status/1274314622447820801
| choeger wrote:
| That's definitely an enhancement.
| pfortuny wrote:
| Either you have the information or you do not. Interpolation
| (of whatever type) is always adding "guesses". So: not for me
| thanks.
|
| Pretty scary stuff.
| [deleted]
| FranksTV wrote:
| "Enhance."
| potamic wrote:
| What's ironic is that all those lame shows turned out to simply
| be way ahead of their time. Who's laughing at those memes now?
| [deleted]
| NAG3LT wrote:
| It will go from comedy to tragedy, as somebody will
| eventually get arrested and even convicted based on high
| quality picture of their face upscaled from 16x16 noisy mess
| of pixels.
| neilv wrote:
| What happens if you take an image of a portrait painting, reduce
| the resolution to pixelate at whatever resolution this upscaling
| model prefers, then run the model?
|
| Will the resulting image appear even more realistic than the
| painting?
___________________________________________________________________
(page generated 2021-09-01 10:01 UTC)