[HN Gopher] RealFill: Image completion using diffusion models
___________________________________________________________________
RealFill: Image completion using diffusion models
Author : flavoredquark
Score : 570 points
Date : 2023-09-29 18:27 UTC (1 days ago)
(HTM) web link (realfill.github.io)
(TXT) w3m dump (realfill.github.io)
| CrzyLngPwd wrote:
| Creating a fake life is going to be so easy soon.
|
| Everyon will be able to make all of the other fakes on social
| media jealous with ease.
| pbhjpbhj wrote:
| Can we cryptographically sign a photo in a way that shows it
| was generated in a particular place? I'm thinking of some sort
| of beacon in a location that allows you to say this person was
| here, at least. I'm not sure if it's possible to go beyond
| presence and indicate anything else about the situation?
|
| I hesitate to say it, but a blockchain is probably part of the
| solution.
| amelius wrote:
| There is this trend going on where hardware vendors are
| increasingly locking down their hardware, and this could be a
| part of the solution you are looking for. Not everyone will
| be happy about it, however.
| lopatin wrote:
| These github.io pages always go down once they hit the front
| page.
| londons_explore wrote:
| Works fine for me...
|
| some github.io pages are iframes to the developers home machine
| or something for a tech demo that can't withstand many users.
|
| But regular github.io static pages ought to be able to
| withstand millions of users at once.
| lopatin wrote:
| I see, I think they're just broken for me in general.
| londons_explore wrote:
| Might be blocked on some corp networks because it's all
| anonymous user generated content.
| callalex wrote:
| Images are broken for me.
| Squarex wrote:
| It appears to work ok and I've never witnessed problems with a
| static content on github pages.
| markjpT wrote:
| [flagged]
| a1o wrote:
| Images aren't loading for me which is kind of a bummer for this
| specifically... :/
| EricMausler wrote:
| I feel like this will be great for wedding photographers
| dang wrote:
| [stub for offtopicness]
| aga98mtl wrote:
| I do not agree with their usage of the word "Authentic".
| emodendroket wrote:
| Perhaps verisimilar then.
| crazygringo wrote:
| The point is that it's not based on hallucination -- it's
| generated out of the authentic details provided from other
| images.
|
| There's definitely a middle ground here that we perhaps don't
| have a good word for. E.g. what do we call a painting made by
| an artist who sat in front of the scene they depicted, vs. a
| painting made by an artist from their imagination? There's
| certainly some sense in which the first one was an
| "authentic" scene.
| thomastjeffery wrote:
| Here are some better words of the top of my head:
|
| Intentional
|
| Contextual
|
| Everything about this project goes against the meaning of
| authenticity.
| CobrastanJorji wrote:
| Yeah, except it's still absolutely vulnerable to
| hallucination. Look at the last set of images on
| "Limitations" page. The algorithm knows that there's a sign
| with text there, and it uses the original image to get the
| right letters in there, but it randomly reorders the
| letters rather than using the source image. "Real" and
| "authentic" is extremely misleading here.
|
| That said, props to them for calling out the limitations so
| clearly. I really appreciate it when people are up front
| with the problems like that.
| bhaney wrote:
| Cool tech, but plastering "authentic" all over this kind of
| generated photography is really disingenuous and just rubs me
| the wrong way. I get that it's informed by real details from
| other photos, but that's not what authentic means.
|
| If I buy an "authentic Rolex" and receive a Chinese Rolex clone
| that's built similarly based on observations of a real Rolex,
| I'm going to feel scammed and very upset. And I'm much more
| protective of my memories than I would be of a watch.
| dang wrote:
| Ok, we've taken authenticity out of the title above.
| 101008 wrote:
| Yeah, I think the first example is bad. This shouldn't be
| used for the photos you took. What's the purpose of having a
| photo if it wasn't the real moment you captured? I could
| understand the usage in marketing or event photographies, but
| for memories with your loved ones (as the first example tries
| to show it) it just doesn't make sense to me.
|
| Two anecdotes:
|
| 1. A friend of mine met his favourite author (traveled from
| one continent to another for a signing event). When he shaked
| hands with the author, a friend took a photo. A lady (still
| hated by us!) step in the middle, and blocked the photo.
| Maybe an IA or a talented person could remove her, use a
| footage photo of the author and rebuild the photo... but why?
| What's the purpose of that?
|
| 2. A few months ago during the pandemic I scanned all the
| printed pictures of my grand parents with my phone. Aftre
| scanning like 200s, I checked one and I zoomed in: the stupid
| app applied some IA to make it better and it just was worse.
| I don't care if it looks better for the untrained eye: my
| grandparents didn't look like that. I now have stupid
| horrible verson of the scanned photos, where my grand parents
| appear with smooth skin and weird eyes.
| nuancebydefault wrote:
| Is IA French for AI? (Like UE and many other
| abbreviations)? I could look it up but might as well ask
| the question.
| nargek wrote:
| Yes it is !
|
| IA -> Intelligence Artificielle
| smcnally wrote:
| In Spanish, too -- and other subject-object-verb
| languages
| IanCal wrote:
| I totally agree with 2. I'm less sure on 1. Imagine it's
| _perfect_ - it would be an accurate representation of what
| was really there. The real photo is a snapshot of a very
| specific time that doesn 't represent the broader context
| of what happened.
|
| A different angle, if a friend had painted the encounter
| instead, it wouldn't be exact but it would be a snapshot of
| a memory.
|
| I'm not hugely arguing in favour of it but I think there's
| different scales here, from cameras doing "merge pictures
| half a second apart so people have their eyes open" to
| "totally change their face".
| neilv wrote:
| They really need to not use the term "authentic" to name
| this.
|
| They also need to be very, very careful when introducing
| capability to falsify photographic images convincingly.
|
| Using the term "authentic" for this (and how do they even
| know what's an authentic memory?) doesn't sound like being
| very, very careful. It sounds like being gratuitously
| reckless.
| cmdli wrote:
| I would argue that authentic is a relative term, and actually
| helped me understand the product more easily. IMO, it is
| "authentic" because, compared to other image fills, it tries
| to fill in the data using real data from other photos.
| sdfghswe wrote:
| HN loves arguing about words.
| debugnik wrote:
| And recent HN posts love to twist them and reinterpret
| them just for promotion.
| marricks wrote:
| It's "authentic" in the same way that when you see
| something labeled authentic it makes you more likely to
| question if it's actually what it says it is because
| authentic thing don't need such labels plastered on them.
|
| Regardless, I'm pretty sure "reconstructed" it the honest
| word to use.
| HaZeust wrote:
| I'd call that "contextual" rather than authentic.
| waynesonfire wrote:
| let me give you an example. when i draw a mustache on a
| face in ms paint in brown, that's contextual but not
| authentic.
| endisneigh wrote:
| IDK, when I think authentic, I think "genuine", and no
| image generation is genuine by definition. this is not a
| bad thing necessarily, but it's important to frame these
| things correctly.
|
| ultimately we oughta think about what we are referring to.
| if we are talking about a photograph taken by someone, the
| authenticity is ultimately coming from the combination of
| the photograph and camera used. so when you think of a
| genuine photo in this scenario you expect it to be
| fundamentally taken by the user by a particular camera to
| create a particular photograph. you can use devices to take
| a photo without pressing the button, such as a timer, but
| the photograph and camera are both fundamental to the
| authenticity of the image. if the camera is no longer
| entirely involved in the generation of the photograph I
| would say that it is no longer genuine.
|
| Reference driven as described in the article is more
| appropriate, but alas it is verbose. normally such pedantry
| bores me, but in this case it's pretty tantamount to what
| it is being presented in this case.
| tremon wrote:
| I think "composite" would be more accurate to describe
| this process. As in, "complete a picture using image
| composition".
| jameshart wrote:
| We need a new word: authentish.
| smcnally wrote:
| Like "truthy."
| tremon wrote:
| How do they know the data from the other photos is real?
| thomastjeffery wrote:
| This literally goes against the meaning of the word
| authenticity.
|
| Call it "realistic". Words matter.
| Timon3 wrote:
| "Realistic" is the wrong word, since that's what infill
| models are already doing, and the word is already used for
| that. You'd have to find something that differentiates
| between plausibly realistic and contextual realistic
| infill.
| QuercusMax wrote:
| Seems like it's only a matter of degree, given that modern
| cell phone cameras take image bursts and combine them into a
| single output image. Filling in details in a scene from other
| photos taken at the same time doesn't really seem that
| different to me. And seeing that photography has never really
| been capturing real life exactly, is it really that big a
| deal? Look at Ansel Adams - he heavily edited his "real-life"
| photographs, and changed them over the years as he made
| subsequent prints.
|
| (Disclaimer: work for Google but have nothing to do with this
| project.)
| drewcoo wrote:
| > plastering "authentic" all over this kind of generated
| photography is really disingenuous
|
| No more so than "virtual," which used to mean "true." Or
| "literal" which used to be the opposite of "figurative." It's
| just another word being used auto-autonymically.
|
| Definitio fugit.
| thomastjeffery wrote:
| Virtual never meant real.
|
| Literally is often used _in a sarcastic context_. That
| sarcasm _depends on_ the word meaning what it means.
| esafak wrote:
| Come on, Google, push this to Google Photos.
| syntaxfree wrote:
| There was a subreddit called something like r/bubbling where
| people would edit pictures of women in bikinis and actually cover
| more of the image, but in such a way that your brain was fooled
| into completing the image and seeing a nude woman. I thought it
| was a technical marvel, however creepy.
| kaetemi wrote:
| I have a ton of potato definition videos along with matching high
| res photos from my childhood that were made by one of those cheap
| CF-card cameras at the time. Would be cool if this could restore
| those shitty video frames based on the reference photos as well.
| 01100011 wrote:
| Slightly off-topic: what's the best way right now to remove my
| ex-wife from an old family portrait and replace her with my
| current wife?
| tremon wrote:
| Scissors and a Pritt stick.
| Adverblessly wrote:
| Assuming you are asking about a generative AI way, you could
| use photos of your new wife to train a LoRA with kohya-ss, then
| with A1111 you could do an img2img repaint using the ControlNet
| extension to make sure you get a similar pose. With enough
| experimentation you could probably get at least one decent
| result.
|
| At least that's what comes to mind with the things I know you
| can run offline.
| [deleted]
| pcblues wrote:
| Give it five years for the tech. Right now? Probably easier to
| get back with the ex to make the portrait correction.
|
| /jk sorry
| markjpT wrote:
| [flagged]
| true_religion wrote:
| I know someone who did something similar. He remarried then
| went back to and cropped or deleted ten years of Facebook
| photos to make it look like he never had a previous
| relationship and just ten years of boys nights.
|
| He even has a picture up of him from his wedding day...
| standing alone in a tux.
| solardev wrote:
| Ask everyone to pose for a new photo
| markjpT wrote:
| [flagged]
| bradleyjg wrote:
| The kind of stuff the op is doing---changing the composition to
| reflect a picture that could have been taken---is one thing.
| But what you are asking feels Stalin-esque to me. A picture is
| a record of a point in time and you can't change the past.
| true_religion wrote:
| This is why my family portrait is going to be a painting. In
| the future, you can just paint in the new generations and we
| can all be in the same frame together.
| joosters wrote:
| _A picture is a record of a point in time and you can't
| change the past._
|
| I don't think either of those things are true. Both can be
| changed, and are often changed. Much of what we 'know' of the
| past is wrong.
| solardev wrote:
| Sure you can, just as you can change people's memories and
| implant false ones. Hell, in this dystopia we're headed
| towards, it'll probably be a subscription service where you
| can rewrite 5 bad memories a month for $29/mo
| syntaxing wrote:
| Something...seems fishy? Like the example with the guy next to
| the robot figure. Their model happened to predict exactly the
| same type of figure?! Diffusion models are not omnipotent...
| foota wrote:
| The model gets the reference images as "context", so it can
| just copy the robot from one of the other images.
| syntaxing wrote:
| Ahh I see, this makes a lot more sense now!
| IshKebab wrote:
| That's the entire point. It didn't "happen" to predict exactly
| the same type of figure. It used the context photos to know
| what type of figure it should render.
|
| You might be getting a bit confused because here the training
| process has to happen every time you use it, whereas in most AI
| applications you only perform inference for actual use.
| bjornlouser wrote:
| I wonder if he is holding that umbrella to aid the model in
| recovering the 3d scene/scale from the reference images.
| Dwedit wrote:
| How much VRAM (or system RAM) would you need to run this, and how
| much processing time does it take to process the reference images
| and let it generate the fills?
| derefr wrote:
| Question: does this model only do outpainting, or does it also do
| super-resolution? Could this model be fed all the frames of a
| really awful security-camera video, in order to then synthesize a
| high-resolution still image of a suspect?
| pbhjpbhj wrote:
| And you can tell it in advance which person you want it to
| reveal as the suspect!
| endisneigh wrote:
| an interesting use case for this once the compute is there is to
| basically allow for ai powered digital zoom-out. it could work by
| instructing the user to take several pictures around the target,
| and then you take regular pictures of your subject.
|
| then, as you like, you can do an "ai zoom out" to get zoomed out
| pictures, no longer constrained by your lens or distance.
|
| I imagine this to be included relatively soon, just like how
| panaromas were once a niche thing that became much easier to do
| with some good ui/ux. pretty much any modern phone can do them
| without having to struggle with lining up photos and what not.
|
| one thing that does greatly concern me about the demo/site is
| that they have "authentic" and "recover" as terms. the result
| here is not authentic nor has anything been "recovered." it's an
| illusion at best. I personally don't like how they portray the
| new image as being equivalent as if the lens framed it in the
| original picture. it's not, as they show themselves in the later
| portion (near the end) with the text sign. seriously
| irresponsible framing (pun intended) to what's otherwise very
| cool tech.
| gs17 wrote:
| Agreed, the "Reference-Driven Generation" part is totally fine,
| but "authentic" is overselling it.
| richardw wrote:
| Nice idea. Might not need multiple pics given Google's image
| dataset and ability to recognise what you're looking at.
|
| Give that a couple generations. "You were at location X and
| didn't take a pic. We generated you some selfies, choose one
| that you like."
| cvwright wrote:
| If they wanted to, they could find real pictures of you,
| taken by other Google users who were there at the same time.
|
| I don't know if that's more or less creepy than the AI
| stuff...
| satvikpendem wrote:
| This already exists in Stable Diffusion and others, called
| outpainting.
| markjpT wrote:
| [flagged]
| Anya200 wrote:
| [dead]
| rasz wrote:
| Recover those precious memories of things that never happened,
| only with Google!
| nuancebydefault wrote:
| Somewhat covertly I deep down wish that human's desire for pretty
| looking pictures will fade away over time, due to the ubiquity of
| pretty looking pictures produced by auto post processing. And at
| ultima their liking of pretty people and shiny new stuff in
| general. I don't want to sound negative or pedantic, I just would
| like that people prefer inner beauty in the broader sense.
| hansoolo wrote:
| This is a beautiful post! Thank you!
| pedalpete wrote:
| Mix this with some nerf/guasian spat, or other 3D rendering and
| we have photos where you can re-frame after being shot. No more
| selfie sticks, perhaps use both cameras at one time to capture
| more of a scene for infill.
|
| Some will say "but that isn't a real photo of what was there",
| but our memories of what was in a photo or a scene aren't perfect
| anyway.
| drumttocs8 wrote:
| I like the idea of a 3d generated scene we can explore with VR
| debarshri wrote:
| The current advancement in Generative AI is a bit scary, in my
| opinion. May I be pessimistic?
|
| This and the new demos I saw from WhatsApp's new demo around
| persona-based AI can really alter someone's perception and
| memories. I don't think we are considering how it can really
| impact our understanding of our feelings, perception, memories
| and mindfulness.
|
| If you take a picture of reality and alter it with Gen AI to do
| something else and change the moment, what is the new reality?
| After a while, we might question if it was real or not, and then
| that might just become the new reality.
|
| In my opinion, GenAI is truly transformational as well as scary,
| as it can alter our perception. I wonder if anyone else feels
| this way.
| theultdev wrote:
| Nobody shows their true reality in public pictures anyway, it's
| all staged in some way.
|
| For private pictures, it didn't change your reality, you can
| lie to yourself, but you've always been able to do that.
| debarshri wrote:
| but when you take a picture that capture personal moment and
| some software without your consent alters it with some
| generative stuff, what would that lead to?
|
| I disagree with lying to yourself. For people who are not
| mindful and aware, this is severely impact their perception.
| theultdev wrote:
| > but when you take a picture that capture personal moment
| and some software without your consent alters it with some
| generative stuff, what would that lead to?
|
| I mean, do you not look at the photo after you take it?
| Even if you don't, you were there and saw the original
| scene. If your memory fails you, it's on you. If you didn't
| take an accurate picture, it's on you. Check next time.
|
| If anything meaningful is added, it'll be very noticeable,
| if it's not meaningful, then what does it matter?
|
| Cameras already do a lot of corrections that don't
| represent reality.
|
| Hell, our perceptions of colors is different than everyone
| else's.
| deckar01 wrote:
| I have been working on a holographic camera, but the ultra-cheap
| pinhole cameras I chose for the array have two issues: the
| exposure can't be controlled and the lenses are poorly aligned. I
| can calibrate away most lens aberrations with OpenCV, but some of
| the outliers have so much cropping that I am discarding 75% of my
| good pixels to get a coherent result. I was considering using
| NeRFs to reproject the ideal camera angles, but COLMAP is not
| very tolerant of brightness fluctuations and NeRF training is
| relatively slow (considering my goal is video). This would be a
| nice solution to my problem, because I have a comprehensive set
| of angles to pull context from.
| waynenilsen wrote:
| "Comparison with Baselines" is shocking
| uptown wrote:
| Is this similar to what GoPro cameras do to remove the selfie-
| stick? They use video content from adjacent frames to remove the
| pole and fill in with pixels. I get that the approach here can
| use imagery that's frames completely differently.
| Jorge1o1 wrote:
| Wow. The use case that comes to mind for me is when you take a
| big family photo (or 20) and someone inevitably ends up cut-off
| by accident.
|
| So then you just feed RealFill the 20 pictures you took and your
| uncle is magically painted in.
| xwdv wrote:
| You don't even need to take the photo, with enough images of
| each family member and images of a tourist destination you can
| just automatically construct a photo of everyone together at
| the location, saving the costs and carbon footprint of getting
| everyone together.
| cubefox wrote:
| And then why demand "photos" of family excursions at all,
| when it is just an AI imagining how things probably were
| happening at the time, or would have happened? We should just
| stick to our own imperfect memory.
| therein wrote:
| I'd imagine in the future we could have services such as
| this one:
|
| > In exchange of a small fee and a 35 minutes suggestion
| session, get you and your family implanted with memories of
| a beautiful vacation that'll last you for a lifetime for
| fraction of the cost of an actual one.
| TheJoeMan wrote:
| That reminds me of https://hackaday.com/2023/06/02/ai-camera-
| imagines-a-photo-o...
|
| A box that takes your gps location, weather, etc and
| autogenerates a photo from your PoV.
| jetrink wrote:
| Also getting everyone smiling with their eyes open at the same
| time. Phone cameras could record a group photo for five or ten
| seconds and use the best expression from different times for
| each person.
| emodendroket wrote:
| Pixel phones already have some features kind of like this so
| it makes sense.
| lazycouchpotato wrote:
| I feel like this is already a thing with certain photo
| editing applications, if not built into phones themselves.
| AuryGlenz wrote:
| I'm a photographer that does families/weddings.
|
| I've done this manually in Photoshop more times than I can
| count.
|
| Usually more automated solutions only hold up to light
| scrutiny, but that's rapidly changed in the past year. I'm
| sitting after this year and I'm a little miffed about it.
| Oh well.
| kennyadam wrote:
| You're sitting?
| patapong wrote:
| Or you take a single picture of a group in front of a
| monument, but cut it off. As I understand it you could find
| pictures of the monument online, run the model, and have a
| picture with the group and the entire monument.
|
| Probably google can even do this automatically - I would not
| be surprised if I get suggestions to fix images with cut off
| buildings via Google Photos in the future! Would be so cool.
| twism wrote:
| From the leaks this may be coming to the Pixel 8
| twism wrote:
| https://www.theverge.com/2023/9/23/23886765/google-
| pixel-8-p...
| ChrisClark wrote:
| Leaks? Wasn't this a launch feature in Google Photos, while
| it was still Google+ Photos?
|
| It was supposed to adjust eyes to open them if you took
| multiple photos.
| twism wrote:
| That's "Top Shot" which is the entire frame. The feature
| I'm referring to would adjust multiple faces in a frame
| by selecting the same faces/sections from different
| frames to a single target frame
| ChrisClark wrote:
| I swear that's what was announced, but I assume you're
| right because you actually know the term Top Shot, and I
| had no memory of that.
|
| So your memory is probably better than mine. :)
|
| I just remember some demo of a family shot and it
| automatically opening a little boys eyes by using another
| photo. And another auto combining of images so that you
| could take a lot of photos of a busy tourist place and
| automatically remove all the people.
| Grazester wrote:
| Yeah there is a version of the smart fill available on
| Pixel phones
| crazygringo wrote:
| Wow.
|
| This actually feels like it could be an _incredibly_ valuable
| post-production tool in film and TV, once they get it working
| consistently across multiple frames.
|
| Not only for more flexibility in "uncropping" after shooting
| (there was a tree/wall in the way), but this could basically be
| the holy grail solution for converting 4:3 to widescreen without
| cutting off content on the top and bottom.
| waynenilsen wrote:
| removing the cameraman from the shot is probably pretty close
| to the top of the list also
| Gabrys1 wrote:
| Initially I read the above as removing the cameramen from the
| process of taking photos (which is also where this is going)
| Wistar wrote:
| ... especially on highly reflective subject surfaces such as
| cars.
| markjpT wrote:
| [flagged]
| emodendroket wrote:
| I can see it working great for some stuff but wouldn't you
| ultimately face the issue with more artistic work that the
| framing might not be very good if just artificially extending.
| crazygringo wrote:
| It definitely needs to be applied judiciously on a shot-by-
| shot basis.
|
| There have been quite a few 4:3-to-widescreen conversions
| that were done using the original film that was actually shot
| in widescreen and cropped for TV.
|
| Sometimes, the wider shot makes perfect sense. Sometimes,
| they keep the original cropped one but cut off top/bottom.
| Sometimes it's a combination of the two. It all depends on
| what's being framed -- two people in a car usually benefits
| from cropping (nobody needs the bottom third of the frame
| occupied by the car's hood), while a close-up on someone's
| face usually benefits from extending the sides (otherwise
| it's an uncomfortable mega-close up that cuts off their
| mouth).
|
| But having the flexibility to extend horizontally gives you
| the artistic possibilities.
| miahwilde wrote:
| wow x2. You're right video is where this is really cool. Take
| enough video of a scene and you can then create most any photo
| from any angle in it.
| qingcharles wrote:
| I already use Photoshop Generative Fill for uncropping videos,
| but it only works for fixed camera shots. Photoshop just added
| feature where you can just drag the video file in and do the
| uncrop in one step.
|
| The problem I'm solving is converting videos from widescreen to
| vertical and sometimes you need some extra height.
| kennyadam wrote:
| Mind if I ask why you'd need to do that? It's a huge amount
| of if the frame being generated artificially, especially if
| you're talking cinema aspect ratio wide-screen.
| qingcharles wrote:
| If you're trying to convert widescreen content so it looks
| good on TikTok, Reels, Shorts, then for the most part you
| can crop a vertical chunk from the centre of the frame, and
| pan if necessary to keep the action in frame. Sometimes
| though the shot is too wide and you can't crop that
| vertical chunk out, so you have to crop it as vertical as
| you can and then add something to the top and bottom to
| fill out the frame, otherwise you have a whole shot that
| isn't in vertical and it breaks the flow of the video.
| jiggawatts wrote:
| > widescreen to vertical
|
| You're a monster.
| qingcharles wrote:
| I know. What would my parents think of what has become of
| their son..? Sorry, Mum and Dad! o_O
| anigbrowl wrote:
| Pro: a cool and useful looking technology
|
| Con: it's from Google so forget about trying it yourself any time
| soon
|
| I used to be a huge supporter of Google's products, now the name
| is an instant red flag.
| corndoge wrote:
| this page consistently crashes chrome on iOS
| cryptoz wrote:
| So is the weather just hallucinated then? We're just making up
| memories and calling them real? And advertising this blatently,
| called rainy days sunny and sunny days rainy? My god I hate this
| so much.
|
| Not even a discussion about if this might be harmful or what the
| risks are or anything, just plain old "THIS FAKE MOMENT WAS REAL
| AND YOU'LL BELIEVE IT"?!
|
| I really have a hard time with this. Wow I'm upset, more than I
| expected. The tech is fine yeah but the marketing is just deeply
| upsetting.
| dymk wrote:
| > We're just making up memories and calling them real?
|
| This has always been the case, you just don't remember it, and
| the (human) hallucinated details are usually just not important
| enough to care about.
| awruko wrote:
| this is crashing (black screen) my firefox on iphone
| [deleted]
| dwallin wrote:
| Seems like the real utility of this technique will be as a way to
| vastly improve the temporal stability with a variety of
| generative video techniques. For example, if you are trying to
| use a video as a base for a new generative video: Take the first
| frame of your video and run it through SD with the control net of
| your choice. Then take that initial image and run it through this
| process to generate a new base model and then use that to
| generate your second frame. Now you can use that second frame to
| feed back into your model and rinse and repeat, always using the
| past few frames to inform the latest.
| goodmachine wrote:
| That makes sense. If I understand correctly, this 'loopback'
| technique is being used below as you describe. Alarming video,
| btw.
|
| https://www.reddit.com/r/StableDiffusion/comments/16uqqrh/ho...
| buildbot wrote:
| Cool tech as others have said, but of course, for thee but not
| for me with Google, unless I missed a link to a GitHub repo.
| (That's why OpenAI is called OpenAI - not open source, but at
| least open access!)
| KETpXDDzR wrote:
| Does anyone else notice that the example images they provided
| look like they included their test data in the training set?
| E.g., the picture with the couch where they cut out the dog in
| it. How should the network know that there was a dog on the
| couch? The only explanation is: It knows the reference image.
| amelius wrote:
| That's the idea, right?
|
| You give it a bunch of reference images, then another image
| with some rectangle removed, and it will fill in the rectangle
| with information from the reference image.
| sergiotapia wrote:
| This is what will make the Pixel compelling.
|
| My wife and I have been using the Pixel phones since Pixel 6 and
| we love the camera. Great pictures! But the best features are
| google photos, auto-tagging, recommending collages, walking down
| memory lane.
|
| Then you can magic erase tourists from pictures and pic a better
| shot from a picture you took on the fly....
|
| You add this "authentic image completion" to my kids pics, and
| it's game over...
|
| I want this on my Pixel 8 asap!
| ehsankia wrote:
| The demo of the new upcoming Magic Editor they gave at I/O was
| quite magical.
|
| https://www.youtube.com/watch?v=-a583U3Sw44
|
| There's also leaks showing another feature where you can
| individually swap every person's face to get the perfect photo:
|
| https://www.ign.com/articles/google-pixel-8-leaked-video-ai-...
|
| I definitely agree, Pixel has been at the forefront of
| computation photography and editing since its inception. Things
| like night photography that we take for granted now, I remember
| when Pixel 2 first introduced it and it was honestly mind
| blowing. this use of computation photography and editing that
| tinyhouse wrote:
| What's so magical about that I/O? I get the point of
| improving the quality of a picture. But editing the picture
| so that it includes things that didn't really happen... why
| even care besides trying to impress others?
| simoneau wrote:
| Me: Facebook AI, please post an entry about my vacation on Cape
| Cod and create a bunch of photos to go with it.
|
| Facebook: Great. I'd be happy to. Any more detail you'd like to
| add?
|
| Me: Make us look attractive. Show that we're a having a great
| time. Also, we went to see the Chatham Lighthouse.
|
| Facebook: OK, done!
|
| ...
|
| Facebook: You've received 48 likes. Your mother would like to
| know if you had any salt water taffy.
|
| Me: Yes, and please create a picture of my oldest daughter having
| trouble chewing it.
|
| Facebook: Done.
| derefr wrote:
| When you think about it, the only thing that's weird about this
| hypothetical conversation is the context of it being about
| (purported) photographs.
|
| We expect images that _look like photographs_ -- at least when
| taken by amateurs -- to be the result of a documentary process,
| rather than an artistic one. They might be slightly filtered or
| airbrushed, but they won 't be put together from whole cloth.
|
| But amateur photography is actually the outlier, in the history
| of "capturing memories"!
|
| If you imagine yourself before the invention of photography,
| describing your vacation to an illustrator you're commissioning
| to create a some woodblock-print artwork for a set of christmas
| cards you're having made up, the conversation you've laid out
| here is exactly how things would go. They'd ask you to recount
| what you saw, do a sketch, and then you'd give feedback and
| iterate together with them, to get a final visual down that
| reflects things _the way you remember them_ , rather than _the
| way they were_ , per se.
| jprete wrote:
| This is an interesting point. Usually people claim technology
| goes inexorably forward, yet here we are, merrily destroying
| trust in the most objective method we have to record the
| past!
| pbhjpbhj wrote:
| Photographs haven't been able to be trusted since almost
| the beginning. Trusted as an image of a real scene that is.
|
| Indeed, people viewing photographs have always been able to
| be manipulated by presentation as fact something that is
| not true -- you dress up smart, in borrowed clothes, when
| you're really poor; you stand with a person you don't know
| to indicate association; you get photographed with a dead
| person as if they're alive; you use a back drop or set; _et
| cetera_.
| jprete wrote:
| These aren't even remotely comparable to AI photo
| manipulation.
| jayunit wrote:
| https://web.archive.org/web/20140222103103/http://subterrane.
| ..
| seydor wrote:
| You guys are very unambitious.
|
| FB AI, make a series of posts about me climbing mount everest,
| meeting dalai lama, curing cancer, bringing peace to ukraine,
| changing my name to Melon Tusk, announcing running for
| president and adopting a dog named Molly
| toyg wrote:
| But see, that's the sort of thing that would give it away.
|
| You got to shoot for something just attainable enough to
| sound credible, while still being at the "enviable" end of
| the spectrum.
|
| "FB AI, make a series of pictures of my first 3 months at
| Goldman Sachs in 2021. Include me shaking hands with the VP
| of software as I receive a productivity award for making them
| $1m in a week. Include a group photo of me and 12 other
| people (all C execs and my VP must be there). Crosspost all
| to LinkedIn, with notifications muted."
|
| _" Ok done"_
|
| "ChatGPT, take my existing CV and replace entries from 2021
| onwards with a job as Head of Performance Monitoring at
| Goldman Sachs, reporting to VP of software. Include several
| projects with direct CEO and CFO involvement. Crosspost
| changes to LinkedIn."
|
| _" Ok done"_
|
| ... and now I can go job-hunting.
| seydor wrote:
| I can see AI Consulting to be the next incarnation of
| social media expert
| y-curious wrote:
| Incredible. Man, am I going to be telling my grandkids about a
| time when you could believe your eyes and ears on the internet.
| DaiPlusPlus wrote:
| What if we're already living in the future, and everything
| we're experiencing right-now is being AI generated?
|
| ...that, and other thoughts I have while baked.
| ShakataGaNai wrote:
| Sounds like the plot line to an episode of Black Mirror, but
| also something that is far too likely to happen.
| simoneau wrote:
| me: Facebook AI, please post a tender moment between me and
| my father when I was a boy. Include some photos.
|
| Facebook: I'd be happy to. Are there any more details you'd
| like to include?
|
| me: Please show how he didn't understand me at first, but
| then he looks at me and starts crying with love and regret.
|
| Facebook: Done. Your relationship with your father must have
| been deeply fulfilling.
| ormax3 wrote:
| https://petapixel.com/2022/12/14/man-fakes-an-entire-
| month-o...
| anticristi wrote:
| This is the weirdest video I ever watched. It's like Black
| Mirror ... but in real life ... and a somewhat happy
| ending.
| brap wrote:
| For the last 2-3 years, on an almost weekly basis, I am blown
| away by the progress made in AI. Huge steps forward. It actually
| happened twice in the last 24 hours alone.
|
| Where will we be 10 years from now? 50?
| Heidaradar wrote:
| What was the second time in the last 24 hours?
| brap wrote:
| https://youtu.be/MVYrJJNdrEg
| pbhjpbhj wrote:
| Lex Friedman's 3D realistic avatar interviewing Mark
| Zuckerberg in a generated space (two floating heads).
|
| Interesting to be how it illustrates philosophical
| questions on the nature of reality, the projection of
| personality, the 'problem of other minds', and such.
| js4ever wrote:
| ohhh Another research paper from google that will leads nowhere
| jawns wrote:
| There's definitely value in providing this functionality for
| photographs taken in the present.
|
| But I think the real value -- and this is definitely in Google's
| favor -- is providing this functionality for photos you have
| taken in the past.
|
| I have probably 30K+ photos in Google Photos that capture moments
| from the past 15 years. There are quite a lot of them where I've
| taken multiple shots of the same scene in quick succession, and
| it would be fairly straightforward for Google to detect such
| groupings and apply the technique to produce synthesized pictures
| that are better than the originals. It already does something
| similar for photo collages and "best in a series of rapid shots."
| They surface without my having to do anything.
| thesuavefactor wrote:
| Every picture is a picture from the past though
| ekianjo wrote:
| Not the pictures where you age people artificially
| jawns wrote:
| Philosophically, yes. But some photo-editing techniques rely
| on data that is not backfillable and must be recorded at
| capture time. And even in cases where there is no functional
| impediment to applying it against historical photos,
| sometimes there is product gatekeeping to contend with.
| royaltheartist wrote:
| Oh yeah, what about this old Kodak I found in my grandpa's
| attic that prints pictures showing how people are going to
| die?
| chii wrote:
| but how did you know it wasnt a coincidence that the
| picture depicted a similar scene in the past?
| parineum wrote:
| Here's a picture of me in the future.
| miohtama wrote:
| John Titor, is that you?
| ortusdux wrote:
| No, it's Mitch Hedberg.
| thejazzman wrote:
| I had an ant farm. They didn't grow shit!
| drewbeck wrote:
| Where you get that camera at??
| amelius wrote:
| Every state machine is bound to cycle at some point, even if
| it has the size of the universe.
| flanked-evergl wrote:
| This is not true, its very trivial to design a state
| machine that won't cycle.
| amelius wrote:
| Sorry, forgot to add that it should be reversible, like
| the laws of physics.
| makapuf wrote:
| Every _existing_ pictures are.
| positus wrote:
| If it hasn't been taken/made/captured yet, it isn't a
| picture. It's just the potential for one.
| BoppreH wrote:
| That's exactly why I've been keeping all "duplicates" in my
| photo collections.
|
| They do take up a lot of space, and just today I asked in
| photo.stackexchange for backup compression techniques that can
| exploit inter-image similarities:
| https://photo.stackexchange.com/questions/132609/backup-comp...
| syntaxfree wrote:
| Suggestion: stack the images vertically or horizontally.
| Frequency spectrum compression schemes like JPG will see the
| similarity in the fine details.
| bayesianbot wrote:
| I got really good compression using this technique with
| JPEG XL, I'm sure there's even a good reason why it works
| so well but it's been a long time and I don't seem to
| remember why.
| bondarchuk wrote:
| > _in the fine details_
|
| Could it be possible that jpg also exploits the repetition
| at the wavelength of the width of a single picture, so to
| say? E.g. 4 pictures side-by-side with the same black dot
| in the center, can all 4 dots be encoded with a single sine
| wave (simplifying a lot here..) that has peaks at each dot?
| bick_nyers wrote:
| Tiled/stacked approach as others mention is good, and
| probably the best approach. Could also try doing an
| uncompressed format (even just .png uncompressed) or
| something simple like RLE then 7zip them together since 7zip
| is the only archive format that does inter-file (as opposed
| to intra-file) compression as far as I am aware.
|
| Unfortunately lossless video compression won't help here as
| it will compress frames individually for lossless.
| adrianN wrote:
| Inter file compression has been solved ever since tar|gz
| beagle3 wrote:
| Not so. Gzip's window is very small - 32K in the original
| gzip iirc, which meant even identical copies of a 33KB
| file would bot help each other.
|
| Iirc it was Bzip2 that bumped that up to 1MB, and there
| are now compressors with larger windows - but files have
| also grown, it's not a solved problem for compression
| utilities.
|
| It is solved for backup - but, reatic, and a few others
| will do that across a backup set with no "window size"
| limit.
|
| .... And all of that is only true for lossless, which
| does not include images or video.
| danielheath wrote:
| Not even remotely an efficient scheme for images or
| video.
| tehsauce wrote:
| That's for lossless compression, i think there's special
| opportunities for multi image lossy
| RockRobotRock wrote:
| Stupid question. Would a block based deduplicating file
| system solve this?
| randyrand wrote:
| most duplicates are from the same vantage point. these are
| not. i.e. you don't need to keep them all.
| beagle3 wrote:
| Those have been used for denouncing and super resolution
| for 30 years now - they are not useless. And storage is
| cheap, just keep them all.
| beagle3 wrote:
| That was supposed to be denoising, not denouncing, DYAC.
| Just noticed, too late to Edit Now.
| fenomas wrote:
| > ..fairly straightforward for Google to detect such groupings
| and apply the technique to produce synthesized pictures that
| are better than the originals.
|
| Wouldn't an operation like this require some kind of fine-
| tuning? Or do diffusion models have a way of using images as
| context, the way one would provide context to an LLM?
| sangnoir wrote:
| I think simpler algorithms (e.g. image histograms) can get
| you a long way. Regardless of the mechanism, Google Photos
| already has the capability to detect similar images, which is
| used to generate animated gifs.
| Workaccount2 wrote:
| Google might as well just be making up tech considering none of
| this stuff ever gets released.
| Grazester wrote:
| Ehh this stuff gets put to use on their pixel phones
| js4ever wrote:
| Agreed, I also suspect this. Since they don't release anything
| most of their "fantastic" papers are probably just BS made to
| let people think they are still relevant
| datameta wrote:
| I think using allusions to realism with AI is a dangerous road to
| start out on.
| henriquez wrote:
| Hasn't something like this been around for a year or so to
| "decensor" hentai pics?
| drcode wrote:
| When will they re-release all the old Star Trek TV shows in 1080p
| resolution and 16:9 aspect ratio?
| ShakataGaNai wrote:
| There are already applications like
| https://www.topazlabs.com/topaz-video-ai and
| https://tensorpix.ai/ -- So it doesn't seem unreasonable that
| some of these deep learning models could upscale all these old
| TV episodes to at least 4k.
|
| I'd love to see a combo of this Google tech and AI upscaling do
| the same for Babylon 5. They had shot the actors in widescreen
| format, but the CGI spaceships were only rendered in 4:3 and
| the files have been lost.
| dragonwriter wrote:
| This requires other pictures of the environment to use to infer
| what should fill in the gaps, which will not exist for every
| shot in those series. (TOS and TNG were already rereleased in
| 1080p, though.) I suppose you could use outpainting to
| _construct_ the rest of the scene in one frame, and use that as
| the reference for other frames in the same shot.
| pbhjpbhj wrote:
| A lot of the shots are on the same set, so you'd want the
| system to use a whole series (season) as samples.
| andrewprock wrote:
| I suspect this will do a pretty good job at defeating watermarks.
| gromneer wrote:
| So "computer, enhance!" is now real?
___________________________________________________________________
(page generated 2023-09-30 23:01 UTC)