[HN Gopher] Identifying Stable Diffusion XL 1.0 images from VAE ...
___________________________________________________________________
Identifying Stable Diffusion XL 1.0 images from VAE artifacts
(2023)
Author : rcarmo
Score : 47 points
Date : 2024-04-05 16:38 UTC (6 hours ago)
(HTM) web link (hforsten.com)
(TXT) w3m dump (hforsten.com)
| blt wrote:
| (2023)
| dang wrote:
| Added. Thanks!
| GaggiX wrote:
| 8 months ago StabilityAI changed the default VAE from 1.0 to 0.9,
| so no one realistically uses VAE 1.0.
| wut42 wrote:
| And this article is nine months old.
| Dwedit wrote:
| I can see why, that VAE looks pretty bad.
| kken wrote:
| Generally, the VAE is mapping from a small latent space to a
| large image space. This means that there must be a large number
| of images for which no reverse mapping exists.
|
| It should be possible to identify images that have not been
| generate by the VAE since they are not part of the set images
| that the VAE can generate. The other way round is a bit more
| difficult as there may be images that can be mapped to the latent
| space and back without loss but have been generated in another
| way
|
| -> there may be false positives.
| chacham15 wrote:
| This logic has a key flaw: just the fact that the size of the
| space is different doesnt mean that every representable thing
| in the larger space is a thing we care about. E.g. a person
| with three hands may not have a representation in the smaller
| space, but we would never care about that. The actual question
| is: what is the difference in the amount of information encoded
| in a large image vs the small latent space and compare that to
| the difference in information between a large image and a small
| image. If those two differences are close enough together,
| being able to determine a legitimate difference between SD
| generated vs not becomes near impossible.
| Onavo wrote:
| Yes, otherwise cryptographic hashes won't work (they are not
| bijective)
| TrueDuality wrote:
| Very interesting! Well broken down and explained!
| tsycho wrote:
| Has anyone tried training a neural net to distinguish between
| photographed vs AI generated images?
|
| Of course, you will need to remove exifs or other metadata, but
| this sounds like the kind of domain that NNs are good at.
| moofight wrote:
| yes, and it's quite challenging for instance =>
| https://sightengine.com/detect-ai-generated-images
| HPsquared wrote:
| Isn't that basically how the image generation models themselves
| work? By refining the image until it can't distinguish it from
| a "real image".
| jncfhnb wrote:
| That was trendy for a while but is no longer the primary
| method
| ok123456 wrote:
| Yes. Trying to sell an "AI detector" is a fool's errand since
| this is how adversarial networks (the hot model from two hype
| cycles ago) are trained. The only use of the "AI detector " is
| to tune the model so that the detector is uniformly random.
___________________________________________________________________
(page generated 2024-04-05 23:01 UTC)