[HN Gopher] Visual Anagrams: Generating optical illusions with d...
___________________________________________________________________
Visual Anagrams: Generating optical illusions with diffusion models
Author : beefman
Score : 292 points
Date : 2023-11-30 18:39 UTC (4 hours ago)
(HTM) web link (dangeng.github.io)
(TXT) w3m dump (dangeng.github.io)
| minimaxir wrote:
| Note that this technique and its results are unrelated to the
| infamous "spiral" ControlNet images a couple months back:
| https://arstechnica.com/information-technology/2023/09/dream...
|
| Per the code, the technique is based off of DeepFloyd-IF, which
| is not as easy to run as a Stable Diffusion variant.
| SamBam wrote:
| I missed it, what was infamous about it?
| minimaxir wrote:
| It created a backlash because a) it was too popular with AI
| people hyping "THIS CHANGES EVERYTHING!" and people were
| posting low-effort transformations to the point that it got
| saturated and b) non-AI people were "tricked" into thinking
| it was a clever trick with real art since ControlNet is not
| ubiquitous outside the AI-sphere, and they got mad.
| andybak wrote:
| I rather liked it and actually didn't get to see as many
| examples as I wanted to.
|
| Is there a good repository anywhere or is it just "wade
| through twitter"?
| Der_Einzige wrote:
| I always thought it was weird that this idea took off with that
| particular controlnet model. Many other controlnet models when
| combined with those same images produce excellent and striking
| results.
|
| The ecosystem around Stable Diffusion in general is so massive.
| minimaxir wrote:
| Other ControlNet adapters either preserve the high-level
| shape not enough or preserve it _too_ well, IMO. Canny /Depth
| ControlNet generations are less of an illusion.
| ShamelessC wrote:
| > Per the code, the technique is based off of DeepFloyd-IF,
| which is not as easy to run as a Stable Diffusion variant.
|
| I haven't dug in yet, but it _should_ be possible to use their
| ideas in other diffusion networks? It may be a non-trivial
| change to the code provided though. Happy to be corrected of
| course.
| minimaxir wrote:
| I suspect the trick only works because DeepFloyd-IF operates
| in pixel space while other diffusion models operate in the
| latent space.
|
| > Our method uses DeepFloyd IF, a pixel-based diffusion
| model. We do not use Stable Diffusion because latent
| diffusion models cause artifacts in illusions (see our paper
| for more details).
| mg wrote:
| I had a similar idea early last year and also dabbled with a
| checkerboard approach.
|
| Here a cat is made from 9 paintings of cats in the style of
| popular painters:
|
| https://twitter.com/marekgibney/status/1521500594577584141
|
| You might have to squint your eyes to see it.
|
| I made a few of them and then somehow lost interest.
| hammock wrote:
| That's really cool. Can you do 3x3x3? As in, 9x9 with 81 1-cell
| cats, 9 9-cell cats and 1 81-cell cat?
| jamilton wrote:
| I really like the man/woman inversion.
|
| I wonder how many permutations could legibly be generated in a
| single image with an extended version of the same technique. I
| don't understand the math, but would two orthogonal
| transformations in sequence still be an orthogonal transformation
| and thus work?
| xanderlewis wrote:
| I'm not sure whether 'orthogonal transformations' in this
| context refers to the usual orthogonal _linear_ transformations
| ( /matrices), but if so then yes.
| moritzwarhier wrote:
| This is wonderful.
| willsmith72 wrote:
| > This colab notebook requires a high RAM and V100 GPU runtime,
| available through Colab Pro.
|
| That's sad, I would've loved to try it.
| andybak wrote:
| well - chuck $10 at it and spend the rest of your month trying
| other things.
|
| (Back in Disco Diffusion days I was happy to spend money on
| Colab Pro. It was fun)
| nomel wrote:
| I completely disagree. It's _fantastic_ that we can get access
| to this hardware for so cheap. A used V100 is $1300. You could
| pay for Colab Pro for _10 years_ with that, which will get you
| faster and faster hardware through the years. Where I am, a
| month is the cost of two bags of chips.
| hammock wrote:
| Do real-life jigsaw puzzles like the ones shown here, exist for
| purchase?
| downboots wrote:
| Could the idea of puzzle piece rearrangement also extend for
| something similar to self-tiling tile sets?
| cloudyporpoise wrote:
| This may be one of the cooler things i've ever seen
| Nition wrote:
| The duck/rabbit that rearranges would be really cool to use on
| one of those sliding puzzles. Two valid solutions!
___________________________________________________________________
(page generated 2023-11-30 23:00 UTC)