[HN Gopher] Alias-Free GAN
___________________________________________________________________
Alias-Free GAN
Author : nurpax
Score : 249 points
Date : 2021-06-23 16:26 UTC (6 hours ago)
(HTM) web link (nvlabs.github.io)
(TXT) w3m dump (nvlabs.github.io)
| minimaxir wrote:
| The first two demo videos are interesting examples of using
| StyleCLIP's global directions to guide an image toward a "smiling
| face" as noted in that paper with smooth interpolation:
| https://github.com/orpatashnik/StyleCLIP
|
| I had ran a few chaotic experiments with StyleCLIP a few months
| ago which would work _very_ well with smooth interpolation:
| https://minimaxir.com/2021/04/styleclip/
| Chilinot wrote:
| That first picture of mark zuckerberg smiling is just straight
| up cursed. Interesting write up though.
| evo wrote:
| I wonder if there are learnings from this that could be
| transposed into the 1-D domain for audio; as far as I know,
| aliasing is a frequent challenge when using deep learning methods
| for audio (e.g. simulating non-linear circuits for guitar amps).
| Lichtso wrote:
| The previous approaches learned screen-space-textures for
| different features and a feature mask to compose them.
|
| Now it seems to actually learn the topology lines of the human
| face [0], as 3D artists would learn them [1] when they study
| anatomy. It also uses quad grids and even places the edge loops
| and poles in similar places.
|
| [0] https://nvlabs-fi-cdn.nvidia.com/_web/alias-free-
| gan/img/ali... [1]
| https://i.pinimg.com/originals/6b/9a/0c/6b9a0c2d108b2be75bf7...
| jerf wrote:
| That's starting to be high enough quality that you could start
| considering using that for some Hollywood-grade special effects.
| That beach morph stuff is pretty impressive. Faces, perhaps not
| quite there yet because we are _so_ hyper-focused on those
| biologically, but you could make one heck of a drug trip scene or
| a Doctor Strange-esque scene with much less effort with some of
| those techniques, effort perhaps even getting down to the range
| of Youtuber videos in the near enough future.
| datameta wrote:
| Wow! The rate of progress is truly stunning. I wonder what Refik
| Anadol could create with this technique.
| Gimpei wrote:
| Those are some creepy pictures! It's like a photo of the demon
| inside.
| benrbray wrote:
| Interesting to that this method makes use of Equivariant Neural
| Networks. Taco Cohen recently published his PhD thesis [1], which
| combines a dozen or so papers he authored on the topic.
|
| [1]: https://pure.uva.nl/ws/files/60770359/Thesis.pdf
| fogof wrote:
| You can see what they're saying about the fixed in place features
| with the beards in the first video, but StyleGAN gets the teeth
| symmetry right whereas this work seems to have trouble with it.
| Why don't the teeth in the StyleGAN slide around like the beard
| does?
| Geee wrote:
| In video 9 teeth are sliding.
| minimaxir wrote:
| That's likely the GANSpace/SeFa part of the manipulation.
|
| > In a further test we created two example cinemagraphs that
| mimic small-scale head movement and facial animation in FFHQ.
| The geometric head motion was generated as a random latent
| space walk along hand-picked directions from GANSpace [24] and
| SeFa [50]. The changes in expression were realized by applying
| the "global directions" method of StyleCLIP [45], using the
| prompts "angry face", "laughing face", "kissing face", "sad
| face", "singing face", and "surprised face". The differences
| between StyleGAN2 and Alias-Free GAN are again very prominent,
| with the former displaying jarring sticking of facial hair and
| skin texture, even under subtle movements
| Imnimo wrote:
| I wonder if you could make the noise inputs work again by using
| the same process as for the latent code - generate the noise in
| the frequency domain, and apply the same shift and careful
| downsampling. If you apply the same shift to the noise as to the
| latent code, then maybe the whole thing will still be
| equivariant? In other words, it seems like the problem with the
| per-pixel noise inputs is that they stay stationary while the
| latent is shifted, so just shift them also!
| Bjartr wrote:
| That beach interpolation is begging for a music video
| goldemerald wrote:
| After styleGAN-2 came out, I couldn't image what improvements
| could be made over it. This work is truly impressive.
|
| The comparisons are illuminative: StyleGAN2's mapping of texture
| to specific pixel location looks very similar to poorly
| implemented video-game textures. Perhaps future GAN improvements
| could come from tricks used in non-AI graphic development.
| tyingq wrote:
| >I couldn't image what improvements could be made over it
|
| Still has the telltale of mismatched ears and/or earrings. This
| seems the most reliable way to recognize them. Well, and the
| nondescript background.
| sbierwagen wrote:
| Teeth too. Partially covered objects in 3D space have been
| hard for a GAN to figure out. (See also hands)
|
| I wonder what dataset you could even use to tell a GAN about
| human internals. 3D renders of a skull with various layers
| removed?
| mzs wrote:
| Mismatched reflections across eyes is the dead give-away for
| me.
| russdpale wrote:
| Great work!
| ansk wrote:
| This group of researchers consistently demonstrates a degree of
| empirical rigor that is unmatched across any other ML lab in
| industry or academia - remarkable empirical results as always,
| reproducible experiments, open-source and well-engineered
| codebase, and valuable insights about low-level learning dynamics
| and high-level emergent artifacts. Applied ML wouldn't have such
| a bad rap if more researchers held themselves to similar
| standards.
| sillysaurusx wrote:
| This isn't true. I do ML every day. You are mistaken.
|
| I click the website. I search "model". I see two results. Oh
| no, that means no download link to model.
|
| I go to the github. Maybe model download link is there. I see
| _zero code_ : https://github.com/NVlabs/alias-free-gan
|
| Zero code. Zero model.
|
| You, and everyone like you, who are gushing with praise and
| hypnotized by pretty images and a nice-looking pdf, are doing
| damage by saying that this is correct and normal.
|
| The thing that's useful to me, first and foremost, is a
| _model_. Code alone isn 't useful.
|
| Code, however, is the recipe to create the model. It might take
| 400 hours on a V100, and it might not actually result in the
| model being created, but it _slightly_ helps me.
|
| There is no code here.
|
| Do you think that the pdf is helpful? Yeah, maybe. But I'm
| starting to suspect that the pdf is in fact a _tech demo for
| nVidia_ , not a scientific contribution whose purpose is to be
| helpful to people like me.
|
| Okay? Model first. Code second. Paper third.
|
| Every time a tech demo like this comes out, I'd like you to
| check that those things exist, in that order. If it doesn't,
| it's not reproducible science. It's a tech demo.
|
| I need to write something about this somewhere, because a large
| number of people seem to be caught in this spell. You're
| definitely not alone, and I'm sorry for sounding like I was
| singling you out. I just loaded up the comment section, saw
| your comment, thought "Oh, awesome!" clicked through, and went
| "Oh no..."
| ansk wrote:
| You're clearly disillusioned with the general accessibility
| of ML research, but I don't think your cynicism is warranted
| here. Take a look at their prior works[1], and I think you'll
| agree they go above and beyond in making their work
| accessible and reproducible. There is no reason to doubt the
| open-source release of this work will be any different. As to
| why the release is delayed, I'd speculate it's because they
| put a significant additional amount of work into releases and
| because releasing code in a large corporation is a
| bureaucratic hassle.
|
| [1] https://nvlabs.github.io/stylegan2/versions.html
| varispeed wrote:
| When I was writing a paper I had to include all source code
| in a state to be published, otherwise it wouldn't be
| accepted. I guess today the bar is much lower.
| sillysaurusx wrote:
| _There is no reason to doubt the open-source release of
| this work will be any different._
|
| Then this is _not_ a scientific contribution yet.
|
| We must wait and see.
|
| The most important tenet of science, is to doubt. I didn't
| even read the name on the paper before I wrote my comment.
| Yes, I know this group. They're why I got into ML, along
| with the group from OpenAI who published GPT-2. Because A+
| science.
|
| Their claims here are likely wrong unless and until proven
| otherwise. This isn't a hardline position. It's been my
| experience across many codebases, during my two years of
| trying to reproduce many ideas.
|
| I agree that that is an example of A+ science. But why do
| you think they're punishing this now, today? Either because
| conference deadline or because nVidia pressure. Neither of
| those are related to helping me achieve the scientific
| method: reproducing the idea in the paper, to verify their
| claims.
|
| All I can do is kind of try to reverse engineer some vague
| claims in a pdf, without those things.
|
| --
|
| Let me tell you a little bit about my job, because my time
| with my job may soon come to an end. I think that might
| clear up some confusion.
|
| My job, as an ML researcher, is to learn techniques that
| may or may not be true, combine them in novel ways, and
| present results to others.
|
| Knowledge, Contribution, Presentation, in that order.
|
| The first step is to obtain knowledge. Let's set aside the
| question of why, because _why_ is a question for me
| personally, which is unrelated.
|
| Scientific knowledge comes when Knowledge, Contribution,
| and Presentation are all achieved in a rigorous way. The
| rigor allows people like me to verify that I have
| knowledge. Without this, I have _mistaken knowledge_ ,
| which is worse than useless. It's an illusion - I'm fooling
| myself.
|
| When I got into ML two years ago, I thought that knowledge
| would come from reading scientific papers. I was wrong.
|
| Most papers, are wrong. That's been my experience for the
| past two years. My experience may be wrong. Maybe others
| obtain rigorous scientific knowledge through the paper
| alone.
|
| But researchers happen to obtain a dangerous thing:
| _prestige_. Unfortunately, prestige doesn 't come from
| helping others obtain knowledge. It comes from that last
| step -- presentation.
|
| The presentation on this thread is excellent. It's another
| Karras release. I agree; there's no reason to doubt they'll
| be just as rigorous with this release as they are with
| stylegan2.
|
| But knowledge doesn't come from presentation. Only
| prestige.
|
| Prestige makes a lot of new researchers try very hard to
| obtain the wrong things.
|
| If all of these were small concerns, or curious quirks,
| they'd be a footnote in my field guide. But I submit that
| these things are _front and center_ to the current state of
| affairs in 2021. Every time a release like this happens, it
| generates a lot of fanfare and we come together in
| celebration because ML Is Happening, Yay!
|
| And then I try to obtain the Knowledge in the fanfare, and
| discover that either it's absent or mistaken. Because there
| are no tools for me to verify their claims -- and when I
| do, I often see that they don't work!
|
| That's right. I kept finding out that these things being
| claimed, just aren't true. No matter how enticing the claim
| is, or whether it sounds like "Foobars are Aligned in the
| Convolution Digit," the claim, from where I was sitting,
| seemed to be wrong. It contained _mistaken knowledge_ --
| worse than useless.
|
| Unfortunately, two years with no salary takes a toll. I
| could spend another few years doing this if I wanted to.
| But I wound up so disgusted with discovering that we're all
| just chasing prestige, not knowledge, that I'd rather ship
| production-grade software for the world's most boring
| commercial work, as long as the work seems useful and the
| team seems interesting. Because at least I'd be doing
| something useful.
| phreeza wrote:
| Expecting fully executable code to accompany every
| publication is kind of unique to the modern ML research
| Scene. As someone from a very different computational
| research field, where zero code is the norm, not the
| exception, this reads as a somewhat entitled rant.
| Reimplementation of a paper is actually a test of the
| robustness of the results. If you download the code of a
| previous paper, there may be some assumptions hidden in
| the implementation that aren't obvious. So I would argue
| that simply downloading and re-executing the author's
| implementation does _not_ constitute reproducible
| research. I know it is costly, but for actual
| reproduction, reimplementation is needed.
| xksteven wrote:
| Sometimes reimplantation is impossible without the code
| and the paper goes on to win awards because it's by a
| famous scientist. Then if the reimplantation doesn't work
| most of the time the graduate students are blamed instead
| of the original work.
|
| There are always assumptions. At least with public code
| and models those assumptions are laid bare for all to see
| and potentially expose any bad assumptions.
| sillysaurusx wrote:
| I wasn't sure whether to post my edit as a separate
| comment or not, but I significantly expanded my comment
| just now, that helps explain my position.
|
| I'd be very interested in your thoughts on that position,
| because if it's mistaken, I shouldn't be saying it. It
| represents whatever small contribution I can make to
| fellow new ML researchers, which is roughly: _" watch
| out._"
|
| In short, for two years, I kept trying to implement
| stated claims -- to reproduce them in exactly the way you
| say here -- and they simply didn't work as stated.
|
| It might sound confusing that the claims were "simply
| wrong" or "didn't work." But every time I tried,
| achieving anything remotely close to "success" was _the
| exception_ , not the norm.
|
| And I don't think it was because I failed to implement
| what they were saying in the paper. I agree that that's
| the most likely thing. But I was careful. It's very easy
| to make mistakes, and I tried to make none, as both
| someone with over a decade of experience
| (https://shawnpresser.blogspot.com/) and someone who
| cares deeply about the things I'm talking about here.
|
| It takes hard work to reproduce the technique the way
| you're saying. I put all my heart and soul into trying
| to. And I kept getting dismayed, because people kept
| trying to convince me of things that either I couldn't
| verify (because verification is extremely hard, as you
| well know) or were simply wrong.
|
| So if I sound entitled, I agree. When I got into this
| job, as an ML researcher, I thought I was entitled to the
| scientific method. Or anything vaguely resembling
| "careful, distilled, _correct_ knowledge that I can build
| on. "
| phreeza wrote:
| I get your frustrations with this state of affairs, but
| for the reasons I mentioned above, I don't think
| providing the model and code is a panacea here. Maybe the
| last few years have also set an unrealistic expectation
| for the pace of progress. In my (former) field of
| theoretical neuroscience, if a paper was not
| reproducible, this knowledge kind of slowly diffused
| through the community, mostly through informal
| conversations with people who tried to reproduce or
| extend a given approach. But this takes several years,
| not the kind of timescale that modern ML research
| operates on.
|
| Fwiw I think actual knowledge is there in the ML
| literature, but it's not in these Benchmark-chasing
| highly tuned papers. It's more high level stuff, like
| basic architecture building blocks etc. GANs and
| Transformers for example. They undeniably work, and the
| knowledge needed to implement them can probably be
| conveyed in a few pages maximum. No need for an
| implementation to be provided by the author, really.
| skybrian wrote:
| I have no particular expertise here, but I wonder if
| you've learned to accept a mostly-broken process? We have
| the Internet, so why settle for slow diffusion over years
| instead of rapid communication?
|
| Why should graduate students have to spend years trying
| to reproduce stuff that turns out to be no good? Nobody
| should have to put up with getting their time wasted like
| that.
| msapaydin wrote:
| I think that not being able to reproduce the results
| claimed in a paper is not specific to ML research. While
| working as a post-doc at a top university research lab, i
| spent years trying to understand how it can be that some
| software that was supposed to corresponds to the well
| cited paper did not even come close to reproducing the
| results of the said paper, and that the primary author
| went on to become a prof at a top university in the US.
| In short, scientific fraud is also quite common, in most
| academic papers.
| sillysaurusx wrote:
| Thank you!!
|
| _i spent years trying to understand how it can be that
| some software that was supposed to corresponds to the
| well cited paper did not even come close to reproducing
| the results of the said paper,_
|
| This was my exact experience. I didn't understand why I
| kept having it, and kept blaming myself for not being
| careful enough. My code _must_ be wrong, or the data, or
| _something_.
|
| Nah. It was the idea.
|
| Kept feeling like a kick in the gut, until here we are
| today, when I'm warning everyone that Karras, of all
| people, might publish such a thing.
|
| I really appreciate that you posted this, because I'm so
| happy I wasn't alone in the feeling of "what's going on,
| here...?"
| fossuser wrote:
| That seems like a worthwhile thing to publicize in and of
| itself?
|
| The replication crisis in psychology threw out 50% or so
| of supposed scientific results.
|
| If this (or just straight fraud) is common elsewhere, it
| seems like knowing about that would be a good thing for
| science.
| nerdponx wrote:
| It's not an entitled rant, other fields just have
| dismally low standards.
| varispeed wrote:
| I am not into ML, but from time to time I like to look how
| this is made and remember only once seeing the code and a
| model, which I thought was exception from the "norm". Good
| that more people are calling this out!
| minimaxir wrote:
| The repo says the code will be available in September: that's
| a reasonable timeframe for the necessary polish/legal
| clearance.
| sillysaurusx wrote:
| (Agreed, fwiw. What's going on here isn't a criticism of
| this work specifically, but the trend of everyone thinking
| that this is science generally. For example, it's true the
| code is coming in September. And, you and I both know it's
| probably gonna have a model release, just because it's more
| impressive, big-name Karras nVidia work. But it might not
| have a model release. I give that at least 40% odds. If it
| doesn't, then everything I said above will be true about
| that too, Karras or not. People _keep doing that_ , and we
| have to call out that this is approximately useless for you
| and me. Actually, I was going to say maybe it's useful for
| you, but you're the language model hacker and I'm the GAN
| hacker, and I assure you, code alone is useless for me. If
| it's useful to you, I would love to know how it helps you
| verify the scientific method.)
| aaron-santos wrote:
| Thank you for calling this out. It's critically important
| that people understand the difference between model, code,
| and paper and what they mean.
|
| It's also important that people understand that even if code
| is provided, it's commercially useless. From the NVAE license
| as an example[1]
|
| > The Work and any derivative works thereof only may be used
| or intended for use non-commercially.
|
| It's a great example of the difference between open source
| (which it is) and free software which it is not. So we're
| back to square one where it is probably best to clean-room
| the implementation from the paper, which is nearly useless to
| reproduce the model.
|
| [1] https://github.com/NVlabs/NVAE/blob/master/LICENSE
| sillysaurusx wrote:
| Unfortunately, I must call you out too, my friend. With
| love.
|
| Because it's crucially important that we protect the
| scientific method here.
|
| The sole goal is to help people like me reproduce the
| model. If I can't reproduce the model, I can't verify the
| paper.
|
| When I saw "commercial" and then "open source" in your
| comment, I said "oh no..."
|
| My duty is to the scientific method, so I don't care if
| it's the most restrictive code on the planet as long as I
| can use it to reproduce the model in the paper.
|
| Because at that point, I have a baseline for _evaluating
| the paper's claims_.
|
| The reason I assume the paper is false until proven
| otherwise, is because the paper often doesn't have enough
| detail to reproduce the model shown in the videos on this
| tech demos. Meaning, if they're the it to help me, the ML
| researcher, then they're failing to tell me how to evaluate
| their claims rigorously.
|
| (That said, it's breaking my heart that I can't agree with
| you here, because _I want to so badly_. I've felt similarly
| for years that scientific contributions need to be "free as
| in beer" commercially. But I recognize signs of zealotry
| when I see them, and I can't let my personal views creep
| in, because people like me would stop listening if I was
| here e.g. arguing vehemently that nVidia needed to be
| delivering us something commercially viable along with a
| high quality codebase. The price for entry to the
| scientific method isn't so high.)
| aaron-santos wrote:
| That's fine, at least you're open about the knowledge for
| knowledge's sake position. There's more than one way to
| judge something.
| isoprophlex wrote:
| If ReLU-introduced high frequency components are indeed the
| culprit, won't using "softened" ReLU (without discontinuity in
| the derivative at 0) everywhere solve the problem, too?
| ChuckNorris89 wrote:
| I expect this work will feed back into removing the aliasing
| artifacts you sometimes get when using DLSS in games.
___________________________________________________________________
(page generated 2021-06-23 23:00 UTC)