[HN Gopher] Alias-Free GAN
       ___________________________________________________________________
        
       Alias-Free GAN
        
       Author : nurpax
       Score  : 249 points
       Date   : 2021-06-23 16:26 UTC (6 hours ago)
        
 (HTM) web link (nvlabs.github.io)
 (TXT) w3m dump (nvlabs.github.io)
        
       | minimaxir wrote:
       | The first two demo videos are interesting examples of using
       | StyleCLIP's global directions to guide an image toward a "smiling
       | face" as noted in that paper with smooth interpolation:
       | https://github.com/orpatashnik/StyleCLIP
       | 
       | I had ran a few chaotic experiments with StyleCLIP a few months
       | ago which would work _very_ well with smooth interpolation:
       | https://minimaxir.com/2021/04/styleclip/
        
         | Chilinot wrote:
         | That first picture of mark zuckerberg smiling is just straight
         | up cursed. Interesting write up though.
        
       | evo wrote:
       | I wonder if there are learnings from this that could be
       | transposed into the 1-D domain for audio; as far as I know,
       | aliasing is a frequent challenge when using deep learning methods
       | for audio (e.g. simulating non-linear circuits for guitar amps).
        
       | Lichtso wrote:
       | The previous approaches learned screen-space-textures for
       | different features and a feature mask to compose them.
       | 
       | Now it seems to actually learn the topology lines of the human
       | face [0], as 3D artists would learn them [1] when they study
       | anatomy. It also uses quad grids and even places the edge loops
       | and poles in similar places.
       | 
       | [0] https://nvlabs-fi-cdn.nvidia.com/_web/alias-free-
       | gan/img/ali... [1]
       | https://i.pinimg.com/originals/6b/9a/0c/6b9a0c2d108b2be75bf7...
        
       | jerf wrote:
       | That's starting to be high enough quality that you could start
       | considering using that for some Hollywood-grade special effects.
       | That beach morph stuff is pretty impressive. Faces, perhaps not
       | quite there yet because we are _so_ hyper-focused on those
       | biologically, but you could make one heck of a drug trip scene or
       | a Doctor Strange-esque scene with much less effort with some of
       | those techniques, effort perhaps even getting down to the range
       | of Youtuber videos in the near enough future.
        
       | datameta wrote:
       | Wow! The rate of progress is truly stunning. I wonder what Refik
       | Anadol could create with this technique.
        
       | Gimpei wrote:
       | Those are some creepy pictures! It's like a photo of the demon
       | inside.
        
       | benrbray wrote:
       | Interesting to that this method makes use of Equivariant Neural
       | Networks. Taco Cohen recently published his PhD thesis [1], which
       | combines a dozen or so papers he authored on the topic.
       | 
       | [1]: https://pure.uva.nl/ws/files/60770359/Thesis.pdf
        
       | fogof wrote:
       | You can see what they're saying about the fixed in place features
       | with the beards in the first video, but StyleGAN gets the teeth
       | symmetry right whereas this work seems to have trouble with it.
       | Why don't the teeth in the StyleGAN slide around like the beard
       | does?
        
         | Geee wrote:
         | In video 9 teeth are sliding.
        
         | minimaxir wrote:
         | That's likely the GANSpace/SeFa part of the manipulation.
         | 
         | > In a further test we created two example cinemagraphs that
         | mimic small-scale head movement and facial animation in FFHQ.
         | The geometric head motion was generated as a random latent
         | space walk along hand-picked directions from GANSpace [24] and
         | SeFa [50]. The changes in expression were realized by applying
         | the "global directions" method of StyleCLIP [45], using the
         | prompts "angry face", "laughing face", "kissing face", "sad
         | face", "singing face", and "surprised face". The differences
         | between StyleGAN2 and Alias-Free GAN are again very prominent,
         | with the former displaying jarring sticking of facial hair and
         | skin texture, even under subtle movements
        
       | Imnimo wrote:
       | I wonder if you could make the noise inputs work again by using
       | the same process as for the latent code - generate the noise in
       | the frequency domain, and apply the same shift and careful
       | downsampling. If you apply the same shift to the noise as to the
       | latent code, then maybe the whole thing will still be
       | equivariant? In other words, it seems like the problem with the
       | per-pixel noise inputs is that they stay stationary while the
       | latent is shifted, so just shift them also!
        
       | Bjartr wrote:
       | That beach interpolation is begging for a music video
        
       | goldemerald wrote:
       | After styleGAN-2 came out, I couldn't image what improvements
       | could be made over it. This work is truly impressive.
       | 
       | The comparisons are illuminative: StyleGAN2's mapping of texture
       | to specific pixel location looks very similar to poorly
       | implemented video-game textures. Perhaps future GAN improvements
       | could come from tricks used in non-AI graphic development.
        
         | tyingq wrote:
         | >I couldn't image what improvements could be made over it
         | 
         | Still has the telltale of mismatched ears and/or earrings. This
         | seems the most reliable way to recognize them. Well, and the
         | nondescript background.
        
           | sbierwagen wrote:
           | Teeth too. Partially covered objects in 3D space have been
           | hard for a GAN to figure out. (See also hands)
           | 
           | I wonder what dataset you could even use to tell a GAN about
           | human internals. 3D renders of a skull with various layers
           | removed?
        
           | mzs wrote:
           | Mismatched reflections across eyes is the dead give-away for
           | me.
        
       | russdpale wrote:
       | Great work!
        
       | ansk wrote:
       | This group of researchers consistently demonstrates a degree of
       | empirical rigor that is unmatched across any other ML lab in
       | industry or academia - remarkable empirical results as always,
       | reproducible experiments, open-source and well-engineered
       | codebase, and valuable insights about low-level learning dynamics
       | and high-level emergent artifacts. Applied ML wouldn't have such
       | a bad rap if more researchers held themselves to similar
       | standards.
        
         | sillysaurusx wrote:
         | This isn't true. I do ML every day. You are mistaken.
         | 
         | I click the website. I search "model". I see two results. Oh
         | no, that means no download link to model.
         | 
         | I go to the github. Maybe model download link is there. I see
         | _zero code_ : https://github.com/NVlabs/alias-free-gan
         | 
         | Zero code. Zero model.
         | 
         | You, and everyone like you, who are gushing with praise and
         | hypnotized by pretty images and a nice-looking pdf, are doing
         | damage by saying that this is correct and normal.
         | 
         | The thing that's useful to me, first and foremost, is a
         | _model_. Code alone isn 't useful.
         | 
         | Code, however, is the recipe to create the model. It might take
         | 400 hours on a V100, and it might not actually result in the
         | model being created, but it _slightly_ helps me.
         | 
         | There is no code here.
         | 
         | Do you think that the pdf is helpful? Yeah, maybe. But I'm
         | starting to suspect that the pdf is in fact a _tech demo for
         | nVidia_ , not a scientific contribution whose purpose is to be
         | helpful to people like me.
         | 
         | Okay? Model first. Code second. Paper third.
         | 
         | Every time a tech demo like this comes out, I'd like you to
         | check that those things exist, in that order. If it doesn't,
         | it's not reproducible science. It's a tech demo.
         | 
         | I need to write something about this somewhere, because a large
         | number of people seem to be caught in this spell. You're
         | definitely not alone, and I'm sorry for sounding like I was
         | singling you out. I just loaded up the comment section, saw
         | your comment, thought "Oh, awesome!" clicked through, and went
         | "Oh no..."
        
           | ansk wrote:
           | You're clearly disillusioned with the general accessibility
           | of ML research, but I don't think your cynicism is warranted
           | here. Take a look at their prior works[1], and I think you'll
           | agree they go above and beyond in making their work
           | accessible and reproducible. There is no reason to doubt the
           | open-source release of this work will be any different. As to
           | why the release is delayed, I'd speculate it's because they
           | put a significant additional amount of work into releases and
           | because releasing code in a large corporation is a
           | bureaucratic hassle.
           | 
           | [1] https://nvlabs.github.io/stylegan2/versions.html
        
             | varispeed wrote:
             | When I was writing a paper I had to include all source code
             | in a state to be published, otherwise it wouldn't be
             | accepted. I guess today the bar is much lower.
        
             | sillysaurusx wrote:
             | _There is no reason to doubt the open-source release of
             | this work will be any different._
             | 
             | Then this is _not_ a scientific contribution yet.
             | 
             | We must wait and see.
             | 
             | The most important tenet of science, is to doubt. I didn't
             | even read the name on the paper before I wrote my comment.
             | Yes, I know this group. They're why I got into ML, along
             | with the group from OpenAI who published GPT-2. Because A+
             | science.
             | 
             | Their claims here are likely wrong unless and until proven
             | otherwise. This isn't a hardline position. It's been my
             | experience across many codebases, during my two years of
             | trying to reproduce many ideas.
             | 
             | I agree that that is an example of A+ science. But why do
             | you think they're punishing this now, today? Either because
             | conference deadline or because nVidia pressure. Neither of
             | those are related to helping me achieve the scientific
             | method: reproducing the idea in the paper, to verify their
             | claims.
             | 
             | All I can do is kind of try to reverse engineer some vague
             | claims in a pdf, without those things.
             | 
             | --
             | 
             | Let me tell you a little bit about my job, because my time
             | with my job may soon come to an end. I think that might
             | clear up some confusion.
             | 
             | My job, as an ML researcher, is to learn techniques that
             | may or may not be true, combine them in novel ways, and
             | present results to others.
             | 
             | Knowledge, Contribution, Presentation, in that order.
             | 
             | The first step is to obtain knowledge. Let's set aside the
             | question of why, because _why_ is a question for me
             | personally, which is unrelated.
             | 
             | Scientific knowledge comes when Knowledge, Contribution,
             | and Presentation are all achieved in a rigorous way. The
             | rigor allows people like me to verify that I have
             | knowledge. Without this, I have _mistaken knowledge_ ,
             | which is worse than useless. It's an illusion - I'm fooling
             | myself.
             | 
             | When I got into ML two years ago, I thought that knowledge
             | would come from reading scientific papers. I was wrong.
             | 
             | Most papers, are wrong. That's been my experience for the
             | past two years. My experience may be wrong. Maybe others
             | obtain rigorous scientific knowledge through the paper
             | alone.
             | 
             | But researchers happen to obtain a dangerous thing:
             | _prestige_. Unfortunately, prestige doesn 't come from
             | helping others obtain knowledge. It comes from that last
             | step -- presentation.
             | 
             | The presentation on this thread is excellent. It's another
             | Karras release. I agree; there's no reason to doubt they'll
             | be just as rigorous with this release as they are with
             | stylegan2.
             | 
             | But knowledge doesn't come from presentation. Only
             | prestige.
             | 
             | Prestige makes a lot of new researchers try very hard to
             | obtain the wrong things.
             | 
             | If all of these were small concerns, or curious quirks,
             | they'd be a footnote in my field guide. But I submit that
             | these things are _front and center_ to the current state of
             | affairs in 2021. Every time a release like this happens, it
             | generates a lot of fanfare and we come together in
             | celebration because ML Is Happening, Yay!
             | 
             | And then I try to obtain the Knowledge in the fanfare, and
             | discover that either it's absent or mistaken. Because there
             | are no tools for me to verify their claims -- and when I
             | do, I often see that they don't work!
             | 
             | That's right. I kept finding out that these things being
             | claimed, just aren't true. No matter how enticing the claim
             | is, or whether it sounds like "Foobars are Aligned in the
             | Convolution Digit," the claim, from where I was sitting,
             | seemed to be wrong. It contained _mistaken knowledge_ --
             | worse than useless.
             | 
             | Unfortunately, two years with no salary takes a toll. I
             | could spend another few years doing this if I wanted to.
             | But I wound up so disgusted with discovering that we're all
             | just chasing prestige, not knowledge, that I'd rather ship
             | production-grade software for the world's most boring
             | commercial work, as long as the work seems useful and the
             | team seems interesting. Because at least I'd be doing
             | something useful.
        
               | phreeza wrote:
               | Expecting fully executable code to accompany every
               | publication is kind of unique to the modern ML research
               | Scene. As someone from a very different computational
               | research field, where zero code is the norm, not the
               | exception, this reads as a somewhat entitled rant.
               | Reimplementation of a paper is actually a test of the
               | robustness of the results. If you download the code of a
               | previous paper, there may be some assumptions hidden in
               | the implementation that aren't obvious. So I would argue
               | that simply downloading and re-executing the author's
               | implementation does _not_ constitute reproducible
               | research. I know it is costly, but for actual
               | reproduction, reimplementation is needed.
        
               | xksteven wrote:
               | Sometimes reimplantation is impossible without the code
               | and the paper goes on to win awards because it's by a
               | famous scientist. Then if the reimplantation doesn't work
               | most of the time the graduate students are blamed instead
               | of the original work.
               | 
               | There are always assumptions. At least with public code
               | and models those assumptions are laid bare for all to see
               | and potentially expose any bad assumptions.
        
               | sillysaurusx wrote:
               | I wasn't sure whether to post my edit as a separate
               | comment or not, but I significantly expanded my comment
               | just now, that helps explain my position.
               | 
               | I'd be very interested in your thoughts on that position,
               | because if it's mistaken, I shouldn't be saying it. It
               | represents whatever small contribution I can make to
               | fellow new ML researchers, which is roughly: _" watch
               | out._"
               | 
               | In short, for two years, I kept trying to implement
               | stated claims -- to reproduce them in exactly the way you
               | say here -- and they simply didn't work as stated.
               | 
               | It might sound confusing that the claims were "simply
               | wrong" or "didn't work." But every time I tried,
               | achieving anything remotely close to "success" was _the
               | exception_ , not the norm.
               | 
               | And I don't think it was because I failed to implement
               | what they were saying in the paper. I agree that that's
               | the most likely thing. But I was careful. It's very easy
               | to make mistakes, and I tried to make none, as both
               | someone with over a decade of experience
               | (https://shawnpresser.blogspot.com/) and someone who
               | cares deeply about the things I'm talking about here.
               | 
               | It takes hard work to reproduce the technique the way
               | you're saying. I put all my heart and soul into trying
               | to. And I kept getting dismayed, because people kept
               | trying to convince me of things that either I couldn't
               | verify (because verification is extremely hard, as you
               | well know) or were simply wrong.
               | 
               | So if I sound entitled, I agree. When I got into this
               | job, as an ML researcher, I thought I was entitled to the
               | scientific method. Or anything vaguely resembling
               | "careful, distilled, _correct_ knowledge that I can build
               | on. "
        
               | phreeza wrote:
               | I get your frustrations with this state of affairs, but
               | for the reasons I mentioned above, I don't think
               | providing the model and code is a panacea here. Maybe the
               | last few years have also set an unrealistic expectation
               | for the pace of progress. In my (former) field of
               | theoretical neuroscience, if a paper was not
               | reproducible, this knowledge kind of slowly diffused
               | through the community, mostly through informal
               | conversations with people who tried to reproduce or
               | extend a given approach. But this takes several years,
               | not the kind of timescale that modern ML research
               | operates on.
               | 
               | Fwiw I think actual knowledge is there in the ML
               | literature, but it's not in these Benchmark-chasing
               | highly tuned papers. It's more high level stuff, like
               | basic architecture building blocks etc. GANs and
               | Transformers for example. They undeniably work, and the
               | knowledge needed to implement them can probably be
               | conveyed in a few pages maximum. No need for an
               | implementation to be provided by the author, really.
        
               | skybrian wrote:
               | I have no particular expertise here, but I wonder if
               | you've learned to accept a mostly-broken process? We have
               | the Internet, so why settle for slow diffusion over years
               | instead of rapid communication?
               | 
               | Why should graduate students have to spend years trying
               | to reproduce stuff that turns out to be no good? Nobody
               | should have to put up with getting their time wasted like
               | that.
        
               | msapaydin wrote:
               | I think that not being able to reproduce the results
               | claimed in a paper is not specific to ML research. While
               | working as a post-doc at a top university research lab, i
               | spent years trying to understand how it can be that some
               | software that was supposed to corresponds to the well
               | cited paper did not even come close to reproducing the
               | results of the said paper, and that the primary author
               | went on to become a prof at a top university in the US.
               | In short, scientific fraud is also quite common, in most
               | academic papers.
        
               | sillysaurusx wrote:
               | Thank you!!
               | 
               |  _i spent years trying to understand how it can be that
               | some software that was supposed to corresponds to the
               | well cited paper did not even come close to reproducing
               | the results of the said paper,_
               | 
               | This was my exact experience. I didn't understand why I
               | kept having it, and kept blaming myself for not being
               | careful enough. My code _must_ be wrong, or the data, or
               | _something_.
               | 
               | Nah. It was the idea.
               | 
               | Kept feeling like a kick in the gut, until here we are
               | today, when I'm warning everyone that Karras, of all
               | people, might publish such a thing.
               | 
               | I really appreciate that you posted this, because I'm so
               | happy I wasn't alone in the feeling of "what's going on,
               | here...?"
        
               | fossuser wrote:
               | That seems like a worthwhile thing to publicize in and of
               | itself?
               | 
               | The replication crisis in psychology threw out 50% or so
               | of supposed scientific results.
               | 
               | If this (or just straight fraud) is common elsewhere, it
               | seems like knowing about that would be a good thing for
               | science.
        
               | nerdponx wrote:
               | It's not an entitled rant, other fields just have
               | dismally low standards.
        
           | varispeed wrote:
           | I am not into ML, but from time to time I like to look how
           | this is made and remember only once seeing the code and a
           | model, which I thought was exception from the "norm". Good
           | that more people are calling this out!
        
           | minimaxir wrote:
           | The repo says the code will be available in September: that's
           | a reasonable timeframe for the necessary polish/legal
           | clearance.
        
             | sillysaurusx wrote:
             | (Agreed, fwiw. What's going on here isn't a criticism of
             | this work specifically, but the trend of everyone thinking
             | that this is science generally. For example, it's true the
             | code is coming in September. And, you and I both know it's
             | probably gonna have a model release, just because it's more
             | impressive, big-name Karras nVidia work. But it might not
             | have a model release. I give that at least 40% odds. If it
             | doesn't, then everything I said above will be true about
             | that too, Karras or not. People _keep doing that_ , and we
             | have to call out that this is approximately useless for you
             | and me. Actually, I was going to say maybe it's useful for
             | you, but you're the language model hacker and I'm the GAN
             | hacker, and I assure you, code alone is useless for me. If
             | it's useful to you, I would love to know how it helps you
             | verify the scientific method.)
        
           | aaron-santos wrote:
           | Thank you for calling this out. It's critically important
           | that people understand the difference between model, code,
           | and paper and what they mean.
           | 
           | It's also important that people understand that even if code
           | is provided, it's commercially useless. From the NVAE license
           | as an example[1]
           | 
           | > The Work and any derivative works thereof only may be used
           | or intended for use non-commercially.
           | 
           | It's a great example of the difference between open source
           | (which it is) and free software which it is not. So we're
           | back to square one where it is probably best to clean-room
           | the implementation from the paper, which is nearly useless to
           | reproduce the model.
           | 
           | [1] https://github.com/NVlabs/NVAE/blob/master/LICENSE
        
             | sillysaurusx wrote:
             | Unfortunately, I must call you out too, my friend. With
             | love.
             | 
             | Because it's crucially important that we protect the
             | scientific method here.
             | 
             | The sole goal is to help people like me reproduce the
             | model. If I can't reproduce the model, I can't verify the
             | paper.
             | 
             | When I saw "commercial" and then "open source" in your
             | comment, I said "oh no..."
             | 
             | My duty is to the scientific method, so I don't care if
             | it's the most restrictive code on the planet as long as I
             | can use it to reproduce the model in the paper.
             | 
             | Because at that point, I have a baseline for _evaluating
             | the paper's claims_.
             | 
             | The reason I assume the paper is false until proven
             | otherwise, is because the paper often doesn't have enough
             | detail to reproduce the model shown in the videos on this
             | tech demos. Meaning, if they're the it to help me, the ML
             | researcher, then they're failing to tell me how to evaluate
             | their claims rigorously.
             | 
             | (That said, it's breaking my heart that I can't agree with
             | you here, because _I want to so badly_. I've felt similarly
             | for years that scientific contributions need to be "free as
             | in beer" commercially. But I recognize signs of zealotry
             | when I see them, and I can't let my personal views creep
             | in, because people like me would stop listening if I was
             | here e.g. arguing vehemently that nVidia needed to be
             | delivering us something commercially viable along with a
             | high quality codebase. The price for entry to the
             | scientific method isn't so high.)
        
               | aaron-santos wrote:
               | That's fine, at least you're open about the knowledge for
               | knowledge's sake position. There's more than one way to
               | judge something.
        
       | isoprophlex wrote:
       | If ReLU-introduced high frequency components are indeed the
       | culprit, won't using "softened" ReLU (without discontinuity in
       | the derivative at 0) everywhere solve the problem, too?
        
       | ChuckNorris89 wrote:
       | I expect this work will feed back into removing the aliasing
       | artifacts you sometimes get when using DLSS in games.
        
       ___________________________________________________________________
       (page generated 2021-06-23 23:00 UTC)