[HN Gopher] How Sber Built ruDALL-E
___________________________________________________________________
How Sber Built ruDALL-E
Author : aroccoli
Score : 46 points
Date : 2021-12-29 20:12 UTC (1 days ago)
(HTM) web link (serokell.io)
(TXT) w3m dump (serokell.io)
| [deleted]
| amelius wrote:
| These models all seem to have the flaw that faces don't come out
| symmetrically. Especially eyes look like they are in the wrong
| location.
| criticaltinker wrote:
| _> The model is considered the greatest computational project in
| Russia for now, totaling 24,256 GPU days to train the models._
|
| _> We don't know for sure why OpenAI hasn't shown its work in a
| more reproducible way. But this step is definitely done to
| stimulate the further openness and progress of such models._
|
| Super interesting and great commentary, thanks for sharing!
| f311a wrote:
| Sber also has an open-source version of GPT-3 for Russian.
|
| Sber is a state-owned Russian bank which is a pretty funny detail
| given that a lot of banks can't even built a decent mobile app.
| cpursley wrote:
| The Sberbank mobile app in Russia is an order of magnitude
| better than anything I've used in the US. The other large
| Russian tech and service company apps are very very good
| (Ozone, anything Yandex puts out). Even the federal services
| apps are well executed - you can pay your property taxes and
| other services by scanning a QR code. Some great tech coming
| out of that country (accusations of hacking, aside).
| baybal2 wrote:
| Sberbank is a joke of a bank, mostly serving older generation
| who kept using it on inertia from the time it was the only
| bank you got in the country.
|
| Generally, it's bureaucratic, kafkaesque, and ill, as the
| country which once made it.
| trhway wrote:
| Sberbank CEO (he is a Russian German and has some typical
| traits making him noticeably different from typical Russian
| bureaucrat) and his posse is the leading part of the
| technocratic wing of the political elite in Russia. Their
| people also lead another important bank - VTB
| (international payments/etc for large corps), and Sber has
| strong hold on various national networks, like naturally
| anything money related, like municipal services and traffic
| ticket payments for example, as well as on generic network
| infra and datacenters. If Putin is gone tomorrow there is
| strong chance that those technocrats will take the power (i
| haven't noticed any significant animosity between them and
| FSB which would otherwise be a complication). Particularly
| important aspect showing their power is that there have
| been no corruption scandals associated with them, at least
| not that i can remember in the last decade at least. They
| tread very carefully, not making any open political claims
| while presenting themselves basically like apolitical tech-
| infrastructure/platform for the efficient government and
| society and doubling down on the source of their shadow
| power - network/infra/technocracy. Thus they can't allow
| themselves to suck too much technically, and thus they
| naturally hire decent technical people (i have some first
| handshakes among the upper management in technology there)
| kgeist wrote:
| Have you used Sberbank lately? I have a different
| experience and I'm not from the "older generation". Its
| mobile app is pretty decent, this year I got a mortgage
| loan and it went pretty smooth, I didn't notice anything
| bureaucratic or kafkaesque about it? I'm its client for 4
| years now and I'm struggling to remember negative
| experience with it. They've been having an overhaul lately,
| maybe it was far worse before. Yeah the cool kids prefer
| Tinkoff nowadays but it's not true that only old people use
| Sberbank.
| cpursley wrote:
| Even so, they do a pretty good job for a state-backed bank.
| Better than anything state run I've experienced in the US.
|
| But I agree in principal with you - and from what I hear,
| Tinkoff is one of the better choices and the founder is
| well respected.
| zkid18 wrote:
| Well, so do 95% retail banks across the globe.
| another_kel wrote:
| It's a shitty bank by russian standards indeed, but this
| has nothing to do with the fact that
|
| >The Sberbank mobile app in Russia is an order of magnitude
| better than anything I've used in the US.
| minimaxir wrote:
| A very curious effect of ruDALL-E is that the finetuning works on
| small datasets with unexpectedly good results. The Sneakers
| example they note in this article is on about ~10k images.
|
| As an experiment, I finetuned ruDALL-E on about 1000 images of
| Pokemon and generated from that, which yielded incredible results
| that went viral:
| https://twitter.com/minimaxir/status/1470913487085785089
|
| I then tried finetuning ruDALL-E on _1_ Pokemon, yet still good
| /horrifying results:
| https://twitter.com/minimaxir/status/1474913997807755268
|
| Unfortunately it's still a convoluted process to finetune
| ruDALL-E; I hope they end up releasing a smaller model to make it
| possible to do on a smaller/free GPU. (if they do, I'll release a
| streamined Colab notebook + blog post on how to do it)
| [deleted]
| [deleted]
| etaioinshrdlu wrote:
| How much GPU RAM and time does it currently take to fine-tune
| the current model?
| minimaxir wrote:
| Essentially all of a 16GB GPU VRAM, even with some layers
| frozen.
|
| The more diverse the input images, the longer/more epochs the
| finetuning process should take in order to get stable
| results. The first Pokemon model was trained for about 4.5
| hours; the one-shot model was about 2 minutes.
| lostmsu wrote:
| Curious. How does freezing layers save you memory? Does it
| save compute time much?
|
| I understand the frozen layers do not need gradients to be
| stored?
| minimaxir wrote:
| Essentially yes. That technique is not exclusive to
| ruDALL-E; large models often freeze early layers and
| train lower layers only due to VRAM constraints.
| lostmsu wrote:
| Oh, right, only freezing early layers makes sense. I was
| thinking you froze inner ones, but gradients would need
| to be computed and kept for them to backpropagate to the
| unfrozen early ones.
___________________________________________________________________
(page generated 2021-12-30 23:00 UTC)