[HN Gopher] Emu Video and Emu Edit, our latest generative AI res...
___________________________________________________________________
Emu Video and Emu Edit, our latest generative AI research
milestones
Author : ot
Score : 112 points
Date : 2023-11-16 15:59 UTC (7 hours ago)
(HTM) web link (ai.meta.com)
(TXT) w3m dump (ai.meta.com)
| dougmwne wrote:
| Emu Edit is awesome. I think we have officially brought this
| scene from Star Trek to life.
|
| https://m.youtube.com/watch?v=NXX0dKw4SjI&pp=ygUII3Npbm50ZWs...
| clows wrote:
| I thought of this Running Man scene
| https://www.youtube.com/watch?v=BVdOr0z6X7Y
| bane wrote:
| With the advent of these models my head cannon now insists that
| when Star Trek characters say they "programmed" something, they
| really mean that they have a log of all of their iterative
| prompts and that there's some optimization the computer can use
| to aggregate all of those into the final resulting warp
| model/holodeck simulation/transporter filter/biobed pathogen
| detector/etc without having to reiterate through all of those
| prompts again...kind of like a NixOS declarative build.
|
| And when somebody comes along and fixes their program or
| reprograms what they did, they simply insert or change some of
| the prompts along the way and get a different effect.
|
| When the characters add new data to the computer (like the
| episode where Geordi added the psycho profile of the enterprise
| engine designer), they're just tuning the foundational model
| with some new input data.
|
| Yeah....that _feels_ right for now to me.
| cma wrote:
| "Computer, load up CELERY MAN, please"
|
| https://www.youtube.com/watch?v=a8K6QUPmv8Q
| echelon wrote:
| Tim and Eric are going to go crazy with Gen AI. They won't
| need Adult Swim to toss them shoestring budgets.
| Ajedi32 wrote:
| > Computer, show me a table.
|
| > There are 5047 classifications of tables on file. Specify
| design parameters.
|
| Interestingly enough, it seems existing AI models are already
| better than the Star Trek computer at dealing with ambiguity.
| Stable Diffusion would just generate a "normal" table and let
| you go from there.
| dougmwne wrote:
| Yes, they seem to handle emotion, humor and ambiguity better
| than Data or any computer ever on the shows. 24th century
| technology, today.
| colesantiago wrote:
| Does anyone know where the source code is, I can't seem to find
| it anywhere.
| dado3212 wrote:
| I don't think either of these (or the base Emu model) are open
| source.
| acheong08 wrote:
| That's a bit disappointing. Meta had been on an "open source"
| roll lately
| JaDogg wrote:
| First dose of gen AI is free
| satvikpendem wrote:
| Technically none of their models are actually open source.
| burningion wrote:
| There's some source code in the paper for Emu edit at least. If
| you look at the supplementary material in the paper, you'll see
| they spell out the techniques used there too.
|
| I didn't see a repository, but I think in this case, the paper
| is actually a perfect balance of detail? I think Meta benefits
| from startups building using their tooling (startups usually
| buy ads), and so the lack of a full implementation leaves a bit
| of room for startups to turn the work in to something a bit
| more production ready.
|
| The cool techniques from the paper are:
|
| Generating a bunch of example images in one go, and using CLiP
| to score your generated images
|
| And mixing pre-built pipelines and grammars to execute common
| tasks.
|
| These two ideas alone (with examples) give people in the space
| plenty to run with.
|
| Great paper!
| enonimal wrote:
| Is anyone able to determine how long it takes to generate a video
| with one of these methods? Can't find it in the paper.
| liuliu wrote:
| Emu image is not significantly slower than SDXL or similar. So
| you would expect to have similar performance as Hotshot. The
| upscaler (8 frame to 37 frame) version probably would take
| significantly longer.
| tomdell wrote:
| An impressive technical achievement, yes - but the
| presentation/marketing of this is absurd.
|
| The generated videos are aesthetically horrendous. I don't know
| what kind of mental gymnastics are going on that they can
| confidently describe something where the body shapes are
| nonsensically in flux with every change of frame (look at the
| eagle's talons, or the dog's leg movements as it runs) as "high-
| quality video".
|
| Is generative AI hype blinding them to how hideous these videos
| are, or do they know and they just pretend like it's something it
| isn't?
| BoorishBears wrote:
| Check out what AI generated images looked like 24 months ago
| and this comment may feel a lot less pithy.
| ShamelessC wrote:
| Compared to prior work, it looks unbelievable. Is this just an
| armchair criticism or have you been paying any attention?
| davesque wrote:
| Definitely looks like progress, but they're still firmly in the
| center of the uncanny valley.
| scudsworth wrote:
| a huge pile of money on fire forever
___________________________________________________________________
(page generated 2023-11-16 23:00 UTC)