[HN Gopher] Veo 2: Our video generation model
___________________________________________________________________
Veo 2: Our video generation model
Author : mvoodarla
Score : 260 points
Date : 2024-12-16 17:04 UTC (5 hours ago)
(HTM) web link (deepmind.google)
(TXT) w3m dump (deepmind.google)
| jsheard wrote:
| Judging by how they've been trying to ram AI into YouTube
| creators workflows I suppose it's only a matter of time before
| they try to automate the entire pipeline from idea, to execution,
| to "engaging" with viewers. It won't be _good_ at doing any of
| that but when did that ever stop them.
|
| https://www.youtube.com/watch?v=26QHXElgrl8
|
| https://x.com/surri01/status/1867433782992879617
| larodi wrote:
| And then suddenly this is not something that fascinates people
| anymore... in 10 years as non-synthetic becomes the new bio or
| artisan or whatever you like.
|
| Humanity has its ways of objecting accelerationism.
| turnsout wrote:
| Put another way, over time people devalue things which can be
| produced with minimal human effort. I suspect it's less about
| humanity's values, and more about the way money closely
| tracks "time" (specifically the duration of human effort).
| PittleyDunkin wrote:
| https://en.wikipedia.org/wiki/Labor_theory_of_value
| turnsout wrote:
| Yes, exactly. Marx had this right. Money is a way to
| trade time.
| EGreg wrote:
| I strongly disagree. How many clothes do you buy that have
| 100 thread count, and are machine-made, vs hand-knit
| sweaters or something?
|
| When did you ask people for directions, or other major
| questions, instead of Google?
|
| You can wax poetic about wanting "the human touch", but at
| the end of the day, the market speaks -- people will just
| prefer everything automated. Including their partners,
| after your boyfriend can remember every little detail about
| you, notice everything including your pupils dilating, know
| exactly how you like it, when you like it, never get angry
| unless it's to spice things up, and has been trained on
| 1000 other partners, how could you go back? When robots can
| raise children better than parents, with patience and
| discipline and teaching them with individual attention,
| know 1000 ways to mold their behavior and achieve healthier
| outcomes. Everything people do is being commodified as we
| speak. Soon it will be humor, entertainment, nursing, etc.
| Then personal relations.
|
| Just extrapolate a decade or three into the future. Best
| case scenario: if we nail alignment, we build a zoo for
| ourselves where we have zero power and are treated like
| animals who have sex and eat and fart all day long. No one
| will care about whatever you have to offer, because
| everyone will be surrounded by layers of bots from the time
| they are born.
|
| PS: anything you write on HN can already have been written
| by AI, pretty soon you may as well quit producing any
| content at all. No one will care whether you wrote it.
| vouaobrasil wrote:
| > PS: anything you write on HN can already have been
| written by AI, pretty soon you may as well quit producing
| any content at all. No one will care whether you wrote
| it.
|
| People theoretically would care, but the internet has
| already set up producing things to be pseudo-anonymous,
| so we have forgotten the value of actually having a human
| being behind content. That's why AI is so successful, and
| it's a damn shame.
| skeledrew wrote:
| What exactly is the value of having a human behind
| content if it gets to the point that content generated by
| AI is indistinguishable from content generated by humans?
| turnsout wrote:
| I think "indistinguishable" is a receding horizon. People
| are already good at picking out AI text, and AI video is
| even easier. Even if it looks 100% realistic on the
| surface, the content itself (writing, concept, etc) will
| have a kind of indescribable "sameness" that will give it
| away.
|
| If there's one thing that connects all media made in
| human history, it's that humans find humans interesting.
| No technology (like literally no technology ever) will
| change that.
| vouaobrasil wrote:
| The fact that anyone would ask this question is
| incredible!
|
| It's so we can in a fraction of those cases, develop real
| relationships to others behind the content! The whole
| point of sharing is to develop connections with real
| people. If all you want to do is consume independently of
| that, you are effectively a soulless machine.
| realce wrote:
| What does indistinguishable even mean here?
|
| If a fish could write a novel, would you find what it
| wrote interesting, or would it seem like a fish wrote it?
| Humans absorb information relative to the human
| experience, and without living a human existence the
| information will feel fuzzy or uncanny. AI can
| approximate that but can't live it for real. Since it is
| a derivative of an information set, it can never truly
| express the full resolution of it's primary source.
| turnsout wrote:
| I have both machine-made and hand-knit sweaters. In
| general, I expect handmade clothes to be more expensive
| than machine-made, which kinda proves my point. I never
| said machine-made things had zero value. I said we will
| tend to devalue them relative to more human-intensive
| things.
|
| Asking for directions is a bad example, because it takes
| very little time for both humans and machines to give you
| directions. Therefore it would be highly unusual for
| anyone to pay for this service (LOL)
| echelon wrote:
| Are you kidding?
|
| TikTok is one of the easiest platforms to create for, and
| look at how much human attention it has sucked up.
|
| The attention/dopamine magnet is accelerating its
| transformation into a gravitational singularity for human
| minds.
| tokioyoyo wrote:
| TikTok's main attraction are the people, not just the
| videos. Trends, drama and etc. all involve real humans
| doing real human stuff, so it's relatable.
|
| I might be wrong, but AI videos are on the same path as AI
| generated images. Cool for the first year, then "ah ok,
| zero effort content".
| gom_jabbar wrote:
| Sure, humanity has its ways of objecting Accelerationism, but
| the process fundamentally challenges human identity:
|
| "The Human Security System is structured by delusion. What's
| being protected there is not some real thing that is mankind,
| it's the structure of illusory identity. Just as at the more
| micro level it's not that humans as an organism are being
| threatened by robots, it's rather that your self-
| comprehension as an organism becomes something that can't be
| maintained beyond a certain threshold of ambient networked
| intelligence." [0]
|
| See also my research project on the core thesis of
| Accelerationism that capitalism is AI. [1]
|
| [0] https://syntheticzero.net/2017/06/19/the-only-thing-i-
| would-...
|
| [1] https://retrochronic.com/
| vouaobrasil wrote:
| > Humanity has its ways of objecting accelerationism.
|
| Actually, typically human objection only slows it down and
| often it becomes a fringe movement, while the masses continue
| to consume the lowest common denominator. Take the revival of
| the flip phone, typewriter, etc. Sadly, technology marches on
| and life gets worse.
| adolph wrote:
| Does life get worse for the majority of people or do the
| fruits of new technology rarely address any individual
| person's progress toward senescence? (The latter feels like
| tech moves forward but life gets worse.)
| vouaobrasil wrote:
| Of course, it depends on how you define "worse". If you
| use life expectancy, infant mortality, and disease, then
| life has in the past gotten better (although the
| technology of the past 20 years has RARELY contributed to
| any of that).
|
| If you use 'proximity to wild nature', 'clean air', 'more
| space', then life has gotten worse.
|
| But people don't choose between these two. They choose
| between alternatives that give them analgesics in an
| already corrupt society creating a series of descending
| local maximae.
| noch wrote:
| > Judging by how they've been trying to ram AI into YouTube
| creators workflows [...]
|
| Thanks for sharing that video and post!
|
| One way to think about this stuff is to imagine that you are 14
| and starting to create videos, art, music, etc in order to
| build a platform online. Maybe you dream of having 7 channels
| at the same time for your sundry hobbies and building
| audiences.
|
| For that 14 year old, these tools are available everywhere by
| default and are a step function above what the prior generation
| had. If you imagine these tools improving even faster in
| usability and capability than prior generations' tools did ...
|
| If you are of a certain age you'll remember how we were
| harangued endlessly about "remix culture" and how mp3s were
| enabling us to steal creativity without making an effort at
| being creative ourselves, about how photobashing in Photoshop
| (pirated cracked version anyway) was not real art, etc.
|
| And yet, halfway through the linked video, the speaker, who has
| misgivings, was laughing out loud at the inventiveness of the
| generated replies and I was reminded that someone once said
| that one true IQ test is the ability to make other humans
| laugh.
| jsheard wrote:
| > laughing out loud at the inventiveness of the generated
| replies
|
| Inventive is one way of putting it, but I think he was
| laughing at how bizarre or out-of-character the responses
| would be if he used them. Like the AI suggesting that he post
| "it is indeed a beverage that would make you have a hard time
| finding a toilet bowl that can hold all of that liquid" as if
| those were his own words.
| handsaway wrote:
| "remix culture" required skill and talent. Not everyone could
| be Girl Talk or make The Grey Album or Wugazi. The artists
| creating those projects clearly have hundreds if not
| thousands of hours of practice differentiating them from
| someone who just started pasting MP3s together in a DAW
| yesterday.
|
| If this is "just another tool" then my question is: does the
| output of someone who has used this tool for one thousand
| hours display a meaningful difference in quality to someone
| who just picked it up?
|
| I have not seen any evidence that it does.
|
| Another idea: What the pro generative AI crowd doesn't seem
| to understand is that good art is not about _execution_ it's
| about _making deliberate choices_. While a master painter or
| guitarist may indeed pull off incredible technical feats,
| their execution is not the art in and of itself, it is
| widening the amount of choices they can make. The more and
| more generative AI steps into the role of making these
| choices ironically the more useless it becomes.
|
| And lastly: I've never met anyone who has spent significant
| time creating art react to generative AI as anything more
| than a toy.
| Philpax wrote:
| > does the output of someone who has used this tool for one
| thousand hours display a meaningful difference in quality
| to someone who just picked it up?
|
| Yes. A thousand hours confers you with a much greater
| understanding of what it's capable of, its constraints, and
| how to best take advantage of these.
|
| By comparison, consider photography: it is ostensibly only
| a few controls and a button, but getting quality results
| requires the user to understand the language of the medium.
|
| > What the pro generative AI crowd doesn't seem to
| understand is that good art is not about _execution_ it's
| about _making deliberate choices_. While a master painter
| or guitarist may indeed pull off incredible technical
| feats, their execution is not the art in and of itself, it
| is widening the amount of choices they can make.
|
| This is often not true, as evidenced by the pre-existing
| fields of generative art and evolutionary art. It's also a
| pretty reductive definition of art: viewers can often find
| art in something with no intentional artistry behind it.
|
| > I've never met anyone who has spent significant time
| creating art react to generative AI as anything more than a
| toy.
|
| It's a big world out there, and you haven't met everyone ;)
| Just this last week, I went to two art exhibitions in Paris
| that involved generative AI as part of the artwork; here's
| one of the pieces:
| https://www.muhka.be/en/exhibitions/agnieszka-polska-
| flowers...
| noch wrote:
| > Just this last week, I went to two art exhibitions in
| Paris that involved generative AI as part of the artwork;
| here's one of the pieces
|
| The exhibition you shared is rather beautiful. Thank you
| for the link!
| dragonwriter wrote:
| > If this is "just another tool" then my question is: does
| the output of someone who has used this tool for one
| thousand hours display a meaningful difference in quality
| to someone who just picked it up?
|
| Yes, absolutely. Not necessarily in apparent execution
| without knowledge of intent (though, often, there, too),
| but in the scope of meaningful choices that fhey can make
| and reflect with the tools, yes.
|
| This is probably even more pronounced with use of open
| models than the exclusively hosted ones, because more
| choices and controls are exposed to the user (with the
| right toolchain) than with most exclusively-hosted models.
| noch wrote:
| > "remix culture" required skill and talent.
|
| We were told that what we were doing didn't require as much
| skill as whatever the previous generation were doing to
| sample music and make new tracks. In hindsight, of course
| _you_ find it easy to cite the prominent successes _that
| you know_ from the generation. That 's arguing from
| _survivorship bias_ and _availability bias_.
|
| But those successes were never the point: the publishers
| and artists were pissed off at the tens of thousands of
| teenagers remixing stuff for their own enjoyment and
| forming small yet numerous communities and subcultures
| globally over the net. Many of us never became famous so
| you can cite our fame as proof of skill but we made money
| hosting parties at the local raves with beats we remixed
| together ad hoc and that others enjoyed.
|
| > The artists creating those projects clearly have hundreds
| if not thousands of hours of practice differentiating them
| from someone who just started pasting MP3s together in a
| DAW yesterday.
|
| But they all began as I did, by being someone who "just
| started pasting MP3s together" in my bedroom. Darude,
| Skrillex, Burial, and all the others simply kept doing it
| longer than those who decided they had to get an office job
| instead.
|
| The teenagers today are in exactly the same position,
| except with vastly more powerful tools and the entire
| corpus of human creativity free to download, whether in the
| public domain or not.
|
| I guess in response to your "required skill and talent",
| I'm saying that skill is something that's developed within
| the context of the technology a generation has available.
| But it is always developed, then viewed as such in
| hindsight.
| spankalee wrote:
| They basically already have this:
| https://workspace.google.com/products/vids/
| cj wrote:
| Last week I started seeing a banner in Google Docs along the
| lines of "Create a video based on the content of this doc!"
| with a call to action that brought me to Google Vids.
| lukan wrote:
| Hey, it's AI and so it is good, right?
|
| Seriously, it sounds like something kids can have fun with,
| or bored deskworkers. But a serious use case, at the
| current state of the art? I doubt it.
| EGreg wrote:
| Who needs viewers anyway? Automate the whole thing. I just see
| the endgame for the internet is
| https://en.wikipedia.org/wiki/Dead_Internet_theory
| zb3 wrote:
| We should collectively ignore these announcements of unavailable
| models. There are models you can use today, even in the EU.
| ilaksh wrote:
| Actually there is a pretty significant new model announced
| today and available now: "MiniMax (Hailuo)Video-01-Live"
| https://blog.fal.ai/introducing-minimax-hailuo-video-01-live...
|
| Although I tried that and it has the same issue all of them
| seem to have for me: if you are familiar with the face but they
| are not really famous then the features in the video are never
| close enough to be able to recognize the same person.
| creativenolo wrote:
| It was announced weeks ago.
|
| 50 cents per video. Far more when accounting for a cherrypick
| rate.
| the8thbit wrote:
| I don't see why, unless you think they're lying and they filmed
| their demos, or used some other preexisting model. I didn't
| ignore the JWST launch just because I haven't been granted to
| ability to use the telescope.
| zb3 wrote:
| Back when Imagen was not public, they didn't properly
| validate whether you were a "trusted tester" on the backend,
| so I managed to generate a few images..
|
| ..and that's when I realized how much cherry picking we have
| in these "demos". These demos are about deceiving you into
| thinking the model is much better than it actually is.
|
| This promotes not making the models available, because people
| then compare their extrapolation of demo images with the
| actual outputs. This can trick people into thinking Google is
| winning the game.
| tauntz wrote:
| Google being Google:
|
| > VideoFX isn't available in your country yet.
| jjbinx007 wrote:
| Give it a few months and it'll get cancelled
| warkdarrior wrote:
| Why would the country get cancelled?
| Jabrov wrote:
| He means the project, obviously
| ilaksh wrote:
| Don't worry, even if it was "available" in your country, it's
| not really available. I am in the US and I just see a waitlist
| sign up.
| xnx wrote:
| This looks great, but I'm confused by this part:
|
| > Veo sample duration is 8s, VideoGen's sample duration is 10s,
| and other models' durations are 5s. We show the full video
| duration to raters.
|
| Could the positive result for Veo 2 mean the raters like longer
| videos? Why not trim Veo 2's output to 5s for a better controlled
| test?
|
| I'm not surprised this isn't open to the public by Google yet,
| there's a huge amount of volunteer red-teaming to be done by the
| public on other services like hailuoai.video yet.
|
| P.S. The skate tricks in the final video are delightfully insane.
| echelon wrote:
| > I'm not surprised this isn't open to the public by Google
| yet,
|
| Closed models aren't going to matter in the long run. Hunyuan
| and LTX both run on consumer hardware and produce videos
| similar in quality to Sora Turbo, yet you can train them and
| prompt them on anything. They fit into the open source
| ecosystem which makes building plugins and controls super easy.
|
| Video is going to play out in a way that resembles images.
| Stable Diffusion and Flux like players will win. There might be
| room for one or two Midjourney-type players, but by and large
| the most activity happens in the open ecosystem.
| sorenjan wrote:
| > Hunyuan and LTX both run on consumer hardware
|
| Are there other versions than the official?
|
| > An NVIDIA GPU with CUDA support is required. > Recommended:
| We recommend using a GPU with 80GB of memory for better
| generation quality.
|
| https://github.com/Tencent/HunyuanVideo
|
| > I am getting CUDA out of memory on an Nvidia L4 with 24 GB
| of VRAM, even after using the bfloat16 optimization.
|
| https://github.com/Lightricks/LTX-Video/issues/64
| jcims wrote:
| Yes. Lots of folks on reddit running it on 24gb cards.
| jokethrowaway wrote:
| Yes you can, with some limitations
|
| https://github.com/Tencent/HunyuanVideo/issues/109
| dyauspitr wrote:
| Stable Diffusion and Flux did not win though. Midjourney and
| chatGPT won.
| griomnib wrote:
| "Won" what exactly? I have no issues running stable
| diffusion locally.
|
| Since Llama3.3 came out it is my first stop for coding
| questions, and I'm only using closed models when llama3.3
| has trouble.
|
| I think it's fairly clear that between open weights and
| LLMs plateauing, the game will be who can build what on top
| of largely equivalent base models.
| dyauspitr wrote:
| The quality for SD is no where near the clear leaders.
| WillyWonkaJr wrote:
| I wonder if the more decisive aspect is the data, not the
| model. Will closed data win over open data?
|
| With the YouTube corpus at their disposal, I don't see how
| anyone can beat Google for AI video generation.
| sigmar wrote:
| Winning 2:1 in user preference versus sora turbo is impressive.
| It seems to have very similar limitations to sora. For example-
| the leg swapping in the ice skating video and the bee keeper
| picking up the jar is at a very unnatural acceleration (like it
| pops up). Though by my eye maybe slightly better emulating
| natural movement and physics in comparison to sora. The blog post
| has slightly more info:
|
| >at resolutions up to 4K, and extended to minutes in length.
|
| https://blog.google/technology/google-labs/video-image-gener...
| BugsJustFindMe wrote:
| > _the jar is at a very unnatural acceleration (like it pops
| up)._
|
| It does pop up. Look at where his hand is relative to the jar
| when he grabs it vs when he stops lifting it. The hand and the
| jar are moving, but the jar is non-physically unattached to the
| grab.
| torginus wrote:
| It looks Sora is actually the worst performer in the
| benchmarks, with Kling being the best and others not far
| behind.
|
| Anyways, I strongly suspect that the funny meme content that
| seems to be the practical uses case of these video generators
| won't be possible on either Veo or Sora, because of copyright,
| PC, containing famous people, or other 'safety' related
| reasons.
| alsodumb wrote:
| My theory as to why all the bigtech companies are investing so
| much money in video generation models is simple: they are trying
| to eliminate the threat of influencers/content creators to their
| ad revenue.
|
| Think about it, almost everyone I know rarely clicks on ads or
| buys from ads anymore. On the other hand, a lot of people
| including myself look into buying something advertised implicitly
| or explicitly by content creators we follow. Say a router
| recommended by LinusTechTips. A lot of brands started moving
| their as spending to influencers too.
|
| Google doesn't have a lot of control on these influencers. But if
| they can get good video generations models, they can control this
| ad space too without having human in the loop.
| PittleyDunkin wrote:
| > Think about it, almost everyone I know rarely clicks on ads
| or buys from ads anymore.
|
| I remember saying this to a google VP fifteen years ago.
| Somehow people are still clicking on ads today.
| dragonwriter wrote:
| > Think about it, almost everyone I know rarely clicks on ads
| or buys from ads anymore.
|
| Most people have claimed not to be influenced by ads since long
| before networked computers were a major medium for delivering
| them.
| spankalee wrote:
| It's so much simpler than that:
|
| 1) AI is a massive wave right now and everyone's afraid that
| they're going to miss it, and that it will change the world.
| They're not obviously wrong!
|
| 2) AI is showing real results in some places. Maybe a lot of us
| are numb to what gen AI can do by now, but the fact that it can
| generate the videos in this post is actually astounding! 10
| years ago it would have been borderline unbelievable. Of course
| they want to keep investing in that.
| summerlight wrote:
| > Think about it, almost everyone I know rarely clicks on ads
| or buys from ads anymore.
|
| This is a typical tech echo chamber. There is a significant
| number of people who make direct purchases through ads.
|
| > But if they can get good video generations models, they can
| control this ad space too without having human in the loop.
|
| Looks like based on a misguided assumption. Format might have
| significant impacts on reach, but decision factor is trust on
| the reviewer. Video format itself does not guarantee a decent
| CTR/CVR. It's true that those ads company find this space
| lucrative, but they're smart enough to acknowledge this
| complexity.
| the8thbit wrote:
| > This is a typical tech echo chamber. There is a significant
| number of people who make direct purchases through ads.
|
| Even if its not, TV ads, newspaper ads, magazine ads,
| billboards, etc... get exactly 0 clickthrus, and yet, people
| still bought (and continue to buy) them. Why do we act like
| impressions are hunky-dory for every other medium, but
| worthless for web ads?
| vinayuck wrote:
| I did not think about that angle yet but I have to admit, I
| agree. I rarely ever even pay attention to the YT ads and kind
| of just zone out but the recommendations by content creators I
| usually watch are one of the main sources I keep up with new
| products and decide what to buy.
| veryrealsid wrote:
| FWIW it feels like Google should dominate text/image -> video
| since they have access to Youtube unfettered. Excited to see what
| the reception is here.
| paxys wrote:
| Everyone has access to YouTube. It's safe to assume that Sora
| was trained on it as well.
| Jeff_Brown wrote:
| All you can eat? Surely they charge a lot for that, at least.
| And how would you even find all the videos?
| griomnib wrote:
| They already did it, and I'm guessing they were using some
| of the various YouTube down loaders Google has been going
| after.
| HeatrayEnjoyer wrote:
| Who says they've talked to Google about it at all?
|
| I can't speak to OpenAI but ByteDance isn't waiting for
| permission.
| bangaladore wrote:
| Does everyone have "legal" access to YouTube.
|
| In theory that should matter to something like
| Open(Closed)Ai. But who knows.
| dheera wrote:
| I mean, I have trained myself on Youtube.
|
| Why can't a silicon being train itself on Youtube as well?
| dmonitor wrote:
| Because silicon is a robot. A camcorder can't catch a
| flick with me in the theater even if I dress it up like a
| muppet.
| hirako2000 wrote:
| They also had a good chunk of the web text indexed, millions of
| people's email sent every day, Google scholar papers, the
| massive Google books that digitized most ever published books
| and even discovered transformers.
| lukol wrote:
| Last time Google made a big Gemini announcement, OpenAI owned
| them by dropping the Sora preview shortly after.
|
| This feels like a bit of a comeback as Veo 2 (subjectively)
| appears to be a step up from what Sora is currently able to
| achieve.
| Jotalea wrote:
| Random fact: Veo means "I see" in Spanish. Take it on any way you
| want.
| espadrine wrote:
| Hernan Moraldo is from Argentina. That may be all there is to
| it.
| arnaudsm wrote:
| While Video means "I see" in latin
| dangan wrote:
| Is it just me or do all these models generate everything in a
| weird pseudo-slow motion framerate?
| thatfrenchguy wrote:
| The example of a "Renaissance palace chamber" is very
| historically inaccurate by around a century or two, the generated
| video looks a lot like a pastiche of Versailles from the Age en
| Enlightenment instead. I guess that's what you get by training on
| the internet.
| ralfd wrote:
| I watched that 10 times because the details are bonkers and I
| find amazing that she and the candle is visible in the mirror!
| Speaking of inaccuracy though are these
| pencils/textmarkers/pens on the desk? ;)
| Retr0id wrote:
| Huge swathes of social media users are going to love this shit.
| It makes me so sad.
| jasonjmcghee wrote:
| I appreciate they posted the skateboarding video. Wildly
| unrealistic whenever he performs a trick - just morphing body
| parts.
|
| Some of the videos look incredibly believable though.
| dyauspitr wrote:
| The honey, Peruvian women, swimming dog, bee keeper, DJ etc.
| are stunning. They're short but I can barely find any
| artifacts.
| johndough wrote:
| It is great so see a limitations section. What would be even
| more honest is a very large list of videos generated without
| any cherry picking to judge the expected quality for the
| average user. Anyway, the lack of more videos suggests that
| there might be something wrong somewhere.
| cyv3r wrote:
| I don't know why they say the model understands physics when it
| makes mistakes like that still.
| bahmboo wrote:
| Cracks in the system are often places where artists find the
| new and interesting. The leg swapping of the ice skater is
| mesmerizing in its own way. It would be useful to be able to
| direct the models in those directions.
| mattigames wrote:
| Just pretend it's a movie about a shape shifter alien and it's
| just trying it's best at ice skating, art is subjective like
| that doesn't it? I bet Salvador Dali would have found those
| morphing body parts highly amusing.
| visnup wrote:
| our only hope for verifying truth in the future is that state
| officials give their speeches while doing kick flips and
| frontside 360s.
| markus_zhang wrote:
| Maybe they will do more in person talks, I guess. Back to the
| old times.
| stabbles wrote:
| sadly it's likely that video gen models will master this
| ability faster than state officials
| mikepurvis wrote:
| Remember when the iPhone came out and BlackBerry smugly
| advertised that their products were "tools not toys"?
|
| I remember saying to someone at the time that I was pretty
| sure iPhone was going to get secure corporate email and
| device management faster than BlackBerry was going to get
| an approachable UI, decent camera, or app ecosystem.
| kaonwarb wrote:
| This was my favorite of all of the videos. There's no uncanny
| valley; it's openly absurd, and I watched it 4-5 times with
| increasing enjoyment.
| qwertox wrote:
| OpenAI is like the super luxurious yacht all pretty and shiny,
| while Google's AI department is the humongous nuclear submarine
| at least 5 times bigger than the yacht with a relatively cool
| conning tower, but not that spectacular to look at.
|
| Like the tanker which is still steering to fully align with the
| course people expect it to be, which they don't recognize that it
| will soon be there and be capable of rolling over everything
| which comes in its way.
|
| If OpenAi claims they're close to having AGI, Google most likely
| already has it and is doing its shenanigans with the US
| government under the radar. While Microsoft are playing the cool
| guys and Amazon is still trying to get their act together.
| byyoung3 wrote:
| google definitely does not have AGI hhaaha
| JeremyNT wrote:
| Yeah pretty bad example from parent but the point stands I
| think... I mostly just assume that for everything ChatGPT
| hypes/teases Google probably has something equivalent
| internally that they just aren't showing off to the public.
| YetAnotherNick wrote:
| I know that Google's internal ChatGPT alternative was
| significally worse than ChatGPT(confirmed both in news and
| by Googlers) around a year back. So you might say they
| might overtake OpenAI because of more resources, but they
| aren't significantly ahead of OpenAI.
| simultsop wrote:
| ex-googler confirms :/
| tokioyoyo wrote:
| All it took was a good old competition that has potential to
| steal user base from core Google search product. Nice to be
| back to competition era of web tech.
| griomnib wrote:
| Or, using Occams Razor; Sundar is a shit CEO and is playing
| catchup with a company largely fueled by innovations created at
| Google but never brought to market because it would eat into
| ads revenue.
|
| That, or they have a secret super human intelligence under
| wraps at the pentagon.
| klabb3 wrote:
| It's telling that safety and responsibility gets so much fluff
| words, technical details are fairly extensive, but no mention of
| the training data? It's clearly relevant for both performance and
| ethical discussions.
|
| Maybe it's just me who couldn't find it, (the website barely
| works at all on FF iOS)..
| tokioyoyo wrote:
| Most people called that the second one of the companies stop
| caring about safety, others will stop as well. People hate
| being told what they're not supposed to do. And not companies
| will go forward with abandoning their responsible use policies.
| gamesbrainiac wrote:
| This might be a dumb question to ask, but what exactly is this
| useful for? B-Roll for YouTube videos? I'm not sure why so much
| effort is being put into something like this when the
| applications are so limited.
| Philpax wrote:
| Are they that limited? It's a machine that can make videos from
| user input: it can ostensibly be used wherever you need video,
| including for creative, technical and professional
| applications.
|
| Now, it may not be the best fit for those yet due to its
| limitations, but you've gotta walk before you can run: compare
| Stable Diffusion 1.x to FLUX.1 with ControlNet to see where
| quality and controllability could head in the future.
| terhechte wrote:
| Back when computers took up a whole room, you'd also have
| asked: "but what exactly is this useful for? B-Roll some simple
| calculations that anybody can do with a piece of paper and a
| pen."?
|
| Think 5-10 years into the future, this is a stepping stone
| alectroem wrote:
| That's comparing apples to oranges though isn't it?
| Generating videos is the output of the technology, not the
| tech itself. It would be like someone asking "this computer
| that takes up a whole room printed out ascii art, what is
| this useful for?"
| code_for_monkey wrote:
| this is kind of an unfair comparison. Whats the endpoint of
| generating AI videos? What can this do that is useful,
| contributes something to society, has artistic value, etc
| etc. We can make educational videos with a script but its
| also pretty easy for motivated parties to do that already,
| and its getting easier as cameras get better and smaller. I
| think asking "whats the point of this" is at least fair.
| mindwok wrote:
| They're a way firo
| carlosjobim wrote:
| They were calculating missile trajectories, everybody
| understood what they were useful for.
| terhechte wrote:
| https://www.lexology.com/library/detail.aspx?g=164a442a-1b9
| 0...
| hnuser123456 wrote:
| Because it's pretty cool to be able to imagine any kind of
| scene in your head, put it into words, then see it be made into
| a video file that you can actually see and share and refine.
| carlosjobim wrote:
| Use your imagination.
| picafrost wrote:
| I have observed some musicians creating their own music videos
| with tools like this.
| aenvoker wrote:
| This silly music video was put together by one person in
| about 10 hours.
|
| https://www.reddit.com/r/aivideo/comments/1hbnyi2/comment/m1.
| ..
|
| Another more serious music video also made entirely by one
| person. https://www.youtube.com/watch?v=pdqcnRGzH5c Don't
| know how long it took though.
| krunck wrote:
| Streaming services where there is no end to new content that
| matches your viewing patterns.
| code_for_monkey wrote:
| this sounds awful haha
| drusepth wrote:
| We're preparing to use video generation (specifically
| image+text => video so we can also include an initial
| screenshot of the current game state for style control) for
| generating in-game cutscenes at our video game studio.
| Specifically, we're generating them at play-time in a sandbox-
| like game where the game plays differently each time, and
| therefore we don't want to prerecord any cutscenes.
| moritonal wrote:
| Okay, so is the aim to run this locally on a client's
| computer or served from a cloud? How does the math work out
| where it's not just easier at that point to render it in
| game?
| notatoad wrote:
| in it's current state, it's already useful for b-roll, video
| backgrounds for websites, and any other sort of "generic"
| application where the point of the shot is just to establish
| mood and fill time.
|
| but more than anything it's useful as a stepping stone to more
| full-featured video generation that can maintain characters and
| story across multiple scenes. it seems clear that at some point
| tools like this will be able to generate full videos, not just
| shots.
| jonas21 wrote:
| If you want to train a model to have a general understanding of
| the physical world, one way is to show it videos and ask it to
| predict what comes next, and then evaluate it on how close it
| was to what actually came next.
|
| To really do well on this task, the model basically has to
| understand physics, and human anatomy, and all sorts of
| cultural things. So you're forcing the model to learn all these
| things about the world, but it's relatively easy to train
| because you can just collect a lot of videos and show the model
| parts of them -- you know what the next frame is, but the model
| doesn't.
|
| Along the way, this also creates a video generation model - but
| you can think of this as more of a nice side effect rather than
| the ultimate goal.
| tucnak wrote:
| You really think making videos with computers is not useful? Is
| this a joke?
| wnolens wrote:
| TV commercials / youtube ads. You don't need a video team
| anymore to make an ad.
| ible wrote:
| That product name sucks for Veo the AI sports video camera
| company who literally makes a product called the Veo 2.
| (https://www.veo.co)
| theorangejuica wrote:
| Time and money are better spent on creating actual video,
| animation, and art than this gen AI drivel.
| demarq wrote:
| just to remind everyone that state of the art was Will Smith
| Eating Spaghetti in April of 2023
|
| https://arstechnica.com/information-technology/2023/03/yes-v...
|
| We're not even done with 2024.
|
| Just imagine what's waiting for us in 2025.
| seanvelasco wrote:
| as OpenAI released a feature that hit Google where it hurts,
| Google released Veo 2 to utterly destroy OpenAI's Sora.
|
| Google won.
| markus_zhang wrote:
| My friend working in a TV station is already using these tools to
| generate videos for public advertising programs. It has been a
| blast.
| stabbles wrote:
| It's interesting they host these videos on YouTube, cause it
| signals they're fine with AI generated content. I wonder if
| Google forgets that the creators themselves are what makes
| YouTube interesting for viewers.
| sylware wrote:
| Anybody does realize this is very sad?
|
| Namely, so few neurons to get picture in our heads.
|
| I guess, end of the world scenarios may lead us to create that
| super intelligence with a gigantic ultra performant artificial
| "brain".
___________________________________________________________________
(page generated 2024-12-16 23:00 UTC)