[HN Gopher] Veo
       ___________________________________________________________________
        
       Veo
        
       Author : meetpateltech
       Score  : 834 points
       Date   : 2024-05-14 17:58 UTC (5 hours ago)
        
 (HTM) web link (deepmind.google)
 (TXT) w3m dump (deepmind.google)
        
       | moralestapia wrote:
       | Not nearly as good as Sora.
       | 
       | Google missed this train, big time.
        
       | DarmokJalad1701 wrote:
       | > "SIGN IN TO GET A SNEAK PEAK."
       | 
       | https://theoatmeal.com/comics/sneak_peek
        
         | baal80spam wrote:
         | Whoa. The URL is correct, the text is not.
        
           | sowbug wrote:
           | The page is now fixed. Even for a "test kitchen," that's a
           | shocking error for a company like Google to make.
        
       | hehdhdjehehegwv wrote:
       | You have to log in just to see a demo? They are desperate to
       | track people.
        
       | iamleppert wrote:
       | It's so bad its laughable. Sundar really needs to crack the whip
       | harder on those Googlers.
        
         | bamboozled wrote:
         | or someone has to crack the whip on Sundar :)
        
           | geodel wrote:
           | His job is keeping stock price up which he is doing well so
           | far. Another is layoffs which again he is doing fine :)
        
       | htrp wrote:
       | Was anyone else confused by that Donald Glover segment. It felt
       | like we were going to get a short film, and we got 3-5 clips?
        
         | curiousgal wrote:
         | Exactly!
         | 
         |  _" Hey guys big artist says this is fine so we're good"_
        
         | jsheard wrote:
         | And those clips mostly look like generic stock footage, not
         | something specific that a director might want to pre-vis.
         | 
         | This is what movie pre-vis is actually like, it doesn't need to
         | be pretty, it needs to be _precise_ :
         | 
         | https://www.youtube.com/watch?v=KMMeHPGV5VE
        
         | thisoneworks wrote:
         | Yeah that wasn't obvious what they were trying to show. Demis
         | said feature films will be released in a while
        
         | Keyframe wrote:
         | It felt AI-generated.
        
           | htrp wrote:
           | I wish it were AI Donald Glover talking and the "Apple twist"
           | at the end was that the entire 3 minute segment was a prompt
           | for "Donald Glover talking about how Awesome Gemini Models
           | are in a California vineyard"
        
         | ZiiS wrote:
         | Also it is either very good at generating living people or they
         | need to put more though into saying "Note: All videos on this
         | page were generated by Veo and have not been modified"
        
           | jsheard wrote:
           | That "footage has not been modified" statement is probably to
           | get ahead of any speculation that it was "cleaned up" in
           | post, after it turned out that the Sora demo of the balloon
           | headed man had fairly extensive manual VFX applied afterwards
           | to fix continuity errors and other artifacts.
        
             | iamdelirium wrote:
             | Wait, where did you hear this? I would assume something
             | like this would have made somewhat of a splash.
        
               | jsheard wrote:
               | The studio was pretty up front about it, they released a
               | making-of video one day after debuting the short which
               | made it clear they used VFX to fix Soras errors in post,
               | but OpenAI neglected to mention that in their own copy so
               | it flew under the radar for a while.
               | 
               | https://www.youtube.com/watch?v=KFzXwBZgB88
               | 
               | https://www.fxguide.com/fxfeatured/actually-using-sora/
               | 
               |  _> While all the imagery was generated in SORA, the
               | balloon still required a lot of post-work. In addition to
               | isolating the balloon so it could be re-coloured, it
               | would sometimes have a face on Sonny, as if his face was
               | drawn on with a marker, and this would be removed in
               | AfterEffects. similar other artifacts were often
               | removed._
        
       | TIPSIO wrote:
       | Seems like ImageFX, VideoFX (just a Google form and 3 demos),
       | MusicFX, and TextFX at the links are down and not working.
       | 
       | Huge grammar error on front page too.
        
       | indy wrote:
       | As someone who doesn't live in the US this year's Google IO feels
       | like I'm outside looking in at all the cool kids who get to play
       | with the latest toys.
        
         | roynasser wrote:
         | VPN'd right into that playground, turns out the toys were
         | pretty blah
        
         | numbers wrote:
         | don't feel left out, we're all on the wait lists
        
       | curiousgal wrote:
       | Oh look another half baked product release that's not available
       | in any country. They're a joke.
        
         | mupuff1234 wrote:
         | Is Sora available in any country?
        
           | bamboozled wrote:
           | I thought I read they've deemed Sora too dangerous to release
           | pre- election ? Or have reservations about it ? I might be
           | wrong...
        
             | sib wrote:
             | Sounds like a great excuse / communications strategy!
        
           | jaggs wrote:
           | Apparently it's only released to red teams at the moment as
           | they try to manage safety. There's also the issue about
           | releasing too close to an election?
        
       | sebzim4500 wrote:
       | The Donald Glover segment might be a new low for Google
       | announcement videos. They spent all this time talking up the
       | product but didn't actually show what he had created.
       | 
       | Imagine how bad the model must be if this is the best way Google
       | can think of selling it.
        
         | fakedang wrote:
         | What seems worse is the Google TextFX video with Lupe Fiasco?
         | What the heck am I supposed to get out of watching boring
         | monologues by a couple of people? They could have just as
         | easily shown, with less camera work, Lupe Fiasco actually using
         | the LLM model, but they didn't - or at least not enough to grab
         | my attention in 2 minutes.
         | 
         | Personally, I liked the above link, even as a Google skeptic,
         | but the videos aren't helping their case.
        
       | Horffupolde wrote:
       | Google is the new Kodak.
        
       | loudmax wrote:
       | The videos in this demo are pretty neat. If this had been
       | announced just four months ago we'd all be very impressed by the
       | capabilities.
       | 
       | The problem is that these video clips are very unimpressive
       | compared to the Sora demonstration which came out three months
       | ago. If this demo was announced by some scrappy startup it would
       | be worth taking note. Coming from Google, the inventor of the
       | Transformer and owner of the largest collection of videos in the
       | world, these sample videos are underwhelming.
       | 
       | Having said that, Sora isn't publicly available yet, and maybe
       | Veo will have more to offer than what we see in those short clips
       | when it gets a full release.
        
         | fakedang wrote:
         | Honestly, if Veo becomes public faster than Sora, they could
         | win the video AI race. But what am I wishfully thinking - it's
         | Google we're talking about!
        
           | Jensson wrote:
           | > But what am I wishfully thinking - it's Google we're
           | talking about!
           | 
           | Google the company known to launch way too many products?
           | What other big company launches more stuff early than them?
           | What people complain about Google is that they launch too
           | much and then shut them down, not that they don't launch
           | things.
        
       | mccraveiro wrote:
       | They didn't show any human videos, which could indicate that the
       | technology struggles with generating them.
        
         | karmasimida wrote:
         | Actually there is one in the last demo, it is not an individual
         | one, but one shot in the demo where a team uses this model to
         | create a scene with human in it, where they created an image of
         | black woman but only up her head in it
         | 
         | I would generally agree though, it is not normal they didn't
         | show more human
        
         | revscat wrote:
         | I'm sure part of the reason, beyond those given already, is
         | that they want to avoid the debate around nudity.
        
         | dyauspitr wrote:
         | You know why and it's not that their technology struggles with
         | it.
        
         | chubot wrote:
         | It's also probably that it's easier to spot fake humans than to
         | spot fake cats or camels. We are more attuned to the faces of
         | our own species
         | 
         | That is, AI humans can look "creepy" whereas AI animals may
         | not. The cowboy looks pretty good precisely because it's all
         | shadow.
         | 
         | CGI animators can probably explain this better than I can ...
         | they have to spend way more time on certain areas and certain
         | motions, and all the other times it makes sense to "cheat" ...
         | 
         | It explains why CGI characters look a certain way too -- they
         | have to be economical to animate
        
         | mjfl wrote:
         | thank goodness.
        
         | himinlomax wrote:
         | They're probably still wary of their latest PR disaster, the
         | inclusive and diverse WW2 Germans from Gemini.
        
       | xianshou wrote:
       | To quote Twitter/X, "I wonder what OpenAI will release tomorrow
       | and Google will release a waitlist for."
       | 
       | GPT-4o: out
       | 
       | Veo: waitlist
       | 
       | Admittedly this is impressive and the direct comp would be Sora,
       | which isn't out, but sometimes the caricature is very close to
       | the truth.
        
         | jsheard wrote:
         | Then again Veo is in the same category as Sora, which isn't
         | released either, 3 months after the reveal.
        
         | rvnx wrote:
         | "This tool isn't available in your country yet"
        
         | modeless wrote:
         | To be fair, all the voice stuff OpenAI demoed isn't released
         | yet either.
        
         | martinesko36 wrote:
         | This is Google for the last 5+ IOs. They just release waitlists
         | and demos that are leapfrogged by the time they're available to
         | all. (and shut down a few years later)
        
           | htrp wrote:
           | Cite sources?
        
         | skepticATX wrote:
         | OpenAI hardly released gpt-4o. The demo yesterday was clearly a
         | rushed response to I/O. It's quite possible that Google will
         | ship multi-modality features faster than OpenAI will.
        
           | JeremyNT wrote:
           | Yeah I think at this point it's "not if, but when" and the
           | gap between parity is just going to keep shrinking
           | (until/unless there's some kind of copyright/legislative
           | barrier implemented that favors one or the other).
           | 
           | "We have no moat" swings both ways.
        
           | juice_bus wrote:
           | Which one of these products Google are releasing that you can
           | trust will even be around in a year or two? I'm certainly
           | done trusting Google with new products.
        
           | buildbot wrote:
           | Without doing anything, I have access to GPT-4o in chatgpt
           | and the api already (on a personal account, not related to
           | work). Maybe I'm just super lucky, but it's certainly not
           | vaporware.
        
           | Difwif wrote:
           | What do you mean? Everyone has access to the gpt-4o model
           | right now through ChatGPT and the API. Sure we don't have
           | voice-to-voice but we have a lot more than what Google has
           | promised.
        
             | hbn wrote:
             | How do I get access? I just checked my app and the Premium
             | upgrade says it will unlocked GPT-3.5 and GPT-4, so I
             | assume my version is still the old one.
             | 
             | All my apps are updated in the App Store too.
        
               | bagels wrote:
               | I have a paid account, and I didn't have to do anything
               | to use the new model.
        
               | DeRock wrote:
               | I just checked, there was an iOS app update available and
               | it enabled it. I'd check again if there's a new update
               | (version 1.2024.129). Or you could use the website.
        
               | hbn wrote:
               | I'm on the same version and don't see anything different
               | 
               | Website also only has toggle for 3.5 and 4 with the Plus
               | upgrade. Not sure if it's cause I'm in Canada?
        
               | croes wrote:
               | I use their website and it's one of the three models to
               | choose from if your are on plus subscription.
        
               | mr_mitm wrote:
               | I have premium access and I can select 4o in the dropdown
               | menu on Android
        
               | cush wrote:
               | I have the 4o model. On premium. No voice yet
        
               | theresistor wrote:
               | To add a counterbalance, I just checked in the app and on
               | the website on a non-paid account, and I too do NOT have
               | GPT-4o.
        
               | hbn wrote:
               | Everyone who says they have access in the replies to my
               | comment seem to be paid users. So maybe it's only rolling
               | out to them first.
        
               | htrp wrote:
               | I expect they have to offer the paid users some thing
        
               | hbn wrote:
               | Paid users get like a 5x higher rate limit iirc
        
               | electriclove wrote:
               | My (paid) app has it but no voice chat yet
        
               | TecoAndJix wrote:
               | I have a paid account and can do voice to voice on the
               | iOS app as of last night.
        
               | hbn wrote:
               | The realtime one they showed yesterday or the one that's
               | existed forever where it's just a voice-to-text input and
               | TTS output taking turns?
        
               | TecoAndJix wrote:
               | I feel silly now. I downloaded the app after the
               | announcement (I'm a desktop user) and it looked identical
               | to the one they show in the sarcasm video. When I asked
               | it, I was told it was not the new feature announced
               | yesterday. Still a lot of fun!
               | 
               | Edit - it does list the new model in my app at least
        
               | sib wrote:
               | In the App Store there's a new build of the iOS app as of
               | 3 hours about (call it about 11am US Pacific time). It
               | includes the GPT-4o model (at least it shows it for me.)
        
               | hbn wrote:
               | Are you a paid user?
        
             | theresistor wrote:
             | It is not available on my (free) account in either the app
             | or the website. So no, everyone does _not_ have access to
             | it.
        
               | satvikpendem wrote:
               | It's for paid users for now, not free. I have ChatGPT Pro
               | and I can use the new model.
        
               | nialv7 wrote:
               | I am a free user and I have 4o. I think it is just a
               | gradual roll out.
        
             | ben_w wrote:
             | API yes, ChatGPT no (at least not for all users); I've got
             | my own web interface for the API so I can play with the
             | model (for all of $0.045 of API fees), but most people
             | can't be bothered with that and will only get 4o when it
             | rolls out as far as their specific ChatGPT account.
        
               | ghshephard wrote:
               | Was running just fine on my ChatGPT client on iOS - full
               | 2-way voice comms by 3:00 PM yesterday. Application was
               | already updated.
        
               | mike_hearn wrote:
               | I have a regular ChatGPT Pro account and I have GPT-4o.
               | 
               | The bigger issue is that 4o without the multi-modal, new
               | speech capabilities or desktop app isn't that different
               | to GPT-4. And those things aren't yet launched.
        
             | TecoAndJix wrote:
             | Posted further down the thread - I have a paid account and
             | can do voice to voice on the iOS app as of last night.
        
             | baobabKoodaa wrote:
             | I don't have access to gpt-4o via ChatGPT
        
           | localfirst wrote:
           | listen all these guys out here attacking Google and making
           | outlandish/false claims
           | 
           | look at their linkedin pages, that will tell you why they are
           | desperate
           | 
           | (hint: they bought OpenAI bags on the secondary market)
        
         | rvz wrote:
         | Sora is the closest comparison to Veo and both aren't out.
         | 
         | It's been there for three months and still isn't even close to
         | being released and available.
         | 
         | Essentially Google has already caught up to OpenAI with their
         | recent responses and it's clear that there are private OpenAI
         | investors pushing such nonsense around Google struggling to
         | compete.
        
         | resource_waste wrote:
         | Google Press: This is the greatest AI Model ever yet.
         | 
         | Users: Lol it wont even tell me how to draw a picture of a
         | human because its inappropriate.
         | 
         | Google flipped like a switch a few years ago. Instead of going
         | for product quality, it seems they went full Apple Marketing
         | and control the narrative of top social media.
         | 
         | I keep trying thinking: "well its Google, they will be the best
         | right?" No, I'm at giving up on Google, they are not as
         | powerful as I once thought... Hmm seems like a good time to get
         | into Lobbying and Marketing...
        
         | nextworddev wrote:
         | Except gpt-4o with audio and video inputs isn't actually out
        
           | adamtaylor_13 wrote:
           | I was using it yesterday in the mobile app. Unless they just
           | slapped the new UI on an older model.
        
             | hakanensari wrote:
             | It's no longer there, I think?
        
             | Cyph0n wrote:
             | I think they (partially?) rolled it back. I tried out the
             | voice input yesterday, but it's missing from the app today.
        
             | Workaccount2 wrote:
             | They just released the text chat model which still uses the
             | same old audio interface as 4. The new audio/video chat
             | stuff is not out yet (unless you are a very lucky early
             | beta user).
        
         | dom96 wrote:
         | > GPT-4o: out
         | 
         | Is it? I can't use it yet at least
        
           | lordswork wrote:
           | I am also wondering how to use it..
        
           | drawnwren wrote:
           | It is. I've got it already, but I'm a bit of a gpt4 power
           | user. I hit my rate limit biweekly or so and run up close to
           | it every day. I'd bet maybe they prioritized people that were
           | costing them money.
        
             | kaibee wrote:
             | It might be just by sign-up order. I signed up for pro
             | basically as soon as I could, but I never hit limits, and
             | only really use it once or twice a day, sometimes not at
             | all.
        
               | drawnwren wrote:
               | Interestingly, when I use Cloudflare's Warp DNS I don't
               | have access to it. So, it might have something to do w/
               | region as well?
        
             | phyalow wrote:
             | Really? You have the new model, sure, I have it too, but
             | afaik nobody has the new ultra fast and variable voice +
             | video chat on mobile.
        
               | drawnwren wrote:
               | The original question asked if anyone had GPT-4o. You're
               | asking a different question.
        
           | w-m wrote:
           | GPT-4o is available for me on ChatGPT, with the
           | text+attachment input (as a Plus user from Germany). It's
           | crazy fast. The voice for the audio conversation in the app
           | is still the old one, and doesn't let you interrupt it.
        
         | mrkramer wrote:
         | Google is scared of what every new model can produce, they
         | don't want drama but they always end up in some kind of media
         | drama.
        
         | Tenoke wrote:
         | I can't even join the waitlist from Europe while 4o is fully
         | available here.
        
         | dyauspitr wrote:
         | I haven't been able to try out 4o. The voice chat continuously
         | says there's too much traffic and I don't even see a button to
         | turn on the camera
        
         | qwertox wrote:
         | > GPT-4o: out
         | 
         | I don't know what's wrong with GPT-4o, but the answers I'm
         | getting are much worse than before yesterday. It's constantly
         | repeating the entire content required to provide a seemingly
         | "full" answer, but if it passes me the same but slightly
         | modified Python code for the fifth time even if it has become
         | irrelevant to the current conversation, it really gets on my
         | nerves.
         | 
         | I had so well tuned custom instructions which worked
         | beautifully and now it's as if it is ignoring most of them.
         | 
         | It's causing me frustration and really wasting my time when I
         | have to wait for the unnecessary long answers to finish.
        
       | endisneigh wrote:
       | I've noticed that a lot of the commentary of these models creates
       | the sort of fervor like politics or sports.
       | 
       | In any case - no details on compute needed. Curious if this ever
       | can be cheap. Even Midjourney still requires a lot.
       | 
       | I'm also surprised there hasn't been some attempt at creating
       | benchmarks for this. One example could be color accuracy.
        
         | stefan_ wrote:
         | Never mind no benchmarks, half of these announcements in the
         | past were straight _made up_ , "offline enhanced" cherry picked
         | "examples", CGI fantasies.
         | 
         | Not to mention the whole AGI topic is forever doomed from SciFi
         | fans, just remember what happened with that room-temperature
         | superconductivity.
        
       | inasio wrote:
       | From a 2014 Wired article [0]: "The average shot length of
       | English language films has declined from about 12 seconds in 1930
       | to about 2.5 seconds today"
       | 
       | I can see more real-world impact from this (and/or Sora) than
       | most other AI tools
       | 
       | [0] https://www.wired.com/2014/09/cinema-is-evolving/
        
         | jsheard wrote:
         | Even if the shots are very short you still need coherency
         | _between_ shots, and they don 't seem to have tackled that
         | problem yet.
        
         | mattgreenrocks wrote:
         | This is very noticeable. Watching movies from the 1970s is
         | positively serene for me, vs the shot time on modern films
         | often leaves me wonder, "wait, what just happened there?"
         | 
         | And I'm someone who is fine playing fast action video games.
         | Can't imagine what it's like if you're older or have sensory
         | processing issues.
        
           | ryandrake wrote:
           | Obligatory: Liam Neeson jumps over a fence in 6 seconds, with
           | 14 cuts[1].
           | 
           | 1: https://www.youtube.com/watch?v=gCKhktcbfQM
        
             | aidenn0 wrote:
             | I'd like to fact check this amazing comment on that video,
             | but it would require watching Taken 3:
             | 
             | > Some of y'all may find how awful this editing gets pretty
             | interesting: I did an Average Shot Length (ASL) for many
             | movies for a recent project, and just to illustrate bad
             | overediting in action movies, I looked at Taken 3 (2014) in
             | its extended cut.
             | 
             | > The longest shot in the movie is the last shot, an aerial
             | shot of a pier at sunset ending the movie as the end
             | credits start rolling over them. It clocks in at a runtime
             | of 41 seconds and is, _BY FAR_ , the longest shot in the
             | movie.
             | 
             | > The next longest is a helicopter establishing shot of the
             | daughter's college after the "action scene" there a little
             | over an hour in, at 5 seconds.
             | 
             | > Otherwise, the ASL for Taken 3 (minus the end
             | credits/opening logos), which has a runtime of 1:49:40,
             | 4,561 shots in all (!!!), is 1.38 SECONDS . For comparison,
             | Zack Snyder's Justice League (2021) (minus end
             | credits/opening logos) is 3:50:59, with 3163 shots overall,
             | giving it an ASL of 4.40 seconds, and this movie, at 1 hour
             | 50 minutes, has north of 4,561 for an ASL of 1.38
             | seconds?!?! _Taken 3 has more shots in it than Zack Snyder
             | 's Justice League, a movie more than double its length..._
             | 
             | > To further illustrate how ridiculous this editing gets,
             | the ASL for Taken 3's non-action scenes is 2.27 seconds. To
             | reiterate, this is the non-action scenes. The "slow
             | scenes." The character stuff. Dialogue scenes. The stuff
             | where any other movie would know to slow down. 2.27 SECONDS
             | For comparison, Mad Max: Fury Road (minus end
             | credits/opening logos) has a runtime of 1:51:58, with 2646
             | shots overall, for an ASL of 2.54 seconds. TAKEN 3'S "SLOW
             | SCENES" ARE EDITED MORE AGGRESSIVELY THAN MAD MAX: FURY
             | ROAD!
             | 
             | > And Taken 3's action scenes? _Their ASL is 0.68 seconds!_
             | 
             | > If it weren't for the sound people on the movie, Taken 3
             | wouldn't be an "action movie". It'd be abstract art.
        
               | throwup238 wrote:
               | It's worth noting that Taken 3 has a 13% rating on Rotten
               | Tomatoes, which is well in to "it's so bad it's good"
               | territory. I don't think the rapid cuts went unnoticed.
        
               | nimithryn wrote:
               | Yeah, this sequence is a meme commonly cited to show
               | "choppy modern editing"
        
               | llmblockchain wrote:
               | More chops than an MF DOOM track.
        
             | kristofferR wrote:
             | The top comment makes a really good point though:
             | 
             | "He's 68. I'm guessing they stitched it together like this
             | because "geriatric spends 30 seconds scaling chainlink
             | fence then breaks a hip" doesn't exactly make for riveting
             | action flick fare."
             | 
             | Lingering shots are horrible for obscuring things.
        
               | lupire wrote:
               | Movies have stunt performers.
               | 
               | And Neeson was only 60 when filming Taken 3.
        
               | troupo wrote:
               | Keanu Reeves was 57-8 when he shot the last _John Wick_.
               | IIRC Bob Odenkirk was 58 in _Nobody_. Neeson was 60 in
               | Taken 3.
               | 
               | There ways to shoot an action scene with an aging star
               | that doesn't involve 14 cuts in 4 seconds. You just have
               | to care about your craft.
        
             | nineteen999 wrote:
             | Is it Liam Neeson, or his stunt double?
        
           | psbp wrote:
           | My brain processes too slow for modern action movies.
           | 
           | I can tell what's going on, but I always end up feeling
           | agitated.
        
           | kemitchell wrote:
           | Enjoy some Tarkovsky.
        
         | joshuahedlund wrote:
         | How many of those 2.5 second "shots" are back-and-forths
         | between two perspectives (ex. of two characters talking to one
         | another) where each perspective is consistent with itself? This
         | would be extremely relevant for how many seconds of consistent
         | footage are actually needed for an AI-generated "shot" at film-
         | level quality.
        
       | Keyframe wrote:
       | Kind of sucks to be google. Even they're making good progress
       | here, and have laid the foundations of a lot if not most things..
       | their products are, well there aren't any noteworthy compared to
       | rest. And considering google is sitting on top of one of the
       | largest if not THE largest video database, along with maps,
       | traffic, search, internet.zip, usenet, vast computing resources
       | vertically integrated.. they have the whole advantage in the
       | world. So, the hell are they doing? Why isn't their CEO already
       | out? Expectations from them are higher than from anyone else.
        
         | InfiniteVortex wrote:
         | Google search has been absolutely ruined in terms of quality.
         | You're right, they've built the base in terms of R&D for many
         | of the AI breakthroughs thats powering competing alternative
         | products.... that happen to be better than Google's own
         | products. Google went from "Don't be evil" to just another big
         | corporate tech company. They have so much potential.
         | Regrettable.
        
           | CraftingLinks wrote:
           | They are fast on their way becoming IBM 2.0.
        
             | jason-phillips wrote:
             | More like Xerox
        
           | dyauspitr wrote:
           | If anything google search with the Gemini area on the top has
           | been very good for me.
        
         | atleastoptimal wrote:
         | Because they punish experimentation as it eats into their
         | bottom line. AI is a tool for ads in the mind of executives at
         | Google. Ads and monetization of human productivity, not an
         | agent of productivity on its own.
        
           | khazhoux wrote:
           | C'mon, Google doesn't "punish" experimentation. Google X,
           | Google Glass, Daydream, Fuschia, moonshots, the lab spinoff
           | (whose name I can't remember)... hell, even all the abandoned
           | products everyone here always complains about.
           | 
           | The experiments often/usually fail, but they _do_ experiment.
        
             | Koffiepoeder wrote:
             | If you prune all the branches, where will the fruits grow?
        
               | khazhoux wrote:
               | The branches were dead and could bear no fruit. New
               | branches will sprout next season.
        
           | lolinder wrote:
           | "Laser-focused on the bottom line at the expense of all else"
           | is not how I'd describe Google, now or at any point in the
           | past. They have a _lot_ of dysfunction, but if anything that
           | dysfunction stems from _too much_ experimentation and
           | autonomy at the leaf nodes of the organization. That 's how
           | they get into these crazy places where they have to pick
           | between 5 chat apps or whatever.
           | 
           | If Google were as focused on ads as you seem to think we'd at
           | least see some sort of coherent org-wide strategy instead of
           | a complete lack of direction.
        
             | criddell wrote:
             | I'd describe Google as focused on the bottom line after
             | they put the ads guy in charge of search.
             | 
             | I'm referring to this article that was posted here
             | recently:
             | 
             | https://www.wheresyoured.at/the-men-who-killed-google/
        
               | khazhoux wrote:
               | The person now in charge of Search is Elizabeth Hamon
               | Reid, a long-time googler who came up through the ranks
               | from engineer (in Google Maps) to VP over 20 years. She's
               | legit.
        
               | criddell wrote:
               | Is Wikipedia out of date then?
               | 
               | https://en.wikipedia.org/wiki/Prabhakar_Raghavan
        
               | khazhoux wrote:
               | Ah, according to this, she's head of Search but reports
               | to Prabhakar. I thought from recent reports that she'd
               | taken search over from him.
               | 
               | Nonetheless, she was a good engineer and a good manager,
               | back when we crossed path many moons ago.
               | 
               | https://searchengineland.com/liz-reid-google-new-head-of-
               | sea...
        
               | lolinder wrote:
               | That was a decision that prioritized the bottom line over
               | other things. But saying that Google is "focused" on the
               | bottom line implies that there's a pattern of them
               | putting the bottom line first, which is simply not true
               | if you look at Google as a whole. Search specifically,
               | maybe, but not Alphabet.
        
         | Workaccount2 wrote:
         | I don't know how more people don't talk about the 1M context
         | tokens. While the output is mediocre for cutting edge models,
         | you can context stuff the ever living hell out of it for some
         | pretty amazing capabilities. 2M tokens is even crazier.
        
           | lordswork wrote:
           | It is pretty amazing. I've been using it every day. I do wish
           | you could easily upload an entire repo into it though.
        
             | bongodongobob wrote:
             | Have it write a program to output a repo as a flat file.
        
           | rm_-rf_slash wrote:
           | Anything approaching the token limit I turn into a file and
           | upload to a vector store. Results are comparable between Chat
           | and Assistants.
        
           | Keyframe wrote:
           | That's a good point. Gemini gatekeeping me on so many answers
           | made me forget about this extraordinary feature of it.
        
         | softwaredoug wrote:
         | It's often said you need to disrupt your own business model.
         | 
         | Google had blinders on. They didn't relentlessly focus on
         | reinventing their domain. They just milked what they had.
         | Gradually losing site of the user experience[1] to focus on
         | monetization above all else.
         | 
         | 1 - https://twitter.com/pdrmnvd/status/1707395736458207430
        
         | dyauspitr wrote:
         | Their CEO is generating massive, growing profits every quarter
         | while releasing generative technology, all the while threading
         | a fine line in what those models generate because it can be
         | pretty devastating for a large corp like Google.
        
           | Keyframe wrote:
           | you think it's because of him or despite him?
        
       | airstrike wrote:
       | > Veo
       | 
       | > Sign up to try VisionFX
       | 
       | Is it Veo or VisionFX? Is it a sign up, a trial, or a waitlist?
       | 
       | How hard can it be to write a clear message? In the words of Don
       | Miller, if you confuse, you lose.
        
         | therein wrote:
         | Yeah I was like so is it Veo or VisionFX.
         | 
         | This landing page feels as haphazardly put together as the
         | Coinbase downtime page last night.
        
         | peppertree wrote:
         | This is very on-brand with how Google does branding. "are you
         | confused yet? no? try this other vaguely similar name."
        
           | davidw wrote:
           | Maybe it's going to be a new messaging app - but with AI!
           | 
           | Kidding... I signed up for the waitlist. I have ideas for
           | videos I'd like to use to explain things that I have no hope
           | of creating myself.
        
         | BlackJack wrote:
         | Disclaimer: I work at Google on related stuff
         | 
         | Veo is the name of a video model. VideoFX is the name of a new
         | experimental tool at labs.google.com, which uses Veo and lets
         | you make videos.
         | 
         | Thanks for the feedback though, I see how it's confusing for
         | users.
        
           | zb3 wrote:
           | I see the endpoint returns "Not Implemented" when trying to
           | make a video :<
           | 
           | Imagen 3 is awesome though, generates nice logos :D
        
         | mike_hearn wrote:
         | Presumably this is DeepMind vs Labs fighting over the same
         | project. A consequence of guaranteeing Demis some level of
         | independence when DeepMind was bought, which still shows
         | through in the fact that the DeepMind brand(s) survive.
        
         | qingcharles wrote:
         | And: Communication isn't what you say, it's what people hear
         | 
         | Agree this is totally confusing.
        
       | rishav_sharan wrote:
       | Now that the first direct competitor to Sora has been announced,
       | I am sure Sora will be suddenly ready for public consumption, all
       | it's ai safety concerns forgotten
        
         | sebastiennight wrote:
         | I think there's a tremendous compute cost associated with both
         | models still... I can't see how either company could withstand
         | the instant enormous demand, even if they tried to command
         | crazy prices.
         | 
         | Even at $1 per 5-second video, I think some use cases
         | (including fun/non-business ones) would still overwhelm
         | capacity.
        
       | popcar2 wrote:
       | Not nearly as impressive as Sora. Sora was impressive because the
       | clips were long and had lots of rapid movement since video models
       | tend to fall apart when the movement isn't easy to predict.
       | 
       | By comparison, the shots here are only a few seconds long and
       | almost all look like slow motion or slow panning shots
       | cherrypicked because they don't have that much movement. Compare
       | that to Sora's videos of people walking in real speed.
       | 
       | The only shot they had that can compare was the cyberpunk video
       | they linked to, and it looks crazy inconsistent. Real shame.
        
         | nuz wrote:
         | Sora is also movement limited to a certain range if you look at
         | the clips closely. Probably something like filtering by some
         | function of optical flow in both cases.
        
         | ein0p wrote:
         | Also Sora demos had some really impressive generations
         | featuring _people_. Here we hardly see any people which likely
         | means exactly what you'd guess.
        
           | data-ottawa wrote:
           | Has Gemini started generated impacted of people again? My
           | trial has ended and I haven't been following the issue.
        
         | spiderfarmer wrote:
         | Also the horse just looks weird, just like the buildings and
         | peppers.
         | 
         | It's impressive as hell though. Even if it would only be used
         | to extrapolate existing video.
        
         | LZ_Khan wrote:
         | I imagine thats just a function of how much training data you
         | throw at it.
        
         | Jensson wrote:
         | > Sora was impressive because the clips were long and had lots
         | of rapid movement
         | 
         | Sora videos ran at 1 beat per second, so everything in the
         | image moved at the same beat and often too slow or too fast to
         | keep the pace.
         | 
         | It is very obvious when you inspect the images and notice that
         | there are keyframes at every whole second mark and everything
         | on the screen suddenly goes in their next animation step.
         | 
         | That really limits the kind of videos you can generate.
        
           | lupire wrote:
           | So it needs to learn how far each object can travel in 1sec
           | at its natural speed?
        
             | Jensson wrote:
             | It also needs to separate animation steps for different
             | objects so that objects can keep different speeds. It isn't
             | trivial at all to go from having a keyframe for the whole
             | picture to having separate for separate parts, you need to
             | retrain the whole thing from the ground up and the results
             | will be way worse until you figure out a way to train that.
             | 
             | My point is that it isn't obvious at all that Soras way
             | actually is closer to the end goal, it might look better
             | today to have those 1 second beats for every video but
             | where do you go from there?
        
         | TIPSIO wrote:
         | Objectively speaking (if people would be honest with
         | themselves), both are just decent at best.
         | 
         | I think comparing them now is probably not that useful outside
         | of this AI hype train. Like comparing two children. A lot can
         | happen.
         | 
         | The bigger message I am getting from this is it's clear OpenAI
         | won't have a super AI monopoly.
        
           | TaylorAlexander wrote:
           | Comparing two children is a good one. My girlfriend has taken
           | to pointing out when I'm engaging in "punditry". They're an
           | engineer like I am and we talk about tech all the time, but
           | sometimes I talk about which company is beating which company
           | like it's a football game, and they call me out for it.
           | 
           | Video models are interesting, and to some extent trying to
           | imagine which company is gonna eat the other's lunch is kind
           | of interesting, but sometimes that's all people are
           | interested in and I can see my girlfriend's reasoning for
           | being disinterested in such discussion.
        
           | motoxpro wrote:
           | What would make this "Good?"
        
         | dangoodmanUT wrote:
         | cant wait to see your model
        
         | arcastroe wrote:
         | > The shots here [..] almost all look like slow motion or slow
         | panning shots.
         | 
         | I think this is arguably better than the alternative. With
         | slow-mo generated videos, you can always speed them up in
         | editing. It's much harder to take a fast-paced video and slow
         | it down without terrible loss in quality.
        
         | totaldude87 wrote:
         | Could also be the doing of google. if Veo screws up , the
         | weight falls on Alphabet stock. While open AI is not public and
         | doesn't have to worry about anything . Like even if open AI
         | faked some of their AI videos(not saying they did), it wouldn't
         | affect them the way it would affect Veo--> Google-->Alphabet
         | 
         | being cautious often puts a dent in innovation
        
           | soulofmischief wrote:
           | You mean like how they faked some Gemini stuff?
           | 
           | https://www.bbc.com/news/technology-67650807
        
         | latexr wrote:
         | > Not nearly as impressive as Sora. Sora was impressive because
         | the clips were long and had lots of rapid movement
         | 
         | The most impressive Sora demo was heavily edited.
         | 
         | https://www.fxguide.com/fxfeatured/actually-using-sora/
        
           | rvz wrote:
           | Interesting to see that OpenAI was successful in creating
           | their own reality distortion spells, just like Apple's
           | reality distortion field which has fooled many of these
           | commenters here.
           | 
           | It's quite early to race to the conclusion that one is better
           | than the other when not only they are both unreleased, but
           | especially when the demos can be edited, faked or altered to
           | look great for optics and distortion.
           | 
           | EDIT: It appears there is at least one commenter who replied
           | below that is upset with this fact above.
           | 
           | It is OK to cope, but the truth really doesn't care
           | especially when the competition (Google) came out much
           | stronger than expected with their announcements.
        
             | ijidak wrote:
             | Well, as a counterpoint, Apple did become a $2 trillion
             | dollar company...
             | 
             | Distortion is easiest when the products really work. :)
        
               | adventured wrote:
               | Apple got up to $3 trillion back in 2023.
        
           | jsheard wrote:
           | To Shy Kids credit _they_ made it clear the Sora footage was
           | heavily edited, but OpenAIs site still presents Air Head
           | without that context.
           | 
           | https://www.youtube.com/watch?v=KFzXwBZgB88 (posted the day
           | after the short debuted)
           | 
           | https://openai.com/index/sora-first-impressions (no mention
           | of editing, nor do they link to the above making-of video)
        
             | seoulmetro wrote:
             | There is now on that second link:
             | 
             | >The videos below were edited by the artists, who
             | creatively integrated Sora into their work, and had the
             | freedom to modify the content Sora generated.
        
               | jsheard wrote:
               | Ha, here's an archive from yesterday for posterity.
               | 
               | https://web.archive.org/web/20240513050023/https://openai
               | .co...
        
       | axblount wrote:
       | I hate to be so cynical, but I'm dreading the inevitable flood of
       | AI generated video spam.
       | 
       | We really are about _this_ close to infinite jest. Imagine TikTok
       | 's algorithm with on demand video generation to suit your exact
       | tastes. It may erase the social aspect, but for many users I
       | doubt that would matter too much. "Lurking" into oblivion.
        
         | lordswork wrote:
         | It's already here. There are communities forming around
         | generating passive income from mass producing AI videos as
         | tiktoks and shorts.
        
           | axblount wrote:
           | I saw one of those where a guy just made videos about
           | increasingly elaborate AI generated cakes. You're right, I
           | guess we're mostly there.
           | 
           | But those still require some human input. I'm imagining a
           | sort of genetic algorithm for video prompts, no human
           | editing, input, or curation required.
        
         | barbariangrunge wrote:
         | YouTube's endgame is to not need content creators in the loop
         | any more. The algorithm will just create everything
        
           | esafak wrote:
           | The endgame of that is that people will leave.
        
             | darby_eight wrote:
             | I'm somewhat surprised people still watch YouTube with the
             | horrible recommendations and non-stop spam
        
           | belter wrote:
           | Henry Ford II: Walter, how are you going to get those robots
           | to pay your union dues?
           | 
           | Walter Reuther: Henry, how are you going to get them to buy
           | your cars?
        
         | LZ_Khan wrote:
         | I had the same thought regarding infinite jest recently
        
         | rm_-rf_slash wrote:
         | And somehow our exact tastes would also include influencer
         | coded advertisements.
        
         | beacon294 wrote:
         | Can you explain this aspect of infinite jest to me without
         | spoiling the book?
        
           | _xander wrote:
           | It's introduced early on (and not what the book is really
           | about): distribution of a video that is so entertaining that
           | any viewer is compelled to watch it until they die
        
         | jprete wrote:
         | At the bottom of the text blurb on the Veo page: "In the
         | future, we'll also bring some of Veo's capabilities to YouTube
         | Shorts and other products."
         | 
         | So...you're not cynical, it's an explicit product goal.
        
         | Invictus0 wrote:
         | This basically already exists for porn
        
         | redml wrote:
         | I think of it as we're replacing the SEO spam we have right now
         | with AI spam. At least now we can fight that with more AI.
        
         | layer8 wrote:
         | If it really suited my exact tastes, that would actually be
         | great. But I don't see how we're anywhere close to that. And
         | they won't target matching your exact taste. They will target
         | the threshold where it's just barely interesting enough that
         | people don't turn it off.
        
       | fidotron wrote:
       | If you were of a mind to give Google the benefit of the doubt you
       | would have to think they are desperately trying not to
       | overpromise and underdeliver, partly because that has been their
       | track record to date. It's a very curious time to choose to make
       | this switch though given their competition, and if it was
       | motivated by the reception Bard received then it shows they
       | didn't learn the right lessons from that mess at all.
        
       | aaroninsf wrote:
       | It's mildly interesting how many of the samples shown fail to
       | fully conform to the prompts. Lots of specifics are missing.
       | 
       | Kudos to Google for if not foregrounding, being entirely
       | transparent, about this.
        
       | willsmith72 wrote:
       | all of this stuff i'll believe when it's ready for public release
       | 
       | 1. safety measures lead to huge quality reductions
       | 
       | 2. the devil's in the details. you can make me 1 million videos
       | which look 99% realistic, but it's useless. consumers can pick it
       | instantly, and it's a gigantic turn-off for any brand
        
         | aprilthird2021 wrote:
         | There'll always be a market for cheap low-quality videos, and
         | vice versa always a market for shockingly high quality videos.
         | K. Asif's Mughal-e-Azham had enormous ticket sales and a huge
         | budget spending on all sorts of stuff, like actual gold jewelry
         | to make the actors feel that they were important despite the
         | film being black and white.
         | 
         | No matter how good AI gets, it will never be the highest
         | budget. Hell, even technically more accurate quartz watches
         | cannot compete price wise with mechanical masterpiece watches
         | of lower accuracy
        
       | barbariangrunge wrote:
       | The company that controls online video is announcing a new tool,
       | and ambitions to develop it further, to create videos without
       | need for content creators. Using their videos to make a machine
       | that will cut them out of the loop.
        
       | hipadev23 wrote:
       | I've never had to click "Sign in" so many times in a row.
        
         | flying_whale wrote:
         | ...and then fill out an actual google form at the end, _after_
         | you've already signed in, to be added to the waitlist :sigh:
        
           | throwup238 wrote:
           | ...and enter your email into the form again despite being
           | logged into a Google account.
        
       | mrkramer wrote:
       | YouTube people: We need more UGC.
       | 
       | DeepMind people: AI can do it.
        
       | belval wrote:
       | While it's cool that they chose to showcase full-resolution
       | videos, they take so long to load I thought their videos were
       | just a stuttery mess.
       | 
       | Turns out if you open the video in a new tab the smoothness is
       | much more impressive.
        
       | robertlagrant wrote:
       | Hold on to your papers!
        
       | esafak wrote:
       | I love the reference to Llama with the alpacas.
        
       | typpo wrote:
       | The amount of negativity in these comments is astounding.
       | Congrats to the teams at Google on what they have built, and
       | hoping for more competition and progress in this space.
        
         | localfirst wrote:
         | We have to take account that this community (good chunk have
         | stakes in YC and a lot to gain from secondary shares in OpenAI)
         | and platform is going to favor its own and be aware that Sam
         | Altman is the golden boy of YC's founder after all.
         | 
         | So of course you are going to see snarky comments and straight
         | up denial in the competition. We saw that yesterday in the
         | comments with the release of GPT4o in anticipation of Gemini
         | 2.0 (GPT-5 basically) release being announced today at Google
         | I/O
         | 
         | I'm SORA to say Veo looks much more polished without jank.
         | 
         | Big congratulations to Google and their excellent AI team for
         | not editing their AI generated videos like SORA
        
           | JumpCrisscross wrote:
           | > _platform is going to favor its own and be aware that Sam
           | Altman is the golden boy of YC 's founder_
           | 
           | I don't know if there is a sentiment analysis tool for HN,
           | but I'm pretty sure it's been dead negative for Altman since
           | at least Worldcoin.
        
             | saalweachter wrote:
             | A land of contrasts, etc.
        
             | cosmotron wrote:
             | Something in this vein was just posted here a few days
             | back: https://news.ycombinator.com/item?id=40307519
        
           | baobabKoodaa wrote:
           | > We have to take account that this community (good chunk
           | have stakes in YC and a lot to gain from secondary shares in
           | OpenAI)
           | 
           | You have to be pretty deep inside your own little bubble to
           | think that even more than a 0.001% of HN has "stakes in YC"
           | or "secondary shares in OpenAI".
        
             | hu3 wrote:
             | It can be a vocal minority. Still vocal.
             | 
             | I wouldn't discard.
        
               | dylan604 wrote:
               | I have 0% stake in any YC, and I'm very vocal in my
               | negativity against any of these "AI" anythings. All of
               | these announcements are only slighty more than a toddler
               | anxious to show the parental units a finger painting
               | looking to hang it on the fridge. Only instead of the
               | fridge, they are a hoping to get funding/investment
               | knowing that their product is _not_ a fully fledged
               | anything. It 's comical.
        
           | mrbungie wrote:
           | The amount of copium in this response is astounding.
           | 
           | Yes, there is a noticeable negative response from HN towards
           | Google, and there has always been especially when speaking
           | about their weird product management practices and
           | incentives. Google hasn't launched any notable (and still
           | surviving, Stadia being a sad example of this) consumer
           | product or service in the last 10 years.
           | 
           | But to suggest there is a Sam Altman / OpenAI bias is
           | delusional. In most posts about them there is at least some
           | kind of skepticism or criticism towards Altman (his
           | participation in Worldcoin and his accelerationist stance
           | towards AGI) or his companies (OpenAI not being really open).
           | 
           | PS: I would say most people lurking here are just hackers (of
           | many kinds, but still hackers), not investors with shady
           | motives.
        
           | betternet77 wrote:
           | Yup, there's a significant anti-Google spin in HN, twitter.
           | For example, here's paulg claiming that Cruise handles
           | driving around cyclists better than Waymo [1], obviously not
           | true to anyone who's used both services
           | 
           | [1] https://twitter.com/paulg/status/1360341492850708481
        
         | rvz wrote:
         | You have to give Google credit as they went against the OpenAI
         | fanatics, Google doomsday crowd and some of the permanent
         | critics (who won't disclose they invested in OpenAI's secondary
         | share sale) that believe that Google can't keep up.
         | 
         | In fact, they already did. What OpenAI announced was nothing
         | that Google could not do already.
         | 
         | The top comments around Sora vs Veo suggesting that Google was
         | falling behind, given the fact that both are still unavailable
         | to use wasn't even a point to make in the first place, but just
         | typical HN nonsense.
        
           | JumpCrisscross wrote:
           | > _What OpenAI announced was nothing that Google could not do
           | already_
           | 
           | I don't think I've seen serious criticism of Google's
           | abilities. Apple didn't release anything that Xerox or IBM
           | couldn't do. The difference is they didn't.
           | 
           | Google's problem has always been in product follow through.
           | In this case, I fault them for having the sole action item be
           | a buried waitlist request and two new brands (Veo and
           | VideoFX) for one unreleased product.
        
             | sangnoir wrote:
             | > I don't think I've seen serious criticism of Google's
             | abilities
             | 
             | Serious or not, that criticism existed on HN - and still
             | does. I've seen many comments claiming Google has "fallen
             | behind" on AI, sometimes with the insinuation the Google
             | won't ever catch up due to OpenAI's apparent insurmountable
             | lead
        
             | aprilthird2021 wrote:
             | I saw it here alone. A lot of people simply have no idea
             | the level of research ability and skill Google, the
             | inventor of the Transformer, has.
        
             | KorematsuFredt wrote:
             | > Google's problem has always been in product follow
             | through.
             | 
             | Google is large enough to not care about small
             | opportunities. It ends up focusing on bigger opportunities
             | that only it can execute well. Google's ability to shut
             | down products that dont work is an insult to user but a
             | very good corporate strategy and they deserve kudos for
             | that.
             | 
             | Now, coming back to the "follow through". Google Search,
             | Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
             | excellent examples of Google's long term commitment to the
             | product and constantly making things better and keeping
             | them relevant for the market. Many companies like Yahoo!
             | had a head start but could not keep up with their mail
             | service.
             | 
             | Sure it has shut down many small products but that is
             | because they were unlikely to turn into bigger
             | opportunities. They often integrated the best aspect of
             | those products into their other well established products
             | such as Google Trips became part of search and Google
             | Shopping became part of search.
        
               | falcor84 wrote:
               | > coming back to the "follow through". Google Search,
               | Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
               | excellent examples of Google's long term commitment
               | 
               | Do you have any examples of something they launched in
               | the last decade?
        
               | troupo wrote:
               | > Google is large enough to not care about small
               | opportunities. It ends up focusing on bigger
               | opportunities
               | 
               | that result in shittier products overall. For example,
               | just a few months ago they cut 17 features from Google
               | Assistant because they couldn't monetize them, sorry,
               | because these were "small opportunities":
               | https://techcrunch.com/2024/01/11/google-is-
               | removing-17-unde...
               | 
               | > all are excellent examples of Google's long term
               | commitment to the product and constantly making things
               | better and keeping them relevant for the market.
               | 
               | And here's a long list of excellent examples of Google
               | killing products right and left because small
               | opportunities or something: https://killedbygoogle.com/
               | 
               | And don't get me started on the whole
               | Hangouts/Meet/Alo/Duo/whatever fiasco
               | 
               | > Sure it has shut down many small products but that is
               | because they were unlikely to turn into bigger
               | opportunities.
               | 
               | Translation: because they couldn't find ways to monetize
               | the last cent out of them
               | 
               | ---
               | 
               | Edit: don't forget: The absolute vast majority of
               | Google's money comes from selling ads. There's nothing
               | else it is capable of doing at any significant scale. The
               | only reason it doesn't "chase small opportunities" is
               | because Google _doesn 't know how_. There are a few
               | smaller cash cows that it can keep chugging along, but
               | they are dwarfed by the single driving force that mars
               | everything at Google: the need to sell more and more ads
               | and monetize the shit out of everything.
        
           | localfirst wrote:
           | Don't forget SORA edited their "ai generated" videos while
           | Google did not here.
           | 
           | Where did SORA get all its training videos from again and why
           | won't the executives answer a simple Yes/No question to "Did
           | you scrape Youtube to train SORA?"
           | 
           | Google attorneys want to know.
        
             | scarmig wrote:
             | Google does not care to start a war where every company has
             | to form explicit legal agreements with every other company
             | to scrape their data. Maybe if they got really desperate,
             | but right now they have no reason to be.
        
             | TwentyPosts wrote:
             | > Don't forget SORA edited their "ai generated" videos
             | while Google did not here.
             | 
             | Wait, really? Could you point to proof for this? I'm very
             | curious where this is coming from
        
           | septic-liqueur wrote:
           | I have no doubt about Google's capabilities in AI, my doubt
           | lies on the productization part. I don't think they can
           | produce something that will not be a complete mess
        
           | CSMastermind wrote:
           | > In fact, they already did.
           | 
           | In terms of software that's actually been released Google is
           | still at best in third place when it comes to AI products.
           | 
           | I don't care what they can demo, I care what they've shipped.
           | So far the only thing they've shipped for Veo is a waitlist.
        
         | Xenoamorphous wrote:
         | It's tiring. Same thing happened to the GPT-4o announcement
         | yesterday. Apparently because there's no unquestionable AGI 14
         | months after GPT-4 then everything sucks.
         | 
         | I always found HN contrarian but as I say it's really tiring.
         | I've no idea what the negative commenters are working on on a
         | daily basis to be so dismissive of everybody else's work,
         | including work that leaves 90% of the population in a
         | combination of awe and fear. Also people sometimes forget that
         | behind big corp names there are actual people. People who might
         | be reading this thread.
        
           | motoxpro wrote:
           | Yeah it's pretty unfortunate. Saying something sucks is such
           | a lack of understanding that things are not static. I guess
           | it's a sure way to be right, because there will always be
           | progress and you can look back and say "See I told you!"
        
             | IggleSniggle wrote:
             | Psh. Things are not static. Progress sucks now. Haven't you
             | heard of enshitification? You can always look back and say,
             | "see? I told you it would suck in the future!"
             | 
             | ...why am I feeling to urge to point out that I am only
             | making a joke here and not trying to make an actual counter
             | point, even if one can be made...?
        
           | mupuff1234 wrote:
           | What's also tiring is that no one is allowed to have any
           | critical thoughts because "it's tiring".
           | 
           | From my own perspective the critique is usually a counter
           | balance to extreme hype, so maybe let's just agree it's ok to
           | have both types of comments, you know "checks and balances".
        
           | Workaccount2 wrote:
           | AI is a pretty direct threat to software engineering. It's no
           | surprise people are hostile towards it. Come 2030, how do you
           | justify a paying someone $175k/yr when a $20/mo app is 95% as
           | good, and the other 5% can be done by someone making $40k/yr?
        
         | piloto_ciego wrote:
         | I think it's fear. Maybe not openly, but people are spooked at
         | how fast stuff is happening, so shitting on progress is a
         | natural reaction.
        
           | Workaccount2 wrote:
           | I have noticed this the most in SWE's who went from being
           | code writers to "human intention decipherers". Ask a an SWE
           | in 2019 what they do and it was "Write novel and efficient
           | code", ask one in 2024 and you get "Sit in meetings and talk
           | to project managers in order to translate their poor
           | communication to good code".
           | 
           | Not saying the latter was never true, it's just interesting
           | to see how people have reframed their work in the wake of
           | breakneck AI progress.
        
           | kmacdough wrote:
           | I suspect it's also a general fatigue with the over-hype. It
           | _is_ moving fast, but every step improvement has come with
           | its own mini hype cycle. The demos are very curated and make
           | the model look incredibly flexible and resilient. But when we
           | test the product in the wild, it 's constantly surprising the
           | simple tasks it blunders on. It's natural to become a bit
           | cynical and human to take that cynicism on the attack. Not
           | saying it's right, just natural, in the same way that it's
           | natural for the marketing teams to be as misleading as they
           | can get away with. Both are annoying, but there's not much to
           | do.
        
           | brikym wrote:
           | Progress? There are loads of downsides the AI fans won't
           | acknowledge. It diminishes human value/creativity and will be
           | owned and controlled by the wealthiest people. It's not like
           | the horse being replaced by the tractor. This time it's
           | different there is no place to move to but doing nothing on a
           | UBI (best case). That same power also opens the door to
           | dystopian levels of censorship and surveillance. I see more
           | of the Black Mirror scenarios coming true rather than
           | breakthroughs that benefit society. Nobody is denying that
           | it's impressive but the question is more whether it's good
           | overall. Unfortunately the toothpaste seems to be out of the
           | tube.
        
         | jmkni wrote:
         | Well for me it linked to a Google Form to join a waitlist lol,
         | so I'm not exactly pumped
        
         | jtolmar wrote:
         | I think it's just hype fatigue.
         | 
         | There's genuinely impressive progress being made, but there are
         | also a lot of new models coming out promising way more than
         | they can deliver. Even the Google AI announcements, which used
         | to be carefully tailored to keep expectations low and show off
         | their own limitations, now read more like marketing puff
         | pieces.
         | 
         | I'm sure a lot of the HN crowd likes to pretend we're all
         | perfectly discerning arbiters of the tech future with our
         | thumbs on the pulse of the times or whatever, but realistically
         | nobody is going to sift through a mountain of announcements
         | ranging from "states it's revolutionary, is marginal
         | improvement" to "states it's revolutionary, is merely an
         | impressive step" to "states it's revolutionary, is bullshit"
         | without resorting to vibes-based analysis.
        
           | throwup238 wrote:
           | It's made all the worse by just being a giant waitlist. Sora
           | is still no where to be seen three months later, GPT-4o's
           | conversational features aren't widely rolled out yet, and
           | Google's AI releases have been waitlist after waitlist after
           | waitlist.
           | 
           | Companies can either get peopled hyped or have never-ending
           | georestricted waitlists, they can't have their cake and eat
           | it too.
        
             | indigodaddy wrote:
             | Isn't their a lot of positive forward motion and
             | fruitfulness in the current state of the open source
             | llama-3 community?
        
         | Dig1t wrote:
         | Honestly just think that Google has burned their good will at
         | this point. If you notice, most announcements by Apple are
         | positively received here and same with OpenAI. But since
         | Google's "don't be evil" persona has faded and since they went
         | through so much churn WRT products. I think most people just
         | don't want to see them win.
        
         | rmbyrro wrote:
         | I hope they didn't mess this one up with ideologically driven
         | non-sense, like they did with Gemini.
        
       | clawoo wrote:
       | > "This tool isn't available in your country yet"
       | 
       | How did I know I would see this message before clicking "Sign up
       | to try"?
        
       | makestuff wrote:
       | Is there any good blogs/videos that ELI5 how these video
       | generation models even work?
        
       | sys32768 wrote:
       | I assume for consumers to use this, we must agree to have product
       | placements inserted into our productions every 48 seconds.
        
       | SoftTalker wrote:
       | Vaguely unsettling that the thumbnail for first example prompt "A
       | lone cowboy rides his horse across an open plain at beautiful
       | sunset, soft light, warm colors" looks something like the
       | pixelated vision of The Gunslinger android (Yul Brynner's
       | character) from the 1973 version of Westworld.
       | 
       | See 1:11 in this video
       | https://www.youtube.com/watch?v=MAvid5fzWnY
       | 
       | Incidentally that was one of the early uses of computer graphics
       | in a movie, supposedly those short scenes took many hours to
       | render and had to be done three times to achieve a colorized
       | image.
        
         | AceJohnny2 wrote:
         | Can't say I see a visual similarity. In any case, "Cowboy
         | silhouette in the sunset" is a pretty classic American visual.
         | 
         | But the parallel you made between android Brynner's vision and
         | the generated imagery is fun to consider!
        
       | totaldude87 wrote:
       | its 2024 and AI is taking over and yet, to signup for this, it
       | take way more clicks and Google form entry(1)
       | 
       | Sigh. I still have hopes for VEO though
        
       | aragonite wrote:
       | With so much recent focus by OpenAI/Google on AI's visual
       | capabilities, does anyone know when we might see an OCR product
       | as good as Whisper for voice transcription? (Or has that already
       | happened?) I had to convert some PDFs and MP3s to text recently
       | and was struck by the vast difference in output quality.
       | Whisper's transcription was near-flawless, all the OCR softwares
       | I tried struggled with formatting, missed words, and made many
       | errors.
        
         | jazzyjackson wrote:
         | You might enjoy this breakdown of the lengths one person went
         | through to take advantage of the iOS vision API and creating a
         | local web service for transcribing some very challenging memes:
         | 
         | https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...
         | 
         | discussed on HN:
         | 
         | https://news.ycombinator.com/item?id=34315782
        
           | aragonite wrote:
           | This is so good - thanks for sharing this!
        
         | thesandlord wrote:
         | We use GPT-4o for data extraction from documents, its really
         | good. I published a small library that does a lot of the
         | document conversion and output parsing:
         | https://npmjs.com/package/llm-document-ocr
         | 
         | For straight OCR, it does work really well but at the end of
         | the day its still not 100%
        
           | aragonite wrote:
           | Thanks! look forward to checking this out as soon as I get
           | home.
        
       | tauntz wrote:
       | Uh.. First it tells me that I can't sign up because my country is
       | supported (yay, EU) and I can sign up to be notified when it's
       | actually available. Great, after I complete that form, I get an
       | error that the form can't be submitted and I'm taken to
       | https://aitestkitchen.withgoogle.com/tools/video-fx where I can
       | only press the "Join our waitlist" button. This takes me to a
       | Google Form, that doesn't have my country in the required country
       | dropdown and has a hint that says: "Note: the dropdown only
       | includes countries where ImageFX and MusicFX are publicly
       | available.". Say what?
       | 
       | Why does this have to be so confusing? Is the name "Veo" or
       | "VideoFX"? Why is the waitlist for VideoFX telling me something
       | about public availability of ImageFX and MusicFX? Why is
       | everything US only, again? Sigh..
        
         | pelorat wrote:
         | We can blame the EU AI act and other regulations for that.
        
       | benatkin wrote:
       | Yo lo veo.
        
       | wseqyrku wrote:
       | Google puts more effort into the namings than the actual model,
       | ngl.
        
       | bpodgursky wrote:
       | I think it's funny the demos don't have people in them after the
       | Gemini fiasco. I wonder if they didn't have time to re-train the
       | model to show representative ethnicities.
        
       | thih9 wrote:
       | Is there any non slow motion example?
       | 
       | The cyberpunk video seems better in that aspect, but I wish there
       | were more.
        
       | gliched_robot wrote:
       | This is far more superior than SORA, there is no comparison.
        
         | monkeeguy wrote:
         | lol
        
       | xnx wrote:
       | 60 second example video:
       | https://www.youtube.com/watch?v=diqmZs1aD1g
        
       | svag wrote:
       | An interesting thing that Google does is to watermark the AI
       | generated videos using the [SynthID
       | technology](https://deepmind.google/technologies/synthid/).
       | 
       | It seems that the SynthID is not only for AI generated video but
       | for image, text and audio.
        
       | s1k3s wrote:
       | This looks really good for promo videos. All scenes in here are
       | basically that.
        
       | KorematsuFredt wrote:
       | I think we should all take a pause and just appreciate the
       | amazing work Google, OpenAI, MS and many others including those
       | in academia have done. We do not know if Google or OpenAI or
       | someone else is going to win the race but unlike many other
       | races, this one makes the entire humanity move faster. Keep the
       | negativity aside and appreciate the sweat and nights people have
       | poured into making such things happen. Majority of these people
       | are pretty ordinary folks working for a salary so they can spend
       | their time with their families.
        
       | ugh123 wrote:
       | From a filmmaking standpoint I still don't think this is
       | impactful.
       | 
       | For that it needs a "director" to say: "turn the horse's head 90@
       | the other way, trot 20 feet, and dismount the rider" and "give me
       | additional camera angles" of the same scene. Otherwise this is
       | mostly b-roll content.
       | 
       | I'm sure this is coming.
        
         | evantbyrne wrote:
         | They claim it can accept an "input video and editing command"
         | to produce a new video output. Also, "In addition, it supports
         | masked editing, enabling changes to specific areas of the video
         | when you add a mask area to your video and text prompt." Not
         | sure if that specific example would work or not.
        
         | qingcharles wrote:
         | I can see using these video generators to create video
         | storyboards. Especially if you can drop in a scribbled sketch
         | and a prompt for each tile.
        
       | iamleppert wrote:
       | Too little, too late. Google is follower, not leader. They need
       | to stop trying and do more stock buybacks and strip the company
       | to barebones, like Musk did with Twitter & Tesla.
        
       | NegativeLatency wrote:
       | Shoulda used youtube to host their video, it's all broken and
       | pixelated for me
        
       | m3kw9 wrote:
       | Why is it always in slow motion, is it hard to get the speed
       | correctly?
        
       | miohtama wrote:
       | > Veo's cutting-edge latent diffusion transformers reduce the
       | appearance of these inconsistencies, keeping characters, objects
       | and styles in place, as they would in real life.
       | 
       | How is this achieved? Is there temporal memory between frames?
        
       | toasted-subs wrote:
       | I could say something but I'm glad to get the confirmation.
        
       | shaunxcode wrote:
       | truly removing the `id` from video.
        
       | abledon wrote:
       | music is lacking.... suno, udio, riffusion all blow this out of
       | the water
        
       | ijidak wrote:
       | These will be remembered as the AI wars.
       | 
       | Reminds me of the competition in tech in the late 80's early 90's
       | between Microsoft and Borland, Microsoft and IBM, AMD and Intel,
       | Word vs Wordperfect, etc.
       | 
       | It's a two horse race between Google and OpenAI.
        
       | animanoir wrote:
       | Google is so finished... Unless they remove Mr. Pinchar...
        
       ___________________________________________________________________
       (page generated 2024-05-14 23:00 UTC)