[HN Gopher] Veo
       ___________________________________________________________________
        
       Veo
        
       Author : meetpateltech
       Score  : 1620 points
       Date   : 2024-05-14 17:58 UTC (1 days ago)
        
 (HTM) web link (deepmind.google)
 (TXT) w3m dump (deepmind.google)
        
       | moralestapia wrote:
       | Not nearly as good as Sora.
       | 
       | Google missed this train, big time.
        
       | DarmokJalad1701 wrote:
       | > "SIGN IN TO GET A SNEAK PEAK."
       | 
       | https://theoatmeal.com/comics/sneak_peek
        
         | baal80spam wrote:
         | Whoa. The URL is correct, the text is not.
        
           | sowbug wrote:
           | The page is now fixed. Even for a "test kitchen," that's a
           | shocking error for a company like Google to make.
        
       | hehdhdjehehegwv wrote:
       | You have to log in just to see a demo? They are desperate to
       | track people.
        
       | iamleppert wrote:
       | It's so bad its laughable. Sundar really needs to crack the whip
       | harder on those Googlers.
        
         | bamboozled wrote:
         | or someone has to crack the whip on Sundar :)
        
           | geodel wrote:
           | His job is keeping stock price up which he is doing well so
           | far. Another is layoffs which again he is doing fine :)
        
       | htrp wrote:
       | Was anyone else confused by that Donald Glover segment. It felt
       | like we were going to get a short film, and we got 3-5 clips?
        
         | curiousgal wrote:
         | Exactly!
         | 
         |  _" Hey guys big artist says this is fine so we're good"_
        
         | jsheard wrote:
         | And those clips mostly look like generic stock footage, not
         | something specific that a director might want to pre-vis.
         | 
         | This is what movie pre-vis is actually like, it doesn't need to
         | be pretty, it needs to be _precise_ :
         | 
         | https://www.youtube.com/watch?v=KMMeHPGV5VE
        
         | thisoneworks wrote:
         | Yeah that wasn't obvious what they were trying to show. Demis
         | said feature films will be released in a while
        
         | Keyframe wrote:
         | It felt AI-generated.
        
           | htrp wrote:
           | I wish it were AI Donald Glover talking and the "Apple twist"
           | at the end was that the entire 3 minute segment was a prompt
           | for "Donald Glover talking about how Awesome Gemini Models
           | are in a California vineyard"
        
         | ZiiS wrote:
         | Also it is either very good at generating living people or they
         | need to put more though into saying "Note: All videos on this
         | page were generated by Veo and have not been modified"
        
           | jsheard wrote:
           | That "footage has not been modified" statement is probably to
           | get ahead of any speculation that it was "cleaned up" in
           | post, after it turned out that the Sora demo of the balloon
           | headed man had fairly extensive manual VFX applied afterwards
           | to fix continuity errors and other artifacts.
        
             | iamdelirium wrote:
             | Wait, where did you hear this? I would assume something
             | like this would have made somewhat of a splash.
        
               | jsheard wrote:
               | The studio was pretty up front about it, they released a
               | making-of video one day after debuting the short which
               | made it clear they used VFX to fix Soras errors in post,
               | but OpenAI neglected to mention that in their own copy so
               | it flew under the radar for a while.
               | 
               | https://www.youtube.com/watch?v=KFzXwBZgB88
               | 
               | https://www.fxguide.com/fxfeatured/actually-using-sora/
               | 
               |  _> While all the imagery was generated in SORA, the
               | balloon still required a lot of post-work. In addition to
               | isolating the balloon so it could be re-coloured, it
               | would sometimes have a face on Sonny, as if his face was
               | drawn on with a marker, and this would be removed in
               | AfterEffects. similar other artifacts were often
               | removed._
        
       | TIPSIO wrote:
       | Seems like ImageFX, VideoFX (just a Google form and 3 demos),
       | MusicFX, and TextFX at the links are down and not working.
       | 
       | Huge grammar error on front page too.
        
       | indy wrote:
       | As someone who doesn't live in the US this year's Google IO feels
       | like I'm outside looking in at all the cool kids who get to play
       | with the latest toys.
        
         | roynasser wrote:
         | VPN'd right into that playground, turns out the toys were
         | pretty blah
        
         | numbers wrote:
         | don't feel left out, we're all on the wait lists
        
       | curiousgal wrote:
       | Oh look another half baked product release that's not available
       | in any country. They're a joke.
        
         | mupuff1234 wrote:
         | Is Sora available in any country?
        
           | bamboozled wrote:
           | I thought I read they've deemed Sora too dangerous to release
           | pre- election ? Or have reservations about it ? I might be
           | wrong...
        
             | sib wrote:
             | Sounds like a great excuse / communications strategy!
        
           | jaggs wrote:
           | Apparently it's only released to red teams at the moment as
           | they try to manage safety. There's also the issue about
           | releasing too close to an election?
        
       | sebzim4500 wrote:
       | The Donald Glover segment might be a new low for Google
       | announcement videos. They spent all this time talking up the
       | product but didn't actually show what he had created.
       | 
       | Imagine how bad the model must be if this is the best way Google
       | can think of selling it.
        
         | fakedang wrote:
         | What seems worse is the Google TextFX video with Lupe Fiasco?
         | What the heck am I supposed to get out of watching boring
         | monologues by a couple of people? They could have just as
         | easily shown, with less camera work, Lupe Fiasco actually using
         | the LLM model, but they didn't - or at least not enough to grab
         | my attention in 2 minutes.
         | 
         | Personally, I liked the above link, even as a Google skeptic,
         | but the videos aren't helping their case.
        
       | Horffupolde wrote:
       | Google is the new Kodak.
        
         | bingbingbing777 wrote:
         | Kodak failed because their CEO refused to go down the digital
         | route. How is that comparable?
        
       | loudmax wrote:
       | The videos in this demo are pretty neat. If this had been
       | announced just four months ago we'd all be very impressed by the
       | capabilities.
       | 
       | The problem is that these video clips are very unimpressive
       | compared to the Sora demonstration which came out three months
       | ago. If this demo was announced by some scrappy startup it would
       | be worth taking note. Coming from Google, the inventor of the
       | Transformer and owner of the largest collection of videos in the
       | world, these sample videos are underwhelming.
       | 
       | Having said that, Sora isn't publicly available yet, and maybe
       | Veo will have more to offer than what we see in those short clips
       | when it gets a full release.
        
         | fakedang wrote:
         | Honestly, if Veo becomes public faster than Sora, they could
         | win the video AI race. But what am I wishfully thinking - it's
         | Google we're talking about!
        
           | Jensson wrote:
           | > But what am I wishfully thinking - it's Google we're
           | talking about!
           | 
           | Google the company known to launch way too many products?
           | What other big company launches more stuff early than them?
           | What people complain about Google is that they launch too
           | much and then shut them down, not that they don't launch
           | things.
        
             | fakedang wrote:
             | Google lost first place in AI precisely because they've
             | been walking around imaginary eggshells regarding AI's
             | effect on the public. That led to the whole Gemini fiasco
             | and the catch up game they've had to play with OpenAI-MSFT.
        
           | spaceman_2020 wrote:
           | The cost to switch to new models is negligible. People will
           | switch to Sora if its better instantly
           | 
           | I've switched to Opus from GPT-4 for coding and it was non-
           | trivially easy
        
             | ndls wrote:
             | I think you used non-trivially wrong there, bud.
        
             | SilverSlash wrote:
             | Except your single experience doesn't mean it's generally
             | true, bud. For instance I have not switched to Opus despite
             | claims that it is better because I don't want to go through
             | the effort of cancelling my ChatGPT subscription and
             | subbing to Claude. Plus I like getting new stuff early that
             | OpenAI occasionally gives out and the same could apply for
             | Google's AI.
        
             | fakedang wrote:
             | Sorry, but lock in effects are real. End users, solo devs
             | and startups might find it trivially easy, but enterprise
             | clients would go through hoops before a decision is to be
             | made. And enterprise clients would rather not go through
             | with that, hence they'll stick to whoever came first,
             | unless there's a massive differentiator between the two.
        
         | alex_duf wrote:
         | >these sample videos are underwhelming
         | 
         | wow the speed at which we can be blase is terrifying. 6 months
         | ago this was not possible, and felt this was years away!
         | 
         | They're not underwhelming to me, they're beyond anything I
         | thought would ever be possible.
         | 
         | are you genuinely unimpressed? or maybe trying to play it cool?
        
           | danielbln wrote:
           | The faster the tech cycle, the faster we become accustomed to
           | it. Look at your phone, an absolute, wondrous marvel of
           | technology that would have been utterl and totally scifi just
           | 25 years ago. Yet we take it for granted, as we do with all
           | technology eventually. The time frames just compress is all,
           | for better or for worse.
        
             | newswasboring wrote:
             | Yeah man but there has to be some thresholds. We take
             | phones for granted after years of active availability. I
             | personally remember days when "what if your phone dies" was
             | a valid concern for even short periods, and I'm not that
             | old. Sora isn't even available publicly. At some point it
             | crosses over from being jaded to just being a cynic.
        
           | loudmax wrote:
           | On some level, it's healthy to retain a sense of humility at
           | the technological marvels around us. Everything about our
           | daily lives is impressive.
           | 
           | Just a few years ago, I would have been absolutely blown away
           | by these demo videos. Six months ago, I would have been very
           | impressed. Today, Google is rolling a product that seems
           | second best. They're playing catch-up in a game where they
           | should be leading.
           | 
           | I will still be very impressed to see videos of that quality
           | generated on consumer grade hardware. I'll also be extremely
           | impressed if Google manages to roll out public access to this
           | capability without major gaffes or embarrassments.
           | 
           | This is very cool tech, and the developers and engineers that
           | produced it should be proud of what they've achieved. But
           | Google's management needs to be asking itself how they've
           | allowed themselves to be surpassed.
        
           | steamer25 wrote:
           | They didn't really do a very good job of selecting marketing
           | examples. The only good one, that shows off creative
           | possibilities, is the knit elephant. Everything else looks
           | like the results of a (granted fairly advanced) search
           | through a catalog of stock footage.
           | 
           | Even search, in and of itself, is incredibly amazing but
           | fairly commoditized at this point. They should've highlighted
           | more unique footage.
        
       | mccraveiro wrote:
       | They didn't show any human videos, which could indicate that the
       | technology struggles with generating them.
        
         | karmasimida wrote:
         | Actually there is one in the last demo, it is not an individual
         | one, but one shot in the demo where a team uses this model to
         | create a scene with human in it, where they created an image of
         | black woman but only up her head in it
         | 
         | I would generally agree though, it is not normal they didn't
         | show more human
        
         | revscat wrote:
         | I'm sure part of the reason, beyond those given already, is
         | that they want to avoid the debate around nudity.
        
         | dyauspitr wrote:
         | You know why and it's not that their technology struggles with
         | it.
        
           | lewispollard wrote:
           | Please elaborate, because I certainly don't.
        
             | blinky88 wrote:
             | I think he's talking about the diversity controversy
        
               | dyauspitr wrote:
               | That might be a factor too but I was referring more to
               | the nudity and objectification issue.
        
         | chubot wrote:
         | It's also probably that it's easier to spot fake humans than to
         | spot fake cats or camels. We are more attuned to the faces of
         | our own species
         | 
         | That is, AI humans can look "creepy" whereas AI animals may
         | not. The cowboy looks pretty good precisely because it's all
         | shadow.
         | 
         | CGI animators can probably explain this better than I can ...
         | they have to spend way more time on certain areas and certain
         | motions, and all the other times it makes sense to "cheat" ...
         | 
         | It explains why CGI characters look a certain way too -- they
         | have to be economical to animate
        
         | mjfl wrote:
         | thank goodness.
        
         | himinlomax wrote:
         | They're probably still wary of their latest PR disaster, the
         | inclusive and diverse WW2 Germans from Gemini.
        
       | xianshou wrote:
       | To quote Twitter/X, "I wonder what OpenAI will release tomorrow
       | and Google will release a waitlist for."
       | 
       | GPT-4o: out
       | 
       | Veo: waitlist
       | 
       | Admittedly this is impressive and the direct comp would be Sora,
       | which isn't out, but sometimes the caricature is very close to
       | the truth.
        
         | jsheard wrote:
         | Then again Veo is in the same category as Sora, which isn't
         | released either, 3 months after the reveal.
        
         | rvnx wrote:
         | "This tool isn't available in your country yet"
        
         | modeless wrote:
         | To be fair, all the voice stuff OpenAI demoed isn't released
         | yet either.
        
         | martinesko36 wrote:
         | This is Google for the last 5+ IOs. They just release waitlists
         | and demos that are leapfrogged by the time they're available to
         | all. (and shut down a few years later)
        
           | htrp wrote:
           | Cite sources?
        
         | skepticATX wrote:
         | OpenAI hardly released gpt-4o. The demo yesterday was clearly a
         | rushed response to I/O. It's quite possible that Google will
         | ship multi-modality features faster than OpenAI will.
        
           | JeremyNT wrote:
           | Yeah I think at this point it's "not if, but when" and the
           | gap between parity is just going to keep shrinking
           | (until/unless there's some kind of copyright/legislative
           | barrier implemented that favors one or the other).
           | 
           | "We have no moat" swings both ways.
        
           | juice_bus wrote:
           | Which one of these products Google are releasing that you can
           | trust will even be around in a year or two? I'm certainly
           | done trusting Google with new products.
        
           | buildbot wrote:
           | Without doing anything, I have access to GPT-4o in chatgpt
           | and the api already (on a personal account, not related to
           | work). Maybe I'm just super lucky, but it's certainly not
           | vaporware.
        
           | Difwif wrote:
           | What do you mean? Everyone has access to the gpt-4o model
           | right now through ChatGPT and the API. Sure we don't have
           | voice-to-voice but we have a lot more than what Google has
           | promised.
        
             | hbn wrote:
             | How do I get access? I just checked my app and the Premium
             | upgrade says it will unlocked GPT-3.5 and GPT-4, so I
             | assume my version is still the old one.
             | 
             | All my apps are updated in the App Store too.
        
               | bagels wrote:
               | I have a paid account, and I didn't have to do anything
               | to use the new model.
        
               | DeRock wrote:
               | I just checked, there was an iOS app update available and
               | it enabled it. I'd check again if there's a new update
               | (version 1.2024.129). Or you could use the website.
        
               | hbn wrote:
               | I'm on the same version and don't see anything different
               | 
               | Website also only has toggle for 3.5 and 4 with the Plus
               | upgrade. Not sure if it's cause I'm in Canada?
        
               | croes wrote:
               | I use their website and it's one of the three models to
               | choose from if your are on plus subscription.
        
               | mr_mitm wrote:
               | I have premium access and I can select 4o in the dropdown
               | menu on Android
        
               | cush wrote:
               | I have the 4o model. On premium. No voice yet
        
               | theresistor wrote:
               | To add a counterbalance, I just checked in the app and on
               | the website on a non-paid account, and I too do NOT have
               | GPT-4o.
        
               | hbn wrote:
               | Everyone who says they have access in the replies to my
               | comment seem to be paid users. So maybe it's only rolling
               | out to them first.
        
               | htrp wrote:
               | I expect they have to offer the paid users some thing
        
               | hbn wrote:
               | Paid users get like a 5x higher rate limit iirc
        
               | electriclove wrote:
               | My (paid) app has it but no voice chat yet
        
               | TecoAndJix wrote:
               | I have a paid account and can do voice to voice on the
               | iOS app as of last night.
        
               | hbn wrote:
               | The realtime one they showed yesterday or the one that's
               | existed forever where it's just a voice-to-text input and
               | TTS output taking turns?
        
               | TecoAndJix wrote:
               | I feel silly now. I downloaded the app after the
               | announcement (I'm a desktop user) and it looked identical
               | to the one they show in the sarcasm video. When I asked
               | it, I was told it was not the new feature announced
               | yesterday. Still a lot of fun!
               | 
               | Edit - it does list the new model in my app at least
        
               | sib wrote:
               | In the App Store there's a new build of the iOS app as of
               | 3 hours about (call it about 11am US Pacific time). It
               | includes the GPT-4o model (at least it shows it for me.)
        
               | hbn wrote:
               | Are you a paid user?
        
             | theresistor wrote:
             | It is not available on my (free) account in either the app
             | or the website. So no, everyone does _not_ have access to
             | it.
        
               | satvikpendem wrote:
               | It's for paid users for now, not free. I have ChatGPT Pro
               | and I can use the new model.
        
               | nialv7 wrote:
               | I am a free user and I have 4o. I think it is just a
               | gradual roll out.
        
               | satvikpendem wrote:
               | That sounds about right, it just seemed that everyone who
               | replied above who had access to the new model were paid
               | users.
        
             | ben_w wrote:
             | API yes, ChatGPT no (at least not for all users); I've got
             | my own web interface for the API so I can play with the
             | model (for all of $0.045 of API fees), but most people
             | can't be bothered with that and will only get 4o when it
             | rolls out as far as their specific ChatGPT account.
        
               | ghshephard wrote:
               | Was running just fine on my ChatGPT client on iOS - full
               | 2-way voice comms by 3:00 PM yesterday. Application was
               | already updated.
        
               | mike_hearn wrote:
               | I have a regular ChatGPT Pro account and I have GPT-4o.
               | 
               | The bigger issue is that 4o without the multi-modal, new
               | speech capabilities or desktop app isn't that different
               | to GPT-4. And those things aren't yet launched.
        
             | TecoAndJix wrote:
             | Posted further down the thread - I have a paid account and
             | can do voice to voice on the iOS app as of last night.
        
             | baobabKoodaa wrote:
             | I don't have access to gpt-4o via ChatGPT
        
           | localfirst wrote:
           | listen all these guys out here attacking Google and making
           | outlandish/false claims
           | 
           | look at their linkedin pages, that will tell you why they are
           | desperate
           | 
           | (hint: they bought OpenAI bags on the secondary market)
        
         | rvz wrote:
         | Sora is the closest comparison to Veo and both aren't out.
         | 
         | It's been there for three months and still isn't even close to
         | being released and available.
         | 
         | Essentially Google has already caught up to OpenAI with their
         | recent responses and it's clear that there are private OpenAI
         | investors pushing such nonsense around Google struggling to
         | compete.
        
         | resource_waste wrote:
         | Google Press: This is the greatest AI Model ever yet.
         | 
         | Users: Lol it wont even tell me how to draw a picture of a
         | human because its inappropriate.
         | 
         | Google flipped like a switch a few years ago. Instead of going
         | for product quality, it seems they went full Apple Marketing
         | and control the narrative of top social media.
         | 
         | I keep trying thinking: "well its Google, they will be the best
         | right?" No, I'm at giving up on Google, they are not as
         | powerful as I once thought... Hmm seems like a good time to get
         | into Lobbying and Marketing...
        
         | nextworddev wrote:
         | Except gpt-4o with audio and video inputs isn't actually out
        
           | adamtaylor_13 wrote:
           | I was using it yesterday in the mobile app. Unless they just
           | slapped the new UI on an older model.
        
             | hakanensari wrote:
             | It's no longer there, I think?
        
             | Cyph0n wrote:
             | I think they (partially?) rolled it back. I tried out the
             | voice input yesterday, but it's missing from the app today.
        
             | Workaccount2 wrote:
             | They just released the text chat model which still uses the
             | same old audio interface as 4. The new audio/video chat
             | stuff is not out yet (unless you are a very lucky early
             | beta user).
        
         | dom96 wrote:
         | > GPT-4o: out
         | 
         | Is it? I can't use it yet at least
        
           | lordswork wrote:
           | I am also wondering how to use it..
        
           | drawnwren wrote:
           | It is. I've got it already, but I'm a bit of a gpt4 power
           | user. I hit my rate limit biweekly or so and run up close to
           | it every day. I'd bet maybe they prioritized people that were
           | costing them money.
        
             | kaibee wrote:
             | It might be just by sign-up order. I signed up for pro
             | basically as soon as I could, but I never hit limits, and
             | only really use it once or twice a day, sometimes not at
             | all.
        
               | drawnwren wrote:
               | Interestingly, when I use Cloudflare's Warp DNS I don't
               | have access to it. So, it might have something to do w/
               | region as well?
        
             | phyalow wrote:
             | Really? You have the new model, sure, I have it too, but
             | afaik nobody has the new ultra fast and variable voice +
             | video chat on mobile.
        
               | drawnwren wrote:
               | The original question asked if anyone had GPT-4o. You're
               | asking a different question.
        
           | w-m wrote:
           | GPT-4o is available for me on ChatGPT, with the
           | text+attachment input (as a Plus user from Germany). It's
           | crazy fast. The voice for the audio conversation in the app
           | is still the old one, and doesn't let you interrupt it.
        
         | mrkramer wrote:
         | Google is scared of what every new model can produce, they
         | don't want drama but they always end up in some kind of media
         | drama.
        
         | Tenoke wrote:
         | I can't even join the waitlist from Europe while 4o is fully
         | available here.
        
         | dyauspitr wrote:
         | I haven't been able to try out 4o. The voice chat continuously
         | says there's too much traffic and I don't even see a button to
         | turn on the camera
        
         | qwertox wrote:
         | > GPT-4o: out
         | 
         | I don't know what's wrong with GPT-4o, but the answers I'm
         | getting are much worse than before yesterday. It's constantly
         | repeating the entire content required to provide a seemingly
         | "full" answer, but if it passes me the same but slightly
         | modified Python code for the fifth time even if it has become
         | irrelevant to the current conversation, it really gets on my
         | nerves.
         | 
         | I had so well tuned custom instructions which worked
         | beautifully and now it's as if it is ignoring most of them.
         | 
         | It's causing me frustration and really wasting my time when I
         | have to wait for the unnecessary long answers to finish.
        
       | endisneigh wrote:
       | I've noticed that a lot of the commentary of these models creates
       | the sort of fervor like politics or sports.
       | 
       | In any case - no details on compute needed. Curious if this ever
       | can be cheap. Even Midjourney still requires a lot.
       | 
       | I'm also surprised there hasn't been some attempt at creating
       | benchmarks for this. One example could be color accuracy.
        
         | stefan_ wrote:
         | Never mind no benchmarks, half of these announcements in the
         | past were straight _made up_ , "offline enhanced" cherry picked
         | "examples", CGI fantasies.
         | 
         | Not to mention the whole AGI topic is forever doomed from SciFi
         | fans, just remember what happened with that room-temperature
         | superconductivity.
        
       | inasio wrote:
       | From a 2014 Wired article [0]: "The average shot length of
       | English language films has declined from about 12 seconds in 1930
       | to about 2.5 seconds today"
       | 
       | I can see more real-world impact from this (and/or Sora) than
       | most other AI tools
       | 
       | [0] https://www.wired.com/2014/09/cinema-is-evolving/
        
         | jsheard wrote:
         | Even if the shots are very short you still need coherency
         | _between_ shots, and they don 't seem to have tackled that
         | problem yet.
        
         | mattgreenrocks wrote:
         | This is very noticeable. Watching movies from the 1970s is
         | positively serene for me, vs the shot time on modern films
         | often leaves me wonder, "wait, what just happened there?"
         | 
         | And I'm someone who is fine playing fast action video games.
         | Can't imagine what it's like if you're older or have sensory
         | processing issues.
        
           | ryandrake wrote:
           | Obligatory: Liam Neeson jumps over a fence in 6 seconds, with
           | 14 cuts[1].
           | 
           | 1: https://www.youtube.com/watch?v=gCKhktcbfQM
        
             | aidenn0 wrote:
             | I'd like to fact check this amazing comment on that video,
             | but it would require watching Taken 3:
             | 
             | > Some of y'all may find how awful this editing gets pretty
             | interesting: I did an Average Shot Length (ASL) for many
             | movies for a recent project, and just to illustrate bad
             | overediting in action movies, I looked at Taken 3 (2014) in
             | its extended cut.
             | 
             | > The longest shot in the movie is the last shot, an aerial
             | shot of a pier at sunset ending the movie as the end
             | credits start rolling over them. It clocks in at a runtime
             | of 41 seconds and is, _BY FAR_ , the longest shot in the
             | movie.
             | 
             | > The next longest is a helicopter establishing shot of the
             | daughter's college after the "action scene" there a little
             | over an hour in, at 5 seconds.
             | 
             | > Otherwise, the ASL for Taken 3 (minus the end
             | credits/opening logos), which has a runtime of 1:49:40,
             | 4,561 shots in all (!!!), is 1.38 SECONDS . For comparison,
             | Zack Snyder's Justice League (2021) (minus end
             | credits/opening logos) is 3:50:59, with 3163 shots overall,
             | giving it an ASL of 4.40 seconds, and this movie, at 1 hour
             | 50 minutes, has north of 4,561 for an ASL of 1.38
             | seconds?!?! _Taken 3 has more shots in it than Zack Snyder
             | 's Justice League, a movie more than double its length..._
             | 
             | > To further illustrate how ridiculous this editing gets,
             | the ASL for Taken 3's non-action scenes is 2.27 seconds. To
             | reiterate, this is the non-action scenes. The "slow
             | scenes." The character stuff. Dialogue scenes. The stuff
             | where any other movie would know to slow down. 2.27 SECONDS
             | For comparison, Mad Max: Fury Road (minus end
             | credits/opening logos) has a runtime of 1:51:58, with 2646
             | shots overall, for an ASL of 2.54 seconds. TAKEN 3'S "SLOW
             | SCENES" ARE EDITED MORE AGGRESSIVELY THAN MAD MAX: FURY
             | ROAD!
             | 
             | > And Taken 3's action scenes? _Their ASL is 0.68 seconds!_
             | 
             | > If it weren't for the sound people on the movie, Taken 3
             | wouldn't be an "action movie". It'd be abstract art.
        
               | throwup238 wrote:
               | It's worth noting that Taken 3 has a 13% rating on Rotten
               | Tomatoes, which is well in to "it's so bad it's good"
               | territory. I don't think the rapid cuts went unnoticed.
        
               | nimithryn wrote:
               | Yeah, this sequence is a meme commonly cited to show
               | "choppy modern editing"
        
               | llmblockchain wrote:
               | More chops than an MF DOOM track.
        
             | kristofferR wrote:
             | The top comment makes a really good point though:
             | 
             | "He's 68. I'm guessing they stitched it together like this
             | because "geriatric spends 30 seconds scaling chainlink
             | fence then breaks a hip" doesn't exactly make for riveting
             | action flick fare."
             | 
             | Lingering shots are horrible for obscuring things.
        
               | lupire wrote:
               | Movies have stunt performers.
               | 
               | And Neeson was only 60 when filming Taken 3.
        
               | troupo wrote:
               | Keanu Reeves was 57-8 when he shot the last _John Wick_.
               | IIRC Bob Odenkirk was 58 in _Nobody_. Neeson was 60 in
               | Taken 3.
               | 
               | There ways to shoot an action scene with an aging star
               | that doesn't involve 14 cuts in 4 seconds. You just have
               | to care about your craft.
        
             | nineteen999 wrote:
             | Is it Liam Neeson, or his stunt double?
        
           | psbp wrote:
           | My brain processes too slow for modern action movies.
           | 
           | I can tell what's going on, but I always end up feeling
           | agitated.
        
             | MarcScott wrote:
             | I'm okay with watching the majority of action movies, but I
             | distinctly remember watching this fight scene in a Bourne
             | movie and not having a clue what was going on. The constant
             | camera changes, short shot length, and shaky cam, just
             | confused the hell out of me.
             | 
             | https://youtu.be/uLt7lXDCHQ0?si=JnVMjmu0WgN5Jr5e&t=70
        
               | earthnail wrote:
               | I thought it was brilliant. Notice there's no music. It's
               | one of the most brutal action scenes I know. Brutal in
               | the sense of how honest it felt about direct combat.
        
               | JohnMakin wrote:
               | I'm glad we're finally getting away from the 00's shaky
               | cam era.
        
           | kemitchell wrote:
           | Enjoy some Tarkovsky.
        
         | joshuahedlund wrote:
         | How many of those 2.5 second "shots" are back-and-forths
         | between two perspectives (ex. of two characters talking to one
         | another) where each perspective is consistent with itself? This
         | would be extremely relevant for how many seconds of consistent
         | footage are actually needed for an AI-generated "shot" at film-
         | level quality.
        
         | lobochrome wrote:
         | Shot length, yes - but the scene stays the same. Getting
         | continuity with just prompts seems not yet figured out.
         | 
         | Maybe it's easy, and you feed continuity stills into the
         | prompt. Maybe it's not, and this will always remain just a more
         | advanced storyboarding technique.
         | 
         | But then again, storyboards are always less about details and
         | more about mood, dialog, and framing.
        
         | chipweinberger wrote:
         | In 1930 they often literally had a single camera.
         | 
         | Just worth keeping that in mind. You could not just switch
         | between multiple shots like you can today.
        
       | Keyframe wrote:
       | Kind of sucks to be google. Even they're making good progress
       | here, and have laid the foundations of a lot if not most things..
       | their products are, well there aren't any noteworthy compared to
       | rest. And considering google is sitting on top of one of the
       | largest if not THE largest video database, along with maps,
       | traffic, search, internet.zip, usenet, vast computing resources
       | vertically integrated.. they have the whole advantage in the
       | world. So, the hell are they doing? Why isn't their CEO already
       | out? Expectations from them are higher than from anyone else.
        
         | InfiniteVortex wrote:
         | Google search has been absolutely ruined in terms of quality.
         | You're right, they've built the base in terms of R&D for many
         | of the AI breakthroughs thats powering competing alternative
         | products.... that happen to be better than Google's own
         | products. Google went from "Don't be evil" to just another big
         | corporate tech company. They have so much potential.
         | Regrettable.
        
           | CraftingLinks wrote:
           | They are fast on their way becoming IBM 2.0.
        
             | jason-phillips wrote:
             | More like Xerox
        
           | dyauspitr wrote:
           | If anything google search with the Gemini area on the top has
           | been very good for me.
        
         | atleastoptimal wrote:
         | Because they punish experimentation as it eats into their
         | bottom line. AI is a tool for ads in the mind of executives at
         | Google. Ads and monetization of human productivity, not an
         | agent of productivity on its own.
        
           | khazhoux wrote:
           | C'mon, Google doesn't "punish" experimentation. Google X,
           | Google Glass, Daydream, Fuschia, moonshots, the lab spinoff
           | (whose name I can't remember)... hell, even all the abandoned
           | products everyone here always complains about.
           | 
           | The experiments often/usually fail, but they _do_ experiment.
        
             | Koffiepoeder wrote:
             | If you prune all the branches, where will the fruits grow?
        
               | khazhoux wrote:
               | The branches were dead and could bear no fruit. New
               | branches will sprout next season.
        
               | saalweachter wrote:
               | For grapes, the conventional wisdom is to prune all the
               | old branches at the end of each season.
        
           | lolinder wrote:
           | "Laser-focused on the bottom line at the expense of all else"
           | is not how I'd describe Google, now or at any point in the
           | past. They have a _lot_ of dysfunction, but if anything that
           | dysfunction stems from _too much_ experimentation and
           | autonomy at the leaf nodes of the organization. That 's how
           | they get into these crazy places where they have to pick
           | between 5 chat apps or whatever.
           | 
           | If Google were as focused on ads as you seem to think we'd at
           | least see some sort of coherent org-wide strategy instead of
           | a complete lack of direction.
        
             | criddell wrote:
             | I'd describe Google as focused on the bottom line after
             | they put the ads guy in charge of search.
             | 
             | I'm referring to this article that was posted here
             | recently:
             | 
             | https://www.wheresyoured.at/the-men-who-killed-google/
        
               | khazhoux wrote:
               | The person now in charge of Search is Elizabeth Hamon
               | Reid, a long-time googler who came up through the ranks
               | from engineer (in Google Maps) to VP over 20 years. She's
               | legit.
        
               | criddell wrote:
               | Is Wikipedia out of date then?
               | 
               | https://en.wikipedia.org/wiki/Prabhakar_Raghavan
        
               | khazhoux wrote:
               | Ah, according to this, she's head of Search but reports
               | to Prabhakar. I thought from recent reports that she'd
               | taken search over from him.
               | 
               | Nonetheless, she was a good engineer and a good manager,
               | back when we crossed path many moons ago.
               | 
               | https://searchengineland.com/liz-reid-google-new-head-of-
               | sea...
        
               | lolinder wrote:
               | That was a decision that prioritized the bottom line over
               | other things. But saying that Google is "focused" on the
               | bottom line implies that there's a pattern of them
               | putting the bottom line first, which is simply not true
               | if you look at Google as a whole. Search specifically,
               | maybe, but not Alphabet.
        
         | Workaccount2 wrote:
         | I don't know how more people don't talk about the 1M context
         | tokens. While the output is mediocre for cutting edge models,
         | you can context stuff the ever living hell out of it for some
         | pretty amazing capabilities. 2M tokens is even crazier.
        
           | lordswork wrote:
           | It is pretty amazing. I've been using it every day. I do wish
           | you could easily upload an entire repo into it though.
        
             | bongodongobob wrote:
             | Have it write a program to output a repo as a flat file.
        
           | rm_-rf_slash wrote:
           | Anything approaching the token limit I turn into a file and
           | upload to a vector store. Results are comparable between Chat
           | and Assistants.
        
           | Keyframe wrote:
           | That's a good point. Gemini gatekeeping me on so many answers
           | made me forget about this extraordinary feature of it.
        
         | softwaredoug wrote:
         | It's often said you need to disrupt your own business model.
         | 
         | Google had blinders on. They didn't relentlessly focus on
         | reinventing their domain. They just milked what they had.
         | Gradually losing site of the user experience[1] to focus on
         | monetization above all else.
         | 
         | 1 - https://twitter.com/pdrmnvd/status/1707395736458207430
        
         | dyauspitr wrote:
         | Their CEO is generating massive, growing profits every quarter
         | while releasing generative technology, all the while threading
         | a fine line in what those models generate because it can be
         | pretty devastating for a large corp like Google.
        
           | Keyframe wrote:
           | you think it's because of him or despite him?
        
       | airstrike wrote:
       | > Veo
       | 
       | > Sign up to try VisionFX
       | 
       | Is it Veo or VisionFX? Is it a sign up, a trial, or a waitlist?
       | 
       | How hard can it be to write a clear message? In the words of Don
       | Miller, if you confuse, you lose.
        
         | therein wrote:
         | Yeah I was like so is it Veo or VisionFX.
         | 
         | This landing page feels as haphazardly put together as the
         | Coinbase downtime page last night.
        
         | peppertree wrote:
         | This is very on-brand with how Google does branding. "are you
         | confused yet? no? try this other vaguely similar name."
        
           | davidw wrote:
           | Maybe it's going to be a new messaging app - but with AI!
           | 
           | Kidding... I signed up for the waitlist. I have ideas for
           | videos I'd like to use to explain things that I have no hope
           | of creating myself.
        
         | BlackJack wrote:
         | Disclaimer: I work at Google on related stuff
         | 
         | Veo is the name of a video model. VideoFX is the name of a new
         | experimental tool at labs.google.com, which uses Veo and lets
         | you make videos.
         | 
         | Thanks for the feedback though, I see how it's confusing for
         | users.
        
           | zb3 wrote:
           | I see the endpoint returns "Not Implemented" when trying to
           | make a video :<
           | 
           | Imagen 3 is awesome though, generates nice logos :D
        
         | mike_hearn wrote:
         | Presumably this is DeepMind vs Labs fighting over the same
         | project. A consequence of guaranteeing Demis some level of
         | independence when DeepMind was bought, which still shows
         | through in the fact that the DeepMind brand(s) survive.
        
         | qingcharles wrote:
         | And: Communication isn't what you say, it's what people hear
         | 
         | Agree this is totally confusing.
        
       | rishav_sharan wrote:
       | Now that the first direct competitor to Sora has been announced,
       | I am sure Sora will be suddenly ready for public consumption, all
       | it's ai safety concerns forgotten
        
         | sebastiennight wrote:
         | I think there's a tremendous compute cost associated with both
         | models still... I can't see how either company could withstand
         | the instant enormous demand, even if they tried to command
         | crazy prices.
         | 
         | Even at $1 per 5-second video, I think some use cases
         | (including fun/non-business ones) would still overwhelm
         | capacity.
        
       | popcar2 wrote:
       | Not nearly as impressive as Sora. Sora was impressive because the
       | clips were long and had lots of rapid movement since video models
       | tend to fall apart when the movement isn't easy to predict.
       | 
       | By comparison, the shots here are only a few seconds long and
       | almost all look like slow motion or slow panning shots
       | cherrypicked because they don't have that much movement. Compare
       | that to Sora's videos of people walking in real speed.
       | 
       | The only shot they had that can compare was the cyberpunk video
       | they linked to, and it looks crazy inconsistent. Real shame.
        
         | nuz wrote:
         | Sora is also movement limited to a certain range if you look at
         | the clips closely. Probably something like filtering by some
         | function of optical flow in both cases.
        
         | ein0p wrote:
         | Also Sora demos had some really impressive generations
         | featuring _people_. Here we hardly see any people which likely
         | means exactly what you'd guess.
        
           | data-ottawa wrote:
           | Has Gemini started generated impacted of people again? My
           | trial has ended and I haven't been following the issue.
        
         | spiderfarmer wrote:
         | Also the horse just looks weird, just like the buildings and
         | peppers.
         | 
         | It's impressive as hell though. Even if it would only be used
         | to extrapolate existing video.
        
         | LZ_Khan wrote:
         | I imagine thats just a function of how much training data you
         | throw at it.
        
         | Jensson wrote:
         | > Sora was impressive because the clips were long and had lots
         | of rapid movement
         | 
         | Sora videos ran at 1 beat per second, so everything in the
         | image moved at the same beat and often too slow or too fast to
         | keep the pace.
         | 
         | It is very obvious when you inspect the images and notice that
         | there are keyframes at every whole second mark and everything
         | on the screen suddenly goes in their next animation step.
         | 
         | That really limits the kind of videos you can generate.
        
           | lupire wrote:
           | So it needs to learn how far each object can travel in 1sec
           | at its natural speed?
        
             | Jensson wrote:
             | It also needs to separate animation steps for different
             | objects so that objects can keep different speeds. It isn't
             | trivial at all to go from having a keyframe for the whole
             | picture to having separate for separate parts, you need to
             | retrain the whole thing from the ground up and the results
             | will be way worse until you figure out a way to train that.
             | 
             | My point is that it isn't obvious at all that Soras way
             | actually is closer to the end goal, it might look better
             | today to have those 1 second beats for every video but
             | where do you go from there?
        
               | Aerroon wrote:
               | The best case scenario would probably being able to
               | generate "layers" at a time. That would give more
               | creative control over the outcome, but I have no idea how
               | you would do it.
        
         | TIPSIO wrote:
         | Objectively speaking (if people would be honest with
         | themselves), both are just decent at best.
         | 
         | I think comparing them now is probably not that useful outside
         | of this AI hype train. Like comparing two children. A lot can
         | happen.
         | 
         | The bigger message I am getting from this is it's clear OpenAI
         | won't have a super AI monopoly.
        
           | TaylorAlexander wrote:
           | Comparing two children is a good one. My girlfriend has taken
           | to pointing out when I'm engaging in "punditry". They're an
           | engineer like I am and we talk about tech all the time, but
           | sometimes I talk about which company is beating which company
           | like it's a football game, and they call me out for it.
           | 
           | Video models are interesting, and to some extent trying to
           | imagine which company is gonna eat the other's lunch is kind
           | of interesting, but sometimes that's all people are
           | interested in and I can see my girlfriend's reasoning for
           | being disinterested in such discussion.
        
             | Jonanin wrote:
             | Except that many of the people involved do think of it like
             | a football game, and thus it actually is like one. Of
             | course the researchers and engineers at both OpenAI and
             | Google DeepMind have a sense of rivalry and strive to one
             | up another. They definitely feel like they are in a
             | competition.
        
               | TaylorAlexander wrote:
               | > They definitely feel like they are in a competition.
               | 
               | Citation needed?
               | 
               | Although I did not work in AI, I did work at Google X
               | robotics on a robot they often use for AI research.
               | 
               | Maybe some people felt like it was a competition, but I
               | don't have much reason to believe that feeling is common.
               | AI researchers are literally in collaboration with other
               | people in the field, publishing papers and reading the
               | work of others to learn and build upon it.
        
               | Jensson wrote:
               | > AI researchers are literally in collaboration with
               | other people in the field, publishing papers and reading
               | the work of others to learn and build upon it.
               | 
               | When OpenAI suddenly stopped publishing their stuff I bet
               | that many researchers now started feeling like it started
               | to be a competition.
               | 
               | OpenAI is no longer cooperating, they are just competing.
               | They still haven't said anything about how gpt-4 works.
        
           | motoxpro wrote:
           | What would make this "Good?"
        
           | Aeolun wrote:
           | I'm fairly certain Google just has a big stack of these in
           | storage but never released, or the moment someone pulls ahead
           | it's all hands on deck to make the same thing.
        
         | arcastroe wrote:
         | > The shots here [..] almost all look like slow motion or slow
         | panning shots.
         | 
         | I think this is arguably better than the alternative. With
         | slow-mo generated videos, you can always speed them up in
         | editing. It's much harder to take a fast-paced video and slow
         | it down without terrible loss in quality.
        
         | totaldude87 wrote:
         | Could also be the doing of google. if Veo screws up , the
         | weight falls on Alphabet stock. While open AI is not public and
         | doesn't have to worry about anything . Like even if open AI
         | faked some of their AI videos(not saying they did), it wouldn't
         | affect them the way it would affect Veo--> Google-->Alphabet
         | 
         | being cautious often puts a dent in innovation
        
           | soulofmischief wrote:
           | You mean like how they faked some Gemini stuff?
           | 
           | https://www.bbc.com/news/technology-67650807
        
         | latexr wrote:
         | > Not nearly as impressive as Sora. Sora was impressive because
         | the clips were long and had lots of rapid movement
         | 
         | The most impressive Sora demo was heavily edited.
         | 
         | https://www.fxguide.com/fxfeatured/actually-using-sora/
        
           | rvz wrote:
           | Interesting to see that OpenAI was successful in creating
           | their own reality distortion spells, just like Apple's
           | reality distortion field which has fooled many of these
           | commenters here.
           | 
           | It's quite early to race to the conclusion that one is better
           | than the other when not only they are both unreleased, but
           | especially when the demos can be edited, faked or altered to
           | look great for optics and distortion.
           | 
           | EDIT: It appears there is at least one commenter who replied
           | below that is upset with this fact above.
           | 
           | It is OK to cope, but the truth really doesn't care
           | especially when the competition (Google) came out much
           | stronger than expected with their announcements.
        
             | ijidak wrote:
             | Well, as a counterpoint, Apple did become a $2 trillion
             | dollar company...
             | 
             | Distortion is easiest when the products really work. :)
        
               | adventured wrote:
               | Apple got up to $3 trillion back in 2023.
        
               | turnsout wrote:
               | Indeed, and they're at 2.87T today... Built largely on
               | differentiated high-margin products, which is not how I
               | would describe OpenAI. I should clarify that I'm a fan of
               | both companies, but the reality is that OpenAI's business
               | model depends on how well it can commoditize itself.
        
           | jsheard wrote:
           | To Shy Kids credit _they_ made it clear the Sora footage was
           | heavily edited, but OpenAIs site still presents Air Head
           | without that context.
           | 
           | https://www.youtube.com/watch?v=KFzXwBZgB88 (posted the day
           | after the short debuted)
           | 
           | https://openai.com/index/sora-first-impressions (no mention
           | of editing, nor do they link to the above making-of video)
        
             | seoulmetro wrote:
             | There is now on that second link:
             | 
             | >The videos below were edited by the artists, who
             | creatively integrated Sora into their work, and had the
             | freedom to modify the content Sora generated.
        
               | jsheard wrote:
               | Ha, here's an archive from yesterday for posterity.
               | 
               | https://web.archive.org/web/20240513050023/https://openai
               | .co...
               | 
               | They also just added a link to the making-of video.
        
               | Aeolun wrote:
               | If you modified something because it got some attention
               | on HN, at least have the guts to own up to it :/
        
               | seoulmetro wrote:
               | That's hilarious. Your comment clearly got seen by
               | someone.
        
           | hanspeter wrote:
           | I believe it was clear that Air Head was an edited video.
           | 
           | The intention wasn't to show "This is what Sora can generate
           | from start to end" but rather "This is what a video
           | production team can do with Sora instead of shooting their
           | own raw footage."
           | 
           | Maybe not so obvious to others, but for me it was clear from
           | how the other demo videos looked.
        
         | dyauspitr wrote:
         | They're not showing people because that can get hairy quickly.
        
         | btown wrote:
         | A commercially available tool that can turn still images into
         | depth-conscious panning shots is still tremendously impactful
         | across all sorts of industries, especially tourism and
         | hospitality. I'm really excited to see what this can do.
        
         | pheatherlite wrote:
         | Not just that, but anything with a subject in it felt uncanny
         | valleyish... like that cowboy clip, the gate of the horse stood
         | out as odd and then I gave it some attention . It seems like a
         | camel's gate. And whole thing seems to be hovering, gliding
         | rather than walking. Sora indeed seems to have an advantage
        
           | __float wrote:
           | I thought a camel's gait is much closer to two legs moving
           | almost at the same time. Granted, I don't see camels often.
           | Out of curiosity can you explain that more?
        
       | axblount wrote:
       | I hate to be so cynical, but I'm dreading the inevitable flood of
       | AI generated video spam.
       | 
       | We really are about _this_ close to infinite jest. Imagine TikTok
       | 's algorithm with on demand video generation to suit your exact
       | tastes. It may erase the social aspect, but for many users I
       | doubt that would matter too much. "Lurking" into oblivion.
        
         | lordswork wrote:
         | It's already here. There are communities forming around
         | generating passive income from mass producing AI videos as
         | tiktoks and shorts.
        
           | axblount wrote:
           | I saw one of those where a guy just made videos about
           | increasingly elaborate AI generated cakes. You're right, I
           | guess we're mostly there.
           | 
           | But those still require some human input. I'm imagining a
           | sort of genetic algorithm for video prompts, no human
           | editing, input, or curation required.
        
           | tikkun wrote:
           | What's the subreddit?
        
         | barbariangrunge wrote:
         | YouTube's endgame is to not need content creators in the loop
         | any more. The algorithm will just create everything
        
           | esafak wrote:
           | The endgame of that is that people will leave.
        
             | darby_eight wrote:
             | I'm somewhat surprised people still watch YouTube with the
             | horrible recommendations and non-stop spam
        
               | astrange wrote:
               | YouTube actually has really good recommendations and
               | comments these days.
               | 
               | In fact I would say the comments are too good. They
               | clearly have something ranking them for "niceness" but it
               | makes them impossibly sentimental. Like I watched a bunch
               | of videos about 70s rock recently and every single
               | comment was about how someone's family member just died
               | of cancer and how much they loved listening to it.
        
           | belter wrote:
           | Henry Ford II: Walter, how are you going to get those robots
           | to pay your union dues?
           | 
           | Walter Reuther: Henry, how are you going to get them to buy
           | your cars?
        
         | LZ_Khan wrote:
         | I had the same thought regarding infinite jest recently
        
         | rm_-rf_slash wrote:
         | And somehow our exact tastes would also include influencer
         | coded advertisements.
        
         | beacon294 wrote:
         | Can you explain this aspect of infinite jest to me without
         | spoiling the book?
        
           | _xander wrote:
           | It's introduced early on (and not what the book is really
           | about): distribution of a video that is so entertaining that
           | any viewer is compelled to watch it until they die
        
         | jprete wrote:
         | At the bottom of the text blurb on the Veo page: "In the
         | future, we'll also bring some of Veo's capabilities to YouTube
         | Shorts and other products."
         | 
         | So...you're not cynical, it's an explicit product goal.
        
         | Invictus0 wrote:
         | This basically already exists for porn
        
         | redml wrote:
         | I think of it as we're replacing the SEO spam we have right now
         | with AI spam. At least now we can fight that with more AI.
        
           | sph wrote:
           | There's a naive statement to make.
        
         | layer8 wrote:
         | If it really suited my exact tastes, that would actually be
         | great. But I don't see how we're anywhere close to that. And
         | they won't target matching your exact taste. They will target
         | the threshold where it's just barely interesting enough that
         | people don't turn it off.
        
       | fidotron wrote:
       | If you were of a mind to give Google the benefit of the doubt you
       | would have to think they are desperately trying not to
       | overpromise and underdeliver, partly because that has been their
       | track record to date. It's a very curious time to choose to make
       | this switch though given their competition, and if it was
       | motivated by the reception Bard received then it shows they
       | didn't learn the right lessons from that mess at all.
        
       | aaroninsf wrote:
       | It's mildly interesting how many of the samples shown fail to
       | fully conform to the prompts. Lots of specifics are missing.
       | 
       | Kudos to Google for if not foregrounding, being entirely
       | transparent, about this.
        
       | willsmith72 wrote:
       | all of this stuff i'll believe when it's ready for public release
       | 
       | 1. safety measures lead to huge quality reductions
       | 
       | 2. the devil's in the details. you can make me 1 million videos
       | which look 99% realistic, but it's useless. consumers can pick it
       | instantly, and it's a gigantic turn-off for any brand
        
         | aprilthird2021 wrote:
         | There'll always be a market for cheap low-quality videos, and
         | vice versa always a market for shockingly high quality videos.
         | K. Asif's Mughal-e-Azham had enormous ticket sales and a huge
         | budget spending on all sorts of stuff, like actual gold jewelry
         | to make the actors feel that they were important despite the
         | film being black and white.
         | 
         | No matter how good AI gets, it will never be the highest
         | budget. Hell, even technically more accurate quartz watches
         | cannot compete price wise with mechanical masterpiece watches
         | of lower accuracy
        
       | barbariangrunge wrote:
       | The company that controls online video is announcing a new tool,
       | and ambitions to develop it further, to create videos without
       | need for content creators. Using their videos to make a machine
       | that will cut them out of the loop.
        
         | infinitezest wrote:
         | Males the very long Acknowledgments section at the bottom extra
         | rich.
        
       | hipadev23 wrote:
       | I've never had to click "Sign in" so many times in a row.
        
         | flying_whale wrote:
         | ...and then fill out an actual google form at the end, _after_
         | you've already signed in, to be added to the waitlist :sigh:
        
           | throwup238 wrote:
           | ...and enter your email into the form again despite being
           | logged into a Google account.
        
       | mrkramer wrote:
       | YouTube people: We need more UGC.
       | 
       | DeepMind people: AI can do it.
        
       | belval wrote:
       | While it's cool that they chose to showcase full-resolution
       | videos, they take so long to load I thought their videos were
       | just a stuttery mess.
       | 
       | Turns out if you open the video in a new tab the smoothness is
       | much more impressive.
        
       | robertlagrant wrote:
       | Hold on to your papers!
        
       | esafak wrote:
       | I love the reference to Llama with the alpacas.
        
       | typpo wrote:
       | The amount of negativity in these comments is astounding.
       | Congrats to the teams at Google on what they have built, and
       | hoping for more competition and progress in this space.
        
         | localfirst wrote:
         | We have to take account that this community (good chunk have
         | stakes in YC and a lot to gain from secondary shares in OpenAI)
         | and platform is going to favor its own and be aware that Sam
         | Altman is the golden boy of YC's founder after all.
         | 
         | So of course you are going to see snarky comments and straight
         | up denial in the competition. We saw that yesterday in the
         | comments with the release of GPT4o in anticipation of Gemini
         | 2.0 (GPT-5 basically) release being announced today at Google
         | I/O
         | 
         | I'm SORA to say Veo looks much more polished without jank.
         | 
         | Big congratulations to Google and their excellent AI team for
         | not editing their AI generated videos like SORA
        
           | JumpCrisscross wrote:
           | > _platform is going to favor its own and be aware that Sam
           | Altman is the golden boy of YC 's founder_
           | 
           | I don't know if there is a sentiment analysis tool for HN,
           | but I'm pretty sure it's been dead negative for Altman since
           | at least Worldcoin.
        
             | saalweachter wrote:
             | A land of contrasts, etc.
        
             | cosmotron wrote:
             | Something in this vein was just posted here a few days
             | back: https://news.ycombinator.com/item?id=40307519
        
           | baobabKoodaa wrote:
           | > We have to take account that this community (good chunk
           | have stakes in YC and a lot to gain from secondary shares in
           | OpenAI)
           | 
           | You have to be pretty deep inside your own little bubble to
           | think that even more than a 0.001% of HN has "stakes in YC"
           | or "secondary shares in OpenAI".
        
             | hu3 wrote:
             | It can be a vocal minority. Still vocal.
             | 
             | I wouldn't discard.
        
               | dylan604 wrote:
               | I have 0% stake in any YC, and I'm very vocal in my
               | negativity against any of these "AI" anythings. All of
               | these announcements are only slighty more than a toddler
               | anxious to show the parental units a finger painting
               | looking to hang it on the fridge. Only instead of the
               | fridge, they are a hoping to get funding/investment
               | knowing that their product is _not_ a fully fledged
               | anything. It 's comical.
        
           | mrbungie wrote:
           | The amount of copium in this response is astounding.
           | 
           | Yes, there is a noticeable negative response from HN towards
           | Google, and there has always been especially when speaking
           | about their weird product management practices and
           | incentives. Google hasn't launched any notable (and still
           | surviving, Stadia being a sad example of this) consumer
           | product or service in the last 10 years.
           | 
           | But to suggest there is a Sam Altman / OpenAI bias is
           | delusional. In most posts about them there is at least some
           | kind of skepticism or criticism towards Altman (his
           | participation in Worldcoin and his accelerationist stance
           | towards AGI) or his companies (OpenAI not being really open).
           | 
           | PS: I would say most people lurking here are just hackers (of
           | many kinds, but still hackers), not investors with shady
           | motives.
        
             | localfirst wrote:
             | My argument wasn't that there was a cabal of shady
             | investors trying to influence perception here. your
             | observation is certainly valid there is general disdain for
             | Google but specifically I'm calling out people that were
             | blatantly telling lies and making outlandish claims and
             | attacking others who were simply pointing out that some of
             | those people have financial motives (either being backed by
             | YC or seek to benefit from the work of others).
             | 
             | None of this is surprising to me and shouldn't shock you.
             | You are literally on a site called Ycombinator. Had this
             | been another platform without ties to investments or
             | drawing from crowd that actively seeks to enrich themselves
             | through participation in a narrative, this wouldn't even be
             | a thing.
             | 
             | Large number of people who read my comment seems to agree
             | and this whole worldcoin thing seems to me just another
             | distraction (We've already been through why that was shady
             | but we are talking about something different here).
        
               | mrbungie wrote:
               | Well, you have a point. I've always thought that Hacker
               | News <> YCombinator, but maybe the truth is in the
               | middle. At the very least, this is food for thought.
        
             | astrange wrote:
             | > Google hasn't launched any notable (and still surviving,
             | Stadia being a sad example of this) consumer product or
             | service in the last 10 years.
             | 
             | Google Photos is less than 10 years old and I think a lot
             | of people use it.
        
           | betternet77 wrote:
           | Yup, there's a significant anti-Google spin in HN, twitter.
           | For example, here's paulg claiming that Cruise handles
           | driving around cyclists better than Waymo [1], obviously not
           | true to anyone who's used both services
           | 
           | [1] https://twitter.com/paulg/status/1360341492850708481
        
             | [deleted]
        
         | rvz wrote:
         | You have to give Google credit as they went against the OpenAI
         | fanatics, Google doomsday crowd and some of the permanent
         | critics (who won't disclose they invested in OpenAI's secondary
         | share sale) that believe that Google can't keep up.
         | 
         | In fact, they already did. What OpenAI announced was nothing
         | that Google could not do already.
         | 
         | The top comments around Sora vs Veo suggesting that Google was
         | falling behind, given the fact that both are still unavailable
         | to use wasn't even a point to make in the first place, but just
         | typical HN nonsense.
        
           | JumpCrisscross wrote:
           | > _What OpenAI announced was nothing that Google could not do
           | already_
           | 
           | I don't think I've seen serious criticism of Google's
           | abilities. Apple didn't release anything that Xerox or IBM
           | couldn't do. The difference is they didn't.
           | 
           | Google's problem has always been in product follow through.
           | In this case, I fault them for having the sole action item be
           | a buried waitlist request and two new brands (Veo and
           | VideoFX) for one unreleased product.
        
             | sangnoir wrote:
             | > I don't think I've seen serious criticism of Google's
             | abilities
             | 
             | Serious or not, that criticism existed on HN - and still
             | does. I've seen many comments claiming Google has "fallen
             | behind" on AI, sometimes with the insinuation the Google
             | won't ever catch up due to OpenAI's apparent insurmountable
             | lead
        
             | aprilthird2021 wrote:
             | I saw it here alone. A lot of people simply have no idea
             | the level of research ability and skill Google, the
             | inventor of the Transformer, has.
        
             | KorematsuFredt wrote:
             | > Google's problem has always been in product follow
             | through.
             | 
             | Google is large enough to not care about small
             | opportunities. It ends up focusing on bigger opportunities
             | that only it can execute well. Google's ability to shut
             | down products that dont work is an insult to user but a
             | very good corporate strategy and they deserve kudos for
             | that.
             | 
             | Now, coming back to the "follow through". Google Search,
             | Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
             | excellent examples of Google's long term commitment to the
             | product and constantly making things better and keeping
             | them relevant for the market. Many companies like Yahoo!
             | had a head start but could not keep up with their mail
             | service.
             | 
             | Sure it has shut down many small products but that is
             | because they were unlikely to turn into bigger
             | opportunities. They often integrated the best aspect of
             | those products into their other well established products
             | such as Google Trips became part of search and Google
             | Shopping became part of search.
        
               | falcor84 wrote:
               | > coming back to the "follow through". Google Search,
               | Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
               | excellent examples of Google's long term commitment
               | 
               | Do you have any examples of something they launched in
               | the last decade?
        
               | astrange wrote:
               | Photos was launched in the last decade.
        
               | troupo wrote:
               | > Google is large enough to not care about small
               | opportunities. It ends up focusing on bigger
               | opportunities
               | 
               | that result in shittier products overall. For example,
               | just a few months ago they cut 17 features from Google
               | Assistant because they couldn't monetize them, sorry,
               | because these were "small opportunities":
               | https://techcrunch.com/2024/01/11/google-is-
               | removing-17-unde...
               | 
               | > all are excellent examples of Google's long term
               | commitment to the product and constantly making things
               | better and keeping them relevant for the market.
               | 
               | And here's a long list of excellent examples of Google
               | killing products right and left because small
               | opportunities or something: https://killedbygoogle.com/
               | 
               | And don't get me started on the whole
               | Hangouts/Meet/Alo/Duo/whatever fiasco
               | 
               | > Sure it has shut down many small products but that is
               | because they were unlikely to turn into bigger
               | opportunities.
               | 
               | Translation: because they couldn't find ways to monetize
               | the last cent out of them
               | 
               | ---
               | 
               | Edit: don't forget: The absolute vast majority of
               | Google's money comes from selling ads. There's nothing
               | else it is capable of doing at any significant scale. The
               | only reason it doesn't "chase small opportunities" is
               | because Google _doesn 't know how_. There are a few
               | smaller cash cows that it can keep chugging along, but
               | they are dwarfed by the single driving force that mars
               | everything at Google: the need to sell more and more ads
               | and monetize the shit out of everything.
        
           | localfirst wrote:
           | Don't forget SORA edited their "ai generated" videos while
           | Google did not here.
           | 
           | Where did SORA get all its training videos from again and why
           | won't the executives answer a simple Yes/No question to "Did
           | you scrape Youtube to train SORA?"
           | 
           | Google attorneys want to know.
        
             | scarmig wrote:
             | Google does not care to start a war where every company has
             | to form explicit legal agreements with every other company
             | to scrape their data. Maybe if they got really desperate,
             | but right now they have no reason to be.
        
             | TwentyPosts wrote:
             | > Don't forget SORA edited their "ai generated" videos
             | while Google did not here.
             | 
             | Wait, really? Could you point to proof for this? I'm very
             | curious where this is coming from
        
           | septic-liqueur wrote:
           | I have no doubt about Google's capabilities in AI, my doubt
           | lies on the productization part. I don't think they can
           | produce something that will not be a complete mess
        
           | CSMastermind wrote:
           | > In fact, they already did.
           | 
           | In terms of software that's actually been released Google is
           | still at best in third place when it comes to AI products.
           | 
           | I don't care what they can demo, I care what they've shipped.
           | So far the only thing they've shipped for Veo is a waitlist.
        
         | Xenoamorphous wrote:
         | It's tiring. Same thing happened to the GPT-4o announcement
         | yesterday. Apparently because there's no unquestionable AGI 14
         | months after GPT-4 then everything sucks.
         | 
         | I always found HN contrarian but as I say it's really tiring.
         | I've no idea what the negative commenters are working on on a
         | daily basis to be so dismissive of everybody else's work,
         | including work that leaves 90% of the population in a
         | combination of awe and fear. Also people sometimes forget that
         | behind big corp names there are actual people. People who might
         | be reading this thread.
        
           | motoxpro wrote:
           | Yeah it's pretty unfortunate. Saying something sucks is such
           | a lack of understanding that things are not static. I guess
           | it's a sure way to be right, because there will always be
           | progress and you can look back and say "See I told you!"
        
             | IggleSniggle wrote:
             | Psh. Things are not static. Progress sucks now. Haven't you
             | heard of enshitification? You can always look back and say,
             | "see? I told you it would suck in the future!"
             | 
             | ...why am I feeling to urge to point out that I am only
             | making a joke here and not trying to make an actual counter
             | point, even if one can be made...?
        
               | piloto_ciego wrote:
               | I commented on this elsewhere, but being a negative Nancy
               | is really a winning strategy.
               | 
               | If you're negative and you get it wrong, nobody cares,
               | get it and right you look like a damn genius. Conversely,
               | if you're positive and get it wrong, you look like an
               | idiot and if you're right you're praised for a good call
               | once. The rational "game theory" choice is to predict
               | calamity.
        
               | motoxpro wrote:
               | Yeah it's funny that optimism in the long term is optimal
               | and pessimism in the short term is optimal.
        
               | piloto_ciego wrote:
               | Right, but I think people sometimes get the "what
               | constitutes long term" factor a little bit wrong.
               | 
               | I am still talking to a lot of people who say, "what can
               | any of this AI stuff even do?" It's like, robots you
               | could hold a conversation with effectively didn't exist 3
               | years ago and you're already upset that it's not a money
               | tree?
               | 
               | I think that peoples expectation horizon narrowing down
               | may be the clearest evidence that we're in the
               | singularity.
        
           | mupuff1234 wrote:
           | What's also tiring is that no one is allowed to have any
           | critical thoughts because "it's tiring".
           | 
           | From my own perspective the critique is usually a counter
           | balance to extreme hype, so maybe let's just agree it's ok to
           | have both types of comments, you know "checks and balances".
        
             | piloto_ciego wrote:
             | Being cynical is not a counterbalance though, it's just as
             | low effort as the hype people.
        
           | Workaccount2 wrote:
           | AI is a pretty direct threat to software engineering. It's no
           | surprise people are hostile towards it. Come 2030, how do you
           | justify a paying someone $175k/yr when a $20/mo app is 95% as
           | good, and the other 5% can be done by someone making $40k/yr?
        
             | astrange wrote:
             | Productivity improvements are good for workers; you should
             | ask yourself why the invention of the compiler didn't cause
             | this to happen already.
             | 
             | Or why the existence of the UK hasn't, since they have a
             | lot of English speaking programmers paid in peanuts.
        
         | piloto_ciego wrote:
         | I think it's fear. Maybe not openly, but people are spooked at
         | how fast stuff is happening, so shitting on progress is a
         | natural reaction.
        
           | Workaccount2 wrote:
           | I have noticed this the most in SWE's who went from being
           | code writers to "human intention decipherers". Ask a an SWE
           | in 2019 what they do and it was "Write novel and efficient
           | code", ask one in 2024 and you get "Sit in meetings and talk
           | to project managers in order to translate their poor
           | communication to good code".
           | 
           | Not saying the latter was never true, it's just interesting
           | to see how people have reframed their work in the wake of
           | breakneck AI progress.
        
           | kmacdough wrote:
           | I suspect it's also a general fatigue with the over-hype. It
           | _is_ moving fast, but every step improvement has come with
           | its own mini hype cycle. The demos are very curated and make
           | the model look incredibly flexible and resilient. But when we
           | test the product in the wild, it 's constantly surprising the
           | simple tasks it blunders on. It's natural to become a bit
           | cynical and human to take that cynicism on the attack. Not
           | saying it's right, just natural, in the same way that it's
           | natural for the marketing teams to be as misleading as they
           | can get away with. Both are annoying, but there's not much to
           | do.
        
             | piloto_ciego wrote:
             | Cynicism is (arguably) the intellectually easy strategy.
             | 
             | If you're cynical and you get it right that everything
             | "sucks" you look like a genius, if you get it wrong there
             | is no penalty.
             | 
             | If you aren't cynical and you talk about how great
             | something is going to be and it flops you look like an
             | idiot. The social penalty is much higher.
        
           | brikym wrote:
           | Progress? There are loads of downsides the AI fans won't
           | acknowledge. It diminishes human value/creativity and will be
           | owned and controlled by the wealthiest people. It's not like
           | the horse being replaced by the tractor. This time it's
           | different there is no place to move to but doing nothing on a
           | UBI (best case). That same power also opens the door to
           | dystopian levels of censorship and surveillance. I see more
           | of the Black Mirror scenarios coming true rather than
           | breakthroughs that benefit society. Nobody is denying that
           | it's impressive but the question is more whether it's good
           | overall. Unfortunately the toothpaste seems to be out of the
           | tube.
        
             | piloto_ciego wrote:
             | >Progress? There are loads of downsides the AI fans won't
             | acknowledge.
             | 
             | I don't know if this is true.
             | 
             | >It diminishes human value/creativity
             | 
             | I don't see this at all, I see it as enhancing creativity
             | and human value.
             | 
             | >and will be owned and controlled by the wealthiest people.
             | 
             | There are a lot of open source models being created, even
             | if they are being released by Meta...
             | 
             | >It's not like the horse being replaced by the tractor.
             | This time it's different there is no place to move to but
             | doing nothing on a UBI (best case).
             | 
             | So, like, you wouldn't do anything if you could just chill
             | on UBI all day? If anything I'd get more creative.
             | 
             | > That same power also opens the door to dystopian levels
             | of censorship and surveillance.
             | 
             | I don't disagree with this at all, but I think we can fight
             | back here and overcome this, but we have to lean into the
             | tech to do that.
             | 
             | > I see more of the Black Mirror scenarios coming true
             | rather than breakthroughs that benefit society.
             | 
             | I think this is basically wrong historically. Things are
             | very seldom permanently dystopian if they're dystopian at
             | all. Things are demonstrably better than they were 100
             | years ago, and if you think back even a couple decades
             | things are often a lot better.
             | 
             | The medical applications alone will save a lot of lives.
             | 
             | > Nobody is denying that it's impressive but the question
             | is more whether it's good overall. Unfortunately the
             | toothpaste seems to be out of the tube.
             | 
             | There are going to be annoyances, but I would bet serious
             | cash that things continue to get better.
        
               | astrange wrote:
               | > So, like, you wouldn't do anything if you could just
               | chill on UBI all day? If anything I'd get more creative.
               | 
               | There is a lot of empirical research on UBI and all of it
               | shows that it has very little effect on employment either
               | way. That is, nothing will change here.
               | 
               | (This is probably because 1. positional goods exist 2.
               | romantic prospects don't like it when you're unemployed
               | even if you're rich.)
        
             | sshnuke wrote:
             | > It diminishes human value/creativity and will be owned
             | and controlled by the wealthiest people
             | 
             | "When you go to an art gallery, you are simply a tourist
             | looking at the trophy cabinet of a few millionaires" -
             | Banksy
        
               | piloto_ciego wrote:
               | Then... isn't AI generated art something that empowers
               | the non-millionaires?
        
         | jmkni wrote:
         | Well for me it linked to a Google Form to join a waitlist lol,
         | so I'm not exactly pumped
        
         | jtolmar wrote:
         | I think it's just hype fatigue.
         | 
         | There's genuinely impressive progress being made, but there are
         | also a lot of new models coming out promising way more than
         | they can deliver. Even the Google AI announcements, which used
         | to be carefully tailored to keep expectations low and show off
         | their own limitations, now read more like marketing puff
         | pieces.
         | 
         | I'm sure a lot of the HN crowd likes to pretend we're all
         | perfectly discerning arbiters of the tech future with our
         | thumbs on the pulse of the times or whatever, but realistically
         | nobody is going to sift through a mountain of announcements
         | ranging from "states it's revolutionary, is marginal
         | improvement" to "states it's revolutionary, is merely an
         | impressive step" to "states it's revolutionary, is bullshit"
         | without resorting to vibes-based analysis.
        
           | throwup238 wrote:
           | It's made all the worse by just being a giant waitlist. Sora
           | is still no where to be seen three months later, GPT-4o's
           | conversational features aren't widely rolled out yet, and
           | Google's AI releases have been waitlist after waitlist after
           | waitlist.
           | 
           | Companies can either get peopled hyped or have never-ending
           | georestricted waitlists, they can't have their cake and eat
           | it too.
        
             | indigodaddy wrote:
             | Isn't there a lot of positive forward motion and
             | fruitfulness in the current state of the open source
             | llama-3 community?
        
         | Dig1t wrote:
         | Honestly just think that Google has burned their good will at
         | this point. If you notice, most announcements by Apple are
         | positively received here and same with OpenAI. But since
         | Google's "don't be evil" persona has faded and since they went
         | through so much churn WRT products. I think most people just
         | don't want to see them win.
        
         | rmbyrro wrote:
         | I hope they didn't mess this one up with ideologically driven
         | non-sense, like they did with Gemini.
        
       | clawoo wrote:
       | > "This tool isn't available in your country yet"
       | 
       | How did I know I would see this message before clicking "Sign up
       | to try"?
        
       | makestuff wrote:
       | Is there any good blogs/videos that ELI5 how these video
       | generation models even work?
        
       | sys32768 wrote:
       | I assume for consumers to use this, we must agree to have product
       | placements inserted into our productions every 48 seconds.
        
       | SoftTalker wrote:
       | Vaguely unsettling that the thumbnail for first example prompt "A
       | lone cowboy rides his horse across an open plain at beautiful
       | sunset, soft light, warm colors" looks something like the
       | pixelated vision of The Gunslinger android (Yul Brynner's
       | character) from the 1973 version of Westworld.
       | 
       | See 1:11 in this video
       | https://www.youtube.com/watch?v=MAvid5fzWnY
       | 
       | Incidentally that was one of the early uses of computer graphics
       | in a movie, supposedly those short scenes took many hours to
       | render and had to be done three times to achieve a colorized
       | image.
        
         | AceJohnny2 wrote:
         | Can't say I see a visual similarity. In any case, "Cowboy
         | silhouette in the sunset" is a pretty classic American visual.
         | 
         | But the parallel you made between android Brynner's vision and
         | the generated imagery is fun to consider!
        
       | totaldude87 wrote:
       | its 2024 and AI is taking over and yet, to signup for this, it
       | take way more clicks and Google form entry(1)
       | 
       | Sigh. I still have hopes for VEO though
        
       | aragonite wrote:
       | With so much recent focus by OpenAI/Google on AI's visual
       | capabilities, does anyone know when we might see an OCR product
       | as good as Whisper for voice transcription? (Or has that already
       | happened?) I had to convert some PDFs and MP3s to text recently
       | and was struck by the vast difference in output quality.
       | Whisper's transcription was near-flawless, all the OCR softwares
       | I tried struggled with formatting, missed words, and made many
       | errors.
        
         | jazzyjackson wrote:
         | You might enjoy this breakdown of the lengths one person went
         | through to take advantage of the iOS vision API and creating a
         | local web service for transcribing some very challenging memes:
         | 
         | https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...
         | 
         | discussed on HN:
         | 
         | https://news.ycombinator.com/item?id=34315782
        
           | aragonite wrote:
           | This is so good - thanks for sharing this!
        
           | nunez wrote:
           | This is a work of fucking art.
        
         | thesandlord wrote:
         | We use GPT-4o for data extraction from documents, its really
         | good. I published a small library that does a lot of the
         | document conversion and output parsing:
         | https://npmjs.com/package/llm-document-ocr
         | 
         | For straight OCR, it does work really well but at the end of
         | the day its still not 100%
        
           | aragonite wrote:
           | Thanks! look forward to checking this out as soon as I get
           | home.
        
       | tauntz wrote:
       | Uh.. First it tells me that I can't sign up because my country is
       | supported (yay, EU) and I can sign up to be notified when it's
       | actually available. Great, after I complete that form, I get an
       | error that the form can't be submitted and I'm taken to
       | https://aitestkitchen.withgoogle.com/tools/video-fx where I can
       | only press the "Join our waitlist" button. This takes me to a
       | Google Form, that doesn't have my country in the required country
       | dropdown and has a hint that says: "Note: the dropdown only
       | includes countries where ImageFX and MusicFX are publicly
       | available.". Say what?
       | 
       | Why does this have to be so confusing? Is the name "Veo" or
       | "VideoFX"? Why is the waitlist for VideoFX telling me something
       | about public availability of ImageFX and MusicFX? Why is
       | everything US only, again? Sigh..
        
         | pelorat wrote:
         | We can blame the EU AI act and other regulations for that.
        
       | benatkin wrote:
       | Yo lo veo.
        
       | wseqyrku wrote:
       | Google puts more effort into the namings than the actual model,
       | ngl.
        
       | bpodgursky wrote:
       | I think it's funny the demos don't have people in them after the
       | Gemini fiasco. I wonder if they didn't have time to re-train the
       | model to show representative ethnicities.
        
       | thih9 wrote:
       | Is there any non slow motion example?
       | 
       | The cyberpunk video seems better in that aspect, but I wish there
       | were more.
        
       | gliched_robot wrote:
       | This is far more superior than SORA, there is no comparison.
        
         | monkeeguy wrote:
         | lol
        
       | xnx wrote:
       | 60 second example video:
       | https://www.youtube.com/watch?v=diqmZs1aD1g
        
         | candiddevmike wrote:
         | For some reason this video reminds me of dreaming--details just
         | kind of pop in and out and the entire thing seems very surreal
         | and fractal.
        
           | jprete wrote:
           | Same impression here. The scene changes very abruptly from a
           | sky view to following the car. The cars meld with the ground
           | frequently, and I think I saw one car drive through another
           | at one point.
        
         | nixpulvis wrote:
         | So... much... bloom. I like it, but still holy shit. I hate
         | that I like it because I don't want this art form to be reduced
         | by overuse. Sadly, it's too late.
         | 
         | I'll just go back to living under a rock.
        
         | londons_explore wrote:
         | Looks like in places this has learned video compression
         | artifacts...
        
           | exodust wrote:
           | Funny if true. Perhaps in some generated video it will
           | suddenly interrupt the sequence with pretend unskippable ads
           | for phone cases & VPNs.
        
         | datashaman wrote:
         | 1080p but it has pixelated artifacts...
        
       | svag wrote:
       | An interesting thing that Google does is to watermark the AI
       | generated videos using the [SynthID
       | technology](https://deepmind.google/technologies/synthid/).
       | 
       | It seems that the SynthID is not only for AI generated video but
       | for image, text and audio.
        
         | bardak wrote:
         | I would like a bit more convincing that the text watermark will
         | not be noticeable. AI text already has issues with using
         | certain words to frequently. Messing with the weights seems
         | like it might make the issue worse
        
           | Tostino wrote:
           | Not to mention, when does he get applied? If I am asking an
           | llm to transform some data from one format to another, I
           | don't expect any changes other than the format.
        
         | padolsey wrote:
         | It seems really clever, especially the encoding of a signature
         | into LLM token probability selections. I wonder if synthid will
         | trigger some standarization in the industry. I don't think
         | there's much incentive to tho. Open-source gen AI will still
         | exist. What does google expext to occur? I guess they're just
         | trying to present themselves as 'ethically pursuing AI'.
        
       | s1k3s wrote:
       | This looks really good for promo videos. All scenes in here are
       | basically that.
        
       | KorematsuFredt wrote:
       | I think we should all take a pause and just appreciate the
       | amazing work Google, OpenAI, MS and many others including those
       | in academia have done. We do not know if Google or OpenAI or
       | someone else is going to win the race but unlike many other
       | races, this one makes the entire humanity move faster. Keep the
       | negativity aside and appreciate the sweat and nights people have
       | poured into making such things happen. Majority of these people
       | are pretty ordinary folks working for a salary so they can spend
       | their time with their families.
        
         | myaccountonhn wrote:
         | Majority of the people building the ai are artists having their
         | work stolen or workers earning extremely low wages to label
         | gory and csam data to a point where it hurts their mental
         | health.
        
       | ugh123 wrote:
       | From a filmmaking standpoint I still don't think this is
       | impactful.
       | 
       | For that it needs a "director" to say: "turn the horse's head 90@
       | the other way, trot 20 feet, and dismount the rider" and "give me
       | additional camera angles" of the same scene. Otherwise this is
       | mostly b-roll content.
       | 
       | I'm sure this is coming.
        
         | evantbyrne wrote:
         | They claim it can accept an "input video and editing command"
         | to produce a new video output. Also, "In addition, it supports
         | masked editing, enabling changes to specific areas of the video
         | when you add a mask area to your video and text prompt." Not
         | sure if that specific example would work or not.
        
         | qingcharles wrote:
         | I can see using these video generators to create video
         | storyboards. Especially if you can drop in a scribbled sketch
         | and a prompt for each tile.
        
           | ancientworldnow wrote:
           | That sounds actively harmful. Often we want story boards to
           | be less specific so as not to have some non artist decision
           | maker ask why it doesn't look like the storyboard.
           | 
           | And when we want it to match exactly in an animatic or
           | whatever, it needs to be far more precise than this, matching
           | real locations etc.
        
             | sbarre wrote:
             | I know you weren't implying this, but not every storyboard
             | is for sharing with (or seeking approval from) decision
             | makers.
             | 
             | I could see this being really useful for exploring tone,
             | movement, shot sequences or cut timing, etc..
             | 
             | Right now you scrape together "kinda close enough" stock
             | footage for this kind of exploration, and this could get
             | you "much closer enough" footage..
        
               | shermantanktop wrote:
               | I think of it in terms of the anchoring bias. Imagine
               | that your most important decisions are anchored for you
               | by what a 10 year old kid heard and understood. Your
               | ideas don't come to life without first being rendered as
               | a terrible approximation that is convincing to others but
               | deeply wrong to you, and now you get to react to that
               | instead of going through your own method.
               | 
               | So if it's an optional tool, great, but some people would
               | be fine with it, some would not.
        
               | sbarre wrote:
               | Absolutely. Everyone's creative process is different (and
               | valid).
        
             | gregmac wrote:
             | I hadn't thought about that in movie context before, but it
             | totally makes sense.
             | 
             | I've worked with other developers that want to build high
             | fidelity wire frames, sometimes in the actual UI framework,
             | probably because they can (and it's "easy"). I always push
             | back against that, in favor of using whiteboard or
             | Sharpies. The low-fidelity brings better feedback and
             | discussion: focused on layout and flow, not spacing and
             | colors. Psychologically it also feels temporary, giving
             | permission for others to suggest a completely different
             | approach without thinking they're tossing out more than a
             | few minutes of work.
             | 
             | I think in the artistic context it extends further, too: if
             | you show something too detailed it can anchor it in
             | people's minds and stifle their creativity. Most people
             | experience this in an ironically similar way: consider how
             | you picture the characters of a book differently depending
             | on if you watched the movie first or not.
        
             | cpill wrote:
             | I guess this will give birth to a new kind of film making.
             | Start with a rough sketch, generate 100 higher quality
             | versions with an image generator, select one to tweak, use
             | that as input to a video generator which generates 10
             | versions, coffee one to refine etc
        
         | sailfast wrote:
         | For most things I view on the internet B-roll is great content,
         | so I'm sure this will enable a new kind of storytelling via
         | YouTube Shorts / Instagram, etc at minimum.
        
         | Eji1700 wrote:
         | There's also the whole "oh you have no actual
         | model/rigging/lighting/set to manipulate" for detail work
         | issue.
         | 
         | That said, I personally think the solution will not be coming
         | that soon, but at the same time, we'll be seeing a LOT more
         | content that can be done using current tools, even if that
         | means a dip in quality (severely) due to the cost it might
         | save.
        
           | SJC_Hacker wrote:
           | This lead me to the question of why hasn't there been an
           | effort to do this with 3D content (that I know of).
           | 
           | Because camera angles/lighting/collision detection/etc. at
           | that point would be almost trivial.
           | 
           | I guess with the "2D only" approach that is based on actual,
           | acquired video you get way more impressive shots.
           | 
           | But the obvious application is for games. Content generation
           | in the form of modeling and animation is actually one the
           | biggest cost centers for most studios these days.
        
         | chacham15 wrote:
         | I dont think "turn the horse's head 90@" is the right path
         | forward. What I think is more likely and more useful is: here
         | is a start keyframe and here is a stop keyframe (generated by
         | text to image using other things like controlnet to control
         | positioning etc.) and then having the AI generate the frames in
         | between. Dont like the way it generated the in between? Choose
         | a keyframe, adjust it, and rerun with the segment before and
         | segment after.
        
           | GenerocUsername wrote:
           | This appeals to me because it feels auditable and
           | controllable... But the pace these things have been
           | progressing the last 3 years, I could imagine the tech
           | leapfrogs all conventional understanding real soon. Likely
           | outputting gaussian splat style outputs where the scene is
           | separate from the camera and ask peices can be independently
           | tweaked via a VR director chair
        
           | 8note wrote:
           | So a declarative keyframe of "the horses head is pointed
           | forward" and a second one of "the horse is looking left"
           | 
           | And let the robot tween?
           | 
           | Vs an imperative for "tween this by turning the horse's head
           | left"
        
         | aetherson wrote:
         | Yeah, I've made a lot of images, and it sure is amazing if all
         | you're interested in is, like, "Any basically good image," but
         | if you start needing something very particular, rather than
         | "anything that is on a general topic and is aesthetically
         | pleasing," it gets a lot harder.
         | 
         | And there are a lot more degrees of freedom to get something
         | wrong in film than in a single still image.
        
         | teaearlgraycold wrote:
         | Everything I've heard from professionals backs that up. Great
         | for B roll. Great for stock footage. That's it.
        
         | imachine1980_ wrote:
         | Stock videos are indeed crucial, especially now that we can
         | easily search for precisely what we need. Take, for instance,
         | the scene at the end of 'Look Up' featuring a native American
         | dance in Peru. The dancer's movements were captured from a
         | stock video, and the comet falling was seamlessly edited in.
         | now imagine having near infinite stock videos tailored to the
         | situation.
        
           | rzmmm wrote:
           | Stock photographers are already having issues with piracy due
           | to very powerful AI watermark removal tools. And I suspect
           | the companies are using content of these people to train
           | these models too. .
        
         | gedy wrote:
         | I think with AI content, we'd need to not treat it like
         | expecting fine grained control. E.g. instead like "dramatic
         | scene of rider coming down path, and dismounting horse, then
         | looking into distance", etc. (Or even less detail eventually
         | once a cohesive story can be generated.)
        
         | lofaszvanitt wrote:
         | I can't wait what will the big video camera makers gonna do
         | with tech similar to this. Since Google clearly have zero idea
         | what to do with this, and they lack the creativity, it's up to
         | ARRI, Canon, Panasonic etc. to create their own solutions for
         | this tech. I can't wait to see what Canon has up its sleeves
         | with their new offerings that come in a few months.
        
         | larodi wrote:
         | Perhaps the only industry which immediately benefits from this
         | is the short ads and perhaps TikTok. But still it is very
         | dubious, as people seem to actually enjoy being themselves the
         | directors of their thing, not somebody else.
         | 
         | Maybe this works for ads for duner place or shisha bar in some
         | developing country. I've seen generated images used for menus
         | in such places.
         | 
         | But I doubt a serious filmography can be done this way. And if
         | it can - it'd be again thanks to some smart concept on behalf
         | of humans.
        
         | kmacdough wrote:
         | I wouldn't be so sure it's coming. NNs currently dont have the
         | structures for long term memory and development. These are
         | almost certainly necessary for creating longer works with real
         | purpose and meaning. It's possible we're on the cusp with some
         | of the work to tame RNNs, but it's taken us years to really
         | harness the power of transformers.
        
         | thehappypm wrote:
         | If you or I don't see the potential here, I think that just
         | means someone more creative is going to do amazing things with
         | it
        
       | iamleppert wrote:
       | Too little, too late. Google is follower, not leader. They need
       | to stop trying and do more stock buybacks and strip the company
       | to barebones, like Musk did with Twitter & Tesla.
        
       | NegativeLatency wrote:
       | Shoulda used youtube to host their video, it's all broken and
       | pixelated for me
        
       | m3kw9 wrote:
       | Why is it always in slow motion, is it hard to get the speed
       | correctly?
        
       | miohtama wrote:
       | > Veo's cutting-edge latent diffusion transformers reduce the
       | appearance of these inconsistencies, keeping characters, objects
       | and styles in place, as they would in real life.
       | 
       | How is this achieved? Is there temporal memory between frames?
        
         | hackerlight wrote:
         | Probably similar to Sora, a patchified vision transformer, you
         | sample a 3d patch (third dimension is time) instead of a 2d
         | patch
        
       | toasted-subs wrote:
       | I could say something but I'm glad to get the confirmation.
        
       | shaunxcode wrote:
       | truly removing the `id` from video.
        
       | abledon wrote:
       | music is lacking.... suno, udio, riffusion all blow this out of
       | the water
        
       | ijidak wrote:
       | These will be remembered as the AI wars.
       | 
       | Reminds me of the competition in tech in the late 80's early 90's
       | between Microsoft and Borland, Microsoft and IBM, AMD and Intel,
       | Word vs Wordperfect, etc.
       | 
       | It's a two horse race between Google and OpenAI.
        
       | animanoir wrote:
       | Google is so finished... Unless they remove Mr. Pinchar...
        
       | infinitezest wrote:
       | > A fast-tracking shot through a bustling dystopian sprawl
       | 
       | How apropos...
        
       | nosmokewhereiam wrote:
       | Made an album in 10 mins. Typically as a techno DJ I'd mix them
       | together so they sound kinda bare right now.
       | 
       | Here's my 10 minutes to 12:09 album debut:
       | 
       | https://on.soundcloud.com/FAXkJrLrC2JjoAyu7
        
         | TazeTSchnitzel wrote:
         | Even as a very inexperienced musician I think I can say these
         | are not very compelling examples? They sound like unfinished
         | sketches that took a few minutes to make each, but with no
         | overarching theme and weirdly low fidelity. An absolute
         | beginner could make better things just by messing around with a
         | groovebox.
        
       | rlhf wrote:
       | It seems so real, cool.
        
       | salamo wrote:
       | The first thing I will do when I get access to this is ask it to
       | generate a realistic chess board. I have never gotten a decent
       | looking chessboard with any image generator that doesn't have
       | deformed pieces, the correct number of squares, squares properly
       | in a checkerboard pattern, pieces placed in the correct position,
       | board oriented properly (white on the right!) and not an
       | otherwise illegal position. It seems to be an "AI complete"
       | problem.
        
         | arcticbull wrote:
         | Similarly the Veo example of the northern lights is a really
         | interesting one. That's not what the northern lights look like
         | to the naked eye - they're actually pretty grey. The really
         | bright greens and even the reds really only come out when you
         | take a photo of them with a camera. Of course the model
         | couldn't know that because, well, it only gets trained on
         | photos. Gets really existential - simulacra energy - maybe
         | another good AI Turing test, for now.
        
           | 22c wrote:
           | I've only ever seen photos of the northern lights and I also
           | didn't know that.
        
           | sdenton4 wrote:
           | That doesn't seem in any way useful, though... To use a very
           | blunt analogy, are color blind people
           | intelligent/sentient/whatever? Obviously, yes: differences in
           | perceptual apparatus aren't useful indicators of
           | intelligence.
        
             | shermantanktop wrote:
             | As a colorblind person...I could see the northern lights
             | way better than all the full-color-vision people around me
             | squinting at their phones.
             | 
             | Wider bandwidth isn't always better.
        
               | Ferret7446 wrote:
               | > I could see the northern lights way better than all the
               | full-color-vision people around me
               | 
               | How would you know?
        
               | squeaky-clean wrote:
               | Quote the entire sentence, not just a portion of it.
        
           | pmlarocque wrote:
           | That not true, they look grey when they aren't bright enough,
           | but they can look green or red to the naked eyes if they are
           | bright. I have seen it myself and yes I was disappointed to
           | see only grey ones last week.
           | 
           | see: https://theconversation.com/what-causes-the-different-
           | colour...
        
             | arcticbull wrote:
             | > [Aurora] only appear to us in shades of gray because the
             | light is too faint to be sensed by our color-detecting cone
             | cells."
             | 
             | > Thus, the human eye primarily views the Northern Lights
             | in faint colors and shades of gray and white. DSLR camera
             | sensors don't have that limitation. Couple that fact with
             | the long exposure times and high ISO settings of modern
             | cameras and it becomes clear that the camera sensor has a
             | much higher dynamic range of vision in the dark than people
             | do.
             | 
             | https://www.space.com/23707-only-photos-reveal-aurora-
             | true-c...
             | 
             | This aligns with my experiences.
             | 
             | The brightest ones I saw in Northern Canada I even saw
             | hints of reds - but no real greens - until I looked at it
             | through my phone, and it looked just like the simulated
             | video.
             | 
             | If I looked up and saw them the way they appear in the
             | simulation, in real life, I'd run for a pair of leaded
             | undies.
        
               | Tronno wrote:
               | I've seen it bright green with the naked eye. It
               | definitely happens. That article is inaccurate.
        
               | Kiro wrote:
               | That is totally incorrect which anyone who have seen real
               | northern lights can attest to. I'm sorry that you haven't
               | gotten the chance to experience it and now think all
               | northern lights are that lackluster.
        
               | kortilla wrote:
               | This is such an arrogant pile of bullshit. I've seen very
               | obvious colors on many different occasions in the
               | northern part of the lower 48, up in southern Canada, and
               | in Alaska.
        
               | Maxion wrote:
               | Greens are the more common colors, reds and blues occur
               | in higher energy solar storms.
               | 
               | And yes, they can be as green to the naked eye in that AI
               | video. I've seen aurora shows that fill the entire night
               | sky from horizon to horizon, way more impressive than
               | that AI video with my own eyes.
        
           | paxys wrote:
           | That's not true at all. I have seen northern lights with my
           | own eyes that were more neon green and bright purple than any
           | mainstream photo.
        
             | cryptoz wrote:
             | There's a middle ground here. I saw the northern lights
             | with my own eyes just days ago and it was mostly grey. I
             | saw some color. But when I took a photo with a phone
             | camera, the color absolutely _popped_. So it may be that
             | you 've seen more color than any photo, but the average
             | viewer in Seattle this past weekend saw grey-er with their
             | eyes and huge color in their phone photos.
             | 
             | (Edit: it was still _super-cool_ even if grey-ish, and
             | there was absolutely beautiful colors in there if you could
             | find your way out of the direct city lights)
        
               | goostavos wrote:
               | The hubris of suggesting that your single experience of
               | vaguely seeing the northern lights one time in Seattle
               | has now led to a deep understanding of their true "color"
               | and that the other person (perhaps all other people?)
               | must be fooling themselves is... part of what makes HN so
               | delightful to read.
               | 
               | I've also seen the northern lights with my own eyes. Way
               | up in the arctic circle in Sweden. Their color changes
               | along with activity. Grey looking sometimes? Sure. But
               | also colors that are so vivid that it feels like it
               | envelopes your body.
        
               | stavros wrote:
               | They did say "the average viewer in Seattle this past
               | weekend", not "all other viewers".
               | 
               | Then again, the average viewer in Seattle this past
               | weekend is hardly representative of what the northern
               | lights look like.
        
               | lpapez wrote:
               | > The hubris of suggesting that your single experience of
               | vaguely seeing the northern lights one time in Seattle
               | has now led to a deep understanding of their true "color"
               | and that the other person (perhaps all other people?)
               | must be fooling themselves is... part of what makes HN so
               | delightful to read.
               | 
               | The H in HN stands for Hubris.
        
               | freedomben wrote:
               | The person they were responding to was saying that the
               | people reporting grays were wrong, and that they had seen
               | it and it was colorful. If anything, you should be
               | accusing that person of hubris, not GP. All GPS point
               | was, is that it can differ in different situations. They
               | used the example of Seattle to show that the person they
               | were responding to is not correct that it is never gray
               | and dull.
        
               | mitthrowaway2 wrote:
               | The human retina effectively combines a color sensor with
               | a monochrome sensor. The monochrome channel is more
               | light-sensitive. When the lights are dim, we'll dilate
               | our pupils, but there's only so much we can do to
               | increase exposure. So in dim light we see mostly in
               | grayscale, even if that light is strongly colored in
               | spectral terms.
               | 
               | Phone cameras have a Bayer filter which means they _only_
               | have RGB color-sensing. The Bayer filter cuts out some
               | incoming light and dims the received image, compared with
               | what a monochrome camera would see. But that 's how you
               | get color photos.
               | 
               | To compensate for a lack of light, the phone boosts the
               | gain and exposure time until it gets enough signal to
               | make an image. When it eventually does get an image, it's
               | getting a _color_ image. This comes at the cost of some
               | noise and motion-blur, but it 's that or no image at all.
               | 
               | If phone cameras had a mix of RGB and monochrome sensors
               | like the human eye does, low-light aurora photos might
               | end up closer to matching our own perception.
        
           | laserbeam wrote:
           | For decades, game engines have been working on realistic
           | rendering. Bumping quality here and there.
           | 
           | The golden standard for rendering has always been cameras.
           | It's always photo-realistic rendering. Maybe this won't be
           | true for VR, but so far most effort is to be as good as
           | video, not as good as the human eye.
           | 
           | Any sort of video generation AI is likely to have the same
           | goal. Be as good as top notch cameras, not as eyes.
        
           | garyrob wrote:
           | Even in NY State, Hudson River Valley, I've seen them with
           | real color. They're different each time.
        
           | blhack wrote:
           | Have you ever seen the Northern Lights with your eyes? If so
           | I'm curious where you saw them.
           | 
           | I echo what some other posters here have said: they're
           | certainly not gray.
        
           | porphyra wrote:
           | Human eyes are basically black and white in low light since
           | rod cells can't detect color. But when the northern lights
           | are bright enough you can definitely see the colors.
           | 
           | The fact that some things are too dark to be seen by humans
           | but can be captured accurately with cameras doesn't mean that
           | the camera, or the AI, is "making things up" or whatever.
           | 
           | Finally, nobody wants to see a video or a photo of a dark,
           | gray, and barely visible aurora.
        
             | exodust wrote:
             | > _nobody wants to see a video or a photo of a dark, gray,
             | and barely visible aurora_
             | 
             | Except those who want to see an accurate representation of
             | what it looks like to the naked eye.
        
               | stkhlm wrote:
               | Living in northern Sweden I see the northern lights
               | multiple times a year. I have never seen them pale or
               | otherwise not colorful. Green and reds always. That is to
               | my naked eye. Photographs do look more saturated, but the
               | difference isn't as large as this comment thread make it
               | out to be.
        
               | shwaj wrote:
               | That mirrors my experience from when I used to live in
               | northern Canada
        
               | jabits wrote:
               | Even in Upper Michigan near Lake Superior we sometimes
               | had stunn, colorful Northern Lights. Sometimes it seemed
               | like they were flying overhead within your grasp
        
               | DaSHacka wrote:
               | Most definitely, it's quite common to find people hanging
               | around outside up towards Calumet whenever there's a
               | night with a high KP Index.
               | 
               | I highly recommend checking them out if you're nearby,
               | the recent auoras have been quite astonishing
        
               | peanut_merchant wrote:
               | Even in Northern Scotland (further south than northern
               | Sweden) this is the case. The latest aurora showing was
               | vividly colourful to the naked eye.
        
               | fzzzy wrote:
               | In the upper peninsula of michigan I have only seen grey.
        
               | Jensson wrote:
               | That is the same latitude as Paris though, not very north
               | at all.
        
               | exodust wrote:
               | I'm in Australia where the southern lights are known to
               | be not as intense as northern lights. That's where my
               | remark comes from. Those who have never seen the aurora
               | with their own eyes may like to see an accurate photo. A
               | rare find among the collective celebration of saturation.
        
               | freedomben wrote:
               | Exactly. I went through major gas lighting trying to see
               | the Aurora. I just wasn't sure whether I was actually
               | seeing it, because it always looked so different from the
               | photos. It is absolutely maddening trying to find a
               | realistic photo of what it looks like to the naked eye,
               | so that you can know if what you are seeing is actually
               | the Aurora and not just clouds
        
           | Kiro wrote:
           | Shouldn't the model reflect how it looks on video rather than
           | our naked eye?
        
           | hoyd wrote:
           | I can see what you mean, and that the video is somewhat not
           | what it would be like in real. I have lived in northern
           | Norway most of my life, and watched Auroras a lot. It
           | certainly look green and link for the most time. Fainter, it
           | would perhaps sorry gray I guess? Red, when viewed from a
           | more southern viewpoint..
           | 
           | I work at Andoya Space where perhaps most of the space
           | research on Aurora had been done by sending scientific
           | rockets into space for the last 60 yrs.
        
           | simonjgreen wrote:
           | To be fair, the prompt isn't asking for a realistic
           | interpretation it's asking for a timelapse. What it's
           | generated is absolutely what most timelapses look like.
           | 
           | > Prompt: Timelapse of the northern lights dancing across the
           | Arctic sky, stars twinkling, snow-covered landscape
        
           | darkstar_16 wrote:
           | Northern lights are actually pretty colourful, even to the
           | naked eye. I've never seen them pale or b/w
        
           | poulpy123 wrote:
           | that's a bad example since the only images of aurora borealis
           | are brightly colored. What I expect of an image generator is
           | to output what is expected from it
        
           | skypanther wrote:
           | What struck me about the northern lights video was that it
           | showed the Milky Way crossing the sky behind the northern
           | lights. That bright part of the Milky Way is visible in the
           | southern sky but the aurora hugging the horizon like that
           | indicates the viewer is looking north. (Swap directions for
           | the southern hemisphere and the aurora borealis).
        
         | sdenton4 wrote:
         | This strikes me as equally "AI complete" as drawing hands,
         | which is now essentially a solved problem... No one test is
         | sufficient, because you can add enough training data to address
         | it.
        
           | salamo wrote:
           | Yeah "AI complete" is a bit tongue-in-cheek but it is a
           | fairly spectacular failure mode of every model I've tried.
        
             | smusamashah wrote:
             | Ideogram and dalle do hands pretty well
        
             | swyx wrote:
             | ive been using "agi-hard" https://latent.space/p/agi-hard
             | as a term
             | 
             | because completeness isnt really what we are going for
        
           | dongping wrote:
           | Not sure about better models, but DALL-E3 still seems to be
           | having problems with hands:
           | 
           | https://www.reddit.com/r/dalle2/comments/1afhemf/is_it_possi.
           | ..
           | 
           | https://www.reddit.com/r/dalle2/comments/1cdks71/a_hand_with.
           | ..
        
         | sabellito wrote:
         | Per usual the top comment on anything AI related is snark on
         | "it can't to [random specific thing] well yet".
        
           | kmacdough wrote:
           | Tiring, but so is the relentless over-marketing. Each new
           | demo implies new use cases and flexible performance. But the
           | reality is they're very brittle and blunder _most_ seemingly
           | simple tasks. I would personally love an ongoing breakdown of
           | the key weaknesses. I often wonder  "can it X?" The answer is
           | almost always "almost, but not a useful almost".
        
         | mikeocool wrote:
         | Ha, wow, I'd never seen this one before. The failures are
         | pretty great. Even repeatedly trying to correct ChatGPT/Dall-e
         | with the proper number of squares and pieces, it somehow makes
         | it worse.
         | 
         | This is what dall-e came up with after trying to correct many
         | previous iterations: https://imgur.com/Ss4TwNC
        
         | perbu wrote:
         | Most generative AI will struggle when given a task that
         | requires something more less exact. They're probably pretty
         | good at making something "chessish".
        
       | efitz wrote:
       | I'm surprised that the cowboy is not actually an Asian woman.
        
       | mrcwinn wrote:
       | OpenAI has the model advantage.
       | 
       | Google and Apple have the ecosystem advantage.
       | 
       | Apple in particular has the deeper stack integration advantage.
       | 
       | Both Apple and Google have a somewhat poor software innovation
       | reputation.
       | 
       | How does it all net out? I suspect ecosystem play wins in this
       | case because they can personalize more deeply.
        
         | lowkey wrote:
         | Google has a deep addiction to AdWords revenue which makes for
         | a significant disadvantage. Nomatter how good their technology,
         | they will struggle internally with deploying it at scale
         | because that would risk their cash cow. Innovator's dilemma.
        
           | frankacter wrote:
           | Google Cloud and cloud services generated almost 9.57
           | billion. That's up 28% from prior:
           | 
           | https://www.crn.com/news/networking/2024/google-cloud-
           | posts-...
           | 
           | They are embedding their models not only widely across their
           | platforms suite of internal products and devices, but also
           | computationally via API for 3rd party development.
           | 
           | Those are all free from any perceived golden handcuffs that
           | AdWords would impose.
        
             | damsalor wrote:
             | Yea, well. I still think there is a conflict of interest if
             | you sell propaganda
        
           | lowkey wrote:
           | As of 2020, AdWords represented over 80% of all Google
           | revenue [1] while in 2021 7% of Google's revenue came from
           | cloud [2].
           | 
           | [1] https://www.cnbc.com/2021/05/18/how-does-google-make-
           | money-a...?
           | 
           | [2] https://aag-it.com/the-latest-cloud-computing-
           | statistics/?t
        
         | miki123211 wrote:
         | Google and Apple also have an "API access" advantage. It is
         | similar to the ecosystem advantage but goes beyond it; Google
         | and Apple restrict third-party app makers from access to
         | crucial APIs like receiving and reading texts or interacting
         | with onscreen content from other apps. I think that may turn
         | out to be the most important advantage of them all. This should
         | be a far bigger concern for antitrust regulators than petty
         | squabbles over in-app purchases. Spotify and Netflix are
         | possible (if slightly inconvenient) to use on iOS, a fully-
         | featured AI assistant coming from somebody who isn't Apple is
         | not.
         | 
         | Google (and to a lesser extend also Microsoft and Meta) also
         | have a data advantage, they've been building search engines for
         | years, and presumably have a lot more in-house expertise on
         | crawling the web and filtering the scraped content. Google can
         | also require websites which wish to appear in Google search to
         | also consent to appearing in their LLM datasets. That decision
         | would even make sense from a technical perspective, it's easier
         | and cheaper to scrape once and maintain one dataset than to
         | have two separate scrapers for different purposes.
         | 
         | Then there's the bias problem, all of the major AI companies
         | (except for Mistral) are based in California and have mostly
         | left-leaning employees, some of them quite radical and many of
         | them very passionate about identity politics. That worldview is
         | inconsistent with a half of all Americans and the large
         | majority of people in other countries. This particularly
         | applies to the identity politics part, which just isn't a
         | concern outside of the English-speaking world. That might also
         | have some impact on which AI companies people choose, although
         | I suspect far less so than the previous two points.
        
         | mirekrusin wrote:
         | Not mentioning Meta, the good guy now, is scandalous.
         | 
         | X is not going to sit quietly as well.
         | 
         | There is also the rest of us.
        
           | riffraff wrote:
           | X is tiny compared to Apple/Meta/Google, both in engineering
           | size and in "fingerprint" in people's life.
           | 
           | Also engineering wise, currently every tweet is followed by a
           | reply "my nudes in profile" and X seems unable to detect it
           | as trivial spam, I doubt they have the chops to compete in
           | this arena, especially after the mass layoffs they
           | experienced.
        
             | mirekrusin wrote:
             | By X I mean one guy with big pocket who won't sit quietly -
             | I wouldn't underestimate him.
        
         | xNeil wrote:
         | >Google and Apple have a somewhat poor software innovation
         | reputation.
         | 
         | I'm assuming you mean reputation as in general opinion among
         | developers? Because Google's probably been the most innovative
         | company of the 21st century so far.
        
           | bugbuddy wrote:
           | Yes, I miss Stadia so much. It was the most innovative
           | streaming platform I had ever used. I wished I could still
           | use it. Please, Google, bring Stadia back.
        
             | teaearlgraycold wrote:
             | They're renting out the tech to 3rd parties
        
         | hwbunny wrote:
         | ahem...zzzzzzzz
        
       | Octokiddie wrote:
       | Oddly enough, I predict the final destination for this train will
       | be for moving images to fade into the background. Everything will
       | have a dazzling sameness to it. It's not unlike the weird place
       | that action movies and pop music have arrived. What would have
       | been considered unbelievable a short time ago has become bland.
       | It's probably more than just novelty that's driving the comeback
       | of vinyl.
        
         | rjh29 wrote:
         | Even this site just did not impress me. I feel like it's all
         | stuff I could easily imagine myself. True creativity is someone
         | with a unique mind creating something you would never had
         | thought of.
        
           | damsalor wrote:
           | Get a life
        
         | jmathai wrote:
         | It's a lot more than novelty. It's dedicating the attention
         | span needed to listen to an album track by track without
         | skipping to another song or another artist. If that sounds
         | dumb, give it time and you'll get there also.
         | 
         | It's not just technology though. Globalization has added so
         | many layers between us and the objects we interact with.
         | 
         | I think Etsy was a bit ahead of their time. It's no longer a
         | marketplace for handcrafted goods - it got overrun by mass
         | produced goods masquerading as something artisan. I think the
         | trend is continuing and in 5-10 years we'll be tired of cheap
         | and plentiful goods.
        
         | hwbunny wrote:
         | Yeah, but if you bring up a generation or two on this trash,
         | they will get used to it and think this will be the norm and
         | gonna enjoy it like pigs at the troughs.
        
         | mFixman wrote:
         | AI generated images and video are not competing against actual
         | quality work with money put into it. They are competing against
         | the quick photoshop or Adobe Aftereffects done by hobbyists and
         | people learning how to work in the creative arts.
         | 
         | I never heard HN claiming that Copilot will replace
         | programmers. Why do so many people believe generative AI will
         | replace artists?
        
       | A4ET8a8uTh0 wrote:
       | I was hoping to see more.. I logged in and was greeted by a
       | waiting list for videos. Since I was disappointed already, I
       | figured I might as well spend some time on other, hopefully
       | usable, features. So I moved to pictures.
       | 
       | First, randomly selected 'feeling lucky' prompt got rejected,
       | because it did not meet some criteria and pop-up helpfully listed
       | FAQ to explain to me how I should be more sensitive to the
       | program. I found it amusing.
       | 
       | Then I played with a couple of images, but it was nothing really
       | exciting one way or another.
       | 
       | I guess you can color me disappointed overall. And no, I don't
       | consider videos on repeat sufficient.
        
       | TheAceOfHearts wrote:
       | ImageFX fails at both of my tests:
       | 
       | 1. Generating an image of "a group of catgirls activating a
       | summoning circle". Anything related to catgirls tends to get
       | tagged as sexual or NSFW so it's censored. Unsurprising.
       | 
       | 2. The lamb described in Book of Revelation. Asking for it
       | directly or pasting in the passage where the lamb is described
       | both fail to generate any images. Normally this fails because
       | there's not much art of the lamb from Book of Revelation from
       | which the model can steal. If I gave the worst of artists a
       | description of this, they'd be able to come up with _something_
       | even if it 's not great.
       | 
       | Overall, a very disappointing release. It's surprising that
       | despite having effectively infinite money this is the best that
       | Google is able to ship at the moment.
        
         | SomaticPirate wrote:
         | I think this comment is peak Hackernews... dripping with
         | sarcasm and minimizing a significant engineering accomplishment
        
           | TheAceOfHearts wrote:
           | There's nothing sarcastic about my comment. It highlights key
           | limitations of the system with clear examples. Considering
           | the number of world-class engineers and effectively infinite
           | resources available to Google it's just a disappointing
           | release. Both examples are things that I care about and which
           | other people aren't discussing, so I think adding my voice to
           | the conversation is a net positive.
        
       | runeks wrote:
       | Didn't the model ever fail to generate realistic-looking content?
       | 
       | If I don't know better I'd think you just cherry-picked the
       | prompts with the best-looking results.
        
         | carschno wrote:
         | What you see there is a product, not the scientific
         | contribution behind it. Consequently, you see marketing
         | material, not a scientific evaluation.
        
           | tsurba wrote:
           | Unfortunately also the majority of scientific papers for eg.
           | image generation have had completely cherry-picked examples
           | for a long time now.
        
       | yoyopa wrote:
       | stop with the ridiculous names just some code numbers like BMW
        
       | sanjayk0508 wrote:
       | its a direct competition to sora
        
       | ArchitectAnon wrote:
       | I think the thing that most perturbs me about AI is that it takes
       | jobs that involve manipulating colours, light, shade and space
       | directly and turns them into essay writing exercises. As a
       | dyslexic I fucking hate writing essays. 40% of architects are
       | dyslexic. I wouldn't be surprised if that was similar or higher
       | in other creative industries such as filmmaking and illustration.
       | Coincidentally 40% of the prison population is also dyslexic, I
       | wonder if that's where all the spare creatives who are terrible
       | at describing things with words will end up in 20 years time.
        
         | aavshr wrote:
         | I would imagine and hope for interfaces to exist where the
         | natural language prompt is the initial seed and then you'd
         | still be able to manipulate visual elements through other ways.
        
           | Art9681 wrote:
           | This is the case today. You won't get a "perfect" image
           | without heavy post-processing, even if that post-processing
           | is AI enhanced. ComfyUI is the new PhotoShop and although its
           | not an easy app to understand, once it "clicks" its the most
           | amazing piece of software to come out of the opensource oven
           | in a long time.
        
         | fzzzy wrote:
         | you can speak instead if you wish. Speech to text is available
         | for all operating systems.
        
           | cy6erlion wrote:
           | Speaking has sound but that is still just words with the same
           | logic structure. "Colours, light, shade and space" have
           | entirely different logic.
        
             | fzzzy wrote:
             | Very interesting. Thank you for the perspective, it is
             | extremely illuminating.
             | 
             | What is a user interface which can move from color, light,
             | shade, and space to images or text? Could there be an
             | architecture that takes blueprints and produces text or
             | images?
        
         | chromanoid wrote:
         | I guess in the near future prompts can be replaced by a live
         | editing conversation with the AI, like talking to a phantom
         | draughtsman or a camera operator / movie team. The AI will
         | adjust while you talk to it and can also ask questions.
         | 
         | ChatGPT already allows this workflow to some extent. You should
         | try it out. I just talked to ChatGPT on my phone to test it. I
         | think I will not go back to text for these purposes. It's much
         | more creative to just say what you don't like about a picture.
         | 
         | If you speech is also affected rough sketches and other
         | interfaces will/are also be available (see
         | https://openart.ai/apps/sketch-to-image). What kind of
         | expression do you prefer?
        
         | canes123456 wrote:
         | It's seems exceedly clear to me that the primary interface for
         | LLMs will voice.
        
         | cainxinth wrote:
         | Terence McKenna predicted this:
         | 
         | "The engineers of the future will be poets."
        
         | gnobbler wrote:
         | You're entitled to your opinion but this will open up a world
         | of possibilities to people who couldn't work in these fields
         | previously due to their own non-dyslexia disability. Handless
         | intelligent people shouldn't lose out because incumbents don't
         | want to share their lane.
        
           | alt227 wrote:
           | So, the fall of the skilled professional and the rise of
           | anybody who knows how to write prompts?
        
             | Jensson wrote:
             | The AI we have today has very little to do with writing
             | prompts, you still need to understand, correct, glue and
             | edit the results and that is most of the work so you still
             | need skilled professionals.
        
         | DeathArrow wrote:
         | >As a dyslexic I fucking hate writing essays
         | 
         | You can feed AI an image and ask it to describe. Kind of the
         | inverse process.
        
         | seanw265 wrote:
         | Your claim that 40% of architects piqued my curiosity. I wonder
         | if this would have an impact on the success of tools like
         | ChatGPT in the architecture industry.
         | 
         | Do you have a source for this stat? I can't seem to find
         | anything to support it.
        
       | neverokay wrote:
       | I really just need to make some porn with this stuff already and
       | I feel like we're all tip toeing around this key feature.
       | 
       | Censored models are not going to work and we need someone to
       | charge for an explicit model already that we can trust.
        
         | ranyume wrote:
         | Noo! Think about the children!
         | 
         | (this post is sarcastic)
        
           | neverokay wrote:
           | If they cared about the kids they would out ahead of this
           | before it spreads like wildfire.
        
         | LZ_Khan wrote:
         | Oh there's a lot of ai generated porn clips floating around the
         | internet.
        
       | Dowwie wrote:
       | The Alpine Lake example is gorgeous
        
       | nbzso wrote:
       | How many billions and tons of water is wasted on this abomination
       | and copyright theft?
        
       | solatic wrote:
       | > It's critical to bring technologies like Veo to the world
       | responsibly. Videos created by Veo are watermarked using SynthID,
       | our cutting-edge tool for watermarking and identifying AI-
       | generated content
       | 
       | And we're supposed to believe that this is resilient against
       | prompt injection?
       | 
       | How do you prevent state actors from creating "proof" that their
       | enemies engaged in acts of war, and they are only engaging in
       | "self-defense"?
        
         | dmix wrote:
         | Nation states can run their own models if not now very soon.
         | This isn't something you're going to control via AI-safety woo
         | woo.
        
       | datarez wrote:
       | Google is dancing with OpenAI
        
       ___________________________________________________________________
       (page generated 2024-05-15 23:01 UTC)