[HN Gopher] Veo
___________________________________________________________________
Veo
Author : meetpateltech
Score : 834 points
Date : 2024-05-14 17:58 UTC (5 hours ago)
(HTM) web link (deepmind.google)
(TXT) w3m dump (deepmind.google)
| moralestapia wrote:
| Not nearly as good as Sora.
|
| Google missed this train, big time.
| DarmokJalad1701 wrote:
| > "SIGN IN TO GET A SNEAK PEAK."
|
| https://theoatmeal.com/comics/sneak_peek
| baal80spam wrote:
| Whoa. The URL is correct, the text is not.
| sowbug wrote:
| The page is now fixed. Even for a "test kitchen," that's a
| shocking error for a company like Google to make.
| hehdhdjehehegwv wrote:
| You have to log in just to see a demo? They are desperate to
| track people.
| iamleppert wrote:
| It's so bad its laughable. Sundar really needs to crack the whip
| harder on those Googlers.
| bamboozled wrote:
| or someone has to crack the whip on Sundar :)
| geodel wrote:
| His job is keeping stock price up which he is doing well so
| far. Another is layoffs which again he is doing fine :)
| htrp wrote:
| Was anyone else confused by that Donald Glover segment. It felt
| like we were going to get a short film, and we got 3-5 clips?
| curiousgal wrote:
| Exactly!
|
| _" Hey guys big artist says this is fine so we're good"_
| jsheard wrote:
| And those clips mostly look like generic stock footage, not
| something specific that a director might want to pre-vis.
|
| This is what movie pre-vis is actually like, it doesn't need to
| be pretty, it needs to be _precise_ :
|
| https://www.youtube.com/watch?v=KMMeHPGV5VE
| thisoneworks wrote:
| Yeah that wasn't obvious what they were trying to show. Demis
| said feature films will be released in a while
| Keyframe wrote:
| It felt AI-generated.
| htrp wrote:
| I wish it were AI Donald Glover talking and the "Apple twist"
| at the end was that the entire 3 minute segment was a prompt
| for "Donald Glover talking about how Awesome Gemini Models
| are in a California vineyard"
| ZiiS wrote:
| Also it is either very good at generating living people or they
| need to put more though into saying "Note: All videos on this
| page were generated by Veo and have not been modified"
| jsheard wrote:
| That "footage has not been modified" statement is probably to
| get ahead of any speculation that it was "cleaned up" in
| post, after it turned out that the Sora demo of the balloon
| headed man had fairly extensive manual VFX applied afterwards
| to fix continuity errors and other artifacts.
| iamdelirium wrote:
| Wait, where did you hear this? I would assume something
| like this would have made somewhat of a splash.
| jsheard wrote:
| The studio was pretty up front about it, they released a
| making-of video one day after debuting the short which
| made it clear they used VFX to fix Soras errors in post,
| but OpenAI neglected to mention that in their own copy so
| it flew under the radar for a while.
|
| https://www.youtube.com/watch?v=KFzXwBZgB88
|
| https://www.fxguide.com/fxfeatured/actually-using-sora/
|
| _> While all the imagery was generated in SORA, the
| balloon still required a lot of post-work. In addition to
| isolating the balloon so it could be re-coloured, it
| would sometimes have a face on Sonny, as if his face was
| drawn on with a marker, and this would be removed in
| AfterEffects. similar other artifacts were often
| removed._
| TIPSIO wrote:
| Seems like ImageFX, VideoFX (just a Google form and 3 demos),
| MusicFX, and TextFX at the links are down and not working.
|
| Huge grammar error on front page too.
| indy wrote:
| As someone who doesn't live in the US this year's Google IO feels
| like I'm outside looking in at all the cool kids who get to play
| with the latest toys.
| roynasser wrote:
| VPN'd right into that playground, turns out the toys were
| pretty blah
| numbers wrote:
| don't feel left out, we're all on the wait lists
| curiousgal wrote:
| Oh look another half baked product release that's not available
| in any country. They're a joke.
| mupuff1234 wrote:
| Is Sora available in any country?
| bamboozled wrote:
| I thought I read they've deemed Sora too dangerous to release
| pre- election ? Or have reservations about it ? I might be
| wrong...
| sib wrote:
| Sounds like a great excuse / communications strategy!
| jaggs wrote:
| Apparently it's only released to red teams at the moment as
| they try to manage safety. There's also the issue about
| releasing too close to an election?
| sebzim4500 wrote:
| The Donald Glover segment might be a new low for Google
| announcement videos. They spent all this time talking up the
| product but didn't actually show what he had created.
|
| Imagine how bad the model must be if this is the best way Google
| can think of selling it.
| fakedang wrote:
| What seems worse is the Google TextFX video with Lupe Fiasco?
| What the heck am I supposed to get out of watching boring
| monologues by a couple of people? They could have just as
| easily shown, with less camera work, Lupe Fiasco actually using
| the LLM model, but they didn't - or at least not enough to grab
| my attention in 2 minutes.
|
| Personally, I liked the above link, even as a Google skeptic,
| but the videos aren't helping their case.
| Horffupolde wrote:
| Google is the new Kodak.
| loudmax wrote:
| The videos in this demo are pretty neat. If this had been
| announced just four months ago we'd all be very impressed by the
| capabilities.
|
| The problem is that these video clips are very unimpressive
| compared to the Sora demonstration which came out three months
| ago. If this demo was announced by some scrappy startup it would
| be worth taking note. Coming from Google, the inventor of the
| Transformer and owner of the largest collection of videos in the
| world, these sample videos are underwhelming.
|
| Having said that, Sora isn't publicly available yet, and maybe
| Veo will have more to offer than what we see in those short clips
| when it gets a full release.
| fakedang wrote:
| Honestly, if Veo becomes public faster than Sora, they could
| win the video AI race. But what am I wishfully thinking - it's
| Google we're talking about!
| Jensson wrote:
| > But what am I wishfully thinking - it's Google we're
| talking about!
|
| Google the company known to launch way too many products?
| What other big company launches more stuff early than them?
| What people complain about Google is that they launch too
| much and then shut them down, not that they don't launch
| things.
| mccraveiro wrote:
| They didn't show any human videos, which could indicate that the
| technology struggles with generating them.
| karmasimida wrote:
| Actually there is one in the last demo, it is not an individual
| one, but one shot in the demo where a team uses this model to
| create a scene with human in it, where they created an image of
| black woman but only up her head in it
|
| I would generally agree though, it is not normal they didn't
| show more human
| revscat wrote:
| I'm sure part of the reason, beyond those given already, is
| that they want to avoid the debate around nudity.
| dyauspitr wrote:
| You know why and it's not that their technology struggles with
| it.
| chubot wrote:
| It's also probably that it's easier to spot fake humans than to
| spot fake cats or camels. We are more attuned to the faces of
| our own species
|
| That is, AI humans can look "creepy" whereas AI animals may
| not. The cowboy looks pretty good precisely because it's all
| shadow.
|
| CGI animators can probably explain this better than I can ...
| they have to spend way more time on certain areas and certain
| motions, and all the other times it makes sense to "cheat" ...
|
| It explains why CGI characters look a certain way too -- they
| have to be economical to animate
| mjfl wrote:
| thank goodness.
| himinlomax wrote:
| They're probably still wary of their latest PR disaster, the
| inclusive and diverse WW2 Germans from Gemini.
| xianshou wrote:
| To quote Twitter/X, "I wonder what OpenAI will release tomorrow
| and Google will release a waitlist for."
|
| GPT-4o: out
|
| Veo: waitlist
|
| Admittedly this is impressive and the direct comp would be Sora,
| which isn't out, but sometimes the caricature is very close to
| the truth.
| jsheard wrote:
| Then again Veo is in the same category as Sora, which isn't
| released either, 3 months after the reveal.
| rvnx wrote:
| "This tool isn't available in your country yet"
| modeless wrote:
| To be fair, all the voice stuff OpenAI demoed isn't released
| yet either.
| martinesko36 wrote:
| This is Google for the last 5+ IOs. They just release waitlists
| and demos that are leapfrogged by the time they're available to
| all. (and shut down a few years later)
| htrp wrote:
| Cite sources?
| skepticATX wrote:
| OpenAI hardly released gpt-4o. The demo yesterday was clearly a
| rushed response to I/O. It's quite possible that Google will
| ship multi-modality features faster than OpenAI will.
| JeremyNT wrote:
| Yeah I think at this point it's "not if, but when" and the
| gap between parity is just going to keep shrinking
| (until/unless there's some kind of copyright/legislative
| barrier implemented that favors one or the other).
|
| "We have no moat" swings both ways.
| juice_bus wrote:
| Which one of these products Google are releasing that you can
| trust will even be around in a year or two? I'm certainly
| done trusting Google with new products.
| buildbot wrote:
| Without doing anything, I have access to GPT-4o in chatgpt
| and the api already (on a personal account, not related to
| work). Maybe I'm just super lucky, but it's certainly not
| vaporware.
| Difwif wrote:
| What do you mean? Everyone has access to the gpt-4o model
| right now through ChatGPT and the API. Sure we don't have
| voice-to-voice but we have a lot more than what Google has
| promised.
| hbn wrote:
| How do I get access? I just checked my app and the Premium
| upgrade says it will unlocked GPT-3.5 and GPT-4, so I
| assume my version is still the old one.
|
| All my apps are updated in the App Store too.
| bagels wrote:
| I have a paid account, and I didn't have to do anything
| to use the new model.
| DeRock wrote:
| I just checked, there was an iOS app update available and
| it enabled it. I'd check again if there's a new update
| (version 1.2024.129). Or you could use the website.
| hbn wrote:
| I'm on the same version and don't see anything different
|
| Website also only has toggle for 3.5 and 4 with the Plus
| upgrade. Not sure if it's cause I'm in Canada?
| croes wrote:
| I use their website and it's one of the three models to
| choose from if your are on plus subscription.
| mr_mitm wrote:
| I have premium access and I can select 4o in the dropdown
| menu on Android
| cush wrote:
| I have the 4o model. On premium. No voice yet
| theresistor wrote:
| To add a counterbalance, I just checked in the app and on
| the website on a non-paid account, and I too do NOT have
| GPT-4o.
| hbn wrote:
| Everyone who says they have access in the replies to my
| comment seem to be paid users. So maybe it's only rolling
| out to them first.
| htrp wrote:
| I expect they have to offer the paid users some thing
| hbn wrote:
| Paid users get like a 5x higher rate limit iirc
| electriclove wrote:
| My (paid) app has it but no voice chat yet
| TecoAndJix wrote:
| I have a paid account and can do voice to voice on the
| iOS app as of last night.
| hbn wrote:
| The realtime one they showed yesterday or the one that's
| existed forever where it's just a voice-to-text input and
| TTS output taking turns?
| TecoAndJix wrote:
| I feel silly now. I downloaded the app after the
| announcement (I'm a desktop user) and it looked identical
| to the one they show in the sarcasm video. When I asked
| it, I was told it was not the new feature announced
| yesterday. Still a lot of fun!
|
| Edit - it does list the new model in my app at least
| sib wrote:
| In the App Store there's a new build of the iOS app as of
| 3 hours about (call it about 11am US Pacific time). It
| includes the GPT-4o model (at least it shows it for me.)
| hbn wrote:
| Are you a paid user?
| theresistor wrote:
| It is not available on my (free) account in either the app
| or the website. So no, everyone does _not_ have access to
| it.
| satvikpendem wrote:
| It's for paid users for now, not free. I have ChatGPT Pro
| and I can use the new model.
| nialv7 wrote:
| I am a free user and I have 4o. I think it is just a
| gradual roll out.
| ben_w wrote:
| API yes, ChatGPT no (at least not for all users); I've got
| my own web interface for the API so I can play with the
| model (for all of $0.045 of API fees), but most people
| can't be bothered with that and will only get 4o when it
| rolls out as far as their specific ChatGPT account.
| ghshephard wrote:
| Was running just fine on my ChatGPT client on iOS - full
| 2-way voice comms by 3:00 PM yesterday. Application was
| already updated.
| mike_hearn wrote:
| I have a regular ChatGPT Pro account and I have GPT-4o.
|
| The bigger issue is that 4o without the multi-modal, new
| speech capabilities or desktop app isn't that different
| to GPT-4. And those things aren't yet launched.
| TecoAndJix wrote:
| Posted further down the thread - I have a paid account and
| can do voice to voice on the iOS app as of last night.
| baobabKoodaa wrote:
| I don't have access to gpt-4o via ChatGPT
| localfirst wrote:
| listen all these guys out here attacking Google and making
| outlandish/false claims
|
| look at their linkedin pages, that will tell you why they are
| desperate
|
| (hint: they bought OpenAI bags on the secondary market)
| rvz wrote:
| Sora is the closest comparison to Veo and both aren't out.
|
| It's been there for three months and still isn't even close to
| being released and available.
|
| Essentially Google has already caught up to OpenAI with their
| recent responses and it's clear that there are private OpenAI
| investors pushing such nonsense around Google struggling to
| compete.
| resource_waste wrote:
| Google Press: This is the greatest AI Model ever yet.
|
| Users: Lol it wont even tell me how to draw a picture of a
| human because its inappropriate.
|
| Google flipped like a switch a few years ago. Instead of going
| for product quality, it seems they went full Apple Marketing
| and control the narrative of top social media.
|
| I keep trying thinking: "well its Google, they will be the best
| right?" No, I'm at giving up on Google, they are not as
| powerful as I once thought... Hmm seems like a good time to get
| into Lobbying and Marketing...
| nextworddev wrote:
| Except gpt-4o with audio and video inputs isn't actually out
| adamtaylor_13 wrote:
| I was using it yesterday in the mobile app. Unless they just
| slapped the new UI on an older model.
| hakanensari wrote:
| It's no longer there, I think?
| Cyph0n wrote:
| I think they (partially?) rolled it back. I tried out the
| voice input yesterday, but it's missing from the app today.
| Workaccount2 wrote:
| They just released the text chat model which still uses the
| same old audio interface as 4. The new audio/video chat
| stuff is not out yet (unless you are a very lucky early
| beta user).
| dom96 wrote:
| > GPT-4o: out
|
| Is it? I can't use it yet at least
| lordswork wrote:
| I am also wondering how to use it..
| drawnwren wrote:
| It is. I've got it already, but I'm a bit of a gpt4 power
| user. I hit my rate limit biweekly or so and run up close to
| it every day. I'd bet maybe they prioritized people that were
| costing them money.
| kaibee wrote:
| It might be just by sign-up order. I signed up for pro
| basically as soon as I could, but I never hit limits, and
| only really use it once or twice a day, sometimes not at
| all.
| drawnwren wrote:
| Interestingly, when I use Cloudflare's Warp DNS I don't
| have access to it. So, it might have something to do w/
| region as well?
| phyalow wrote:
| Really? You have the new model, sure, I have it too, but
| afaik nobody has the new ultra fast and variable voice +
| video chat on mobile.
| drawnwren wrote:
| The original question asked if anyone had GPT-4o. You're
| asking a different question.
| w-m wrote:
| GPT-4o is available for me on ChatGPT, with the
| text+attachment input (as a Plus user from Germany). It's
| crazy fast. The voice for the audio conversation in the app
| is still the old one, and doesn't let you interrupt it.
| mrkramer wrote:
| Google is scared of what every new model can produce, they
| don't want drama but they always end up in some kind of media
| drama.
| Tenoke wrote:
| I can't even join the waitlist from Europe while 4o is fully
| available here.
| dyauspitr wrote:
| I haven't been able to try out 4o. The voice chat continuously
| says there's too much traffic and I don't even see a button to
| turn on the camera
| qwertox wrote:
| > GPT-4o: out
|
| I don't know what's wrong with GPT-4o, but the answers I'm
| getting are much worse than before yesterday. It's constantly
| repeating the entire content required to provide a seemingly
| "full" answer, but if it passes me the same but slightly
| modified Python code for the fifth time even if it has become
| irrelevant to the current conversation, it really gets on my
| nerves.
|
| I had so well tuned custom instructions which worked
| beautifully and now it's as if it is ignoring most of them.
|
| It's causing me frustration and really wasting my time when I
| have to wait for the unnecessary long answers to finish.
| endisneigh wrote:
| I've noticed that a lot of the commentary of these models creates
| the sort of fervor like politics or sports.
|
| In any case - no details on compute needed. Curious if this ever
| can be cheap. Even Midjourney still requires a lot.
|
| I'm also surprised there hasn't been some attempt at creating
| benchmarks for this. One example could be color accuracy.
| stefan_ wrote:
| Never mind no benchmarks, half of these announcements in the
| past were straight _made up_ , "offline enhanced" cherry picked
| "examples", CGI fantasies.
|
| Not to mention the whole AGI topic is forever doomed from SciFi
| fans, just remember what happened with that room-temperature
| superconductivity.
| inasio wrote:
| From a 2014 Wired article [0]: "The average shot length of
| English language films has declined from about 12 seconds in 1930
| to about 2.5 seconds today"
|
| I can see more real-world impact from this (and/or Sora) than
| most other AI tools
|
| [0] https://www.wired.com/2014/09/cinema-is-evolving/
| jsheard wrote:
| Even if the shots are very short you still need coherency
| _between_ shots, and they don 't seem to have tackled that
| problem yet.
| mattgreenrocks wrote:
| This is very noticeable. Watching movies from the 1970s is
| positively serene for me, vs the shot time on modern films
| often leaves me wonder, "wait, what just happened there?"
|
| And I'm someone who is fine playing fast action video games.
| Can't imagine what it's like if you're older or have sensory
| processing issues.
| ryandrake wrote:
| Obligatory: Liam Neeson jumps over a fence in 6 seconds, with
| 14 cuts[1].
|
| 1: https://www.youtube.com/watch?v=gCKhktcbfQM
| aidenn0 wrote:
| I'd like to fact check this amazing comment on that video,
| but it would require watching Taken 3:
|
| > Some of y'all may find how awful this editing gets pretty
| interesting: I did an Average Shot Length (ASL) for many
| movies for a recent project, and just to illustrate bad
| overediting in action movies, I looked at Taken 3 (2014) in
| its extended cut.
|
| > The longest shot in the movie is the last shot, an aerial
| shot of a pier at sunset ending the movie as the end
| credits start rolling over them. It clocks in at a runtime
| of 41 seconds and is, _BY FAR_ , the longest shot in the
| movie.
|
| > The next longest is a helicopter establishing shot of the
| daughter's college after the "action scene" there a little
| over an hour in, at 5 seconds.
|
| > Otherwise, the ASL for Taken 3 (minus the end
| credits/opening logos), which has a runtime of 1:49:40,
| 4,561 shots in all (!!!), is 1.38 SECONDS . For comparison,
| Zack Snyder's Justice League (2021) (minus end
| credits/opening logos) is 3:50:59, with 3163 shots overall,
| giving it an ASL of 4.40 seconds, and this movie, at 1 hour
| 50 minutes, has north of 4,561 for an ASL of 1.38
| seconds?!?! _Taken 3 has more shots in it than Zack Snyder
| 's Justice League, a movie more than double its length..._
|
| > To further illustrate how ridiculous this editing gets,
| the ASL for Taken 3's non-action scenes is 2.27 seconds. To
| reiterate, this is the non-action scenes. The "slow
| scenes." The character stuff. Dialogue scenes. The stuff
| where any other movie would know to slow down. 2.27 SECONDS
| For comparison, Mad Max: Fury Road (minus end
| credits/opening logos) has a runtime of 1:51:58, with 2646
| shots overall, for an ASL of 2.54 seconds. TAKEN 3'S "SLOW
| SCENES" ARE EDITED MORE AGGRESSIVELY THAN MAD MAX: FURY
| ROAD!
|
| > And Taken 3's action scenes? _Their ASL is 0.68 seconds!_
|
| > If it weren't for the sound people on the movie, Taken 3
| wouldn't be an "action movie". It'd be abstract art.
| throwup238 wrote:
| It's worth noting that Taken 3 has a 13% rating on Rotten
| Tomatoes, which is well in to "it's so bad it's good"
| territory. I don't think the rapid cuts went unnoticed.
| nimithryn wrote:
| Yeah, this sequence is a meme commonly cited to show
| "choppy modern editing"
| llmblockchain wrote:
| More chops than an MF DOOM track.
| kristofferR wrote:
| The top comment makes a really good point though:
|
| "He's 68. I'm guessing they stitched it together like this
| because "geriatric spends 30 seconds scaling chainlink
| fence then breaks a hip" doesn't exactly make for riveting
| action flick fare."
|
| Lingering shots are horrible for obscuring things.
| lupire wrote:
| Movies have stunt performers.
|
| And Neeson was only 60 when filming Taken 3.
| troupo wrote:
| Keanu Reeves was 57-8 when he shot the last _John Wick_.
| IIRC Bob Odenkirk was 58 in _Nobody_. Neeson was 60 in
| Taken 3.
|
| There ways to shoot an action scene with an aging star
| that doesn't involve 14 cuts in 4 seconds. You just have
| to care about your craft.
| nineteen999 wrote:
| Is it Liam Neeson, or his stunt double?
| psbp wrote:
| My brain processes too slow for modern action movies.
|
| I can tell what's going on, but I always end up feeling
| agitated.
| kemitchell wrote:
| Enjoy some Tarkovsky.
| joshuahedlund wrote:
| How many of those 2.5 second "shots" are back-and-forths
| between two perspectives (ex. of two characters talking to one
| another) where each perspective is consistent with itself? This
| would be extremely relevant for how many seconds of consistent
| footage are actually needed for an AI-generated "shot" at film-
| level quality.
| Keyframe wrote:
| Kind of sucks to be google. Even they're making good progress
| here, and have laid the foundations of a lot if not most things..
| their products are, well there aren't any noteworthy compared to
| rest. And considering google is sitting on top of one of the
| largest if not THE largest video database, along with maps,
| traffic, search, internet.zip, usenet, vast computing resources
| vertically integrated.. they have the whole advantage in the
| world. So, the hell are they doing? Why isn't their CEO already
| out? Expectations from them are higher than from anyone else.
| InfiniteVortex wrote:
| Google search has been absolutely ruined in terms of quality.
| You're right, they've built the base in terms of R&D for many
| of the AI breakthroughs thats powering competing alternative
| products.... that happen to be better than Google's own
| products. Google went from "Don't be evil" to just another big
| corporate tech company. They have so much potential.
| Regrettable.
| CraftingLinks wrote:
| They are fast on their way becoming IBM 2.0.
| jason-phillips wrote:
| More like Xerox
| dyauspitr wrote:
| If anything google search with the Gemini area on the top has
| been very good for me.
| atleastoptimal wrote:
| Because they punish experimentation as it eats into their
| bottom line. AI is a tool for ads in the mind of executives at
| Google. Ads and monetization of human productivity, not an
| agent of productivity on its own.
| khazhoux wrote:
| C'mon, Google doesn't "punish" experimentation. Google X,
| Google Glass, Daydream, Fuschia, moonshots, the lab spinoff
| (whose name I can't remember)... hell, even all the abandoned
| products everyone here always complains about.
|
| The experiments often/usually fail, but they _do_ experiment.
| Koffiepoeder wrote:
| If you prune all the branches, where will the fruits grow?
| khazhoux wrote:
| The branches were dead and could bear no fruit. New
| branches will sprout next season.
| lolinder wrote:
| "Laser-focused on the bottom line at the expense of all else"
| is not how I'd describe Google, now or at any point in the
| past. They have a _lot_ of dysfunction, but if anything that
| dysfunction stems from _too much_ experimentation and
| autonomy at the leaf nodes of the organization. That 's how
| they get into these crazy places where they have to pick
| between 5 chat apps or whatever.
|
| If Google were as focused on ads as you seem to think we'd at
| least see some sort of coherent org-wide strategy instead of
| a complete lack of direction.
| criddell wrote:
| I'd describe Google as focused on the bottom line after
| they put the ads guy in charge of search.
|
| I'm referring to this article that was posted here
| recently:
|
| https://www.wheresyoured.at/the-men-who-killed-google/
| khazhoux wrote:
| The person now in charge of Search is Elizabeth Hamon
| Reid, a long-time googler who came up through the ranks
| from engineer (in Google Maps) to VP over 20 years. She's
| legit.
| criddell wrote:
| Is Wikipedia out of date then?
|
| https://en.wikipedia.org/wiki/Prabhakar_Raghavan
| khazhoux wrote:
| Ah, according to this, she's head of Search but reports
| to Prabhakar. I thought from recent reports that she'd
| taken search over from him.
|
| Nonetheless, she was a good engineer and a good manager,
| back when we crossed path many moons ago.
|
| https://searchengineland.com/liz-reid-google-new-head-of-
| sea...
| lolinder wrote:
| That was a decision that prioritized the bottom line over
| other things. But saying that Google is "focused" on the
| bottom line implies that there's a pattern of them
| putting the bottom line first, which is simply not true
| if you look at Google as a whole. Search specifically,
| maybe, but not Alphabet.
| Workaccount2 wrote:
| I don't know how more people don't talk about the 1M context
| tokens. While the output is mediocre for cutting edge models,
| you can context stuff the ever living hell out of it for some
| pretty amazing capabilities. 2M tokens is even crazier.
| lordswork wrote:
| It is pretty amazing. I've been using it every day. I do wish
| you could easily upload an entire repo into it though.
| bongodongobob wrote:
| Have it write a program to output a repo as a flat file.
| rm_-rf_slash wrote:
| Anything approaching the token limit I turn into a file and
| upload to a vector store. Results are comparable between Chat
| and Assistants.
| Keyframe wrote:
| That's a good point. Gemini gatekeeping me on so many answers
| made me forget about this extraordinary feature of it.
| softwaredoug wrote:
| It's often said you need to disrupt your own business model.
|
| Google had blinders on. They didn't relentlessly focus on
| reinventing their domain. They just milked what they had.
| Gradually losing site of the user experience[1] to focus on
| monetization above all else.
|
| 1 - https://twitter.com/pdrmnvd/status/1707395736458207430
| dyauspitr wrote:
| Their CEO is generating massive, growing profits every quarter
| while releasing generative technology, all the while threading
| a fine line in what those models generate because it can be
| pretty devastating for a large corp like Google.
| Keyframe wrote:
| you think it's because of him or despite him?
| airstrike wrote:
| > Veo
|
| > Sign up to try VisionFX
|
| Is it Veo or VisionFX? Is it a sign up, a trial, or a waitlist?
|
| How hard can it be to write a clear message? In the words of Don
| Miller, if you confuse, you lose.
| therein wrote:
| Yeah I was like so is it Veo or VisionFX.
|
| This landing page feels as haphazardly put together as the
| Coinbase downtime page last night.
| peppertree wrote:
| This is very on-brand with how Google does branding. "are you
| confused yet? no? try this other vaguely similar name."
| davidw wrote:
| Maybe it's going to be a new messaging app - but with AI!
|
| Kidding... I signed up for the waitlist. I have ideas for
| videos I'd like to use to explain things that I have no hope
| of creating myself.
| BlackJack wrote:
| Disclaimer: I work at Google on related stuff
|
| Veo is the name of a video model. VideoFX is the name of a new
| experimental tool at labs.google.com, which uses Veo and lets
| you make videos.
|
| Thanks for the feedback though, I see how it's confusing for
| users.
| zb3 wrote:
| I see the endpoint returns "Not Implemented" when trying to
| make a video :<
|
| Imagen 3 is awesome though, generates nice logos :D
| mike_hearn wrote:
| Presumably this is DeepMind vs Labs fighting over the same
| project. A consequence of guaranteeing Demis some level of
| independence when DeepMind was bought, which still shows
| through in the fact that the DeepMind brand(s) survive.
| qingcharles wrote:
| And: Communication isn't what you say, it's what people hear
|
| Agree this is totally confusing.
| rishav_sharan wrote:
| Now that the first direct competitor to Sora has been announced,
| I am sure Sora will be suddenly ready for public consumption, all
| it's ai safety concerns forgotten
| sebastiennight wrote:
| I think there's a tremendous compute cost associated with both
| models still... I can't see how either company could withstand
| the instant enormous demand, even if they tried to command
| crazy prices.
|
| Even at $1 per 5-second video, I think some use cases
| (including fun/non-business ones) would still overwhelm
| capacity.
| popcar2 wrote:
| Not nearly as impressive as Sora. Sora was impressive because the
| clips were long and had lots of rapid movement since video models
| tend to fall apart when the movement isn't easy to predict.
|
| By comparison, the shots here are only a few seconds long and
| almost all look like slow motion or slow panning shots
| cherrypicked because they don't have that much movement. Compare
| that to Sora's videos of people walking in real speed.
|
| The only shot they had that can compare was the cyberpunk video
| they linked to, and it looks crazy inconsistent. Real shame.
| nuz wrote:
| Sora is also movement limited to a certain range if you look at
| the clips closely. Probably something like filtering by some
| function of optical flow in both cases.
| ein0p wrote:
| Also Sora demos had some really impressive generations
| featuring _people_. Here we hardly see any people which likely
| means exactly what you'd guess.
| data-ottawa wrote:
| Has Gemini started generated impacted of people again? My
| trial has ended and I haven't been following the issue.
| spiderfarmer wrote:
| Also the horse just looks weird, just like the buildings and
| peppers.
|
| It's impressive as hell though. Even if it would only be used
| to extrapolate existing video.
| LZ_Khan wrote:
| I imagine thats just a function of how much training data you
| throw at it.
| Jensson wrote:
| > Sora was impressive because the clips were long and had lots
| of rapid movement
|
| Sora videos ran at 1 beat per second, so everything in the
| image moved at the same beat and often too slow or too fast to
| keep the pace.
|
| It is very obvious when you inspect the images and notice that
| there are keyframes at every whole second mark and everything
| on the screen suddenly goes in their next animation step.
|
| That really limits the kind of videos you can generate.
| lupire wrote:
| So it needs to learn how far each object can travel in 1sec
| at its natural speed?
| Jensson wrote:
| It also needs to separate animation steps for different
| objects so that objects can keep different speeds. It isn't
| trivial at all to go from having a keyframe for the whole
| picture to having separate for separate parts, you need to
| retrain the whole thing from the ground up and the results
| will be way worse until you figure out a way to train that.
|
| My point is that it isn't obvious at all that Soras way
| actually is closer to the end goal, it might look better
| today to have those 1 second beats for every video but
| where do you go from there?
| TIPSIO wrote:
| Objectively speaking (if people would be honest with
| themselves), both are just decent at best.
|
| I think comparing them now is probably not that useful outside
| of this AI hype train. Like comparing two children. A lot can
| happen.
|
| The bigger message I am getting from this is it's clear OpenAI
| won't have a super AI monopoly.
| TaylorAlexander wrote:
| Comparing two children is a good one. My girlfriend has taken
| to pointing out when I'm engaging in "punditry". They're an
| engineer like I am and we talk about tech all the time, but
| sometimes I talk about which company is beating which company
| like it's a football game, and they call me out for it.
|
| Video models are interesting, and to some extent trying to
| imagine which company is gonna eat the other's lunch is kind
| of interesting, but sometimes that's all people are
| interested in and I can see my girlfriend's reasoning for
| being disinterested in such discussion.
| motoxpro wrote:
| What would make this "Good?"
| dangoodmanUT wrote:
| cant wait to see your model
| arcastroe wrote:
| > The shots here [..] almost all look like slow motion or slow
| panning shots.
|
| I think this is arguably better than the alternative. With
| slow-mo generated videos, you can always speed them up in
| editing. It's much harder to take a fast-paced video and slow
| it down without terrible loss in quality.
| totaldude87 wrote:
| Could also be the doing of google. if Veo screws up , the
| weight falls on Alphabet stock. While open AI is not public and
| doesn't have to worry about anything . Like even if open AI
| faked some of their AI videos(not saying they did), it wouldn't
| affect them the way it would affect Veo--> Google-->Alphabet
|
| being cautious often puts a dent in innovation
| soulofmischief wrote:
| You mean like how they faked some Gemini stuff?
|
| https://www.bbc.com/news/technology-67650807
| latexr wrote:
| > Not nearly as impressive as Sora. Sora was impressive because
| the clips were long and had lots of rapid movement
|
| The most impressive Sora demo was heavily edited.
|
| https://www.fxguide.com/fxfeatured/actually-using-sora/
| rvz wrote:
| Interesting to see that OpenAI was successful in creating
| their own reality distortion spells, just like Apple's
| reality distortion field which has fooled many of these
| commenters here.
|
| It's quite early to race to the conclusion that one is better
| than the other when not only they are both unreleased, but
| especially when the demos can be edited, faked or altered to
| look great for optics and distortion.
|
| EDIT: It appears there is at least one commenter who replied
| below that is upset with this fact above.
|
| It is OK to cope, but the truth really doesn't care
| especially when the competition (Google) came out much
| stronger than expected with their announcements.
| ijidak wrote:
| Well, as a counterpoint, Apple did become a $2 trillion
| dollar company...
|
| Distortion is easiest when the products really work. :)
| adventured wrote:
| Apple got up to $3 trillion back in 2023.
| jsheard wrote:
| To Shy Kids credit _they_ made it clear the Sora footage was
| heavily edited, but OpenAIs site still presents Air Head
| without that context.
|
| https://www.youtube.com/watch?v=KFzXwBZgB88 (posted the day
| after the short debuted)
|
| https://openai.com/index/sora-first-impressions (no mention
| of editing, nor do they link to the above making-of video)
| seoulmetro wrote:
| There is now on that second link:
|
| >The videos below were edited by the artists, who
| creatively integrated Sora into their work, and had the
| freedom to modify the content Sora generated.
| jsheard wrote:
| Ha, here's an archive from yesterday for posterity.
|
| https://web.archive.org/web/20240513050023/https://openai
| .co...
| axblount wrote:
| I hate to be so cynical, but I'm dreading the inevitable flood of
| AI generated video spam.
|
| We really are about _this_ close to infinite jest. Imagine TikTok
| 's algorithm with on demand video generation to suit your exact
| tastes. It may erase the social aspect, but for many users I
| doubt that would matter too much. "Lurking" into oblivion.
| lordswork wrote:
| It's already here. There are communities forming around
| generating passive income from mass producing AI videos as
| tiktoks and shorts.
| axblount wrote:
| I saw one of those where a guy just made videos about
| increasingly elaborate AI generated cakes. You're right, I
| guess we're mostly there.
|
| But those still require some human input. I'm imagining a
| sort of genetic algorithm for video prompts, no human
| editing, input, or curation required.
| barbariangrunge wrote:
| YouTube's endgame is to not need content creators in the loop
| any more. The algorithm will just create everything
| esafak wrote:
| The endgame of that is that people will leave.
| darby_eight wrote:
| I'm somewhat surprised people still watch YouTube with the
| horrible recommendations and non-stop spam
| belter wrote:
| Henry Ford II: Walter, how are you going to get those robots
| to pay your union dues?
|
| Walter Reuther: Henry, how are you going to get them to buy
| your cars?
| LZ_Khan wrote:
| I had the same thought regarding infinite jest recently
| rm_-rf_slash wrote:
| And somehow our exact tastes would also include influencer
| coded advertisements.
| beacon294 wrote:
| Can you explain this aspect of infinite jest to me without
| spoiling the book?
| _xander wrote:
| It's introduced early on (and not what the book is really
| about): distribution of a video that is so entertaining that
| any viewer is compelled to watch it until they die
| jprete wrote:
| At the bottom of the text blurb on the Veo page: "In the
| future, we'll also bring some of Veo's capabilities to YouTube
| Shorts and other products."
|
| So...you're not cynical, it's an explicit product goal.
| Invictus0 wrote:
| This basically already exists for porn
| redml wrote:
| I think of it as we're replacing the SEO spam we have right now
| with AI spam. At least now we can fight that with more AI.
| layer8 wrote:
| If it really suited my exact tastes, that would actually be
| great. But I don't see how we're anywhere close to that. And
| they won't target matching your exact taste. They will target
| the threshold where it's just barely interesting enough that
| people don't turn it off.
| fidotron wrote:
| If you were of a mind to give Google the benefit of the doubt you
| would have to think they are desperately trying not to
| overpromise and underdeliver, partly because that has been their
| track record to date. It's a very curious time to choose to make
| this switch though given their competition, and if it was
| motivated by the reception Bard received then it shows they
| didn't learn the right lessons from that mess at all.
| aaroninsf wrote:
| It's mildly interesting how many of the samples shown fail to
| fully conform to the prompts. Lots of specifics are missing.
|
| Kudos to Google for if not foregrounding, being entirely
| transparent, about this.
| willsmith72 wrote:
| all of this stuff i'll believe when it's ready for public release
|
| 1. safety measures lead to huge quality reductions
|
| 2. the devil's in the details. you can make me 1 million videos
| which look 99% realistic, but it's useless. consumers can pick it
| instantly, and it's a gigantic turn-off for any brand
| aprilthird2021 wrote:
| There'll always be a market for cheap low-quality videos, and
| vice versa always a market for shockingly high quality videos.
| K. Asif's Mughal-e-Azham had enormous ticket sales and a huge
| budget spending on all sorts of stuff, like actual gold jewelry
| to make the actors feel that they were important despite the
| film being black and white.
|
| No matter how good AI gets, it will never be the highest
| budget. Hell, even technically more accurate quartz watches
| cannot compete price wise with mechanical masterpiece watches
| of lower accuracy
| barbariangrunge wrote:
| The company that controls online video is announcing a new tool,
| and ambitions to develop it further, to create videos without
| need for content creators. Using their videos to make a machine
| that will cut them out of the loop.
| hipadev23 wrote:
| I've never had to click "Sign in" so many times in a row.
| flying_whale wrote:
| ...and then fill out an actual google form at the end, _after_
| you've already signed in, to be added to the waitlist :sigh:
| throwup238 wrote:
| ...and enter your email into the form again despite being
| logged into a Google account.
| mrkramer wrote:
| YouTube people: We need more UGC.
|
| DeepMind people: AI can do it.
| belval wrote:
| While it's cool that they chose to showcase full-resolution
| videos, they take so long to load I thought their videos were
| just a stuttery mess.
|
| Turns out if you open the video in a new tab the smoothness is
| much more impressive.
| robertlagrant wrote:
| Hold on to your papers!
| esafak wrote:
| I love the reference to Llama with the alpacas.
| typpo wrote:
| The amount of negativity in these comments is astounding.
| Congrats to the teams at Google on what they have built, and
| hoping for more competition and progress in this space.
| localfirst wrote:
| We have to take account that this community (good chunk have
| stakes in YC and a lot to gain from secondary shares in OpenAI)
| and platform is going to favor its own and be aware that Sam
| Altman is the golden boy of YC's founder after all.
|
| So of course you are going to see snarky comments and straight
| up denial in the competition. We saw that yesterday in the
| comments with the release of GPT4o in anticipation of Gemini
| 2.0 (GPT-5 basically) release being announced today at Google
| I/O
|
| I'm SORA to say Veo looks much more polished without jank.
|
| Big congratulations to Google and their excellent AI team for
| not editing their AI generated videos like SORA
| JumpCrisscross wrote:
| > _platform is going to favor its own and be aware that Sam
| Altman is the golden boy of YC 's founder_
|
| I don't know if there is a sentiment analysis tool for HN,
| but I'm pretty sure it's been dead negative for Altman since
| at least Worldcoin.
| saalweachter wrote:
| A land of contrasts, etc.
| cosmotron wrote:
| Something in this vein was just posted here a few days
| back: https://news.ycombinator.com/item?id=40307519
| baobabKoodaa wrote:
| > We have to take account that this community (good chunk
| have stakes in YC and a lot to gain from secondary shares in
| OpenAI)
|
| You have to be pretty deep inside your own little bubble to
| think that even more than a 0.001% of HN has "stakes in YC"
| or "secondary shares in OpenAI".
| hu3 wrote:
| It can be a vocal minority. Still vocal.
|
| I wouldn't discard.
| dylan604 wrote:
| I have 0% stake in any YC, and I'm very vocal in my
| negativity against any of these "AI" anythings. All of
| these announcements are only slighty more than a toddler
| anxious to show the parental units a finger painting
| looking to hang it on the fridge. Only instead of the
| fridge, they are a hoping to get funding/investment
| knowing that their product is _not_ a fully fledged
| anything. It 's comical.
| mrbungie wrote:
| The amount of copium in this response is astounding.
|
| Yes, there is a noticeable negative response from HN towards
| Google, and there has always been especially when speaking
| about their weird product management practices and
| incentives. Google hasn't launched any notable (and still
| surviving, Stadia being a sad example of this) consumer
| product or service in the last 10 years.
|
| But to suggest there is a Sam Altman / OpenAI bias is
| delusional. In most posts about them there is at least some
| kind of skepticism or criticism towards Altman (his
| participation in Worldcoin and his accelerationist stance
| towards AGI) or his companies (OpenAI not being really open).
|
| PS: I would say most people lurking here are just hackers (of
| many kinds, but still hackers), not investors with shady
| motives.
| betternet77 wrote:
| Yup, there's a significant anti-Google spin in HN, twitter.
| For example, here's paulg claiming that Cruise handles
| driving around cyclists better than Waymo [1], obviously not
| true to anyone who's used both services
|
| [1] https://twitter.com/paulg/status/1360341492850708481
| rvz wrote:
| You have to give Google credit as they went against the OpenAI
| fanatics, Google doomsday crowd and some of the permanent
| critics (who won't disclose they invested in OpenAI's secondary
| share sale) that believe that Google can't keep up.
|
| In fact, they already did. What OpenAI announced was nothing
| that Google could not do already.
|
| The top comments around Sora vs Veo suggesting that Google was
| falling behind, given the fact that both are still unavailable
| to use wasn't even a point to make in the first place, but just
| typical HN nonsense.
| JumpCrisscross wrote:
| > _What OpenAI announced was nothing that Google could not do
| already_
|
| I don't think I've seen serious criticism of Google's
| abilities. Apple didn't release anything that Xerox or IBM
| couldn't do. The difference is they didn't.
|
| Google's problem has always been in product follow through.
| In this case, I fault them for having the sole action item be
| a buried waitlist request and two new brands (Veo and
| VideoFX) for one unreleased product.
| sangnoir wrote:
| > I don't think I've seen serious criticism of Google's
| abilities
|
| Serious or not, that criticism existed on HN - and still
| does. I've seen many comments claiming Google has "fallen
| behind" on AI, sometimes with the insinuation the Google
| won't ever catch up due to OpenAI's apparent insurmountable
| lead
| aprilthird2021 wrote:
| I saw it here alone. A lot of people simply have no idea
| the level of research ability and skill Google, the
| inventor of the Transformer, has.
| KorematsuFredt wrote:
| > Google's problem has always been in product follow
| through.
|
| Google is large enough to not care about small
| opportunities. It ends up focusing on bigger opportunities
| that only it can execute well. Google's ability to shut
| down products that dont work is an insult to user but a
| very good corporate strategy and they deserve kudos for
| that.
|
| Now, coming back to the "follow through". Google Search,
| Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
| excellent examples of Google's long term commitment to the
| product and constantly making things better and keeping
| them relevant for the market. Many companies like Yahoo!
| had a head start but could not keep up with their mail
| service.
|
| Sure it has shut down many small products but that is
| because they were unlikely to turn into bigger
| opportunities. They often integrated the best aspect of
| those products into their other well established products
| such as Google Trips became part of search and Google
| Shopping became part of search.
| falcor84 wrote:
| > coming back to the "follow through". Google Search,
| Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
| excellent examples of Google's long term commitment
|
| Do you have any examples of something they launched in
| the last decade?
| troupo wrote:
| > Google is large enough to not care about small
| opportunities. It ends up focusing on bigger
| opportunities
|
| that result in shittier products overall. For example,
| just a few months ago they cut 17 features from Google
| Assistant because they couldn't monetize them, sorry,
| because these were "small opportunities":
| https://techcrunch.com/2024/01/11/google-is-
| removing-17-unde...
|
| > all are excellent examples of Google's long term
| commitment to the product and constantly making things
| better and keeping them relevant for the market.
|
| And here's a long list of excellent examples of Google
| killing products right and left because small
| opportunities or something: https://killedbygoogle.com/
|
| And don't get me started on the whole
| Hangouts/Meet/Alo/Duo/whatever fiasco
|
| > Sure it has shut down many small products but that is
| because they were unlikely to turn into bigger
| opportunities.
|
| Translation: because they couldn't find ways to monetize
| the last cent out of them
|
| ---
|
| Edit: don't forget: The absolute vast majority of
| Google's money comes from selling ads. There's nothing
| else it is capable of doing at any significant scale. The
| only reason it doesn't "chase small opportunities" is
| because Google _doesn 't know how_. There are a few
| smaller cash cows that it can keep chugging along, but
| they are dwarfed by the single driving force that mars
| everything at Google: the need to sell more and more ads
| and monetize the shit out of everything.
| localfirst wrote:
| Don't forget SORA edited their "ai generated" videos while
| Google did not here.
|
| Where did SORA get all its training videos from again and why
| won't the executives answer a simple Yes/No question to "Did
| you scrape Youtube to train SORA?"
|
| Google attorneys want to know.
| scarmig wrote:
| Google does not care to start a war where every company has
| to form explicit legal agreements with every other company
| to scrape their data. Maybe if they got really desperate,
| but right now they have no reason to be.
| TwentyPosts wrote:
| > Don't forget SORA edited their "ai generated" videos
| while Google did not here.
|
| Wait, really? Could you point to proof for this? I'm very
| curious where this is coming from
| septic-liqueur wrote:
| I have no doubt about Google's capabilities in AI, my doubt
| lies on the productization part. I don't think they can
| produce something that will not be a complete mess
| CSMastermind wrote:
| > In fact, they already did.
|
| In terms of software that's actually been released Google is
| still at best in third place when it comes to AI products.
|
| I don't care what they can demo, I care what they've shipped.
| So far the only thing they've shipped for Veo is a waitlist.
| Xenoamorphous wrote:
| It's tiring. Same thing happened to the GPT-4o announcement
| yesterday. Apparently because there's no unquestionable AGI 14
| months after GPT-4 then everything sucks.
|
| I always found HN contrarian but as I say it's really tiring.
| I've no idea what the negative commenters are working on on a
| daily basis to be so dismissive of everybody else's work,
| including work that leaves 90% of the population in a
| combination of awe and fear. Also people sometimes forget that
| behind big corp names there are actual people. People who might
| be reading this thread.
| motoxpro wrote:
| Yeah it's pretty unfortunate. Saying something sucks is such
| a lack of understanding that things are not static. I guess
| it's a sure way to be right, because there will always be
| progress and you can look back and say "See I told you!"
| IggleSniggle wrote:
| Psh. Things are not static. Progress sucks now. Haven't you
| heard of enshitification? You can always look back and say,
| "see? I told you it would suck in the future!"
|
| ...why am I feeling to urge to point out that I am only
| making a joke here and not trying to make an actual counter
| point, even if one can be made...?
| mupuff1234 wrote:
| What's also tiring is that no one is allowed to have any
| critical thoughts because "it's tiring".
|
| From my own perspective the critique is usually a counter
| balance to extreme hype, so maybe let's just agree it's ok to
| have both types of comments, you know "checks and balances".
| Workaccount2 wrote:
| AI is a pretty direct threat to software engineering. It's no
| surprise people are hostile towards it. Come 2030, how do you
| justify a paying someone $175k/yr when a $20/mo app is 95% as
| good, and the other 5% can be done by someone making $40k/yr?
| piloto_ciego wrote:
| I think it's fear. Maybe not openly, but people are spooked at
| how fast stuff is happening, so shitting on progress is a
| natural reaction.
| Workaccount2 wrote:
| I have noticed this the most in SWE's who went from being
| code writers to "human intention decipherers". Ask a an SWE
| in 2019 what they do and it was "Write novel and efficient
| code", ask one in 2024 and you get "Sit in meetings and talk
| to project managers in order to translate their poor
| communication to good code".
|
| Not saying the latter was never true, it's just interesting
| to see how people have reframed their work in the wake of
| breakneck AI progress.
| kmacdough wrote:
| I suspect it's also a general fatigue with the over-hype. It
| _is_ moving fast, but every step improvement has come with
| its own mini hype cycle. The demos are very curated and make
| the model look incredibly flexible and resilient. But when we
| test the product in the wild, it 's constantly surprising the
| simple tasks it blunders on. It's natural to become a bit
| cynical and human to take that cynicism on the attack. Not
| saying it's right, just natural, in the same way that it's
| natural for the marketing teams to be as misleading as they
| can get away with. Both are annoying, but there's not much to
| do.
| brikym wrote:
| Progress? There are loads of downsides the AI fans won't
| acknowledge. It diminishes human value/creativity and will be
| owned and controlled by the wealthiest people. It's not like
| the horse being replaced by the tractor. This time it's
| different there is no place to move to but doing nothing on a
| UBI (best case). That same power also opens the door to
| dystopian levels of censorship and surveillance. I see more
| of the Black Mirror scenarios coming true rather than
| breakthroughs that benefit society. Nobody is denying that
| it's impressive but the question is more whether it's good
| overall. Unfortunately the toothpaste seems to be out of the
| tube.
| jmkni wrote:
| Well for me it linked to a Google Form to join a waitlist lol,
| so I'm not exactly pumped
| jtolmar wrote:
| I think it's just hype fatigue.
|
| There's genuinely impressive progress being made, but there are
| also a lot of new models coming out promising way more than
| they can deliver. Even the Google AI announcements, which used
| to be carefully tailored to keep expectations low and show off
| their own limitations, now read more like marketing puff
| pieces.
|
| I'm sure a lot of the HN crowd likes to pretend we're all
| perfectly discerning arbiters of the tech future with our
| thumbs on the pulse of the times or whatever, but realistically
| nobody is going to sift through a mountain of announcements
| ranging from "states it's revolutionary, is marginal
| improvement" to "states it's revolutionary, is merely an
| impressive step" to "states it's revolutionary, is bullshit"
| without resorting to vibes-based analysis.
| throwup238 wrote:
| It's made all the worse by just being a giant waitlist. Sora
| is still no where to be seen three months later, GPT-4o's
| conversational features aren't widely rolled out yet, and
| Google's AI releases have been waitlist after waitlist after
| waitlist.
|
| Companies can either get peopled hyped or have never-ending
| georestricted waitlists, they can't have their cake and eat
| it too.
| indigodaddy wrote:
| Isn't their a lot of positive forward motion and
| fruitfulness in the current state of the open source
| llama-3 community?
| Dig1t wrote:
| Honestly just think that Google has burned their good will at
| this point. If you notice, most announcements by Apple are
| positively received here and same with OpenAI. But since
| Google's "don't be evil" persona has faded and since they went
| through so much churn WRT products. I think most people just
| don't want to see them win.
| rmbyrro wrote:
| I hope they didn't mess this one up with ideologically driven
| non-sense, like they did with Gemini.
| clawoo wrote:
| > "This tool isn't available in your country yet"
|
| How did I know I would see this message before clicking "Sign up
| to try"?
| makestuff wrote:
| Is there any good blogs/videos that ELI5 how these video
| generation models even work?
| sys32768 wrote:
| I assume for consumers to use this, we must agree to have product
| placements inserted into our productions every 48 seconds.
| SoftTalker wrote:
| Vaguely unsettling that the thumbnail for first example prompt "A
| lone cowboy rides his horse across an open plain at beautiful
| sunset, soft light, warm colors" looks something like the
| pixelated vision of The Gunslinger android (Yul Brynner's
| character) from the 1973 version of Westworld.
|
| See 1:11 in this video
| https://www.youtube.com/watch?v=MAvid5fzWnY
|
| Incidentally that was one of the early uses of computer graphics
| in a movie, supposedly those short scenes took many hours to
| render and had to be done three times to achieve a colorized
| image.
| AceJohnny2 wrote:
| Can't say I see a visual similarity. In any case, "Cowboy
| silhouette in the sunset" is a pretty classic American visual.
|
| But the parallel you made between android Brynner's vision and
| the generated imagery is fun to consider!
| totaldude87 wrote:
| its 2024 and AI is taking over and yet, to signup for this, it
| take way more clicks and Google form entry(1)
|
| Sigh. I still have hopes for VEO though
| aragonite wrote:
| With so much recent focus by OpenAI/Google on AI's visual
| capabilities, does anyone know when we might see an OCR product
| as good as Whisper for voice transcription? (Or has that already
| happened?) I had to convert some PDFs and MP3s to text recently
| and was struck by the vast difference in output quality.
| Whisper's transcription was near-flawless, all the OCR softwares
| I tried struggled with formatting, missed words, and made many
| errors.
| jazzyjackson wrote:
| You might enjoy this breakdown of the lengths one person went
| through to take advantage of the iOS vision API and creating a
| local web service for transcribing some very challenging memes:
|
| https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...
|
| discussed on HN:
|
| https://news.ycombinator.com/item?id=34315782
| aragonite wrote:
| This is so good - thanks for sharing this!
| thesandlord wrote:
| We use GPT-4o for data extraction from documents, its really
| good. I published a small library that does a lot of the
| document conversion and output parsing:
| https://npmjs.com/package/llm-document-ocr
|
| For straight OCR, it does work really well but at the end of
| the day its still not 100%
| aragonite wrote:
| Thanks! look forward to checking this out as soon as I get
| home.
| tauntz wrote:
| Uh.. First it tells me that I can't sign up because my country is
| supported (yay, EU) and I can sign up to be notified when it's
| actually available. Great, after I complete that form, I get an
| error that the form can't be submitted and I'm taken to
| https://aitestkitchen.withgoogle.com/tools/video-fx where I can
| only press the "Join our waitlist" button. This takes me to a
| Google Form, that doesn't have my country in the required country
| dropdown and has a hint that says: "Note: the dropdown only
| includes countries where ImageFX and MusicFX are publicly
| available.". Say what?
|
| Why does this have to be so confusing? Is the name "Veo" or
| "VideoFX"? Why is the waitlist for VideoFX telling me something
| about public availability of ImageFX and MusicFX? Why is
| everything US only, again? Sigh..
| pelorat wrote:
| We can blame the EU AI act and other regulations for that.
| benatkin wrote:
| Yo lo veo.
| wseqyrku wrote:
| Google puts more effort into the namings than the actual model,
| ngl.
| bpodgursky wrote:
| I think it's funny the demos don't have people in them after the
| Gemini fiasco. I wonder if they didn't have time to re-train the
| model to show representative ethnicities.
| thih9 wrote:
| Is there any non slow motion example?
|
| The cyberpunk video seems better in that aspect, but I wish there
| were more.
| gliched_robot wrote:
| This is far more superior than SORA, there is no comparison.
| monkeeguy wrote:
| lol
| xnx wrote:
| 60 second example video:
| https://www.youtube.com/watch?v=diqmZs1aD1g
| svag wrote:
| An interesting thing that Google does is to watermark the AI
| generated videos using the [SynthID
| technology](https://deepmind.google/technologies/synthid/).
|
| It seems that the SynthID is not only for AI generated video but
| for image, text and audio.
| s1k3s wrote:
| This looks really good for promo videos. All scenes in here are
| basically that.
| KorematsuFredt wrote:
| I think we should all take a pause and just appreciate the
| amazing work Google, OpenAI, MS and many others including those
| in academia have done. We do not know if Google or OpenAI or
| someone else is going to win the race but unlike many other
| races, this one makes the entire humanity move faster. Keep the
| negativity aside and appreciate the sweat and nights people have
| poured into making such things happen. Majority of these people
| are pretty ordinary folks working for a salary so they can spend
| their time with their families.
| ugh123 wrote:
| From a filmmaking standpoint I still don't think this is
| impactful.
|
| For that it needs a "director" to say: "turn the horse's head 90@
| the other way, trot 20 feet, and dismount the rider" and "give me
| additional camera angles" of the same scene. Otherwise this is
| mostly b-roll content.
|
| I'm sure this is coming.
| evantbyrne wrote:
| They claim it can accept an "input video and editing command"
| to produce a new video output. Also, "In addition, it supports
| masked editing, enabling changes to specific areas of the video
| when you add a mask area to your video and text prompt." Not
| sure if that specific example would work or not.
| qingcharles wrote:
| I can see using these video generators to create video
| storyboards. Especially if you can drop in a scribbled sketch
| and a prompt for each tile.
| iamleppert wrote:
| Too little, too late. Google is follower, not leader. They need
| to stop trying and do more stock buybacks and strip the company
| to barebones, like Musk did with Twitter & Tesla.
| NegativeLatency wrote:
| Shoulda used youtube to host their video, it's all broken and
| pixelated for me
| m3kw9 wrote:
| Why is it always in slow motion, is it hard to get the speed
| correctly?
| miohtama wrote:
| > Veo's cutting-edge latent diffusion transformers reduce the
| appearance of these inconsistencies, keeping characters, objects
| and styles in place, as they would in real life.
|
| How is this achieved? Is there temporal memory between frames?
| toasted-subs wrote:
| I could say something but I'm glad to get the confirmation.
| shaunxcode wrote:
| truly removing the `id` from video.
| abledon wrote:
| music is lacking.... suno, udio, riffusion all blow this out of
| the water
| ijidak wrote:
| These will be remembered as the AI wars.
|
| Reminds me of the competition in tech in the late 80's early 90's
| between Microsoft and Borland, Microsoft and IBM, AMD and Intel,
| Word vs Wordperfect, etc.
|
| It's a two horse race between Google and OpenAI.
| animanoir wrote:
| Google is so finished... Unless they remove Mr. Pinchar...
___________________________________________________________________
(page generated 2024-05-14 23:00 UTC)