[HN Gopher] Veo
___________________________________________________________________
Veo
Author : meetpateltech
Score : 1620 points
Date : 2024-05-14 17:58 UTC (1 days ago)
(HTM) web link (deepmind.google)
(TXT) w3m dump (deepmind.google)
| moralestapia wrote:
| Not nearly as good as Sora.
|
| Google missed this train, big time.
| DarmokJalad1701 wrote:
| > "SIGN IN TO GET A SNEAK PEAK."
|
| https://theoatmeal.com/comics/sneak_peek
| baal80spam wrote:
| Whoa. The URL is correct, the text is not.
| sowbug wrote:
| The page is now fixed. Even for a "test kitchen," that's a
| shocking error for a company like Google to make.
| hehdhdjehehegwv wrote:
| You have to log in just to see a demo? They are desperate to
| track people.
| iamleppert wrote:
| It's so bad its laughable. Sundar really needs to crack the whip
| harder on those Googlers.
| bamboozled wrote:
| or someone has to crack the whip on Sundar :)
| geodel wrote:
| His job is keeping stock price up which he is doing well so
| far. Another is layoffs which again he is doing fine :)
| htrp wrote:
| Was anyone else confused by that Donald Glover segment. It felt
| like we were going to get a short film, and we got 3-5 clips?
| curiousgal wrote:
| Exactly!
|
| _" Hey guys big artist says this is fine so we're good"_
| jsheard wrote:
| And those clips mostly look like generic stock footage, not
| something specific that a director might want to pre-vis.
|
| This is what movie pre-vis is actually like, it doesn't need to
| be pretty, it needs to be _precise_ :
|
| https://www.youtube.com/watch?v=KMMeHPGV5VE
| thisoneworks wrote:
| Yeah that wasn't obvious what they were trying to show. Demis
| said feature films will be released in a while
| Keyframe wrote:
| It felt AI-generated.
| htrp wrote:
| I wish it were AI Donald Glover talking and the "Apple twist"
| at the end was that the entire 3 minute segment was a prompt
| for "Donald Glover talking about how Awesome Gemini Models
| are in a California vineyard"
| ZiiS wrote:
| Also it is either very good at generating living people or they
| need to put more though into saying "Note: All videos on this
| page were generated by Veo and have not been modified"
| jsheard wrote:
| That "footage has not been modified" statement is probably to
| get ahead of any speculation that it was "cleaned up" in
| post, after it turned out that the Sora demo of the balloon
| headed man had fairly extensive manual VFX applied afterwards
| to fix continuity errors and other artifacts.
| iamdelirium wrote:
| Wait, where did you hear this? I would assume something
| like this would have made somewhat of a splash.
| jsheard wrote:
| The studio was pretty up front about it, they released a
| making-of video one day after debuting the short which
| made it clear they used VFX to fix Soras errors in post,
| but OpenAI neglected to mention that in their own copy so
| it flew under the radar for a while.
|
| https://www.youtube.com/watch?v=KFzXwBZgB88
|
| https://www.fxguide.com/fxfeatured/actually-using-sora/
|
| _> While all the imagery was generated in SORA, the
| balloon still required a lot of post-work. In addition to
| isolating the balloon so it could be re-coloured, it
| would sometimes have a face on Sonny, as if his face was
| drawn on with a marker, and this would be removed in
| AfterEffects. similar other artifacts were often
| removed._
| TIPSIO wrote:
| Seems like ImageFX, VideoFX (just a Google form and 3 demos),
| MusicFX, and TextFX at the links are down and not working.
|
| Huge grammar error on front page too.
| indy wrote:
| As someone who doesn't live in the US this year's Google IO feels
| like I'm outside looking in at all the cool kids who get to play
| with the latest toys.
| roynasser wrote:
| VPN'd right into that playground, turns out the toys were
| pretty blah
| numbers wrote:
| don't feel left out, we're all on the wait lists
| curiousgal wrote:
| Oh look another half baked product release that's not available
| in any country. They're a joke.
| mupuff1234 wrote:
| Is Sora available in any country?
| bamboozled wrote:
| I thought I read they've deemed Sora too dangerous to release
| pre- election ? Or have reservations about it ? I might be
| wrong...
| sib wrote:
| Sounds like a great excuse / communications strategy!
| jaggs wrote:
| Apparently it's only released to red teams at the moment as
| they try to manage safety. There's also the issue about
| releasing too close to an election?
| sebzim4500 wrote:
| The Donald Glover segment might be a new low for Google
| announcement videos. They spent all this time talking up the
| product but didn't actually show what he had created.
|
| Imagine how bad the model must be if this is the best way Google
| can think of selling it.
| fakedang wrote:
| What seems worse is the Google TextFX video with Lupe Fiasco?
| What the heck am I supposed to get out of watching boring
| monologues by a couple of people? They could have just as
| easily shown, with less camera work, Lupe Fiasco actually using
| the LLM model, but they didn't - or at least not enough to grab
| my attention in 2 minutes.
|
| Personally, I liked the above link, even as a Google skeptic,
| but the videos aren't helping their case.
| Horffupolde wrote:
| Google is the new Kodak.
| bingbingbing777 wrote:
| Kodak failed because their CEO refused to go down the digital
| route. How is that comparable?
| loudmax wrote:
| The videos in this demo are pretty neat. If this had been
| announced just four months ago we'd all be very impressed by the
| capabilities.
|
| The problem is that these video clips are very unimpressive
| compared to the Sora demonstration which came out three months
| ago. If this demo was announced by some scrappy startup it would
| be worth taking note. Coming from Google, the inventor of the
| Transformer and owner of the largest collection of videos in the
| world, these sample videos are underwhelming.
|
| Having said that, Sora isn't publicly available yet, and maybe
| Veo will have more to offer than what we see in those short clips
| when it gets a full release.
| fakedang wrote:
| Honestly, if Veo becomes public faster than Sora, they could
| win the video AI race. But what am I wishfully thinking - it's
| Google we're talking about!
| Jensson wrote:
| > But what am I wishfully thinking - it's Google we're
| talking about!
|
| Google the company known to launch way too many products?
| What other big company launches more stuff early than them?
| What people complain about Google is that they launch too
| much and then shut them down, not that they don't launch
| things.
| fakedang wrote:
| Google lost first place in AI precisely because they've
| been walking around imaginary eggshells regarding AI's
| effect on the public. That led to the whole Gemini fiasco
| and the catch up game they've had to play with OpenAI-MSFT.
| spaceman_2020 wrote:
| The cost to switch to new models is negligible. People will
| switch to Sora if its better instantly
|
| I've switched to Opus from GPT-4 for coding and it was non-
| trivially easy
| ndls wrote:
| I think you used non-trivially wrong there, bud.
| SilverSlash wrote:
| Except your single experience doesn't mean it's generally
| true, bud. For instance I have not switched to Opus despite
| claims that it is better because I don't want to go through
| the effort of cancelling my ChatGPT subscription and
| subbing to Claude. Plus I like getting new stuff early that
| OpenAI occasionally gives out and the same could apply for
| Google's AI.
| fakedang wrote:
| Sorry, but lock in effects are real. End users, solo devs
| and startups might find it trivially easy, but enterprise
| clients would go through hoops before a decision is to be
| made. And enterprise clients would rather not go through
| with that, hence they'll stick to whoever came first,
| unless there's a massive differentiator between the two.
| alex_duf wrote:
| >these sample videos are underwhelming
|
| wow the speed at which we can be blase is terrifying. 6 months
| ago this was not possible, and felt this was years away!
|
| They're not underwhelming to me, they're beyond anything I
| thought would ever be possible.
|
| are you genuinely unimpressed? or maybe trying to play it cool?
| danielbln wrote:
| The faster the tech cycle, the faster we become accustomed to
| it. Look at your phone, an absolute, wondrous marvel of
| technology that would have been utterl and totally scifi just
| 25 years ago. Yet we take it for granted, as we do with all
| technology eventually. The time frames just compress is all,
| for better or for worse.
| newswasboring wrote:
| Yeah man but there has to be some thresholds. We take
| phones for granted after years of active availability. I
| personally remember days when "what if your phone dies" was
| a valid concern for even short periods, and I'm not that
| old. Sora isn't even available publicly. At some point it
| crosses over from being jaded to just being a cynic.
| loudmax wrote:
| On some level, it's healthy to retain a sense of humility at
| the technological marvels around us. Everything about our
| daily lives is impressive.
|
| Just a few years ago, I would have been absolutely blown away
| by these demo videos. Six months ago, I would have been very
| impressed. Today, Google is rolling a product that seems
| second best. They're playing catch-up in a game where they
| should be leading.
|
| I will still be very impressed to see videos of that quality
| generated on consumer grade hardware. I'll also be extremely
| impressed if Google manages to roll out public access to this
| capability without major gaffes or embarrassments.
|
| This is very cool tech, and the developers and engineers that
| produced it should be proud of what they've achieved. But
| Google's management needs to be asking itself how they've
| allowed themselves to be surpassed.
| steamer25 wrote:
| They didn't really do a very good job of selecting marketing
| examples. The only good one, that shows off creative
| possibilities, is the knit elephant. Everything else looks
| like the results of a (granted fairly advanced) search
| through a catalog of stock footage.
|
| Even search, in and of itself, is incredibly amazing but
| fairly commoditized at this point. They should've highlighted
| more unique footage.
| mccraveiro wrote:
| They didn't show any human videos, which could indicate that the
| technology struggles with generating them.
| karmasimida wrote:
| Actually there is one in the last demo, it is not an individual
| one, but one shot in the demo where a team uses this model to
| create a scene with human in it, where they created an image of
| black woman but only up her head in it
|
| I would generally agree though, it is not normal they didn't
| show more human
| revscat wrote:
| I'm sure part of the reason, beyond those given already, is
| that they want to avoid the debate around nudity.
| dyauspitr wrote:
| You know why and it's not that their technology struggles with
| it.
| lewispollard wrote:
| Please elaborate, because I certainly don't.
| blinky88 wrote:
| I think he's talking about the diversity controversy
| dyauspitr wrote:
| That might be a factor too but I was referring more to
| the nudity and objectification issue.
| chubot wrote:
| It's also probably that it's easier to spot fake humans than to
| spot fake cats or camels. We are more attuned to the faces of
| our own species
|
| That is, AI humans can look "creepy" whereas AI animals may
| not. The cowboy looks pretty good precisely because it's all
| shadow.
|
| CGI animators can probably explain this better than I can ...
| they have to spend way more time on certain areas and certain
| motions, and all the other times it makes sense to "cheat" ...
|
| It explains why CGI characters look a certain way too -- they
| have to be economical to animate
| mjfl wrote:
| thank goodness.
| himinlomax wrote:
| They're probably still wary of their latest PR disaster, the
| inclusive and diverse WW2 Germans from Gemini.
| xianshou wrote:
| To quote Twitter/X, "I wonder what OpenAI will release tomorrow
| and Google will release a waitlist for."
|
| GPT-4o: out
|
| Veo: waitlist
|
| Admittedly this is impressive and the direct comp would be Sora,
| which isn't out, but sometimes the caricature is very close to
| the truth.
| jsheard wrote:
| Then again Veo is in the same category as Sora, which isn't
| released either, 3 months after the reveal.
| rvnx wrote:
| "This tool isn't available in your country yet"
| modeless wrote:
| To be fair, all the voice stuff OpenAI demoed isn't released
| yet either.
| martinesko36 wrote:
| This is Google for the last 5+ IOs. They just release waitlists
| and demos that are leapfrogged by the time they're available to
| all. (and shut down a few years later)
| htrp wrote:
| Cite sources?
| skepticATX wrote:
| OpenAI hardly released gpt-4o. The demo yesterday was clearly a
| rushed response to I/O. It's quite possible that Google will
| ship multi-modality features faster than OpenAI will.
| JeremyNT wrote:
| Yeah I think at this point it's "not if, but when" and the
| gap between parity is just going to keep shrinking
| (until/unless there's some kind of copyright/legislative
| barrier implemented that favors one or the other).
|
| "We have no moat" swings both ways.
| juice_bus wrote:
| Which one of these products Google are releasing that you can
| trust will even be around in a year or two? I'm certainly
| done trusting Google with new products.
| buildbot wrote:
| Without doing anything, I have access to GPT-4o in chatgpt
| and the api already (on a personal account, not related to
| work). Maybe I'm just super lucky, but it's certainly not
| vaporware.
| Difwif wrote:
| What do you mean? Everyone has access to the gpt-4o model
| right now through ChatGPT and the API. Sure we don't have
| voice-to-voice but we have a lot more than what Google has
| promised.
| hbn wrote:
| How do I get access? I just checked my app and the Premium
| upgrade says it will unlocked GPT-3.5 and GPT-4, so I
| assume my version is still the old one.
|
| All my apps are updated in the App Store too.
| bagels wrote:
| I have a paid account, and I didn't have to do anything
| to use the new model.
| DeRock wrote:
| I just checked, there was an iOS app update available and
| it enabled it. I'd check again if there's a new update
| (version 1.2024.129). Or you could use the website.
| hbn wrote:
| I'm on the same version and don't see anything different
|
| Website also only has toggle for 3.5 and 4 with the Plus
| upgrade. Not sure if it's cause I'm in Canada?
| croes wrote:
| I use their website and it's one of the three models to
| choose from if your are on plus subscription.
| mr_mitm wrote:
| I have premium access and I can select 4o in the dropdown
| menu on Android
| cush wrote:
| I have the 4o model. On premium. No voice yet
| theresistor wrote:
| To add a counterbalance, I just checked in the app and on
| the website on a non-paid account, and I too do NOT have
| GPT-4o.
| hbn wrote:
| Everyone who says they have access in the replies to my
| comment seem to be paid users. So maybe it's only rolling
| out to them first.
| htrp wrote:
| I expect they have to offer the paid users some thing
| hbn wrote:
| Paid users get like a 5x higher rate limit iirc
| electriclove wrote:
| My (paid) app has it but no voice chat yet
| TecoAndJix wrote:
| I have a paid account and can do voice to voice on the
| iOS app as of last night.
| hbn wrote:
| The realtime one they showed yesterday or the one that's
| existed forever where it's just a voice-to-text input and
| TTS output taking turns?
| TecoAndJix wrote:
| I feel silly now. I downloaded the app after the
| announcement (I'm a desktop user) and it looked identical
| to the one they show in the sarcasm video. When I asked
| it, I was told it was not the new feature announced
| yesterday. Still a lot of fun!
|
| Edit - it does list the new model in my app at least
| sib wrote:
| In the App Store there's a new build of the iOS app as of
| 3 hours about (call it about 11am US Pacific time). It
| includes the GPT-4o model (at least it shows it for me.)
| hbn wrote:
| Are you a paid user?
| theresistor wrote:
| It is not available on my (free) account in either the app
| or the website. So no, everyone does _not_ have access to
| it.
| satvikpendem wrote:
| It's for paid users for now, not free. I have ChatGPT Pro
| and I can use the new model.
| nialv7 wrote:
| I am a free user and I have 4o. I think it is just a
| gradual roll out.
| satvikpendem wrote:
| That sounds about right, it just seemed that everyone who
| replied above who had access to the new model were paid
| users.
| ben_w wrote:
| API yes, ChatGPT no (at least not for all users); I've got
| my own web interface for the API so I can play with the
| model (for all of $0.045 of API fees), but most people
| can't be bothered with that and will only get 4o when it
| rolls out as far as their specific ChatGPT account.
| ghshephard wrote:
| Was running just fine on my ChatGPT client on iOS - full
| 2-way voice comms by 3:00 PM yesterday. Application was
| already updated.
| mike_hearn wrote:
| I have a regular ChatGPT Pro account and I have GPT-4o.
|
| The bigger issue is that 4o without the multi-modal, new
| speech capabilities or desktop app isn't that different
| to GPT-4. And those things aren't yet launched.
| TecoAndJix wrote:
| Posted further down the thread - I have a paid account and
| can do voice to voice on the iOS app as of last night.
| baobabKoodaa wrote:
| I don't have access to gpt-4o via ChatGPT
| localfirst wrote:
| listen all these guys out here attacking Google and making
| outlandish/false claims
|
| look at their linkedin pages, that will tell you why they are
| desperate
|
| (hint: they bought OpenAI bags on the secondary market)
| rvz wrote:
| Sora is the closest comparison to Veo and both aren't out.
|
| It's been there for three months and still isn't even close to
| being released and available.
|
| Essentially Google has already caught up to OpenAI with their
| recent responses and it's clear that there are private OpenAI
| investors pushing such nonsense around Google struggling to
| compete.
| resource_waste wrote:
| Google Press: This is the greatest AI Model ever yet.
|
| Users: Lol it wont even tell me how to draw a picture of a
| human because its inappropriate.
|
| Google flipped like a switch a few years ago. Instead of going
| for product quality, it seems they went full Apple Marketing
| and control the narrative of top social media.
|
| I keep trying thinking: "well its Google, they will be the best
| right?" No, I'm at giving up on Google, they are not as
| powerful as I once thought... Hmm seems like a good time to get
| into Lobbying and Marketing...
| nextworddev wrote:
| Except gpt-4o with audio and video inputs isn't actually out
| adamtaylor_13 wrote:
| I was using it yesterday in the mobile app. Unless they just
| slapped the new UI on an older model.
| hakanensari wrote:
| It's no longer there, I think?
| Cyph0n wrote:
| I think they (partially?) rolled it back. I tried out the
| voice input yesterday, but it's missing from the app today.
| Workaccount2 wrote:
| They just released the text chat model which still uses the
| same old audio interface as 4. The new audio/video chat
| stuff is not out yet (unless you are a very lucky early
| beta user).
| dom96 wrote:
| > GPT-4o: out
|
| Is it? I can't use it yet at least
| lordswork wrote:
| I am also wondering how to use it..
| drawnwren wrote:
| It is. I've got it already, but I'm a bit of a gpt4 power
| user. I hit my rate limit biweekly or so and run up close to
| it every day. I'd bet maybe they prioritized people that were
| costing them money.
| kaibee wrote:
| It might be just by sign-up order. I signed up for pro
| basically as soon as I could, but I never hit limits, and
| only really use it once or twice a day, sometimes not at
| all.
| drawnwren wrote:
| Interestingly, when I use Cloudflare's Warp DNS I don't
| have access to it. So, it might have something to do w/
| region as well?
| phyalow wrote:
| Really? You have the new model, sure, I have it too, but
| afaik nobody has the new ultra fast and variable voice +
| video chat on mobile.
| drawnwren wrote:
| The original question asked if anyone had GPT-4o. You're
| asking a different question.
| w-m wrote:
| GPT-4o is available for me on ChatGPT, with the
| text+attachment input (as a Plus user from Germany). It's
| crazy fast. The voice for the audio conversation in the app
| is still the old one, and doesn't let you interrupt it.
| mrkramer wrote:
| Google is scared of what every new model can produce, they
| don't want drama but they always end up in some kind of media
| drama.
| Tenoke wrote:
| I can't even join the waitlist from Europe while 4o is fully
| available here.
| dyauspitr wrote:
| I haven't been able to try out 4o. The voice chat continuously
| says there's too much traffic and I don't even see a button to
| turn on the camera
| qwertox wrote:
| > GPT-4o: out
|
| I don't know what's wrong with GPT-4o, but the answers I'm
| getting are much worse than before yesterday. It's constantly
| repeating the entire content required to provide a seemingly
| "full" answer, but if it passes me the same but slightly
| modified Python code for the fifth time even if it has become
| irrelevant to the current conversation, it really gets on my
| nerves.
|
| I had so well tuned custom instructions which worked
| beautifully and now it's as if it is ignoring most of them.
|
| It's causing me frustration and really wasting my time when I
| have to wait for the unnecessary long answers to finish.
| endisneigh wrote:
| I've noticed that a lot of the commentary of these models creates
| the sort of fervor like politics or sports.
|
| In any case - no details on compute needed. Curious if this ever
| can be cheap. Even Midjourney still requires a lot.
|
| I'm also surprised there hasn't been some attempt at creating
| benchmarks for this. One example could be color accuracy.
| stefan_ wrote:
| Never mind no benchmarks, half of these announcements in the
| past were straight _made up_ , "offline enhanced" cherry picked
| "examples", CGI fantasies.
|
| Not to mention the whole AGI topic is forever doomed from SciFi
| fans, just remember what happened with that room-temperature
| superconductivity.
| inasio wrote:
| From a 2014 Wired article [0]: "The average shot length of
| English language films has declined from about 12 seconds in 1930
| to about 2.5 seconds today"
|
| I can see more real-world impact from this (and/or Sora) than
| most other AI tools
|
| [0] https://www.wired.com/2014/09/cinema-is-evolving/
| jsheard wrote:
| Even if the shots are very short you still need coherency
| _between_ shots, and they don 't seem to have tackled that
| problem yet.
| mattgreenrocks wrote:
| This is very noticeable. Watching movies from the 1970s is
| positively serene for me, vs the shot time on modern films
| often leaves me wonder, "wait, what just happened there?"
|
| And I'm someone who is fine playing fast action video games.
| Can't imagine what it's like if you're older or have sensory
| processing issues.
| ryandrake wrote:
| Obligatory: Liam Neeson jumps over a fence in 6 seconds, with
| 14 cuts[1].
|
| 1: https://www.youtube.com/watch?v=gCKhktcbfQM
| aidenn0 wrote:
| I'd like to fact check this amazing comment on that video,
| but it would require watching Taken 3:
|
| > Some of y'all may find how awful this editing gets pretty
| interesting: I did an Average Shot Length (ASL) for many
| movies for a recent project, and just to illustrate bad
| overediting in action movies, I looked at Taken 3 (2014) in
| its extended cut.
|
| > The longest shot in the movie is the last shot, an aerial
| shot of a pier at sunset ending the movie as the end
| credits start rolling over them. It clocks in at a runtime
| of 41 seconds and is, _BY FAR_ , the longest shot in the
| movie.
|
| > The next longest is a helicopter establishing shot of the
| daughter's college after the "action scene" there a little
| over an hour in, at 5 seconds.
|
| > Otherwise, the ASL for Taken 3 (minus the end
| credits/opening logos), which has a runtime of 1:49:40,
| 4,561 shots in all (!!!), is 1.38 SECONDS . For comparison,
| Zack Snyder's Justice League (2021) (minus end
| credits/opening logos) is 3:50:59, with 3163 shots overall,
| giving it an ASL of 4.40 seconds, and this movie, at 1 hour
| 50 minutes, has north of 4,561 for an ASL of 1.38
| seconds?!?! _Taken 3 has more shots in it than Zack Snyder
| 's Justice League, a movie more than double its length..._
|
| > To further illustrate how ridiculous this editing gets,
| the ASL for Taken 3's non-action scenes is 2.27 seconds. To
| reiterate, this is the non-action scenes. The "slow
| scenes." The character stuff. Dialogue scenes. The stuff
| where any other movie would know to slow down. 2.27 SECONDS
| For comparison, Mad Max: Fury Road (minus end
| credits/opening logos) has a runtime of 1:51:58, with 2646
| shots overall, for an ASL of 2.54 seconds. TAKEN 3'S "SLOW
| SCENES" ARE EDITED MORE AGGRESSIVELY THAN MAD MAX: FURY
| ROAD!
|
| > And Taken 3's action scenes? _Their ASL is 0.68 seconds!_
|
| > If it weren't for the sound people on the movie, Taken 3
| wouldn't be an "action movie". It'd be abstract art.
| throwup238 wrote:
| It's worth noting that Taken 3 has a 13% rating on Rotten
| Tomatoes, which is well in to "it's so bad it's good"
| territory. I don't think the rapid cuts went unnoticed.
| nimithryn wrote:
| Yeah, this sequence is a meme commonly cited to show
| "choppy modern editing"
| llmblockchain wrote:
| More chops than an MF DOOM track.
| kristofferR wrote:
| The top comment makes a really good point though:
|
| "He's 68. I'm guessing they stitched it together like this
| because "geriatric spends 30 seconds scaling chainlink
| fence then breaks a hip" doesn't exactly make for riveting
| action flick fare."
|
| Lingering shots are horrible for obscuring things.
| lupire wrote:
| Movies have stunt performers.
|
| And Neeson was only 60 when filming Taken 3.
| troupo wrote:
| Keanu Reeves was 57-8 when he shot the last _John Wick_.
| IIRC Bob Odenkirk was 58 in _Nobody_. Neeson was 60 in
| Taken 3.
|
| There ways to shoot an action scene with an aging star
| that doesn't involve 14 cuts in 4 seconds. You just have
| to care about your craft.
| nineteen999 wrote:
| Is it Liam Neeson, or his stunt double?
| psbp wrote:
| My brain processes too slow for modern action movies.
|
| I can tell what's going on, but I always end up feeling
| agitated.
| MarcScott wrote:
| I'm okay with watching the majority of action movies, but I
| distinctly remember watching this fight scene in a Bourne
| movie and not having a clue what was going on. The constant
| camera changes, short shot length, and shaky cam, just
| confused the hell out of me.
|
| https://youtu.be/uLt7lXDCHQ0?si=JnVMjmu0WgN5Jr5e&t=70
| earthnail wrote:
| I thought it was brilliant. Notice there's no music. It's
| one of the most brutal action scenes I know. Brutal in
| the sense of how honest it felt about direct combat.
| JohnMakin wrote:
| I'm glad we're finally getting away from the 00's shaky
| cam era.
| kemitchell wrote:
| Enjoy some Tarkovsky.
| joshuahedlund wrote:
| How many of those 2.5 second "shots" are back-and-forths
| between two perspectives (ex. of two characters talking to one
| another) where each perspective is consistent with itself? This
| would be extremely relevant for how many seconds of consistent
| footage are actually needed for an AI-generated "shot" at film-
| level quality.
| lobochrome wrote:
| Shot length, yes - but the scene stays the same. Getting
| continuity with just prompts seems not yet figured out.
|
| Maybe it's easy, and you feed continuity stills into the
| prompt. Maybe it's not, and this will always remain just a more
| advanced storyboarding technique.
|
| But then again, storyboards are always less about details and
| more about mood, dialog, and framing.
| chipweinberger wrote:
| In 1930 they often literally had a single camera.
|
| Just worth keeping that in mind. You could not just switch
| between multiple shots like you can today.
| Keyframe wrote:
| Kind of sucks to be google. Even they're making good progress
| here, and have laid the foundations of a lot if not most things..
| their products are, well there aren't any noteworthy compared to
| rest. And considering google is sitting on top of one of the
| largest if not THE largest video database, along with maps,
| traffic, search, internet.zip, usenet, vast computing resources
| vertically integrated.. they have the whole advantage in the
| world. So, the hell are they doing? Why isn't their CEO already
| out? Expectations from them are higher than from anyone else.
| InfiniteVortex wrote:
| Google search has been absolutely ruined in terms of quality.
| You're right, they've built the base in terms of R&D for many
| of the AI breakthroughs thats powering competing alternative
| products.... that happen to be better than Google's own
| products. Google went from "Don't be evil" to just another big
| corporate tech company. They have so much potential.
| Regrettable.
| CraftingLinks wrote:
| They are fast on their way becoming IBM 2.0.
| jason-phillips wrote:
| More like Xerox
| dyauspitr wrote:
| If anything google search with the Gemini area on the top has
| been very good for me.
| atleastoptimal wrote:
| Because they punish experimentation as it eats into their
| bottom line. AI is a tool for ads in the mind of executives at
| Google. Ads and monetization of human productivity, not an
| agent of productivity on its own.
| khazhoux wrote:
| C'mon, Google doesn't "punish" experimentation. Google X,
| Google Glass, Daydream, Fuschia, moonshots, the lab spinoff
| (whose name I can't remember)... hell, even all the abandoned
| products everyone here always complains about.
|
| The experiments often/usually fail, but they _do_ experiment.
| Koffiepoeder wrote:
| If you prune all the branches, where will the fruits grow?
| khazhoux wrote:
| The branches were dead and could bear no fruit. New
| branches will sprout next season.
| saalweachter wrote:
| For grapes, the conventional wisdom is to prune all the
| old branches at the end of each season.
| lolinder wrote:
| "Laser-focused on the bottom line at the expense of all else"
| is not how I'd describe Google, now or at any point in the
| past. They have a _lot_ of dysfunction, but if anything that
| dysfunction stems from _too much_ experimentation and
| autonomy at the leaf nodes of the organization. That 's how
| they get into these crazy places where they have to pick
| between 5 chat apps or whatever.
|
| If Google were as focused on ads as you seem to think we'd at
| least see some sort of coherent org-wide strategy instead of
| a complete lack of direction.
| criddell wrote:
| I'd describe Google as focused on the bottom line after
| they put the ads guy in charge of search.
|
| I'm referring to this article that was posted here
| recently:
|
| https://www.wheresyoured.at/the-men-who-killed-google/
| khazhoux wrote:
| The person now in charge of Search is Elizabeth Hamon
| Reid, a long-time googler who came up through the ranks
| from engineer (in Google Maps) to VP over 20 years. She's
| legit.
| criddell wrote:
| Is Wikipedia out of date then?
|
| https://en.wikipedia.org/wiki/Prabhakar_Raghavan
| khazhoux wrote:
| Ah, according to this, she's head of Search but reports
| to Prabhakar. I thought from recent reports that she'd
| taken search over from him.
|
| Nonetheless, she was a good engineer and a good manager,
| back when we crossed path many moons ago.
|
| https://searchengineland.com/liz-reid-google-new-head-of-
| sea...
| lolinder wrote:
| That was a decision that prioritized the bottom line over
| other things. But saying that Google is "focused" on the
| bottom line implies that there's a pattern of them
| putting the bottom line first, which is simply not true
| if you look at Google as a whole. Search specifically,
| maybe, but not Alphabet.
| Workaccount2 wrote:
| I don't know how more people don't talk about the 1M context
| tokens. While the output is mediocre for cutting edge models,
| you can context stuff the ever living hell out of it for some
| pretty amazing capabilities. 2M tokens is even crazier.
| lordswork wrote:
| It is pretty amazing. I've been using it every day. I do wish
| you could easily upload an entire repo into it though.
| bongodongobob wrote:
| Have it write a program to output a repo as a flat file.
| rm_-rf_slash wrote:
| Anything approaching the token limit I turn into a file and
| upload to a vector store. Results are comparable between Chat
| and Assistants.
| Keyframe wrote:
| That's a good point. Gemini gatekeeping me on so many answers
| made me forget about this extraordinary feature of it.
| softwaredoug wrote:
| It's often said you need to disrupt your own business model.
|
| Google had blinders on. They didn't relentlessly focus on
| reinventing their domain. They just milked what they had.
| Gradually losing site of the user experience[1] to focus on
| monetization above all else.
|
| 1 - https://twitter.com/pdrmnvd/status/1707395736458207430
| dyauspitr wrote:
| Their CEO is generating massive, growing profits every quarter
| while releasing generative technology, all the while threading
| a fine line in what those models generate because it can be
| pretty devastating for a large corp like Google.
| Keyframe wrote:
| you think it's because of him or despite him?
| airstrike wrote:
| > Veo
|
| > Sign up to try VisionFX
|
| Is it Veo or VisionFX? Is it a sign up, a trial, or a waitlist?
|
| How hard can it be to write a clear message? In the words of Don
| Miller, if you confuse, you lose.
| therein wrote:
| Yeah I was like so is it Veo or VisionFX.
|
| This landing page feels as haphazardly put together as the
| Coinbase downtime page last night.
| peppertree wrote:
| This is very on-brand with how Google does branding. "are you
| confused yet? no? try this other vaguely similar name."
| davidw wrote:
| Maybe it's going to be a new messaging app - but with AI!
|
| Kidding... I signed up for the waitlist. I have ideas for
| videos I'd like to use to explain things that I have no hope
| of creating myself.
| BlackJack wrote:
| Disclaimer: I work at Google on related stuff
|
| Veo is the name of a video model. VideoFX is the name of a new
| experimental tool at labs.google.com, which uses Veo and lets
| you make videos.
|
| Thanks for the feedback though, I see how it's confusing for
| users.
| zb3 wrote:
| I see the endpoint returns "Not Implemented" when trying to
| make a video :<
|
| Imagen 3 is awesome though, generates nice logos :D
| mike_hearn wrote:
| Presumably this is DeepMind vs Labs fighting over the same
| project. A consequence of guaranteeing Demis some level of
| independence when DeepMind was bought, which still shows
| through in the fact that the DeepMind brand(s) survive.
| qingcharles wrote:
| And: Communication isn't what you say, it's what people hear
|
| Agree this is totally confusing.
| rishav_sharan wrote:
| Now that the first direct competitor to Sora has been announced,
| I am sure Sora will be suddenly ready for public consumption, all
| it's ai safety concerns forgotten
| sebastiennight wrote:
| I think there's a tremendous compute cost associated with both
| models still... I can't see how either company could withstand
| the instant enormous demand, even if they tried to command
| crazy prices.
|
| Even at $1 per 5-second video, I think some use cases
| (including fun/non-business ones) would still overwhelm
| capacity.
| popcar2 wrote:
| Not nearly as impressive as Sora. Sora was impressive because the
| clips were long and had lots of rapid movement since video models
| tend to fall apart when the movement isn't easy to predict.
|
| By comparison, the shots here are only a few seconds long and
| almost all look like slow motion or slow panning shots
| cherrypicked because they don't have that much movement. Compare
| that to Sora's videos of people walking in real speed.
|
| The only shot they had that can compare was the cyberpunk video
| they linked to, and it looks crazy inconsistent. Real shame.
| nuz wrote:
| Sora is also movement limited to a certain range if you look at
| the clips closely. Probably something like filtering by some
| function of optical flow in both cases.
| ein0p wrote:
| Also Sora demos had some really impressive generations
| featuring _people_. Here we hardly see any people which likely
| means exactly what you'd guess.
| data-ottawa wrote:
| Has Gemini started generated impacted of people again? My
| trial has ended and I haven't been following the issue.
| spiderfarmer wrote:
| Also the horse just looks weird, just like the buildings and
| peppers.
|
| It's impressive as hell though. Even if it would only be used
| to extrapolate existing video.
| LZ_Khan wrote:
| I imagine thats just a function of how much training data you
| throw at it.
| Jensson wrote:
| > Sora was impressive because the clips were long and had lots
| of rapid movement
|
| Sora videos ran at 1 beat per second, so everything in the
| image moved at the same beat and often too slow or too fast to
| keep the pace.
|
| It is very obvious when you inspect the images and notice that
| there are keyframes at every whole second mark and everything
| on the screen suddenly goes in their next animation step.
|
| That really limits the kind of videos you can generate.
| lupire wrote:
| So it needs to learn how far each object can travel in 1sec
| at its natural speed?
| Jensson wrote:
| It also needs to separate animation steps for different
| objects so that objects can keep different speeds. It isn't
| trivial at all to go from having a keyframe for the whole
| picture to having separate for separate parts, you need to
| retrain the whole thing from the ground up and the results
| will be way worse until you figure out a way to train that.
|
| My point is that it isn't obvious at all that Soras way
| actually is closer to the end goal, it might look better
| today to have those 1 second beats for every video but
| where do you go from there?
| Aerroon wrote:
| The best case scenario would probably being able to
| generate "layers" at a time. That would give more
| creative control over the outcome, but I have no idea how
| you would do it.
| TIPSIO wrote:
| Objectively speaking (if people would be honest with
| themselves), both are just decent at best.
|
| I think comparing them now is probably not that useful outside
| of this AI hype train. Like comparing two children. A lot can
| happen.
|
| The bigger message I am getting from this is it's clear OpenAI
| won't have a super AI monopoly.
| TaylorAlexander wrote:
| Comparing two children is a good one. My girlfriend has taken
| to pointing out when I'm engaging in "punditry". They're an
| engineer like I am and we talk about tech all the time, but
| sometimes I talk about which company is beating which company
| like it's a football game, and they call me out for it.
|
| Video models are interesting, and to some extent trying to
| imagine which company is gonna eat the other's lunch is kind
| of interesting, but sometimes that's all people are
| interested in and I can see my girlfriend's reasoning for
| being disinterested in such discussion.
| Jonanin wrote:
| Except that many of the people involved do think of it like
| a football game, and thus it actually is like one. Of
| course the researchers and engineers at both OpenAI and
| Google DeepMind have a sense of rivalry and strive to one
| up another. They definitely feel like they are in a
| competition.
| TaylorAlexander wrote:
| > They definitely feel like they are in a competition.
|
| Citation needed?
|
| Although I did not work in AI, I did work at Google X
| robotics on a robot they often use for AI research.
|
| Maybe some people felt like it was a competition, but I
| don't have much reason to believe that feeling is common.
| AI researchers are literally in collaboration with other
| people in the field, publishing papers and reading the
| work of others to learn and build upon it.
| Jensson wrote:
| > AI researchers are literally in collaboration with
| other people in the field, publishing papers and reading
| the work of others to learn and build upon it.
|
| When OpenAI suddenly stopped publishing their stuff I bet
| that many researchers now started feeling like it started
| to be a competition.
|
| OpenAI is no longer cooperating, they are just competing.
| They still haven't said anything about how gpt-4 works.
| motoxpro wrote:
| What would make this "Good?"
| Aeolun wrote:
| I'm fairly certain Google just has a big stack of these in
| storage but never released, or the moment someone pulls ahead
| it's all hands on deck to make the same thing.
| arcastroe wrote:
| > The shots here [..] almost all look like slow motion or slow
| panning shots.
|
| I think this is arguably better than the alternative. With
| slow-mo generated videos, you can always speed them up in
| editing. It's much harder to take a fast-paced video and slow
| it down without terrible loss in quality.
| totaldude87 wrote:
| Could also be the doing of google. if Veo screws up , the
| weight falls on Alphabet stock. While open AI is not public and
| doesn't have to worry about anything . Like even if open AI
| faked some of their AI videos(not saying they did), it wouldn't
| affect them the way it would affect Veo--> Google-->Alphabet
|
| being cautious often puts a dent in innovation
| soulofmischief wrote:
| You mean like how they faked some Gemini stuff?
|
| https://www.bbc.com/news/technology-67650807
| latexr wrote:
| > Not nearly as impressive as Sora. Sora was impressive because
| the clips were long and had lots of rapid movement
|
| The most impressive Sora demo was heavily edited.
|
| https://www.fxguide.com/fxfeatured/actually-using-sora/
| rvz wrote:
| Interesting to see that OpenAI was successful in creating
| their own reality distortion spells, just like Apple's
| reality distortion field which has fooled many of these
| commenters here.
|
| It's quite early to race to the conclusion that one is better
| than the other when not only they are both unreleased, but
| especially when the demos can be edited, faked or altered to
| look great for optics and distortion.
|
| EDIT: It appears there is at least one commenter who replied
| below that is upset with this fact above.
|
| It is OK to cope, but the truth really doesn't care
| especially when the competition (Google) came out much
| stronger than expected with their announcements.
| ijidak wrote:
| Well, as a counterpoint, Apple did become a $2 trillion
| dollar company...
|
| Distortion is easiest when the products really work. :)
| adventured wrote:
| Apple got up to $3 trillion back in 2023.
| turnsout wrote:
| Indeed, and they're at 2.87T today... Built largely on
| differentiated high-margin products, which is not how I
| would describe OpenAI. I should clarify that I'm a fan of
| both companies, but the reality is that OpenAI's business
| model depends on how well it can commoditize itself.
| jsheard wrote:
| To Shy Kids credit _they_ made it clear the Sora footage was
| heavily edited, but OpenAIs site still presents Air Head
| without that context.
|
| https://www.youtube.com/watch?v=KFzXwBZgB88 (posted the day
| after the short debuted)
|
| https://openai.com/index/sora-first-impressions (no mention
| of editing, nor do they link to the above making-of video)
| seoulmetro wrote:
| There is now on that second link:
|
| >The videos below were edited by the artists, who
| creatively integrated Sora into their work, and had the
| freedom to modify the content Sora generated.
| jsheard wrote:
| Ha, here's an archive from yesterday for posterity.
|
| https://web.archive.org/web/20240513050023/https://openai
| .co...
|
| They also just added a link to the making-of video.
| Aeolun wrote:
| If you modified something because it got some attention
| on HN, at least have the guts to own up to it :/
| seoulmetro wrote:
| That's hilarious. Your comment clearly got seen by
| someone.
| hanspeter wrote:
| I believe it was clear that Air Head was an edited video.
|
| The intention wasn't to show "This is what Sora can generate
| from start to end" but rather "This is what a video
| production team can do with Sora instead of shooting their
| own raw footage."
|
| Maybe not so obvious to others, but for me it was clear from
| how the other demo videos looked.
| dyauspitr wrote:
| They're not showing people because that can get hairy quickly.
| btown wrote:
| A commercially available tool that can turn still images into
| depth-conscious panning shots is still tremendously impactful
| across all sorts of industries, especially tourism and
| hospitality. I'm really excited to see what this can do.
| pheatherlite wrote:
| Not just that, but anything with a subject in it felt uncanny
| valleyish... like that cowboy clip, the gate of the horse stood
| out as odd and then I gave it some attention . It seems like a
| camel's gate. And whole thing seems to be hovering, gliding
| rather than walking. Sora indeed seems to have an advantage
| __float wrote:
| I thought a camel's gait is much closer to two legs moving
| almost at the same time. Granted, I don't see camels often.
| Out of curiosity can you explain that more?
| axblount wrote:
| I hate to be so cynical, but I'm dreading the inevitable flood of
| AI generated video spam.
|
| We really are about _this_ close to infinite jest. Imagine TikTok
| 's algorithm with on demand video generation to suit your exact
| tastes. It may erase the social aspect, but for many users I
| doubt that would matter too much. "Lurking" into oblivion.
| lordswork wrote:
| It's already here. There are communities forming around
| generating passive income from mass producing AI videos as
| tiktoks and shorts.
| axblount wrote:
| I saw one of those where a guy just made videos about
| increasingly elaborate AI generated cakes. You're right, I
| guess we're mostly there.
|
| But those still require some human input. I'm imagining a
| sort of genetic algorithm for video prompts, no human
| editing, input, or curation required.
| tikkun wrote:
| What's the subreddit?
| barbariangrunge wrote:
| YouTube's endgame is to not need content creators in the loop
| any more. The algorithm will just create everything
| esafak wrote:
| The endgame of that is that people will leave.
| darby_eight wrote:
| I'm somewhat surprised people still watch YouTube with the
| horrible recommendations and non-stop spam
| astrange wrote:
| YouTube actually has really good recommendations and
| comments these days.
|
| In fact I would say the comments are too good. They
| clearly have something ranking them for "niceness" but it
| makes them impossibly sentimental. Like I watched a bunch
| of videos about 70s rock recently and every single
| comment was about how someone's family member just died
| of cancer and how much they loved listening to it.
| belter wrote:
| Henry Ford II: Walter, how are you going to get those robots
| to pay your union dues?
|
| Walter Reuther: Henry, how are you going to get them to buy
| your cars?
| LZ_Khan wrote:
| I had the same thought regarding infinite jest recently
| rm_-rf_slash wrote:
| And somehow our exact tastes would also include influencer
| coded advertisements.
| beacon294 wrote:
| Can you explain this aspect of infinite jest to me without
| spoiling the book?
| _xander wrote:
| It's introduced early on (and not what the book is really
| about): distribution of a video that is so entertaining that
| any viewer is compelled to watch it until they die
| jprete wrote:
| At the bottom of the text blurb on the Veo page: "In the
| future, we'll also bring some of Veo's capabilities to YouTube
| Shorts and other products."
|
| So...you're not cynical, it's an explicit product goal.
| Invictus0 wrote:
| This basically already exists for porn
| redml wrote:
| I think of it as we're replacing the SEO spam we have right now
| with AI spam. At least now we can fight that with more AI.
| sph wrote:
| There's a naive statement to make.
| layer8 wrote:
| If it really suited my exact tastes, that would actually be
| great. But I don't see how we're anywhere close to that. And
| they won't target matching your exact taste. They will target
| the threshold where it's just barely interesting enough that
| people don't turn it off.
| fidotron wrote:
| If you were of a mind to give Google the benefit of the doubt you
| would have to think they are desperately trying not to
| overpromise and underdeliver, partly because that has been their
| track record to date. It's a very curious time to choose to make
| this switch though given their competition, and if it was
| motivated by the reception Bard received then it shows they
| didn't learn the right lessons from that mess at all.
| aaroninsf wrote:
| It's mildly interesting how many of the samples shown fail to
| fully conform to the prompts. Lots of specifics are missing.
|
| Kudos to Google for if not foregrounding, being entirely
| transparent, about this.
| willsmith72 wrote:
| all of this stuff i'll believe when it's ready for public release
|
| 1. safety measures lead to huge quality reductions
|
| 2. the devil's in the details. you can make me 1 million videos
| which look 99% realistic, but it's useless. consumers can pick it
| instantly, and it's a gigantic turn-off for any brand
| aprilthird2021 wrote:
| There'll always be a market for cheap low-quality videos, and
| vice versa always a market for shockingly high quality videos.
| K. Asif's Mughal-e-Azham had enormous ticket sales and a huge
| budget spending on all sorts of stuff, like actual gold jewelry
| to make the actors feel that they were important despite the
| film being black and white.
|
| No matter how good AI gets, it will never be the highest
| budget. Hell, even technically more accurate quartz watches
| cannot compete price wise with mechanical masterpiece watches
| of lower accuracy
| barbariangrunge wrote:
| The company that controls online video is announcing a new tool,
| and ambitions to develop it further, to create videos without
| need for content creators. Using their videos to make a machine
| that will cut them out of the loop.
| infinitezest wrote:
| Males the very long Acknowledgments section at the bottom extra
| rich.
| hipadev23 wrote:
| I've never had to click "Sign in" so many times in a row.
| flying_whale wrote:
| ...and then fill out an actual google form at the end, _after_
| you've already signed in, to be added to the waitlist :sigh:
| throwup238 wrote:
| ...and enter your email into the form again despite being
| logged into a Google account.
| mrkramer wrote:
| YouTube people: We need more UGC.
|
| DeepMind people: AI can do it.
| belval wrote:
| While it's cool that they chose to showcase full-resolution
| videos, they take so long to load I thought their videos were
| just a stuttery mess.
|
| Turns out if you open the video in a new tab the smoothness is
| much more impressive.
| robertlagrant wrote:
| Hold on to your papers!
| esafak wrote:
| I love the reference to Llama with the alpacas.
| typpo wrote:
| The amount of negativity in these comments is astounding.
| Congrats to the teams at Google on what they have built, and
| hoping for more competition and progress in this space.
| localfirst wrote:
| We have to take account that this community (good chunk have
| stakes in YC and a lot to gain from secondary shares in OpenAI)
| and platform is going to favor its own and be aware that Sam
| Altman is the golden boy of YC's founder after all.
|
| So of course you are going to see snarky comments and straight
| up denial in the competition. We saw that yesterday in the
| comments with the release of GPT4o in anticipation of Gemini
| 2.0 (GPT-5 basically) release being announced today at Google
| I/O
|
| I'm SORA to say Veo looks much more polished without jank.
|
| Big congratulations to Google and their excellent AI team for
| not editing their AI generated videos like SORA
| JumpCrisscross wrote:
| > _platform is going to favor its own and be aware that Sam
| Altman is the golden boy of YC 's founder_
|
| I don't know if there is a sentiment analysis tool for HN,
| but I'm pretty sure it's been dead negative for Altman since
| at least Worldcoin.
| saalweachter wrote:
| A land of contrasts, etc.
| cosmotron wrote:
| Something in this vein was just posted here a few days
| back: https://news.ycombinator.com/item?id=40307519
| baobabKoodaa wrote:
| > We have to take account that this community (good chunk
| have stakes in YC and a lot to gain from secondary shares in
| OpenAI)
|
| You have to be pretty deep inside your own little bubble to
| think that even more than a 0.001% of HN has "stakes in YC"
| or "secondary shares in OpenAI".
| hu3 wrote:
| It can be a vocal minority. Still vocal.
|
| I wouldn't discard.
| dylan604 wrote:
| I have 0% stake in any YC, and I'm very vocal in my
| negativity against any of these "AI" anythings. All of
| these announcements are only slighty more than a toddler
| anxious to show the parental units a finger painting
| looking to hang it on the fridge. Only instead of the
| fridge, they are a hoping to get funding/investment
| knowing that their product is _not_ a fully fledged
| anything. It 's comical.
| mrbungie wrote:
| The amount of copium in this response is astounding.
|
| Yes, there is a noticeable negative response from HN towards
| Google, and there has always been especially when speaking
| about their weird product management practices and
| incentives. Google hasn't launched any notable (and still
| surviving, Stadia being a sad example of this) consumer
| product or service in the last 10 years.
|
| But to suggest there is a Sam Altman / OpenAI bias is
| delusional. In most posts about them there is at least some
| kind of skepticism or criticism towards Altman (his
| participation in Worldcoin and his accelerationist stance
| towards AGI) or his companies (OpenAI not being really open).
|
| PS: I would say most people lurking here are just hackers (of
| many kinds, but still hackers), not investors with shady
| motives.
| localfirst wrote:
| My argument wasn't that there was a cabal of shady
| investors trying to influence perception here. your
| observation is certainly valid there is general disdain for
| Google but specifically I'm calling out people that were
| blatantly telling lies and making outlandish claims and
| attacking others who were simply pointing out that some of
| those people have financial motives (either being backed by
| YC or seek to benefit from the work of others).
|
| None of this is surprising to me and shouldn't shock you.
| You are literally on a site called Ycombinator. Had this
| been another platform without ties to investments or
| drawing from crowd that actively seeks to enrich themselves
| through participation in a narrative, this wouldn't even be
| a thing.
|
| Large number of people who read my comment seems to agree
| and this whole worldcoin thing seems to me just another
| distraction (We've already been through why that was shady
| but we are talking about something different here).
| mrbungie wrote:
| Well, you have a point. I've always thought that Hacker
| News <> YCombinator, but maybe the truth is in the
| middle. At the very least, this is food for thought.
| astrange wrote:
| > Google hasn't launched any notable (and still surviving,
| Stadia being a sad example of this) consumer product or
| service in the last 10 years.
|
| Google Photos is less than 10 years old and I think a lot
| of people use it.
| betternet77 wrote:
| Yup, there's a significant anti-Google spin in HN, twitter.
| For example, here's paulg claiming that Cruise handles
| driving around cyclists better than Waymo [1], obviously not
| true to anyone who's used both services
|
| [1] https://twitter.com/paulg/status/1360341492850708481
| [deleted]
| rvz wrote:
| You have to give Google credit as they went against the OpenAI
| fanatics, Google doomsday crowd and some of the permanent
| critics (who won't disclose they invested in OpenAI's secondary
| share sale) that believe that Google can't keep up.
|
| In fact, they already did. What OpenAI announced was nothing
| that Google could not do already.
|
| The top comments around Sora vs Veo suggesting that Google was
| falling behind, given the fact that both are still unavailable
| to use wasn't even a point to make in the first place, but just
| typical HN nonsense.
| JumpCrisscross wrote:
| > _What OpenAI announced was nothing that Google could not do
| already_
|
| I don't think I've seen serious criticism of Google's
| abilities. Apple didn't release anything that Xerox or IBM
| couldn't do. The difference is they didn't.
|
| Google's problem has always been in product follow through.
| In this case, I fault them for having the sole action item be
| a buried waitlist request and two new brands (Veo and
| VideoFX) for one unreleased product.
| sangnoir wrote:
| > I don't think I've seen serious criticism of Google's
| abilities
|
| Serious or not, that criticism existed on HN - and still
| does. I've seen many comments claiming Google has "fallen
| behind" on AI, sometimes with the insinuation the Google
| won't ever catch up due to OpenAI's apparent insurmountable
| lead
| aprilthird2021 wrote:
| I saw it here alone. A lot of people simply have no idea
| the level of research ability and skill Google, the
| inventor of the Transformer, has.
| KorematsuFredt wrote:
| > Google's problem has always been in product follow
| through.
|
| Google is large enough to not care about small
| opportunities. It ends up focusing on bigger opportunities
| that only it can execute well. Google's ability to shut
| down products that dont work is an insult to user but a
| very good corporate strategy and they deserve kudos for
| that.
|
| Now, coming back to the "follow through". Google Search,
| Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
| excellent examples of Google's long term commitment to the
| product and constantly making things better and keeping
| them relevant for the market. Many companies like Yahoo!
| had a head start but could not keep up with their mail
| service.
|
| Sure it has shut down many small products but that is
| because they were unlikely to turn into bigger
| opportunities. They often integrated the best aspect of
| those products into their other well established products
| such as Google Trips became part of search and Google
| Shopping became part of search.
| falcor84 wrote:
| > coming back to the "follow through". Google Search,
| Gmail, Chrome, Android, Photos, Drive, Cloud etc. all are
| excellent examples of Google's long term commitment
|
| Do you have any examples of something they launched in
| the last decade?
| astrange wrote:
| Photos was launched in the last decade.
| troupo wrote:
| > Google is large enough to not care about small
| opportunities. It ends up focusing on bigger
| opportunities
|
| that result in shittier products overall. For example,
| just a few months ago they cut 17 features from Google
| Assistant because they couldn't monetize them, sorry,
| because these were "small opportunities":
| https://techcrunch.com/2024/01/11/google-is-
| removing-17-unde...
|
| > all are excellent examples of Google's long term
| commitment to the product and constantly making things
| better and keeping them relevant for the market.
|
| And here's a long list of excellent examples of Google
| killing products right and left because small
| opportunities or something: https://killedbygoogle.com/
|
| And don't get me started on the whole
| Hangouts/Meet/Alo/Duo/whatever fiasco
|
| > Sure it has shut down many small products but that is
| because they were unlikely to turn into bigger
| opportunities.
|
| Translation: because they couldn't find ways to monetize
| the last cent out of them
|
| ---
|
| Edit: don't forget: The absolute vast majority of
| Google's money comes from selling ads. There's nothing
| else it is capable of doing at any significant scale. The
| only reason it doesn't "chase small opportunities" is
| because Google _doesn 't know how_. There are a few
| smaller cash cows that it can keep chugging along, but
| they are dwarfed by the single driving force that mars
| everything at Google: the need to sell more and more ads
| and monetize the shit out of everything.
| localfirst wrote:
| Don't forget SORA edited their "ai generated" videos while
| Google did not here.
|
| Where did SORA get all its training videos from again and why
| won't the executives answer a simple Yes/No question to "Did
| you scrape Youtube to train SORA?"
|
| Google attorneys want to know.
| scarmig wrote:
| Google does not care to start a war where every company has
| to form explicit legal agreements with every other company
| to scrape their data. Maybe if they got really desperate,
| but right now they have no reason to be.
| TwentyPosts wrote:
| > Don't forget SORA edited their "ai generated" videos
| while Google did not here.
|
| Wait, really? Could you point to proof for this? I'm very
| curious where this is coming from
| septic-liqueur wrote:
| I have no doubt about Google's capabilities in AI, my doubt
| lies on the productization part. I don't think they can
| produce something that will not be a complete mess
| CSMastermind wrote:
| > In fact, they already did.
|
| In terms of software that's actually been released Google is
| still at best in third place when it comes to AI products.
|
| I don't care what they can demo, I care what they've shipped.
| So far the only thing they've shipped for Veo is a waitlist.
| Xenoamorphous wrote:
| It's tiring. Same thing happened to the GPT-4o announcement
| yesterday. Apparently because there's no unquestionable AGI 14
| months after GPT-4 then everything sucks.
|
| I always found HN contrarian but as I say it's really tiring.
| I've no idea what the negative commenters are working on on a
| daily basis to be so dismissive of everybody else's work,
| including work that leaves 90% of the population in a
| combination of awe and fear. Also people sometimes forget that
| behind big corp names there are actual people. People who might
| be reading this thread.
| motoxpro wrote:
| Yeah it's pretty unfortunate. Saying something sucks is such
| a lack of understanding that things are not static. I guess
| it's a sure way to be right, because there will always be
| progress and you can look back and say "See I told you!"
| IggleSniggle wrote:
| Psh. Things are not static. Progress sucks now. Haven't you
| heard of enshitification? You can always look back and say,
| "see? I told you it would suck in the future!"
|
| ...why am I feeling to urge to point out that I am only
| making a joke here and not trying to make an actual counter
| point, even if one can be made...?
| piloto_ciego wrote:
| I commented on this elsewhere, but being a negative Nancy
| is really a winning strategy.
|
| If you're negative and you get it wrong, nobody cares,
| get it and right you look like a damn genius. Conversely,
| if you're positive and get it wrong, you look like an
| idiot and if you're right you're praised for a good call
| once. The rational "game theory" choice is to predict
| calamity.
| motoxpro wrote:
| Yeah it's funny that optimism in the long term is optimal
| and pessimism in the short term is optimal.
| piloto_ciego wrote:
| Right, but I think people sometimes get the "what
| constitutes long term" factor a little bit wrong.
|
| I am still talking to a lot of people who say, "what can
| any of this AI stuff even do?" It's like, robots you
| could hold a conversation with effectively didn't exist 3
| years ago and you're already upset that it's not a money
| tree?
|
| I think that peoples expectation horizon narrowing down
| may be the clearest evidence that we're in the
| singularity.
| mupuff1234 wrote:
| What's also tiring is that no one is allowed to have any
| critical thoughts because "it's tiring".
|
| From my own perspective the critique is usually a counter
| balance to extreme hype, so maybe let's just agree it's ok to
| have both types of comments, you know "checks and balances".
| piloto_ciego wrote:
| Being cynical is not a counterbalance though, it's just as
| low effort as the hype people.
| Workaccount2 wrote:
| AI is a pretty direct threat to software engineering. It's no
| surprise people are hostile towards it. Come 2030, how do you
| justify a paying someone $175k/yr when a $20/mo app is 95% as
| good, and the other 5% can be done by someone making $40k/yr?
| astrange wrote:
| Productivity improvements are good for workers; you should
| ask yourself why the invention of the compiler didn't cause
| this to happen already.
|
| Or why the existence of the UK hasn't, since they have a
| lot of English speaking programmers paid in peanuts.
| piloto_ciego wrote:
| I think it's fear. Maybe not openly, but people are spooked at
| how fast stuff is happening, so shitting on progress is a
| natural reaction.
| Workaccount2 wrote:
| I have noticed this the most in SWE's who went from being
| code writers to "human intention decipherers". Ask a an SWE
| in 2019 what they do and it was "Write novel and efficient
| code", ask one in 2024 and you get "Sit in meetings and talk
| to project managers in order to translate their poor
| communication to good code".
|
| Not saying the latter was never true, it's just interesting
| to see how people have reframed their work in the wake of
| breakneck AI progress.
| kmacdough wrote:
| I suspect it's also a general fatigue with the over-hype. It
| _is_ moving fast, but every step improvement has come with
| its own mini hype cycle. The demos are very curated and make
| the model look incredibly flexible and resilient. But when we
| test the product in the wild, it 's constantly surprising the
| simple tasks it blunders on. It's natural to become a bit
| cynical and human to take that cynicism on the attack. Not
| saying it's right, just natural, in the same way that it's
| natural for the marketing teams to be as misleading as they
| can get away with. Both are annoying, but there's not much to
| do.
| piloto_ciego wrote:
| Cynicism is (arguably) the intellectually easy strategy.
|
| If you're cynical and you get it right that everything
| "sucks" you look like a genius, if you get it wrong there
| is no penalty.
|
| If you aren't cynical and you talk about how great
| something is going to be and it flops you look like an
| idiot. The social penalty is much higher.
| brikym wrote:
| Progress? There are loads of downsides the AI fans won't
| acknowledge. It diminishes human value/creativity and will be
| owned and controlled by the wealthiest people. It's not like
| the horse being replaced by the tractor. This time it's
| different there is no place to move to but doing nothing on a
| UBI (best case). That same power also opens the door to
| dystopian levels of censorship and surveillance. I see more
| of the Black Mirror scenarios coming true rather than
| breakthroughs that benefit society. Nobody is denying that
| it's impressive but the question is more whether it's good
| overall. Unfortunately the toothpaste seems to be out of the
| tube.
| piloto_ciego wrote:
| >Progress? There are loads of downsides the AI fans won't
| acknowledge.
|
| I don't know if this is true.
|
| >It diminishes human value/creativity
|
| I don't see this at all, I see it as enhancing creativity
| and human value.
|
| >and will be owned and controlled by the wealthiest people.
|
| There are a lot of open source models being created, even
| if they are being released by Meta...
|
| >It's not like the horse being replaced by the tractor.
| This time it's different there is no place to move to but
| doing nothing on a UBI (best case).
|
| So, like, you wouldn't do anything if you could just chill
| on UBI all day? If anything I'd get more creative.
|
| > That same power also opens the door to dystopian levels
| of censorship and surveillance.
|
| I don't disagree with this at all, but I think we can fight
| back here and overcome this, but we have to lean into the
| tech to do that.
|
| > I see more of the Black Mirror scenarios coming true
| rather than breakthroughs that benefit society.
|
| I think this is basically wrong historically. Things are
| very seldom permanently dystopian if they're dystopian at
| all. Things are demonstrably better than they were 100
| years ago, and if you think back even a couple decades
| things are often a lot better.
|
| The medical applications alone will save a lot of lives.
|
| > Nobody is denying that it's impressive but the question
| is more whether it's good overall. Unfortunately the
| toothpaste seems to be out of the tube.
|
| There are going to be annoyances, but I would bet serious
| cash that things continue to get better.
| astrange wrote:
| > So, like, you wouldn't do anything if you could just
| chill on UBI all day? If anything I'd get more creative.
|
| There is a lot of empirical research on UBI and all of it
| shows that it has very little effect on employment either
| way. That is, nothing will change here.
|
| (This is probably because 1. positional goods exist 2.
| romantic prospects don't like it when you're unemployed
| even if you're rich.)
| sshnuke wrote:
| > It diminishes human value/creativity and will be owned
| and controlled by the wealthiest people
|
| "When you go to an art gallery, you are simply a tourist
| looking at the trophy cabinet of a few millionaires" -
| Banksy
| piloto_ciego wrote:
| Then... isn't AI generated art something that empowers
| the non-millionaires?
| jmkni wrote:
| Well for me it linked to a Google Form to join a waitlist lol,
| so I'm not exactly pumped
| jtolmar wrote:
| I think it's just hype fatigue.
|
| There's genuinely impressive progress being made, but there are
| also a lot of new models coming out promising way more than
| they can deliver. Even the Google AI announcements, which used
| to be carefully tailored to keep expectations low and show off
| their own limitations, now read more like marketing puff
| pieces.
|
| I'm sure a lot of the HN crowd likes to pretend we're all
| perfectly discerning arbiters of the tech future with our
| thumbs on the pulse of the times or whatever, but realistically
| nobody is going to sift through a mountain of announcements
| ranging from "states it's revolutionary, is marginal
| improvement" to "states it's revolutionary, is merely an
| impressive step" to "states it's revolutionary, is bullshit"
| without resorting to vibes-based analysis.
| throwup238 wrote:
| It's made all the worse by just being a giant waitlist. Sora
| is still no where to be seen three months later, GPT-4o's
| conversational features aren't widely rolled out yet, and
| Google's AI releases have been waitlist after waitlist after
| waitlist.
|
| Companies can either get peopled hyped or have never-ending
| georestricted waitlists, they can't have their cake and eat
| it too.
| indigodaddy wrote:
| Isn't there a lot of positive forward motion and
| fruitfulness in the current state of the open source
| llama-3 community?
| Dig1t wrote:
| Honestly just think that Google has burned their good will at
| this point. If you notice, most announcements by Apple are
| positively received here and same with OpenAI. But since
| Google's "don't be evil" persona has faded and since they went
| through so much churn WRT products. I think most people just
| don't want to see them win.
| rmbyrro wrote:
| I hope they didn't mess this one up with ideologically driven
| non-sense, like they did with Gemini.
| clawoo wrote:
| > "This tool isn't available in your country yet"
|
| How did I know I would see this message before clicking "Sign up
| to try"?
| makestuff wrote:
| Is there any good blogs/videos that ELI5 how these video
| generation models even work?
| sys32768 wrote:
| I assume for consumers to use this, we must agree to have product
| placements inserted into our productions every 48 seconds.
| SoftTalker wrote:
| Vaguely unsettling that the thumbnail for first example prompt "A
| lone cowboy rides his horse across an open plain at beautiful
| sunset, soft light, warm colors" looks something like the
| pixelated vision of The Gunslinger android (Yul Brynner's
| character) from the 1973 version of Westworld.
|
| See 1:11 in this video
| https://www.youtube.com/watch?v=MAvid5fzWnY
|
| Incidentally that was one of the early uses of computer graphics
| in a movie, supposedly those short scenes took many hours to
| render and had to be done three times to achieve a colorized
| image.
| AceJohnny2 wrote:
| Can't say I see a visual similarity. In any case, "Cowboy
| silhouette in the sunset" is a pretty classic American visual.
|
| But the parallel you made between android Brynner's vision and
| the generated imagery is fun to consider!
| totaldude87 wrote:
| its 2024 and AI is taking over and yet, to signup for this, it
| take way more clicks and Google form entry(1)
|
| Sigh. I still have hopes for VEO though
| aragonite wrote:
| With so much recent focus by OpenAI/Google on AI's visual
| capabilities, does anyone know when we might see an OCR product
| as good as Whisper for voice transcription? (Or has that already
| happened?) I had to convert some PDFs and MP3s to text recently
| and was struck by the vast difference in output quality.
| Whisper's transcription was near-flawless, all the OCR softwares
| I tried struggled with formatting, missed words, and made many
| errors.
| jazzyjackson wrote:
| You might enjoy this breakdown of the lengths one person went
| through to take advantage of the iOS vision API and creating a
| local web service for transcribing some very challenging memes:
|
| https://findthatmeme.com/blog/2023/01/08/image-stacks-and-ip...
|
| discussed on HN:
|
| https://news.ycombinator.com/item?id=34315782
| aragonite wrote:
| This is so good - thanks for sharing this!
| nunez wrote:
| This is a work of fucking art.
| thesandlord wrote:
| We use GPT-4o for data extraction from documents, its really
| good. I published a small library that does a lot of the
| document conversion and output parsing:
| https://npmjs.com/package/llm-document-ocr
|
| For straight OCR, it does work really well but at the end of
| the day its still not 100%
| aragonite wrote:
| Thanks! look forward to checking this out as soon as I get
| home.
| tauntz wrote:
| Uh.. First it tells me that I can't sign up because my country is
| supported (yay, EU) and I can sign up to be notified when it's
| actually available. Great, after I complete that form, I get an
| error that the form can't be submitted and I'm taken to
| https://aitestkitchen.withgoogle.com/tools/video-fx where I can
| only press the "Join our waitlist" button. This takes me to a
| Google Form, that doesn't have my country in the required country
| dropdown and has a hint that says: "Note: the dropdown only
| includes countries where ImageFX and MusicFX are publicly
| available.". Say what?
|
| Why does this have to be so confusing? Is the name "Veo" or
| "VideoFX"? Why is the waitlist for VideoFX telling me something
| about public availability of ImageFX and MusicFX? Why is
| everything US only, again? Sigh..
| pelorat wrote:
| We can blame the EU AI act and other regulations for that.
| benatkin wrote:
| Yo lo veo.
| wseqyrku wrote:
| Google puts more effort into the namings than the actual model,
| ngl.
| bpodgursky wrote:
| I think it's funny the demos don't have people in them after the
| Gemini fiasco. I wonder if they didn't have time to re-train the
| model to show representative ethnicities.
| thih9 wrote:
| Is there any non slow motion example?
|
| The cyberpunk video seems better in that aspect, but I wish there
| were more.
| gliched_robot wrote:
| This is far more superior than SORA, there is no comparison.
| monkeeguy wrote:
| lol
| xnx wrote:
| 60 second example video:
| https://www.youtube.com/watch?v=diqmZs1aD1g
| candiddevmike wrote:
| For some reason this video reminds me of dreaming--details just
| kind of pop in and out and the entire thing seems very surreal
| and fractal.
| jprete wrote:
| Same impression here. The scene changes very abruptly from a
| sky view to following the car. The cars meld with the ground
| frequently, and I think I saw one car drive through another
| at one point.
| nixpulvis wrote:
| So... much... bloom. I like it, but still holy shit. I hate
| that I like it because I don't want this art form to be reduced
| by overuse. Sadly, it's too late.
|
| I'll just go back to living under a rock.
| londons_explore wrote:
| Looks like in places this has learned video compression
| artifacts...
| exodust wrote:
| Funny if true. Perhaps in some generated video it will
| suddenly interrupt the sequence with pretend unskippable ads
| for phone cases & VPNs.
| datashaman wrote:
| 1080p but it has pixelated artifacts...
| svag wrote:
| An interesting thing that Google does is to watermark the AI
| generated videos using the [SynthID
| technology](https://deepmind.google/technologies/synthid/).
|
| It seems that the SynthID is not only for AI generated video but
| for image, text and audio.
| bardak wrote:
| I would like a bit more convincing that the text watermark will
| not be noticeable. AI text already has issues with using
| certain words to frequently. Messing with the weights seems
| like it might make the issue worse
| Tostino wrote:
| Not to mention, when does he get applied? If I am asking an
| llm to transform some data from one format to another, I
| don't expect any changes other than the format.
| padolsey wrote:
| It seems really clever, especially the encoding of a signature
| into LLM token probability selections. I wonder if synthid will
| trigger some standarization in the industry. I don't think
| there's much incentive to tho. Open-source gen AI will still
| exist. What does google expext to occur? I guess they're just
| trying to present themselves as 'ethically pursuing AI'.
| s1k3s wrote:
| This looks really good for promo videos. All scenes in here are
| basically that.
| KorematsuFredt wrote:
| I think we should all take a pause and just appreciate the
| amazing work Google, OpenAI, MS and many others including those
| in academia have done. We do not know if Google or OpenAI or
| someone else is going to win the race but unlike many other
| races, this one makes the entire humanity move faster. Keep the
| negativity aside and appreciate the sweat and nights people have
| poured into making such things happen. Majority of these people
| are pretty ordinary folks working for a salary so they can spend
| their time with their families.
| myaccountonhn wrote:
| Majority of the people building the ai are artists having their
| work stolen or workers earning extremely low wages to label
| gory and csam data to a point where it hurts their mental
| health.
| ugh123 wrote:
| From a filmmaking standpoint I still don't think this is
| impactful.
|
| For that it needs a "director" to say: "turn the horse's head 90@
| the other way, trot 20 feet, and dismount the rider" and "give me
| additional camera angles" of the same scene. Otherwise this is
| mostly b-roll content.
|
| I'm sure this is coming.
| evantbyrne wrote:
| They claim it can accept an "input video and editing command"
| to produce a new video output. Also, "In addition, it supports
| masked editing, enabling changes to specific areas of the video
| when you add a mask area to your video and text prompt." Not
| sure if that specific example would work or not.
| qingcharles wrote:
| I can see using these video generators to create video
| storyboards. Especially if you can drop in a scribbled sketch
| and a prompt for each tile.
| ancientworldnow wrote:
| That sounds actively harmful. Often we want story boards to
| be less specific so as not to have some non artist decision
| maker ask why it doesn't look like the storyboard.
|
| And when we want it to match exactly in an animatic or
| whatever, it needs to be far more precise than this, matching
| real locations etc.
| sbarre wrote:
| I know you weren't implying this, but not every storyboard
| is for sharing with (or seeking approval from) decision
| makers.
|
| I could see this being really useful for exploring tone,
| movement, shot sequences or cut timing, etc..
|
| Right now you scrape together "kinda close enough" stock
| footage for this kind of exploration, and this could get
| you "much closer enough" footage..
| shermantanktop wrote:
| I think of it in terms of the anchoring bias. Imagine
| that your most important decisions are anchored for you
| by what a 10 year old kid heard and understood. Your
| ideas don't come to life without first being rendered as
| a terrible approximation that is convincing to others but
| deeply wrong to you, and now you get to react to that
| instead of going through your own method.
|
| So if it's an optional tool, great, but some people would
| be fine with it, some would not.
| sbarre wrote:
| Absolutely. Everyone's creative process is different (and
| valid).
| gregmac wrote:
| I hadn't thought about that in movie context before, but it
| totally makes sense.
|
| I've worked with other developers that want to build high
| fidelity wire frames, sometimes in the actual UI framework,
| probably because they can (and it's "easy"). I always push
| back against that, in favor of using whiteboard or
| Sharpies. The low-fidelity brings better feedback and
| discussion: focused on layout and flow, not spacing and
| colors. Psychologically it also feels temporary, giving
| permission for others to suggest a completely different
| approach without thinking they're tossing out more than a
| few minutes of work.
|
| I think in the artistic context it extends further, too: if
| you show something too detailed it can anchor it in
| people's minds and stifle their creativity. Most people
| experience this in an ironically similar way: consider how
| you picture the characters of a book differently depending
| on if you watched the movie first or not.
| cpill wrote:
| I guess this will give birth to a new kind of film making.
| Start with a rough sketch, generate 100 higher quality
| versions with an image generator, select one to tweak, use
| that as input to a video generator which generates 10
| versions, coffee one to refine etc
| sailfast wrote:
| For most things I view on the internet B-roll is great content,
| so I'm sure this will enable a new kind of storytelling via
| YouTube Shorts / Instagram, etc at minimum.
| Eji1700 wrote:
| There's also the whole "oh you have no actual
| model/rigging/lighting/set to manipulate" for detail work
| issue.
|
| That said, I personally think the solution will not be coming
| that soon, but at the same time, we'll be seeing a LOT more
| content that can be done using current tools, even if that
| means a dip in quality (severely) due to the cost it might
| save.
| SJC_Hacker wrote:
| This lead me to the question of why hasn't there been an
| effort to do this with 3D content (that I know of).
|
| Because camera angles/lighting/collision detection/etc. at
| that point would be almost trivial.
|
| I guess with the "2D only" approach that is based on actual,
| acquired video you get way more impressive shots.
|
| But the obvious application is for games. Content generation
| in the form of modeling and animation is actually one the
| biggest cost centers for most studios these days.
| chacham15 wrote:
| I dont think "turn the horse's head 90@" is the right path
| forward. What I think is more likely and more useful is: here
| is a start keyframe and here is a stop keyframe (generated by
| text to image using other things like controlnet to control
| positioning etc.) and then having the AI generate the frames in
| between. Dont like the way it generated the in between? Choose
| a keyframe, adjust it, and rerun with the segment before and
| segment after.
| GenerocUsername wrote:
| This appeals to me because it feels auditable and
| controllable... But the pace these things have been
| progressing the last 3 years, I could imagine the tech
| leapfrogs all conventional understanding real soon. Likely
| outputting gaussian splat style outputs where the scene is
| separate from the camera and ask peices can be independently
| tweaked via a VR director chair
| 8note wrote:
| So a declarative keyframe of "the horses head is pointed
| forward" and a second one of "the horse is looking left"
|
| And let the robot tween?
|
| Vs an imperative for "tween this by turning the horse's head
| left"
| aetherson wrote:
| Yeah, I've made a lot of images, and it sure is amazing if all
| you're interested in is, like, "Any basically good image," but
| if you start needing something very particular, rather than
| "anything that is on a general topic and is aesthetically
| pleasing," it gets a lot harder.
|
| And there are a lot more degrees of freedom to get something
| wrong in film than in a single still image.
| teaearlgraycold wrote:
| Everything I've heard from professionals backs that up. Great
| for B roll. Great for stock footage. That's it.
| imachine1980_ wrote:
| Stock videos are indeed crucial, especially now that we can
| easily search for precisely what we need. Take, for instance,
| the scene at the end of 'Look Up' featuring a native American
| dance in Peru. The dancer's movements were captured from a
| stock video, and the comet falling was seamlessly edited in.
| now imagine having near infinite stock videos tailored to the
| situation.
| rzmmm wrote:
| Stock photographers are already having issues with piracy due
| to very powerful AI watermark removal tools. And I suspect
| the companies are using content of these people to train
| these models too. .
| gedy wrote:
| I think with AI content, we'd need to not treat it like
| expecting fine grained control. E.g. instead like "dramatic
| scene of rider coming down path, and dismounting horse, then
| looking into distance", etc. (Or even less detail eventually
| once a cohesive story can be generated.)
| lofaszvanitt wrote:
| I can't wait what will the big video camera makers gonna do
| with tech similar to this. Since Google clearly have zero idea
| what to do with this, and they lack the creativity, it's up to
| ARRI, Canon, Panasonic etc. to create their own solutions for
| this tech. I can't wait to see what Canon has up its sleeves
| with their new offerings that come in a few months.
| larodi wrote:
| Perhaps the only industry which immediately benefits from this
| is the short ads and perhaps TikTok. But still it is very
| dubious, as people seem to actually enjoy being themselves the
| directors of their thing, not somebody else.
|
| Maybe this works for ads for duner place or shisha bar in some
| developing country. I've seen generated images used for menus
| in such places.
|
| But I doubt a serious filmography can be done this way. And if
| it can - it'd be again thanks to some smart concept on behalf
| of humans.
| kmacdough wrote:
| I wouldn't be so sure it's coming. NNs currently dont have the
| structures for long term memory and development. These are
| almost certainly necessary for creating longer works with real
| purpose and meaning. It's possible we're on the cusp with some
| of the work to tame RNNs, but it's taken us years to really
| harness the power of transformers.
| thehappypm wrote:
| If you or I don't see the potential here, I think that just
| means someone more creative is going to do amazing things with
| it
| iamleppert wrote:
| Too little, too late. Google is follower, not leader. They need
| to stop trying and do more stock buybacks and strip the company
| to barebones, like Musk did with Twitter & Tesla.
| NegativeLatency wrote:
| Shoulda used youtube to host their video, it's all broken and
| pixelated for me
| m3kw9 wrote:
| Why is it always in slow motion, is it hard to get the speed
| correctly?
| miohtama wrote:
| > Veo's cutting-edge latent diffusion transformers reduce the
| appearance of these inconsistencies, keeping characters, objects
| and styles in place, as they would in real life.
|
| How is this achieved? Is there temporal memory between frames?
| hackerlight wrote:
| Probably similar to Sora, a patchified vision transformer, you
| sample a 3d patch (third dimension is time) instead of a 2d
| patch
| toasted-subs wrote:
| I could say something but I'm glad to get the confirmation.
| shaunxcode wrote:
| truly removing the `id` from video.
| abledon wrote:
| music is lacking.... suno, udio, riffusion all blow this out of
| the water
| ijidak wrote:
| These will be remembered as the AI wars.
|
| Reminds me of the competition in tech in the late 80's early 90's
| between Microsoft and Borland, Microsoft and IBM, AMD and Intel,
| Word vs Wordperfect, etc.
|
| It's a two horse race between Google and OpenAI.
| animanoir wrote:
| Google is so finished... Unless they remove Mr. Pinchar...
| infinitezest wrote:
| > A fast-tracking shot through a bustling dystopian sprawl
|
| How apropos...
| nosmokewhereiam wrote:
| Made an album in 10 mins. Typically as a techno DJ I'd mix them
| together so they sound kinda bare right now.
|
| Here's my 10 minutes to 12:09 album debut:
|
| https://on.soundcloud.com/FAXkJrLrC2JjoAyu7
| TazeTSchnitzel wrote:
| Even as a very inexperienced musician I think I can say these
| are not very compelling examples? They sound like unfinished
| sketches that took a few minutes to make each, but with no
| overarching theme and weirdly low fidelity. An absolute
| beginner could make better things just by messing around with a
| groovebox.
| rlhf wrote:
| It seems so real, cool.
| salamo wrote:
| The first thing I will do when I get access to this is ask it to
| generate a realistic chess board. I have never gotten a decent
| looking chessboard with any image generator that doesn't have
| deformed pieces, the correct number of squares, squares properly
| in a checkerboard pattern, pieces placed in the correct position,
| board oriented properly (white on the right!) and not an
| otherwise illegal position. It seems to be an "AI complete"
| problem.
| arcticbull wrote:
| Similarly the Veo example of the northern lights is a really
| interesting one. That's not what the northern lights look like
| to the naked eye - they're actually pretty grey. The really
| bright greens and even the reds really only come out when you
| take a photo of them with a camera. Of course the model
| couldn't know that because, well, it only gets trained on
| photos. Gets really existential - simulacra energy - maybe
| another good AI Turing test, for now.
| 22c wrote:
| I've only ever seen photos of the northern lights and I also
| didn't know that.
| sdenton4 wrote:
| That doesn't seem in any way useful, though... To use a very
| blunt analogy, are color blind people
| intelligent/sentient/whatever? Obviously, yes: differences in
| perceptual apparatus aren't useful indicators of
| intelligence.
| shermantanktop wrote:
| As a colorblind person...I could see the northern lights
| way better than all the full-color-vision people around me
| squinting at their phones.
|
| Wider bandwidth isn't always better.
| Ferret7446 wrote:
| > I could see the northern lights way better than all the
| full-color-vision people around me
|
| How would you know?
| squeaky-clean wrote:
| Quote the entire sentence, not just a portion of it.
| pmlarocque wrote:
| That not true, they look grey when they aren't bright enough,
| but they can look green or red to the naked eyes if they are
| bright. I have seen it myself and yes I was disappointed to
| see only grey ones last week.
|
| see: https://theconversation.com/what-causes-the-different-
| colour...
| arcticbull wrote:
| > [Aurora] only appear to us in shades of gray because the
| light is too faint to be sensed by our color-detecting cone
| cells."
|
| > Thus, the human eye primarily views the Northern Lights
| in faint colors and shades of gray and white. DSLR camera
| sensors don't have that limitation. Couple that fact with
| the long exposure times and high ISO settings of modern
| cameras and it becomes clear that the camera sensor has a
| much higher dynamic range of vision in the dark than people
| do.
|
| https://www.space.com/23707-only-photos-reveal-aurora-
| true-c...
|
| This aligns with my experiences.
|
| The brightest ones I saw in Northern Canada I even saw
| hints of reds - but no real greens - until I looked at it
| through my phone, and it looked just like the simulated
| video.
|
| If I looked up and saw them the way they appear in the
| simulation, in real life, I'd run for a pair of leaded
| undies.
| Tronno wrote:
| I've seen it bright green with the naked eye. It
| definitely happens. That article is inaccurate.
| Kiro wrote:
| That is totally incorrect which anyone who have seen real
| northern lights can attest to. I'm sorry that you haven't
| gotten the chance to experience it and now think all
| northern lights are that lackluster.
| kortilla wrote:
| This is such an arrogant pile of bullshit. I've seen very
| obvious colors on many different occasions in the
| northern part of the lower 48, up in southern Canada, and
| in Alaska.
| Maxion wrote:
| Greens are the more common colors, reds and blues occur
| in higher energy solar storms.
|
| And yes, they can be as green to the naked eye in that AI
| video. I've seen aurora shows that fill the entire night
| sky from horizon to horizon, way more impressive than
| that AI video with my own eyes.
| paxys wrote:
| That's not true at all. I have seen northern lights with my
| own eyes that were more neon green and bright purple than any
| mainstream photo.
| cryptoz wrote:
| There's a middle ground here. I saw the northern lights
| with my own eyes just days ago and it was mostly grey. I
| saw some color. But when I took a photo with a phone
| camera, the color absolutely _popped_. So it may be that
| you 've seen more color than any photo, but the average
| viewer in Seattle this past weekend saw grey-er with their
| eyes and huge color in their phone photos.
|
| (Edit: it was still _super-cool_ even if grey-ish, and
| there was absolutely beautiful colors in there if you could
| find your way out of the direct city lights)
| goostavos wrote:
| The hubris of suggesting that your single experience of
| vaguely seeing the northern lights one time in Seattle
| has now led to a deep understanding of their true "color"
| and that the other person (perhaps all other people?)
| must be fooling themselves is... part of what makes HN so
| delightful to read.
|
| I've also seen the northern lights with my own eyes. Way
| up in the arctic circle in Sweden. Their color changes
| along with activity. Grey looking sometimes? Sure. But
| also colors that are so vivid that it feels like it
| envelopes your body.
| stavros wrote:
| They did say "the average viewer in Seattle this past
| weekend", not "all other viewers".
|
| Then again, the average viewer in Seattle this past
| weekend is hardly representative of what the northern
| lights look like.
| lpapez wrote:
| > The hubris of suggesting that your single experience of
| vaguely seeing the northern lights one time in Seattle
| has now led to a deep understanding of their true "color"
| and that the other person (perhaps all other people?)
| must be fooling themselves is... part of what makes HN so
| delightful to read.
|
| The H in HN stands for Hubris.
| freedomben wrote:
| The person they were responding to was saying that the
| people reporting grays were wrong, and that they had seen
| it and it was colorful. If anything, you should be
| accusing that person of hubris, not GP. All GPS point
| was, is that it can differ in different situations. They
| used the example of Seattle to show that the person they
| were responding to is not correct that it is never gray
| and dull.
| mitthrowaway2 wrote:
| The human retina effectively combines a color sensor with
| a monochrome sensor. The monochrome channel is more
| light-sensitive. When the lights are dim, we'll dilate
| our pupils, but there's only so much we can do to
| increase exposure. So in dim light we see mostly in
| grayscale, even if that light is strongly colored in
| spectral terms.
|
| Phone cameras have a Bayer filter which means they _only_
| have RGB color-sensing. The Bayer filter cuts out some
| incoming light and dims the received image, compared with
| what a monochrome camera would see. But that 's how you
| get color photos.
|
| To compensate for a lack of light, the phone boosts the
| gain and exposure time until it gets enough signal to
| make an image. When it eventually does get an image, it's
| getting a _color_ image. This comes at the cost of some
| noise and motion-blur, but it 's that or no image at all.
|
| If phone cameras had a mix of RGB and monochrome sensors
| like the human eye does, low-light aurora photos might
| end up closer to matching our own perception.
| laserbeam wrote:
| For decades, game engines have been working on realistic
| rendering. Bumping quality here and there.
|
| The golden standard for rendering has always been cameras.
| It's always photo-realistic rendering. Maybe this won't be
| true for VR, but so far most effort is to be as good as
| video, not as good as the human eye.
|
| Any sort of video generation AI is likely to have the same
| goal. Be as good as top notch cameras, not as eyes.
| garyrob wrote:
| Even in NY State, Hudson River Valley, I've seen them with
| real color. They're different each time.
| blhack wrote:
| Have you ever seen the Northern Lights with your eyes? If so
| I'm curious where you saw them.
|
| I echo what some other posters here have said: they're
| certainly not gray.
| porphyra wrote:
| Human eyes are basically black and white in low light since
| rod cells can't detect color. But when the northern lights
| are bright enough you can definitely see the colors.
|
| The fact that some things are too dark to be seen by humans
| but can be captured accurately with cameras doesn't mean that
| the camera, or the AI, is "making things up" or whatever.
|
| Finally, nobody wants to see a video or a photo of a dark,
| gray, and barely visible aurora.
| exodust wrote:
| > _nobody wants to see a video or a photo of a dark, gray,
| and barely visible aurora_
|
| Except those who want to see an accurate representation of
| what it looks like to the naked eye.
| stkhlm wrote:
| Living in northern Sweden I see the northern lights
| multiple times a year. I have never seen them pale or
| otherwise not colorful. Green and reds always. That is to
| my naked eye. Photographs do look more saturated, but the
| difference isn't as large as this comment thread make it
| out to be.
| shwaj wrote:
| That mirrors my experience from when I used to live in
| northern Canada
| jabits wrote:
| Even in Upper Michigan near Lake Superior we sometimes
| had stunn, colorful Northern Lights. Sometimes it seemed
| like they were flying overhead within your grasp
| DaSHacka wrote:
| Most definitely, it's quite common to find people hanging
| around outside up towards Calumet whenever there's a
| night with a high KP Index.
|
| I highly recommend checking them out if you're nearby,
| the recent auoras have been quite astonishing
| peanut_merchant wrote:
| Even in Northern Scotland (further south than northern
| Sweden) this is the case. The latest aurora showing was
| vividly colourful to the naked eye.
| fzzzy wrote:
| In the upper peninsula of michigan I have only seen grey.
| Jensson wrote:
| That is the same latitude as Paris though, not very north
| at all.
| exodust wrote:
| I'm in Australia where the southern lights are known to
| be not as intense as northern lights. That's where my
| remark comes from. Those who have never seen the aurora
| with their own eyes may like to see an accurate photo. A
| rare find among the collective celebration of saturation.
| freedomben wrote:
| Exactly. I went through major gas lighting trying to see
| the Aurora. I just wasn't sure whether I was actually
| seeing it, because it always looked so different from the
| photos. It is absolutely maddening trying to find a
| realistic photo of what it looks like to the naked eye,
| so that you can know if what you are seeing is actually
| the Aurora and not just clouds
| Kiro wrote:
| Shouldn't the model reflect how it looks on video rather than
| our naked eye?
| hoyd wrote:
| I can see what you mean, and that the video is somewhat not
| what it would be like in real. I have lived in northern
| Norway most of my life, and watched Auroras a lot. It
| certainly look green and link for the most time. Fainter, it
| would perhaps sorry gray I guess? Red, when viewed from a
| more southern viewpoint..
|
| I work at Andoya Space where perhaps most of the space
| research on Aurora had been done by sending scientific
| rockets into space for the last 60 yrs.
| simonjgreen wrote:
| To be fair, the prompt isn't asking for a realistic
| interpretation it's asking for a timelapse. What it's
| generated is absolutely what most timelapses look like.
|
| > Prompt: Timelapse of the northern lights dancing across the
| Arctic sky, stars twinkling, snow-covered landscape
| darkstar_16 wrote:
| Northern lights are actually pretty colourful, even to the
| naked eye. I've never seen them pale or b/w
| poulpy123 wrote:
| that's a bad example since the only images of aurora borealis
| are brightly colored. What I expect of an image generator is
| to output what is expected from it
| skypanther wrote:
| What struck me about the northern lights video was that it
| showed the Milky Way crossing the sky behind the northern
| lights. That bright part of the Milky Way is visible in the
| southern sky but the aurora hugging the horizon like that
| indicates the viewer is looking north. (Swap directions for
| the southern hemisphere and the aurora borealis).
| sdenton4 wrote:
| This strikes me as equally "AI complete" as drawing hands,
| which is now essentially a solved problem... No one test is
| sufficient, because you can add enough training data to address
| it.
| salamo wrote:
| Yeah "AI complete" is a bit tongue-in-cheek but it is a
| fairly spectacular failure mode of every model I've tried.
| smusamashah wrote:
| Ideogram and dalle do hands pretty well
| swyx wrote:
| ive been using "agi-hard" https://latent.space/p/agi-hard
| as a term
|
| because completeness isnt really what we are going for
| dongping wrote:
| Not sure about better models, but DALL-E3 still seems to be
| having problems with hands:
|
| https://www.reddit.com/r/dalle2/comments/1afhemf/is_it_possi.
| ..
|
| https://www.reddit.com/r/dalle2/comments/1cdks71/a_hand_with.
| ..
| sabellito wrote:
| Per usual the top comment on anything AI related is snark on
| "it can't to [random specific thing] well yet".
| kmacdough wrote:
| Tiring, but so is the relentless over-marketing. Each new
| demo implies new use cases and flexible performance. But the
| reality is they're very brittle and blunder _most_ seemingly
| simple tasks. I would personally love an ongoing breakdown of
| the key weaknesses. I often wonder "can it X?" The answer is
| almost always "almost, but not a useful almost".
| mikeocool wrote:
| Ha, wow, I'd never seen this one before. The failures are
| pretty great. Even repeatedly trying to correct ChatGPT/Dall-e
| with the proper number of squares and pieces, it somehow makes
| it worse.
|
| This is what dall-e came up with after trying to correct many
| previous iterations: https://imgur.com/Ss4TwNC
| perbu wrote:
| Most generative AI will struggle when given a task that
| requires something more less exact. They're probably pretty
| good at making something "chessish".
| efitz wrote:
| I'm surprised that the cowboy is not actually an Asian woman.
| mrcwinn wrote:
| OpenAI has the model advantage.
|
| Google and Apple have the ecosystem advantage.
|
| Apple in particular has the deeper stack integration advantage.
|
| Both Apple and Google have a somewhat poor software innovation
| reputation.
|
| How does it all net out? I suspect ecosystem play wins in this
| case because they can personalize more deeply.
| lowkey wrote:
| Google has a deep addiction to AdWords revenue which makes for
| a significant disadvantage. Nomatter how good their technology,
| they will struggle internally with deploying it at scale
| because that would risk their cash cow. Innovator's dilemma.
| frankacter wrote:
| Google Cloud and cloud services generated almost 9.57
| billion. That's up 28% from prior:
|
| https://www.crn.com/news/networking/2024/google-cloud-
| posts-...
|
| They are embedding their models not only widely across their
| platforms suite of internal products and devices, but also
| computationally via API for 3rd party development.
|
| Those are all free from any perceived golden handcuffs that
| AdWords would impose.
| damsalor wrote:
| Yea, well. I still think there is a conflict of interest if
| you sell propaganda
| lowkey wrote:
| As of 2020, AdWords represented over 80% of all Google
| revenue [1] while in 2021 7% of Google's revenue came from
| cloud [2].
|
| [1] https://www.cnbc.com/2021/05/18/how-does-google-make-
| money-a...?
|
| [2] https://aag-it.com/the-latest-cloud-computing-
| statistics/?t
| miki123211 wrote:
| Google and Apple also have an "API access" advantage. It is
| similar to the ecosystem advantage but goes beyond it; Google
| and Apple restrict third-party app makers from access to
| crucial APIs like receiving and reading texts or interacting
| with onscreen content from other apps. I think that may turn
| out to be the most important advantage of them all. This should
| be a far bigger concern for antitrust regulators than petty
| squabbles over in-app purchases. Spotify and Netflix are
| possible (if slightly inconvenient) to use on iOS, a fully-
| featured AI assistant coming from somebody who isn't Apple is
| not.
|
| Google (and to a lesser extend also Microsoft and Meta) also
| have a data advantage, they've been building search engines for
| years, and presumably have a lot more in-house expertise on
| crawling the web and filtering the scraped content. Google can
| also require websites which wish to appear in Google search to
| also consent to appearing in their LLM datasets. That decision
| would even make sense from a technical perspective, it's easier
| and cheaper to scrape once and maintain one dataset than to
| have two separate scrapers for different purposes.
|
| Then there's the bias problem, all of the major AI companies
| (except for Mistral) are based in California and have mostly
| left-leaning employees, some of them quite radical and many of
| them very passionate about identity politics. That worldview is
| inconsistent with a half of all Americans and the large
| majority of people in other countries. This particularly
| applies to the identity politics part, which just isn't a
| concern outside of the English-speaking world. That might also
| have some impact on which AI companies people choose, although
| I suspect far less so than the previous two points.
| mirekrusin wrote:
| Not mentioning Meta, the good guy now, is scandalous.
|
| X is not going to sit quietly as well.
|
| There is also the rest of us.
| riffraff wrote:
| X is tiny compared to Apple/Meta/Google, both in engineering
| size and in "fingerprint" in people's life.
|
| Also engineering wise, currently every tweet is followed by a
| reply "my nudes in profile" and X seems unable to detect it
| as trivial spam, I doubt they have the chops to compete in
| this arena, especially after the mass layoffs they
| experienced.
| mirekrusin wrote:
| By X I mean one guy with big pocket who won't sit quietly -
| I wouldn't underestimate him.
| xNeil wrote:
| >Google and Apple have a somewhat poor software innovation
| reputation.
|
| I'm assuming you mean reputation as in general opinion among
| developers? Because Google's probably been the most innovative
| company of the 21st century so far.
| bugbuddy wrote:
| Yes, I miss Stadia so much. It was the most innovative
| streaming platform I had ever used. I wished I could still
| use it. Please, Google, bring Stadia back.
| teaearlgraycold wrote:
| They're renting out the tech to 3rd parties
| hwbunny wrote:
| ahem...zzzzzzzz
| Octokiddie wrote:
| Oddly enough, I predict the final destination for this train will
| be for moving images to fade into the background. Everything will
| have a dazzling sameness to it. It's not unlike the weird place
| that action movies and pop music have arrived. What would have
| been considered unbelievable a short time ago has become bland.
| It's probably more than just novelty that's driving the comeback
| of vinyl.
| rjh29 wrote:
| Even this site just did not impress me. I feel like it's all
| stuff I could easily imagine myself. True creativity is someone
| with a unique mind creating something you would never had
| thought of.
| damsalor wrote:
| Get a life
| jmathai wrote:
| It's a lot more than novelty. It's dedicating the attention
| span needed to listen to an album track by track without
| skipping to another song or another artist. If that sounds
| dumb, give it time and you'll get there also.
|
| It's not just technology though. Globalization has added so
| many layers between us and the objects we interact with.
|
| I think Etsy was a bit ahead of their time. It's no longer a
| marketplace for handcrafted goods - it got overrun by mass
| produced goods masquerading as something artisan. I think the
| trend is continuing and in 5-10 years we'll be tired of cheap
| and plentiful goods.
| hwbunny wrote:
| Yeah, but if you bring up a generation or two on this trash,
| they will get used to it and think this will be the norm and
| gonna enjoy it like pigs at the troughs.
| mFixman wrote:
| AI generated images and video are not competing against actual
| quality work with money put into it. They are competing against
| the quick photoshop or Adobe Aftereffects done by hobbyists and
| people learning how to work in the creative arts.
|
| I never heard HN claiming that Copilot will replace
| programmers. Why do so many people believe generative AI will
| replace artists?
| A4ET8a8uTh0 wrote:
| I was hoping to see more.. I logged in and was greeted by a
| waiting list for videos. Since I was disappointed already, I
| figured I might as well spend some time on other, hopefully
| usable, features. So I moved to pictures.
|
| First, randomly selected 'feeling lucky' prompt got rejected,
| because it did not meet some criteria and pop-up helpfully listed
| FAQ to explain to me how I should be more sensitive to the
| program. I found it amusing.
|
| Then I played with a couple of images, but it was nothing really
| exciting one way or another.
|
| I guess you can color me disappointed overall. And no, I don't
| consider videos on repeat sufficient.
| TheAceOfHearts wrote:
| ImageFX fails at both of my tests:
|
| 1. Generating an image of "a group of catgirls activating a
| summoning circle". Anything related to catgirls tends to get
| tagged as sexual or NSFW so it's censored. Unsurprising.
|
| 2. The lamb described in Book of Revelation. Asking for it
| directly or pasting in the passage where the lamb is described
| both fail to generate any images. Normally this fails because
| there's not much art of the lamb from Book of Revelation from
| which the model can steal. If I gave the worst of artists a
| description of this, they'd be able to come up with _something_
| even if it 's not great.
|
| Overall, a very disappointing release. It's surprising that
| despite having effectively infinite money this is the best that
| Google is able to ship at the moment.
| SomaticPirate wrote:
| I think this comment is peak Hackernews... dripping with
| sarcasm and minimizing a significant engineering accomplishment
| TheAceOfHearts wrote:
| There's nothing sarcastic about my comment. It highlights key
| limitations of the system with clear examples. Considering
| the number of world-class engineers and effectively infinite
| resources available to Google it's just a disappointing
| release. Both examples are things that I care about and which
| other people aren't discussing, so I think adding my voice to
| the conversation is a net positive.
| runeks wrote:
| Didn't the model ever fail to generate realistic-looking content?
|
| If I don't know better I'd think you just cherry-picked the
| prompts with the best-looking results.
| carschno wrote:
| What you see there is a product, not the scientific
| contribution behind it. Consequently, you see marketing
| material, not a scientific evaluation.
| tsurba wrote:
| Unfortunately also the majority of scientific papers for eg.
| image generation have had completely cherry-picked examples
| for a long time now.
| yoyopa wrote:
| stop with the ridiculous names just some code numbers like BMW
| sanjayk0508 wrote:
| its a direct competition to sora
| ArchitectAnon wrote:
| I think the thing that most perturbs me about AI is that it takes
| jobs that involve manipulating colours, light, shade and space
| directly and turns them into essay writing exercises. As a
| dyslexic I fucking hate writing essays. 40% of architects are
| dyslexic. I wouldn't be surprised if that was similar or higher
| in other creative industries such as filmmaking and illustration.
| Coincidentally 40% of the prison population is also dyslexic, I
| wonder if that's where all the spare creatives who are terrible
| at describing things with words will end up in 20 years time.
| aavshr wrote:
| I would imagine and hope for interfaces to exist where the
| natural language prompt is the initial seed and then you'd
| still be able to manipulate visual elements through other ways.
| Art9681 wrote:
| This is the case today. You won't get a "perfect" image
| without heavy post-processing, even if that post-processing
| is AI enhanced. ComfyUI is the new PhotoShop and although its
| not an easy app to understand, once it "clicks" its the most
| amazing piece of software to come out of the opensource oven
| in a long time.
| fzzzy wrote:
| you can speak instead if you wish. Speech to text is available
| for all operating systems.
| cy6erlion wrote:
| Speaking has sound but that is still just words with the same
| logic structure. "Colours, light, shade and space" have
| entirely different logic.
| fzzzy wrote:
| Very interesting. Thank you for the perspective, it is
| extremely illuminating.
|
| What is a user interface which can move from color, light,
| shade, and space to images or text? Could there be an
| architecture that takes blueprints and produces text or
| images?
| chromanoid wrote:
| I guess in the near future prompts can be replaced by a live
| editing conversation with the AI, like talking to a phantom
| draughtsman or a camera operator / movie team. The AI will
| adjust while you talk to it and can also ask questions.
|
| ChatGPT already allows this workflow to some extent. You should
| try it out. I just talked to ChatGPT on my phone to test it. I
| think I will not go back to text for these purposes. It's much
| more creative to just say what you don't like about a picture.
|
| If you speech is also affected rough sketches and other
| interfaces will/are also be available (see
| https://openart.ai/apps/sketch-to-image). What kind of
| expression do you prefer?
| canes123456 wrote:
| It's seems exceedly clear to me that the primary interface for
| LLMs will voice.
| cainxinth wrote:
| Terence McKenna predicted this:
|
| "The engineers of the future will be poets."
| gnobbler wrote:
| You're entitled to your opinion but this will open up a world
| of possibilities to people who couldn't work in these fields
| previously due to their own non-dyslexia disability. Handless
| intelligent people shouldn't lose out because incumbents don't
| want to share their lane.
| alt227 wrote:
| So, the fall of the skilled professional and the rise of
| anybody who knows how to write prompts?
| Jensson wrote:
| The AI we have today has very little to do with writing
| prompts, you still need to understand, correct, glue and
| edit the results and that is most of the work so you still
| need skilled professionals.
| DeathArrow wrote:
| >As a dyslexic I fucking hate writing essays
|
| You can feed AI an image and ask it to describe. Kind of the
| inverse process.
| seanw265 wrote:
| Your claim that 40% of architects piqued my curiosity. I wonder
| if this would have an impact on the success of tools like
| ChatGPT in the architecture industry.
|
| Do you have a source for this stat? I can't seem to find
| anything to support it.
| neverokay wrote:
| I really just need to make some porn with this stuff already and
| I feel like we're all tip toeing around this key feature.
|
| Censored models are not going to work and we need someone to
| charge for an explicit model already that we can trust.
| ranyume wrote:
| Noo! Think about the children!
|
| (this post is sarcastic)
| neverokay wrote:
| If they cared about the kids they would out ahead of this
| before it spreads like wildfire.
| LZ_Khan wrote:
| Oh there's a lot of ai generated porn clips floating around the
| internet.
| Dowwie wrote:
| The Alpine Lake example is gorgeous
| nbzso wrote:
| How many billions and tons of water is wasted on this abomination
| and copyright theft?
| solatic wrote:
| > It's critical to bring technologies like Veo to the world
| responsibly. Videos created by Veo are watermarked using SynthID,
| our cutting-edge tool for watermarking and identifying AI-
| generated content
|
| And we're supposed to believe that this is resilient against
| prompt injection?
|
| How do you prevent state actors from creating "proof" that their
| enemies engaged in acts of war, and they are only engaging in
| "self-defense"?
| dmix wrote:
| Nation states can run their own models if not now very soon.
| This isn't something you're going to control via AI-safety woo
| woo.
| datarez wrote:
| Google is dancing with OpenAI
___________________________________________________________________
(page generated 2024-05-15 23:01 UTC)