[HN Gopher] Sora is here
___________________________________________________________________
Sora is here
Author : toomuchtodo
Score : 594 points
Date : 2024-12-09 18:02 UTC (4 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| toomuchtodo wrote:
| From "12 Days of OpenAI: Day 3"
|
| https://www.youtube.com/watch?v=2jKVx2vyZOY (live as of this
| comment)
| bbor wrote:
| Over now, and pretty short/light on info AFAICT. That said,
| knowing what we know now about Altman made me physically unable
| to watch while he engages in sustained eye contact with the
| camera, so maybe missed something while skimming! On the
| upside, I'm so glad we have three billionaires cultivating
| three different cinema-supervillain vibes (Musk, Altman, &
| Zuckerberg). Much more fresh than the usual "oil baron"
| aesthetic that we know from the gilded age
| optimalsolver wrote:
| Wish they'd followed their previous MO of releasing stuff with no
| warning or buildup.
|
| Results won't match the hype.
| cooper_ganglia wrote:
| I feel like announcing a new product in the same vein as your
| main product as an established company is almost always a bad
| idea. If you're going to improve your product, don't announce
| the improvements 6-12 months ahead of time and grow the hype to
| unmanageable levels, just announce a great product and tell
| them it's available starting today.
| transformi wrote:
| Not impressive compare to the opensource video models out there,
| I anticipated some physics/VR capabilities, but it's basically
| just a marketing promotion to "stay in the game"...
| Geee wrote:
| What's the best open source video model right now?
| minimaxir wrote:
| Hunyan (https://replicate.com/tencent/hunyuan-video ,
| $0.70/video) is the best but somewhat expensive. LTX
| (https://replicate.com/fofr/ltx-video , $0.10) is
| cheaper/faster but less capable.
|
| Both are permissively licensed.
| treesciencebot wrote:
| Hunyuan at other providers like fal.ai is cheaper than SORA
| for the same resolution (720p 5 seconds gets you ~15 videos
| for $20 vs almost 50 videos at fal). It is slower than SORA
| (~3 minutes for a 720p video) but faster than replicate's
| hunyuan (by 6-7x for the same settings).
|
| https://fal.ai/models/fal-ai/hunyuan-video
| cooper_ganglia wrote:
| Hunyuan is a recent one that has looked pretty good.
| bbor wrote:
| I... can you explain, or point to some competitors...? To me
| this looks _leagues_ ahead of everything else. But maybe I 'm
| behind the game?
|
| AFAIK based on HuggingFace trending[1], the competitors are:
|
| - bytedance/animatediff-lightning:
| https://arxiv.org/pdf/2403.12706 (2.7M downloads in the past
| 30d, released in March)
|
| - genmo/mochi-1-preview: https://github-production-user-
| asset-6210df.s3.amazonaws.com... (21k downloads, released in
| October)
|
| - thudm/cogvideox-5b: https://huggingface.co/THUDM/CogVideoX-5b
| (128k downloads, released in August)
|
| Is there a better place to go? I'm very much not plugged into
| this part of LLMs, partially because it's just so damn
| spooky...
|
| EDIT: I now see the reply above referencing Hunyuan, which I
| didn't even know was its own model. Fair enough! I guess, like
| always, we'll just need to wait for release so people can run
| their own human-preference tests to definitively say which is
| better. Hunyuan does indeed seem good
| zeknife wrote:
| Like with music generation models, the main thing that might
| make "open source" models better is most likely that they have
| no concern about excluding copyrighted material from the
| training data, so they actually get a good starting point
| instead of using a dataset consisting of youtube videos and
| stock footage
| fngjdflmdflg wrote:
| Serious question: is this better than current text-to-video
| models like Hailuo?
| meetpateltech wrote:
| Pricing:
|
| Plus Tier (20$/month)
|
| - Up to 50 priority videos (1,000 credits)
|
| - Up to 720p resolution and 5s duration
|
| Pro Tier (200$/month)
|
| - Up to 500 priority videos (10,000 credits)
|
| - Unlimited relaxed videos
|
| - Up to 1080p resolution, 20s duration and 5 concurrent
| generations
|
| - Download without watermark
|
| more info: https://help.openai.com/en/articles/10245774-sora-
| billing-cr...
| cube2222 wrote:
| Worth noting here that this is the existing ChatGPT
| subscription, you don't need a separate one.
| jsheard wrote:
| Called it, they were sitting on Sora until the $200 tier
| launched. Between the watermarking and 50 video limit the $20
| tier is functionally a trial.
| dbspin wrote:
| Wow they're watermarking videos and limiting them to 720 at the
| 20 dollar price point? That's a bold move, considering their
| competition's pricing...
|
| https://www.klingai.com/membership/membership-plan
|
| Quality seems relatively similar based on the samples I've
| seen. With the same issues - object permanence, temporal
| stability, physics comprehension etc, being present in both.
| Kling has no qualms about copyright violation however.
| minimaxir wrote:
| At OpenAI's $20/mo price point, you can also only generate
| _16_ 720p 5s videos per month.
|
| Kling doesn't seem to have more granular information
| publically but I suspect it allows for more than 16 videos
| per month.
| throwup238 wrote:
| From the FAQ [1], too:
|
| _> > Can I purchase more credits?
|
| > We currently don't support the ability to purchase more
| credits on a one-time basis.
|
| > If you are on a ChatGPT Plus and would like to access more
| credits to use with Sora, you can upgrade to the Pro plan._
|
| Ouch. Looks like they're really pushing this ChatGPT pro
| subscription. Between the watermark and being unable to buy
| more credits, the plus plan is basically a small trial.
|
| [1] https://help.openai.com/en/articles/10245774-sora-billing-
| cr...
| yeknoda wrote:
| I've found using these and similar tools that the amount of
| prompts and iteration required to create my vision (image or
| video in my mind) is very large and often is not able to create
| what I had originally wanted. A way to test this is to take a
| piece of footage or an image which is the ground truth, and test
| how much prompting and editing it takes to get the same or
| similar ground truth starting from scratch. It is basically not
| possible with the current tech and finite amounts of time and
| iterations.
| cube2222 wrote:
| Agreed. It's still much better than what I could do myself
| without it, though.
|
| (Talking about visual generative AI in general)
| JKCalhoun wrote:
| Yeah, but if I handed you a Maxfield Parrish it would be
| better than either of us can do -- but not what I asked for.
|
| I find generative AI frustrating because I _know_ what I
| want. To this point I have been trying but then ultimately
| sitting it out -- waiting for the one that really works the
| way I want.
| cube2222 wrote:
| For me even if I know what I want, if I'm using gen AI I'm
| happy to compromise and get good enough (which again, is so
| much better than I could do otherwise).
|
| If you want higher quality/precision, you'll likely want to
| ask a professional, and I don't expect that to change in
| the near future.
| adamc wrote:
| That limits its value for industries like Hollywood,
| though, doesn't it? And without that, who exactly is
| going to pay for this?
| jddj wrote:
| Advertisers, I guess. Same folks who paid for everything
| else around here
| adamc wrote:
| Yeah, I just question if there are enough customers to
| make this work.
| cube2222 wrote:
| To me, currently, visual generative ai is an evolution
| and improvement of stock images, and has effectively the
| same purpose.
|
| People pay for stock images.
| adamc wrote:
| Yeah, maybe for some purposes. In business, people
| sometimes pay for stock images but often don't have the
| expertise or patience to really spend a lot of time
| coaching a video into fruition. Maybe for advertising or
| other contexts where more effort is worth it (not just
| powerpoints), but it feels like a slim audience.
| cube2222 wrote:
| With tools like Apple Intelligence and its genmoji (emoji
| generation) and playground (general diffusion image
| generation) I expect it to also take on some of the
| current entertainment and social use-cases of stickers
| and GIFs.
|
| But that's probably something you don't pay for directly,
| instead paying for e.g. a phone that has those features.
| nomel wrote:
| I think inpainting and "draw the label scene" type interfaces
| are the obvious future. Never thought I'd miss GauGAN [1].
|
| https://www.youtube.com/watch?v=uNv7XBngmLY&t=25
| minimaxir wrote:
| Way back in the days of GPT-2, there was an expectation that
| you'd need to cherry-pick atleast 10% of your output to get
| something usable/coherent. GPT-3 and ChatGPT greatly reduced
| the need to cherry-pick, for better or for worse.
|
| All the generated video startups seem to generate videos with
| _much_ lower than 10% usable output, without significant human-
| guided edits. Given the massive amount of compute needed to
| generate a video relative to hyperoptimized LLMs, the quality
| issue will handicap gen video for the foreseeable future.
| joe_the_user wrote:
| Plus editing text or an image is practical. Video editors
| typically are used to cut and paste video streams - a video
| editor can't fix a stream of video that gets motion or
| anatomy wrong.
| mattigames wrote:
| Not too far in the future you will be able to drag and drop the
| position of the characters as well as the position of the
| camera, among other refiment tools.
| isoprophlex wrote:
| And another thing that irks me: none of these video generators
| get motion right...
|
| Especially anything involving fluid/smoke dynamics, or fast
| dynamic momements of humans and animals all suffer from the
| same weird motion artifacts. I can't describe it other than
| that the fluidity of the movements are completely off.
|
| And as all genai video tools I've used are suffering from the
| same problem, I wonder if this is somehow inherent to the
| approach & somehow unsolvable with the current model
| architectures.
| benchmarkist wrote:
| Neural networks use smooth manifolds as their underlying
| inductive bias so in theory it should be possible to
| incorporate smooth kinematic and Hamiltonian constraints but
| I am certain no one at OpenAI actually understands enough of
| the theory to figure out how to do that.
| david-gpu wrote:
| _> I am certain no one at OpenAI actually understands
| enough of the theory to figure out how to do that_
|
| We would love to learn more about the origin of your
| certainty.
| benchmarkist wrote:
| I don't work there so I'm certain there is no one with
| enough knowledge to make it work with Hamiltonian
| constraints because the idea is very obvious but they
| haven't done it because they don't have the wherewithal
| to do so. In other words, no one at OpenAI understands
| enough basic physics to incorporate conservation
| principles into the generative network so that objects
| with random masses don't appear and disappear on the
| "video" manifold as it evolves in time.
| david-gpu wrote:
| _> the idea is very obvious but they haven 't done it
| because they don't have the wherewithal to do so_
|
| Fascinating! I wish I had the knowledge and wherewithal
| to do that and become rich instead of wasting my time on
| HN.
| benchmarkist wrote:
| No one is perfect but you should try to do better and
| waste less time on HN now that you're aware and can act
| on that knowledge.
| david-gpu wrote:
| Nah, I'm good. HN can be a very amusing place at times.
| Thanks, though.
| dartos wrote:
| How does your conclusion follow from your statement?
|
| Neural networks are largely black box piles of linear
| algebra which are massaged to minimize a loss function.
|
| How would you incorporate smooth kinematic motion in such
| an environment?
|
| The fact that you discount the knowledge of literally every
| single employee at OpenAI is a big signal that you have no
| idea what you're talking about.
|
| I don't even really like OpenAI and I can see that.
| benchmarkist wrote:
| I've seen the quality of OpenAI engineers on Twitter and
| it's easy enough to extrapolate. Moreoever, neural
| networks are not black boxes, you're just parroting
| whatever you've heard on social media. The underlying
| theory is very simple.
| dartos wrote:
| Do not make assumptions about people you do not know in
| an attempt to discredit them. You seem to be a big fan of
| that.
|
| I have been working with NLP and neural networks since
| 2017.
|
| They aren't just black boxes, they are _largely_ black
| boxes.
|
| When training an NN, you don't have great control over
| what parts of the model does what or how.
|
| Now instead of trying to discredit me, would you mind
| answering my question? Especially since, as you say, the
| theory is so simple.
|
| How would you incorporate smooth kinematic motion in such
| an environment?
| benchmarkist wrote:
| Why would I give away the idea for free? How much do you
| want to pay for the implementation?
| dartos wrote:
| lol. Ok dude you have a good one.
| benchmarkist wrote:
| You too but if you do want to learn the basics then
| here's one good reference:
| https://www.amazon.com/Hamiltonian-Dynamics-Gaetano-
| Vilasi/d.... If you already know the basics then this is
| a good followup: https://www.amazon.com/Integrable-
| Hamiltonian-Systems-Geomet.... The books are much cheaper
| than paying someone like me to do the implementation.
| dartos wrote:
| Yeah I mean I would never pay you for anything.
|
| You've convinced me that you're small and know very
| little about the subject matter.
|
| You don't need to reply to this. I'm done with this
| convo.
| benchmarkist wrote:
| Ok, have a good one dude.
| mech422 wrote:
| cop out... according to you, the idea is so obvious it
| wouldn't be worth anything.
| benchmarkist wrote:
| It's not worth anything to me but I'm sure OpenAI would
| be willing to pay a lot of money for opening up the
| market of video generators with realistic physical
| evolution. If they want to hire an ultra genius such as
| myself to do the work then it would be worth at least
| $500k to them so that's basically the floor for anyone
| else who wants the actual details. The actual market
| would be worth billions so I'm basically giving the idea
| away at that price.
| giantrobot wrote:
| I think one of the biggest problems is the models are trained
| on 2D sequences and don't have any understanding of what
| they're actually seeing. They see some structure of pixels
| shift in a frame and learn that some 2D structures should
| shift in a frame over time. They don't actually understand
| the images are 2D capture of an event that occurred in four
| dimensions and the thing that's been imaged is under the
| influence of _unimaged_ forces.
|
| I saw a Santa dancing video today and the suspension of
| disbelief was almost instantly dispelled when the cuffs of
| his jacket moved erratically. The GenAI was trying to get
| them to sway with arm movements but because it didn't
| understand _why_ they would sway it just generated a
| statistical approximation of swaying.
|
| GenAI also definitely doesn't understand 3D structures easily
| demonstrated by completely incorrect morphological features.
| Even my dogs understand gravity, if I drop an object they're
| tracking (food) they know it should hit the ground. They also
| understand 3D space, if they stand on their back legs they
| can see over things or get a better perspective.
|
| I've yet to see any GenAI that demonstrates even my dogs'
| level of understanding the physical world. This leaves their
| output in the uncanny valley.
| jeroen wrote:
| They don't even get basic details right. The ship in the 8th
| video changes with every camera change and birds appear out
| of nowhere.
| soheil wrote:
| What's the point of poking holes in new technology and
| nitpiking like this? Are you blind to the immense
| breakthroughs made today and yet you focus what irks you
| about some tiny detail that might go away after a couple of
| versions?
| miltonlost wrote:
| The adage "a picture is worth a thousand words" has the nice
| corollary "A thousand words isn't enough to be precise about an
| image".
|
| Now expand that to movies and games and you can get why this
| whole generative-AI bubble is going to pop.
| szundi wrote:
| Comment was probably rather about the 360 degree turning
| heads etc.
| GistNoesis wrote:
| (2020) https://arxiv.org/abs/2010.11929 : an image is worth
| 16x16 words transformers for image recognition at scale
|
| (2021) https://arxiv.org/abs/2103.13915 : An Image is Worth
| 16x16 Words, What is a Video Worth?
|
| (2024) https://arxiv.org/abs/2406.07550 : An Image is Worth
| 32 Tokens for Reconstruction and Generation
| dartos wrote:
| Those are indeed 3 papers.
| GistNoesis wrote:
| Yes in a nutshell they explain that you can express a
| picture or a video with relatively few discrete
| information.
|
| First paper is the most famous and prompted a lot of
| research to using text generation tools in the image
| generation domain : 256 "words" for an image, Second
| paper is 24 reference image per minutes of video, Third
| paper is a refinement of the first saying you only need
| 32 "tokens". I'll let you multiply the numbers.
|
| In kind of the same way as a who's who game, where you
| can identify any human on earth with ~32bits of
| information.
|
| The corollary being that contrary to what parent is
| telling there is no theoretical obstacle to obtaining a
| video from a textual description.
| dartos wrote:
| I think something is getting lost in translation.
|
| These papers, from my quick skim (tho I did read the
| first one fully years ago,) seem to show that some images
| and to an extent video can be generated from discrete
| tokens, but does not show that exact images nor that any
| image can be.
|
| For instance, what combination of tokens must I put in to
| get _exactly_ Mona Lisa or starry night? (Tho these might
| be very well represented in the data set. Maybe a lesser
| known image would be a better example)
|
| As I understand, OC was saying that they can't produce
| what they want with any degree of precision since there's
| no way to encode that information in discrete tokens.
| GistNoesis wrote:
| If you want to know what tokens you want to obtain
| _exactly_ Mona Lisa, or any other image, you take the
| image and put it through your image tokenizer aka encode
| it, and if you have the sequence of token you can decode
| it to an image.
|
| VQ-VAE (Vector Quantised-Variational AutoEncoder), (2017)
| https://arxiv.org/abs/1711.00937
|
| The whole encoding-decoding process is reversible, and
| you only lose some imperceptible "details", the process
| can be either trained with a L2Loss, or a perceptual loss
| depending what you value.
|
| The point being that images which occurs naturally are
| not really information rich and can be compressed a lot
| by neural networks of a few GB that have seen billions of
| pictures. With that strong prior, aka common knowledge,
| we can indeed paint with words.
| dartos wrote:
| Maybe I'm not able to articulate my thought well enough.
|
| Taking an existing image and reversing the process to get
| the tokens that led to it then redoing that doesn't seem
| the same as inserting token to get a precise novel image.
|
| Especially since, as you said, we'd lose some details, it
| suggests that not all images can be perfectly described
| and recreated.
|
| I suppose I'll need to play around with some of those
| techniques.
| TeMPOraL wrote:
| > _Now expand that to movies and games and you can get why
| this whole generative-AI bubble is going to pop._
|
| What will save it is that, no matter how picky you are as a
| creator, your audience will never know what exactly was that
| you dreamed up, so any half-decent approximation will work.
|
| In other words, a corollary to your corollary is,
| "Fortunately, you don't need them to be, because no one cares
| about low-order bits".
|
| Or, as we say in Poland, "What the eye doesn't see, the heart
| doesn't mourn."
| jsheard wrote:
| > What will save it is that, no matter how picky you are as
| a creator, your audience will never know what exactly was
| that you dreamed up, so any half-decent approximation will
| work.
|
| Part of the problem is the "half decent approximations"
| tend towards a cliched average, the audience won't know
| that the cool cyberpunk cityscape you generated isn't
| exactly what you had in mind, but they will know that it
| looks like every other AI generated cyberpunk cityscape and
| mentally file your creation in the slop folder.
|
| I think the pursuit of fidelity has made the models less
| creative over time, they make fewer glaring mistakes like
| giving people six fingers but their output is ever more
| homogenized and interchangable.
| randomcatuser wrote:
| a somewhat counterintuitive argument is this: AI models
| will make the overall creative landscape more diverse and
| interesting, ie, less "average"!
|
| Imagine the space of ideas as a circle, with stuff in the
| middle being more easy to reach (the "cliched average").
| Previously, traversing the circle was incredibly hard -
| we had to use tools like DeviantArt, Instragram, etc to
| agglomerate the diverse tastes of artists, hoping to find
| or create the style we're looking for. Creating the same
| art style is hiring the artist. As a result, on average,
| what you see is the result of huge amounts of human
| curation, effort, and branding teams.
|
| Now reduce the effort 1000x, and all of a sudden, it's
| incredibly easy to reach the edge of the circle (or
| closer to it). Sure, we might still miss some things at
| the very outer edge, but it's equivalent to building
| roads. Motorists appear, people with no time to sit down
| and spend 10000 hours to learn and master a particular
| style can simply _remix art_ and create things wildly
| beyond their manual capabilities. As a result, the amount
| of content in the infosphere skyrockets, the tastemaking
| velocity accelerates, and you end up with a more
| interesting infosphere than you 're used to.
| wongarsu wrote:
| And as AI oversaturates the cliched average, creators
| will have to get further and further away from the
| average to differentiate themselves. If you pour a lot of
| work into your creation you want to make it clear that it
| isn't some cliched AI drivel.
| skydhash wrote:
| You will basically have to provide a video showcasing
| your workflow.
| TeMPOraL wrote:
| To extend the analogy, imagine the circle as a
| probability distribution; for simplicity, imagine it's a
| bivariate normal joint distribution (aka. Gaussian in 3D)
| + some noise, and you're above it and looking down.
|
| When you're commissioning an artist to make you some art,
| you're basically sampling from the entire distribution.
| Stuff in the middle is, as you say, easiest to reach, so
| that's what you'll most likely get. Generative models let
| more people do art, meaning there's more sampling
| happening, so the stuff further from the centre will be
| visited more often, too.
|
| However, AI tools also make another thing easier: moving
| and narrowing the sampling area. Much like with a very
| good human artist, you can find some work that's "out
| there", and ask for variations of it. However, there are
| only so many good artists to go around. AI making this
| process much easier and more accessible means more
| exploration of the circle's edges will happen. Not just
| "more like this weird thing", but also combinations of 2,
| 3, 4, N distinct weird things. So in a way, I feel that
| AI tools will surface creative art disproportionally more
| than it'll boost the common case.
|
| Well, except for the fly in the ointment that's the
| advertising industry (aka. the cancer on modern society).
| Unfortunately, by far most of the creative output of
| humanity today is done for advertising purposes, and that
| goal favors the common, as it maximizes the audience (and
| is least off-putting). Deluge of AI slop is unavoidable,
| because slop is how the digital world makes money, and
| generative AI models make it cheaper than generative
| protein models that did it so far. Don't blame AI
| research for that, blame advertising.
| robertlagrant wrote:
| It's just like when Bootstrap came out. Terrible-looking
| websites stopped appearing, but so did beautiful
| websites.
| TeMPOraL wrote:
| > _I think the pursuit of fidelity has made the models
| less creative over time (...) their output is ever more
| homogenized and interchangable._
|
| Ironically, we're long past that point with _human
| creators_ , at least when it comes to movies and games.
|
| Take sci-fi movies, compare modern ones to the ones from
| the tail end of the 20th century. Year by year, VFX gets
| more and more detailed (and expensive) - more and better
| lights, finer details on every material, more stuff
| moving and emitting lights, etc. But all that effort
| arguably _killed_ immersion and believability, by _making
| scenes incomprehensible_. There 's _way_ too much visual
| noise in action scenes in particular - bullets and
| lighting bolts zip around, and all that detail just blurs
| together. Contrast the 20th century productions -
| textures weren 't as refined, but you could at least tell
| who's shooting who and when.
|
| Or take video games, where all that graphics works makes
| everything look the same. Especially games that go for
| realistic style, they're all homogenous these days, and
| it's all cheap plastic.
|
| (Seriously, what the fuck went wrong here? All that talk,
| and research, and work into "physically based rendering",
| yet in the end, all PBR materials end up looking like
| painted plastic. Raytracing seems to help a bit when it
| comes to liquids, but it still can't seem to make metals
| look like metals and not Fischer-Price toys repainted to
| gray.)
|
| So I guess in this way, more precision just makes the
| audience give up entirely.
|
| > _they will know that it looks like every other AI
| generated cyberpunk cityscape and mentally file your
| creation in the slop folder._
|
| The answer here is the same as with human-produced slop:
| don't. People are good at spotting patterns, so keep
| adding those low-order bits until it's no longer obvious
| you're doing the same thing everyone else is.
|
| EDIT: Also, obligatory reminder that generative models
| don't give you average of training data with some noise
| mixed up; they _sample from learned distribution_. Law of
| large numbers apply, but it just means that to get more
| creative output, you need to bias the sampling.
| wongarsu wrote:
| Video games (the much larger industry of the two, by
| revenue) seems to be closer to understanding this. AAA
| games dominate advertising and news cycles, but on any
| best-seller list AAA games are on par with indie and B
| games (I think they call them AA now?). For every
| successful $60M PBR-rendered Unreal 5 title there is an
| equally successful game with low-fidelity graphics but
| exceptional art direction, story or gameplay.
|
| Western movie studios may discover the same thing soon,
| with the number of high-budget productions tanking
| lately.
| robertlagrant wrote:
| I agree. The one shining hope I have is the incredible
| art and animation style of Fortiche[0]'s Arcane[1]
| series. Watch that, and then watch any recent (and
| identikit) Pixar movie, and they are just streets ahead.
| It's just brilliant.
|
| [0] https://en.wikipedia.org/wiki/Fortiche
|
| [1] https://en.wikipedia.org/wiki/Arcane_(TV_series)
| samatman wrote:
| Empirically, we've passed the point where that's true,
| for someone not being lazy about it.
|
| https://www.astralcodexten.com/p/how-did-you-do-on-the-
| ai-ar...
|
| In other words, someone willing to tweak the prompt and
| press the button enough times to say "yeah, that one,
| that's really good" is going to have a result which
| cannot in fact be reliably binned as AI-generated.
| dartos wrote:
| Your eye sees just about every frame of a film...
|
| People may not think they care, but obviously they do.
| That's why marvel movies do better than DC ones.
|
| People absolutely care about details in their media.
| TeMPOraL wrote:
| Fair point, particularly given the example. My conclusion
| wrt. Marvel vs. DC is that DC productions care much less
| about details, in exactly the way I find off-putting.
|
| Not all details matter, some do. And, it's better to not
| show the details at all, than to be inconsistent in them.
|
| Like, idk., don't identify a bomb as a specific type of
| existing air-fuel ordnance and then act about it as if it
| was a goddamn tactical nuke. Something along these lines
| was what made me stop watching _Arrow_ series.
| dartos wrote:
| > Not all details matter, some do
|
| This is a key observation, unfortunately generally
| solving for what details matter is extremely difficult.
|
| I don't think video generation models help with that
| problem, since you have even less control of details than
| you do with film.
|
| At least before post.
| og_kalu wrote:
| The visuals are the absolute bottom of why DC movies have
| performed worse over the years.
|
| The movies have just had much worse audience and critical
| reception.
| naasking wrote:
| I was just going to say this. If you have an artistic
| vision that you simply _must_ create to the minutest
| detail, then like any artist, you 're in for a lot of
| manual work.
|
| If you are not beholden to a precise vision or maybe just
| want to create something that sells, these tools will
| likely be significant productivity multipliers.
| whstl wrote:
| Exactly.
|
| So far ChatGPT is not for writing books, but is great for
| SEO-spam blogposts. It is already killing the content
| marketing industry.
|
| So far Dall-E is not for making master paintings, but
| it's great for stock images. It might kill most of the
| clipart and stock image industry.
|
| So far Udio and other song generators are not able to
| make symphonies, but it's great for quiet background
| music. It might kill most of the generic royalty-free-
| music industry.
| hammock wrote:
| It's like how there are two types of movie directors (or
| creative directors in general), the dictatorial "100 takes
| until I get it exactly how I envision it" type, and the "I
| hired you to act, so you bring the character to life for me
| and what will be will be" type
|
| Right now AI is more the latter, but many people want it to
| be the former
| troupo wrote:
| AI is neither.
|
| A director letting actors "just be" knows exactly what
| he/she wants, and choses actors accordingly. Just as the
| directors that want the most minute detail.
|
| Clint Eastwood tries to do at most one take of a scene.
| David Fincher is infamous for his dozens of takes.
|
| AI is neither Fincher nor Eastwood.
| wcfrobert wrote:
| Do artist really have a fully formed vision in their head?
| I suspect the creative process is much more iterative
| rather than one-directional.
| skydhash wrote:
| No one can have a fully formed vision. But intent, yes.
| Then you use techniques to materialize it. Word is a poor
| substitute for that intent, which is why there's so many
| sketches in a visual project.
| maxglute wrote:
| And why physical execution frequently significantly
| departs from sketches and concept art. The amount of
| intent that doesn't get translated is pretty staggering
| in both physical and digital pipelines in many projects.
| Ar-Curunir wrote:
| That's just sad, and why people have a derogative stance
| towards generative AI: "half-decent" approximation removes
| all personality from the output, leading to a bunch of slop
| on the internet.
| TeMPOraL wrote:
| It does indeed, but then many of those people don't
| notice they're already consuming half-decent,
| personality-less slop, because that's what human artists
| make too, when churning out commercial art for peanuts
| and on tight deadlines.
|
| It's less obvious because people _project_ personality
| onto the content they see, because they implicitly
| _assume_ the artist _cared_ , and had some vision in
| mind. Cheap shit doesn't look like cheap shit in
| isolation. Except when you know it's AI-generated,
| because this removes the artist from the equation, and
| with it, your assumptions that there's any personality
| involved.
| whatevertrevor wrote:
| I'm not so sure, one of the primary complaints about IP
| farming slop that major studios have produced recently is
| a lack of firm creative vision, and clear evidence of
| design by committee over artist direction.
|
| People can generally see the lack of artistic intent when
| consuming entertainment.
| TeMPOraL wrote:
| That's true. Then again, complaints about "lack of firm
| creative vision, and clear evidence of design by
| committee over artist direction" is something I've seen
| levied against Disney for several years now; importantly,
| they started _before_ generative AI found its way into
| major productions.
|
| So, while GenAI tools make it easier to create
| superficially decent work that lacks creative intent, the
| studios managed to do it just fine with human
| intelligence only, suggesting the problem isn't AI, but
| the studios and their modern management policies.
| msabalau wrote:
| Half decent approximations work a lot better in generating
| the equivalent of a stock illustrations of a powerpoint
| slide.
|
| Actual long form art like a movie works because it includes
| many well informed choices that work together as a whole.
|
| There seems to be a large gap between generating a few
| seconds of video vaguely like one's notion, and trying to
| create 90 minutes that are related and meaningful.
|
| Which doesn't mean that you can't build from this starting
| place build more robust tools. But if you think that this
| is a large, hard amount of work, it certainly could call
| into question optisimitic projections from people who don't
| even seem to notice that there is work need at all.
| throwup238 wrote:
| "A frame is worth a billion rays"
|
| The last production I worked on averaged 16 hours per frame
| for the final rendering. The amount of information encoded in
| lighting, models, texture, maps, etc is insane.
| bongodongobob wrote:
| What were you working on? It took a month to render 2
| seconds of video?
| elmigranto wrote:
| I would guess there is more than one computer :)
|
| Pixar's stuff famously takes days per frame.
| Arelius wrote:
| > Pixar's stuff famously takes days per frame.
|
| Do you have a citation for this? My guess would be much
| closer to a couple of hours per frame.
| elmigranto wrote:
| https://sciencebehindpixar.org/pipeline/rendering
| Arelius wrote:
| Most VFX productions take over 2 CPU hours a frame for
| final video, and have for a very long time. It takes
| shorter then a month since this gets parallelized on
| large render farms.
| throwup238 wrote:
| VFX heavy feature for a Disney subsidiary. Each frame is
| rendered independently of each other - it's not like
| video encoding where each frame depends on the previous
| one, they all have their own scene assembly that can be
| sent to a server to parallelize rendering. With enough
| compute, the entire film can be rendered in a few days.
| (It's a little more complicated than that but works to a
| first order approximation)
|
| I don't remember how long the final rendering took but it
| was nearly two months and the final compute budget was 7
| or 8 figures. I think we had close to 100k cores running
| at peak from three different render farms during crunch
| time, but don't take my word for it I wasn't producing
| the picture.
| dist-epoch wrote:
| Are they still using CPUs and not GPUs for rendering?
|
| Weren't the rendering algos ported to CUDA yet?
| jsheard wrote:
| GPU renderers exist but they have pretty hard scaling
| limits, so the highest end productions still use CPU
| renderers almost exclusively.
|
| The 3D you see in things like commercials is usually done
| on GPUs though because at their smaller scale it's much
| faster.
| Al-Khwarizmi wrote:
| If you can build a system that can generate engaging games
| and movies, from an economic (bubble popping or not popping)
| point of view it's largely irrelevant whether they conform to
| fine-grained specifications by a human or not.
| dartos wrote:
| In other words:
|
| If you find a silver bullet then everything else is largely
| irrelevant.
|
| Idk if you noticed but that "if" is carrying an insane
| amount of weight.
| jsheard wrote:
| Text generation is the most mature form of genAI and even
| that isn't even remotely close to producing infinite
| engaging stories. Adding the visual aspect to make that
| story into a movie or the interactive element to turn it
| into a game is only uphill from there.
| beambot wrote:
| Corollary: I couldn't create an original visual piece of art
| to save my life, so prompting is infinitely better than what
| I could do myself (or am willing to invest time in building
| skills). The gen-AI bubble isn't going to burst. Pareto
| always wins.
| mrandish wrote:
| I agree that people who want any meaningful precision in
| their visual results will inevitably be disappointed.
| fooker wrote:
| Sure it's going to pop. But when is the important question.
|
| Being too early about this and being wrong are the same.
| stale2002 wrote:
| You are half right. Its funny because I use the same same.
| Mine is "A picture is worth a thousand words. thats why it
| takes 1000 words to describe the exact image that you want!
| Much better to just use Image to Image instead".
|
| Thats my full quote on this topic. And I think it stands.
| Sure, people won't describe a picture. instead, they will
| take an existing picture or video, and do modifications of
| it, using AI. That is much much simpler and more useful, if
| you can file a scene, and then animate it later with AI.
| meta_x_ai wrote:
| A picture is worth a thousand words.
|
| A word is worth a thousand pictures. (E.g Love)
|
| It is abstraction all the way
| raincole wrote:
| The point is not to be precise. It's to be "good enough".
|
| Trust me, even if you work with human artists, you'll keep
| saying "it's not quite I initially invisioned, but we don't
| have budget/time for another revision, so it's good enough
| for now." _all the time_.
| soheil wrote:
| Maybe your AI bubble! If you define AI to be something like
| just another programming language yes you will be sadly
| disappointed. You see it as an employee with its own
| intuitions and ways of doing things that you're trying to
| micromanage.
|
| I have a bad feeling that you'd be a horrible manager if you
| ever were one.
| hipadev23 wrote:
| Real artists struggle matching vague descriptions of what is in
| your head too. This is at least quicker?
| janalsncm wrote:
| The point is if you are the artist and have something in your
| head. It's the same problem with image editing. I am sure you
| have experienced this.
| mlboss wrote:
| So what I am getting a use-case for brain-computer
| interface.
| TeMPOraL wrote:
| There is no problem unless you insist on reflecting what
| you had in mind _exactly_. That needs minute controls, but
| no matter the medium and tools you use, unless you 're
| doing it in your own quest for artistic perfection, the
| economic constraints will make you stop short of your idea
| - there's always a point past which any further refinement
| will not make a difference to the audience (which doesn't
| have access to the thing in your head to use as reference),
| and the costs of continuing will exceed any value (monetary
| or otherwise) you expect to get from the work.
|
| AI or not, no one but you cares about the lower order bits
| of your idea.
| janalsncm wrote:
| I disagree. Even without exactness, adding any reasonable
| constraints is impossible. Ask it to generate a realistic
| circuit diagram or chess board or any other thing where
| precision matters. Good luck going back and forth getting
| it right.
|
| These are situations with relatively simple logical
| constraints, but an infinite number of valid solutions.
|
| Keep in mind that we are not requiring any particular
| configuration of circuit diagram, just any diagram that
| makes sense. There are an infinite number of valid ones.
| TeMPOraL wrote:
| That's using the wrong tool for a job :). Asking
| diffusion models to give you a valid circuit diagram is
| like asking a painter to paint you pixel-perfect 300DPI
| image on a regular canvas, using their standard
| paintbrush. It ain't gonna work.
|
| That doesn't mean it _can 't_ work with AI - it's that
| you may need to add something extra to the generative
| pipeline, something that can do circuit diagrams, and
| make the diffusion model supply style and extra noise
| (er, beautifying elements).
|
| > _Keep in mind that we are not requiring any particular
| configuration of circuit diagram, just any diagram that
| makes sense. There are an infinite number of valid ones._
|
| On that note. I'm the kind of person that loves to
| freeze-frame movies to look at markings, labels, and
| computer screens, and one thing I learned is that _humans
| fail at this task too_. Most of the time the problems are
| big and obvious, ruining my suspension of disbelief, and
| importantly, they could be trivially solved if the
| producers grabbed a random STEM-interested intern and
| asked for advice. Alas, it seems they don 't care.
|
| This is just a specific instance of the general problem
| of "whatever you work with or are interested in, you'll
| see movies keep getting it wrong". Most of the time, it's
| somewhat defensible - e.g. most movies get guns wrong,
| but in way people are used to, and makes the scenes more
| streamlined and entertaining. But with labels, markings
| and computer screens, doing it right isn't any more
| expensive, nor would it make the movie any less
| entertaining. It seems that the people responsible don't
| know better or care.
|
| Let's keep that in mind when comparing AI output to the
| "real deal", as to not set an impossible standards that
| human productions don't match, and never did.
| janalsncm wrote:
| The issue isn't any particular constraint. The issue is
| the inability to add any constraints at all.
|
| In particular, internal consistency is one of the
| important constraints which viewers will immediately
| notice. If you're just using sora for 5 second unrelated
| videos it may be less of an issue but if you want to do
| anything interesting you'll need the clips to tie
| together which requires internal consistency.
| throwup238 wrote:
| Nobody else really cares about the lower order bits of
| the idea but they do care that those lower order bits are
| consistent. The simplest example is color grading: most
| viewers are generally ignorant of artistic choices in
| color palettes unless it's noticeable like the Netflix
| blue tint but a movie where the scenes haven't been made
| consistently color graded is obviously jarring and even
| an expensive production can come off amateur.
|
| GenAI is great at filling in those lower order bits but
| until stuff like ControlNet gets much better precision
| and UX, I think genAI will be stuck in the uncanny valley
| because they're inconsistent between scenes, frames, etc.
| TeMPOraL wrote:
| Yup, 100% agreed on that, and mentioned this caveat
| elsewhere. As you say - people don't pay attention to
| details (or lack of it), as long as the details are
| consistent. Inconsistencies stand out like sore thumbs.
| Which is why IMO it's best to have less details than to
| be inconsistent with them.
| staticman2 wrote:
| Real artists take comic book scripts and turn them into
| actual comic books every month. They may not match exactly
| what the writer had in mind, but they are fit for purpose.
| TeMPOraL wrote:
| > _They may not match exactly what the writer had in mind,
| but they are fit for purpose._
|
| That's what GenAI is doing, too. After all, the audience
| only sees the final product; they never get know what the
| writer had in mind.
| staticman2 wrote:
| I haven't used SORA, but none of the GenAI I'm aware of
| could produce a competent comic book. When a human artist
| draws a character in a house in panel 1, they'll draw the
| same house in panel 2, not a procedurally generated
| different house for each image.
|
| If a 60 year old grizzled detective is introduced in page
| 1, a human artist will draw the same grizzled detective
| in page 2, 3 and so on, not procedurally generate a new
| grizzled detective each time.
| TeMPOraL wrote:
| A human artist keeps state :). They keep it between
| drawing sessions, and more importantly, they keep _very
| detailed_ state - their imagination or interpretation of
| what the thing (house, grizzled detective, etc.) is.
|
| Most models people currently use don't keep state between
| invocations, and whatever interpretation they make from
| provided context (e.g. reference image, previous frame)
| is surface level and doesn't translate well to output.
| This is akin to giving each panel in a comic to a
| different artist, and also telling them to sketch it out
| by their gut, without any deep analysis of prior work.
| It's a big limitation, alright, but researchers and
| practitioners are actively working to overcome it.
|
| (Same applies to LLMs, too.)
| jerf wrote:
| It just plain isn't possible if you mean a prompt the size of
| what most people have been using lately, in the couple hundred
| character range. By sheer information theory, the number of
| possible interpretations of "a zoom in on a happy dog catching
| a frisbee" means that you can not match a particular clip out
| of the set with just that much text. You will _need_ vastly
| more content; information about the breed, information about
| the frisbee, information about the background, information
| about timing, information about framing, information about
| lighting, and so on and so forth. Right now the AIs can 't do
| that, which is to say, even if you sit there and type a prompt
| containing all that information, it is going to be forced to
| ignore most of the result. Under the hood, with the way the
| text is turned into vector embeddings, it's fairly questionable
| whether you'd agree that it can even represent such a thing.
|
| This isn't a matter of human-level AI or superhuman-level AI;
| it's just straight up impossible. If you want the information
| to match, it has to be provided. If it isn't there, an AI can
| fill in the gaps with "something" that will make the scene
| work, but expecting it to fill in the gaps the way you "want"
| even though you gave it no indication of what that is is
| expecting literal magic.
|
| Long term, you'll never have a coherent movie produced by
| stringing together a series of textual snippets because, again,
| that's just impossible. Some sort of long-form "write me a
| horror movie staring a precocious 22-year old elf in a far-
| future Ganymede colony with a message about the importance of
| friendship" AI that generates a coherent movie of many scenes
| will have to be doing a lot of some sort of internal
| communication in an internal language to hold the result
| together between scenes, because what it takes to hold stuff
| coherent between scenes is an amount of English text not
| entirely dissimilar in size from the underlying representation
| itself. You might as well skip the English middleman and go
| straight to an embedding not constrained by a human language
| mapping.
| yeknoda wrote:
| something like a white paper with a mood board, color scheme,
| and concept art as the input might work. This could be sent
| into an LLM "expander" that increases the words and
| speficity. Then multiple reviews to tap things in the right
| direction.
| 3form wrote:
| And I think this realistically is going to be the shape of
| the tools to come in the foreseeable future.
| echelon wrote:
| You should see what people are building with Open Source
| video models like HunYuan [1] and ComfyUI + Control Nets.
| It blows Sora out of the water.
|
| Check out the Banodoco Discord community [2]. These are
| the people pioneering steerable AI video, and it's all
| being built on top of open source.
|
| [1] https://github.com/Tencent/HunyuanVideo
|
| [2] https://banodoco.ai/
| mikepurvis wrote:
| I expect this kind of thing is actually how it's going to
| work longer term, where AI is a copilot to a human artist.
| The human artist does storyboarding, sketching in backdrops
| and character poses in keyframes, and then the AI steps in
| and "paints" the details over top of it, perhaps based on
| some pre-training about what the characters and settings
| are so that there's consistency throughout a given work.
|
| The real trick is that the AI needs to be able to
| participate in iteration cycles, where the human can say
| "okay this is all mostly good, but I've circled some areas
| that don't look quite right and described what needs to be
| different about them." As far as I've played with it,
| current AIs aren't very good at revisiting their own work--
| you're basically just tweaking the original inputs and
| otherwise starting over from scratch each time.
| programd wrote:
| We will shortly have much better tweaking tools which
| work not only on images and video but concepts like what
| aspects a character should exhibit. See for example the
| presentation from Shapeshift Labs.
|
| https://www.shapeshift.ink/
| echelon wrote:
| For those not in this space, _Sora is essentially dead on
| arrival._
|
| Sora performs worse than closed source Kling and Hailuo, but
| more importantly, it's already trumped by open source too.
|
| Tencent is releasing a fully open source Hunyuan model [1]
| that is better than all of the SOTA closed source models.
| Lightricks has their open source LTX model and Genmo is
| pushing Mochi as open source. Black Forest Labs is working on
| video too.
|
| Sora will fall into the same pit that Dall-E did. SaaS
| doesn't work for artists, and open source always trumps
| closed source models.
|
| Artists want to fine tune their models, add them to ComfyUI
| workflows, and use ControlNets to precision control the
| outputs.
|
| Images are now almost 100% Flux and Stable Diffusion, and
| video will soon be 100% Hunyuan and LTX.
|
| Sora doesn't have much market apart from name recognition at
| this point. It's just another inflexible closed source model
| like Runway or Pika. Open source has caught up with state of
| the art and is pushing past it.
|
| [1] https://github.com/Tencent/HunyuanVideo
| minimaxir wrote:
| > Under the hood, with the way the text is turned into vector
| embeddings, it's fairly questionable whether you'd agree that
| it can even represent such a thing.
|
| The _text encoder_ may not be able to know complex
| relationships, but the generative image /video models that
| are conditioned on said text embeddings absolutely can.
|
| Flux, for example, uses the very old T5 model for text
| encoding, but image generations from it can (loosely) adhere
| to all rules and nuances in a multi-paragraph prompt:
| https://x.com/minimaxir/status/1820512770351411268
| robotresearcher wrote:
| > Long term, you'll never have a coherent movie produced by
| stringing together a series of textual snippets because,
| again, that's just impossible.
|
| Why snippets? Submit a whole script the way a writer delivers
| a movie to a director. The (automated) director/DP/editor
| could maintain internal visual coherence, while the script
| drives the story coherence.
| jerf wrote:
| That's what I describe at the end, albeit quickly in lingo,
| where the internal coherence is maintained in internal
| embeddings that are never related to English at all. A top-
| level AI could orchestrate component AIs through embedded
| vectors, but you'll never do it with a human trying to type
| out descriptions.
| troupo wrote:
| You should watch how movies are made sometime. How a script
| is developed. How changes to it are made. How storyboards
| are created. How actors are screened for roles. How
| locations are scouted, booked, and changed. How the
| gazillion of different departments end up affecting how a
| movie looks, is produced, made, and in which direction it
| goes (the wardrobe alone, and its availability and
| deadlines will have a huge impact on the movie).
|
| What does "EXT. NIGHT" mean in a script? Is it cloudy?
| Rainy? Well lit? What are camera locations? Is the scene
| important for the context of the movie? What are characters
| wearing? What are they looking at?
|
| What do actors actually do? How do they actually behave?
|
| Here are a few examples of script vs. screen.
|
| Here's a well described script of Whiplash. Tell me the one
| hundred million things happening on screen that are not in
| the script: https://www.youtube.com/watch?v=kunUvYIJtHM
|
| Or here's Joker interrogation from The Dark Night Rises.
| Same million different things, including actors (or the
| director) _ignoring_ instructions in the script:
| https://www.youtube.com/watch?v=rqQdEh0hUsc
|
| Here's A Few Good Men: https://www.youtube.com/watch?v=6hv7
| U7XhDdI&list=PLxtbRuSKCC...
|
| and so on
|
| ---
|
| _Edit_. Here 's Annie Atkins on visual design in movies,
| including Grand Budapest Hotel:
| https://www.youtube.com/watch?v=SzGvEYSzHf4. And here's a
| small article summarizing some of it:
| https://www.itsnicethat.com/articles/annie-atkins-grand-
| buda...
|
| Good luck finding any of these details in any of the
| scripts. See minute 14:16 where she goes through the script
|
| _Edit 2_ : do watch The Kerning chapter at 22:35 to see
| what it _actually_ takes to create something :)
| coffeebeqn wrote:
| This almost certainly won't work. Feel free to feed any of
| the hundreds of existing film scripts and test how coherent
| the models can be. My guess is not at all
| sleepybrett wrote:
| This will almost certainly be in theaters within 5 years,
| probably first as a small experimental project (think
| blair witch).
| letmevoteplease wrote:
| Shane Carruth (Primer) released interesting scripts for "A
| Topiary" and "The Modern Ocean" which now have no hope of
| being filmed. I hope AI can bring them to life someday. If
| we get tools like ControlNet for video, maybe Carruth could
| even "direct" them himself.
| LASR wrote:
| What you are saying is totally correct.
|
| And this applies to language / code outputs as well.
|
| The number of times I've had engineers at my company type out
| 5 sentences and then expect a complete react webapp.
|
| But what I've found in practice is using LLMs to generate the
| prompt with low-effort human input (eg: thumbs up/down,
| multiple-choice etc) is quite useful. It generates walls of
| text, but with metaprompting, that's kind of the point. With
| this, I've definitely been able to get high ROI out of LLMs.
| I suspect the same would work for vision output.
| kurthr wrote:
| I'm not sure, but I think you're saying what I'm thinking.
|
| Stick the video you want to replicate into -o1 and ask for
| a descriptive prompt to generate a video with the same
| style and content. Take that prompt and put it into Sora.
| Iterate with human and o1 generated critical responses.
|
| I suspect you can get close pretty quickly, but I don't
| know the cost. I'm also suspicious that they might have put
| in "safeguards" to prevent some high profile/embarrassing
| rip-offs.
| amelius wrote:
| Can't you just give it a photo of a dog, and then say "use
| this dog in this or that scene"?
| alpha_squared wrote:
| How would that even work? A dog has physical features
| (legs, nose, eyes, ears, etc.) that they use to interact
| with the world around them (ground, tree, grass, sounds,
| etc.). And each one of those things has physical structures
| that compose senses (nervous system, optic nerves, etc.).
| There are layers upon layers of intricate complexity that
| took eons to develop and a single photo cannot encapsulate
| that level of complexity and density of information. Even a
| 3D scan can't capture that level of information. There is
| an implicit understanding of the physical world that helps
| us make sense of images. For example, a dog with all four
| paws standing on grass is within the bounds of possibility;
| a dog with six paws, two of which are on it's head, are
| outside the bounds of possibility. An image generator
| doesn't understand that obvious delineation and just
| approximates likelihood.
| int_19h wrote:
| A single photo doesn't have to capture all that
| complexity. It's carried by all those countless dog
| photos and videos in the training set of the model.
| artemisart wrote:
| Yes, the idea works and was explored with
| dreambooth/textual inversion for image diffusion models.
|
| https://dreambooth.github.io/ https://textual-
| inversion.github.io/
| minimaxir wrote:
| Both of those are of course out of date and require
| significant training instead of just feeding it a single
| image.
|
| InstantID (https://replicate.com/zsxkib/instant-id) fixes
| that issue.
| torginus wrote:
| Yeah, it almost feels like gambling - 'you're very close, just
| spend 20 more credits and you might get it right this time!'
| moralestapia wrote:
| Still three or four order of magnitudes cheaper and easier than
| to produce said video through traditional methods.
| beefnugs wrote:
| AI isn't trying to sell to you: a precise artist with real
| vision in your brain. It is selling to managers who want to
| shit out something in an evening that approximates anything,
| that writes ads that no one wants to see anyway, that produces
| surface level examples of how you can pay employees less
| because "their job is so easy"
| spuz wrote:
| Yes and the thing is, even for those tasks, it's incredibly
| difficult to achieve even the low bar that a typical
| advertising manager expects. Try it yourself for any real
| world task and you will see.
| cornel_io wrote:
| Counterpoint: our CEO spent 25 minutes shitting out a bunch
| of AI ads because he was frustrated with the pace of our
| advertising creative team. They hated the ads that he
| created, for the reasons you mention, but we tested them
| anyways and the best performing ones beat all of our
| "expert" team's best ads by a healthy margin (on _all_ the
| metrics we care about, from CTR to IPM and downstream stuff
| like retention and RoAS).
|
| Maybe we're in a honeymoon period where your average user
| hasn't gotten annoyed by all the slop out there and they
| will soon, but at least for now, there is real value here.
| Yes, out of 20 ads maybe only 2 outperform the manually
| created ones, but if I can create those 20 with a couple
| hundred bucks in GenAI credits and maybe an hour or two of
| video editing that process wipes the floor with the
| competition, which is several thousand dollars per ad, most
| of which are terrible and end up thrown away, too. With the
| way the platforms function now, ad creative is quickly
| becoming a volume-driven "throw it at the wall and see what
| sticks" game, and AI is great for that.
| sarchertech wrote:
| > Maybe we're in a honeymoon period where your average
| user hasn't gotten annoyed by all the slop out there and
| they will soon
|
| It's this. A video ad with a person morphing into a bird
| that takes off like a rocket with fire coming out of its
| ass, sure it might perform well because we aren't
| saturated with that yet.
|
| You'd probably get a similar result by giving a camera to
| a 5 year old.
|
| But you also have to ask what that's doing long term to
| your brand.
| mewpmewp2 wrote:
| A/B/C/D testing is the perfect grounds for that. You can
| keep automatically generating and iterating quickly while
| A/B tests are constantly being ran. This data on CTR can
| later be used to train the model better as well.
| soheil wrote:
| You seem to speak from experience of being that manager...
| I'm not going to ask what you shit out in your evenings.
| jstummbillig wrote:
| > A way to test this is to take a piece of footage or an image
| which is the ground truth, and test how much prompting and
| editing it takes to get the same or similar ground truth
| starting from scratch.
|
| Sure, if you then do the same in reverse.
| javier123454321 wrote:
| This is the conundrum of AI generated art. It will lower the
| barrier to entry for new artists to produce audiovisual
| content, but it will not lower the amount of effort required to
| make good art. If anything it will increase the effort, as it
| has to be excellent in order to get past the slop of base level
| drudge that is bound to fill up every single distribution
| channel.
| diob wrote:
| I believe it. I was just using AI to help out with some
| mandatory end of year writing exercises at work.
|
| Eventually, it starts to muck with the earlier work that it did
| good on, when I'm just asking it to add onto it.
|
| I was still happy with what I got in the end, but it took trial
| and error and then a lot of piecemeal coaxing with verification
| that it didn't do more than I asked along the way.
|
| I can imagine the same for video or images. You have to examine
| each step post prompt to verify it didn't go back and muck with
| the already good parts.
| bilsbie wrote:
| Sounds like another way of saying a picture is worth a thousand
| words.
| estebarb wrote:
| For those scenarios would be helpful a draft generation mode:
| 16 colors, 320x200...
| hmottestad wrote:
| When I first started learning Photoshop as a teenager I often
| knew what I wanted my final image to look like, but no matter
| how hard I tried I could never get the there. It wasn't that it
| was impossible, it was just that my skills just weren't there
| yet. I needed a lot more practice before I got good enough to
| create what I could see in my imagination.
|
| Sora is obviously not Photoshop, but given that you can write
| basically anything you can think of I reckon it's going to take
| a long time to get good at expressing your vision in words that
| a model like Sora will understand.
| titzer wrote:
| As only a cursory user of said tools (but strong opinions) I
| felt the immediate desire to get an editable (2D) scene that I
| could rearrange. For example I often have a specific vantage
| point or composition in mind, which is fine to start from, but
| to tweak it and the elements, I'd like to edit it afterwards.
| To foray into 3D, I'd be wanting to rearrange the characters
| and direct them, as well as change the vantage point. Can it do
| that yet?
| corytheboyd wrote:
| Free text is just the fundamentally wrong input for precision
| work like this. Because it is wrong for this doesn't mean it
| has NO purpose, it's still useful and impressive for what it
| is.
|
| FWIW I too have been quite frustrated iterating with AI to
| produce a vision that is clear in my head. Past changing the
| broad strokes, once you start "asking" for specifics, it all
| goes to shit.
|
| Still, it's good enough at those broad strokes. If you want
| your vision to become reality, you either need to learn how to
| paint (or whatever the medium), or hire a professional, both
| being tough-but-fair IMO.
| londons_explore wrote:
| I don't think it'll be long before GUI tools catch up for
| editing video.
|
| Things like rearranging things in the scene with drag'n'drop
| sound implementable (although incredibly GPU heavy)
| goldfeld wrote:
| If you use it in a utilitarist way it'll give you a run for
| your money, if you use for expression, such as art, learning to
| embrace some serendipity, it makes good stuff.
| owenpalmer wrote:
| MKBHD's review of the new Sora release:
|
| https://www.youtube.com/watch?v=OY2x0TyKzIQ
| laweijfmvo wrote:
| Love the callout of them definitely training on his own videos
| awongh wrote:
| Interesting to see how bad the physics/object permanence is. I
| wonder if combining this with a Genie 2 type model (Google's
| new "world model") would be the next step in refining it's
| capabilities.
| kranke155 wrote:
| Until these models can figure out physics, it seems to me
| they will be an interesting toy
| andybak wrote:
| They can figure out a fair bit of physics. It's not a "no
| physics" vs "physics" thing. Rather it's a "flawed and
| unreliable physics" thing.
|
| It's similar to the LLM hallucination problem. LLMs produce
| nonsense and untruths - but they are still useful in many
| domains.
| Barrin92 wrote:
| It's a pretty binary thing in the sense that "bad
| physics" pretty quickly decoheres into no physics.
|
| I saw one of these models doing a Minecraft like
| simulation and it looked sort of okay but then water
| started to end up in impossible places and once it was
| there it kept spreading and you ended up in some
| lovecraftian horror dimension. Any useful physics
| simluation at least needs boundary conditions to hold and
| these models have no boundary conditions because they
| have no clear categories of anything.
| kylehotchkiss wrote:
| But they don't, they just understand pixel relationships
| (right?)
| torginus wrote:
| This feels like computer graphics and the 'screen space'
| techniques that got introduced in the Xbox 360 generation -
| reflection, shadows etc. all suffered from the inability to
| work with off screen information and gave wildly bad answers
| once off screen info was required.
|
| The solution was simple - just maintain the information in
| world space, and sample for that. But simple does not mean
| cheap, and it led to a ton of redundant (as in invisible in
| the final image) having to be kept track of.
| pearjuice wrote:
| Though I like the novelty of AI generated content, it kind of
| sucks dead internet theory is becoming more and more prevalent.
| YouTube (and all of the web) is already being spammed with AI
| generated slop and "better" video/text/audio models only make
| this worse. At some point we will cross the threshold of "real"
| and "generated" content being posted on the web and there's no
| stopping that.
| xena wrote:
| My hope was that AI would make it easier for people to create
| new things that haven't been done before, but my fear was that
| it would just be an endless slop machine. We're living in the
| endless slop machine timeline and even genuine attempts to make
| something artistic end up just coming off as more slop.
|
| I love this timeline.
| visarga wrote:
| Even if it's made with AI, it is slop only if you don't add
| anything original in your prompt, and don't spend much time
| selecting.
|
| The real competition of any new work is the backlog of
| decades of content that is instantly accessible. Of course it
| makes all content less valuable, you can always find
| something else. Hence the race for attention and the slop
| machine. It was actually invented by the ad driven revenue
| model.
|
| We should not project on AI something invented elsewhere.
| Even if gen AI could make original interesting works, the
| social network feeds would prioritize slop back again. So the
| problem is the way we let them control our feeds.
| minimaxir wrote:
| > if you don't add anything original in your prompt
|
| Define "original". You could generate a pregnant Spongebob
| Squarepants and that would be original, but it would still
| be noise that doesn't inherently expand the creative space.
|
| > don't spend much time selecting
|
| That's the unexpected issue with the proliferation of
| generative AI now being accessible to nontechnical people.
| Most are lazy and go with the first generation that matches
| the vibe, which is the main reason why we have slop.
| 999900000999 wrote:
| Imagine a movie like Napoleon, but instead of needing 100
| million and thousands of extras, you just need 5 actors and
| maybe a budget of 50k.
|
| You could get something much more creative or historically
| accurate than whatever Hollywood deems marketable.
|
| I think about AI like any other tool. For example I make
| music using various software.
|
| Are drum machines cheating? Is electronic music computer
| sloop compared to playing each instrument.
|
| Is using a Mac and a 1k mic over a 30k studio cheating ?
| xena wrote:
| The main comparator is Kasane Teto and Suno. Kasane Teto is
| functionally a piano that uses generative AI for vocal
| synthesis: https://youtu.be/s3VPKCC9LSs. This is an aid to
| the creative process. Suno lets you put in a description
| and completely bypass the creative process by instantly
| getting to the end: https://youtu.be/UpBVDSJorlU
|
| Kokoro is art. Driveway is content. Art uses the medium and
| implementation to say something and convey messages.
| Content is what goes between the ads so the shareholders
| see a number increase.
|
| I wish there were more things like Kokoro and less things
| like Driveway.
| 999900000999 wrote:
| What if your making a short movie and driveway is playing
| in the background during a scene.
|
| It's like everything else. It's just a tool.
|
| You can create an entire movie using a high end phone
| with quality that would have cost millions 40 years ago.
| Do real movies need film?
| FergusArgyll wrote:
| It might be true for "Creators" etc. but there were things
| that I always wanted paintings of but I have no talent, time,
| tools or anything really.
|
| When I first got access to dalle (in '22) the first thing I
| tried was to get an impressionist style painting of the way I
| always imagined Bob Dylan's 'Mr. Tambourine Man' I
| regenerated it multiple times and I got something I was very
| happy with! I didn't put it on social media, didn't try to
| make money off it, it's _for me_ .
|
| If you enjoy "art" (nice pictures, paintings, videos now I
| guess) You can create it yourself! I think people are missing
| that aspect of it, use it to make yourself happy, make
| pictures you want to look at!
| huijzer wrote:
| Put more weight on your subscriptions. I don't have much AI
| content in my YouTube suggestions. (Good luck AI generating an
| interview with Chris Lattner or Stephen Kotkin for example. It
| won't work.)
| yaj54 wrote:
| It will work within thousands of days.
| skepticATX wrote:
| I felt this same way as image generation was rapidly improving,
| but I've been caught by surprise and impressed with how
| resilient we have been in the face of it.
|
| Turns out it's surprisingly, at least for me, to tune out the
| slop. Some platforms will fall victim to it (Google image
| search, for one), but new platforms will spring up to take
| their place.
| tguedes wrote:
| My hope is that it will be the death of the aggregators and
| there will be more value in high quality and authentic content.
| The past 10-15 years has rewarded people who appeal to the
| aggregation algorithms and get the most views. Hopefully going
| forward theres going to be more organic, word of mouth
| recommendations of high quality content.
| ghita_ wrote:
| yeah i already have so many AI-generated videos in my feed on
| all social media it's insane. i spot them from far for now but
| at some point i'll just be consuming content that took seconds
| to generate just to get money
| minimaxir wrote:
| No API/per video generation? Huh.
| owenpalmer wrote:
| It's probably because they're relying heavily on their new
| editing UI to make the model useful. You can cut out weird
| parts of the videos and append a newly generated potion with a
| new prompt.
| Tiberium wrote:
| OpenAI almost always waits a few months before adding new
| features or models to the API, the same happened to DALL-E 3,
| advanced voice mode, and lots of smaller model updates and
| releases.
| rvz wrote:
| That's around more than 20+ VC-backed AI video generation
| startups destroyed in a microsecond and scrambling to compete
| against Sora in the race to zero.
|
| Many of them will die, but may the AI slop continue anyway.
| hagbarth wrote:
| Not really a microsecond. Sora was announced months ago.
| blackeyeblitzar wrote:
| It's a race to zero margin. The people who win will have lots
| of existing distribution channels (customers) or lots of money
| or control over data. Those who innovate but don't have these
| things will be copied and run out of money eventually, as sad
| is it is. The competition between those startups and bigger
| players isn't fair.
| marban wrote:
| Not available in the EU:
| https://help.openai.com/en/articles/10250692-sora-supported-...
| blfr wrote:
| What did the EU do this time?
| mnau wrote:
| Probably this
| https://en.wikipedia.org/wiki/Artificial_Intelligence_Act,
| possibl this:
| https://en.wikipedia.org/wiki/Digital_Markets_Act
| therealmarv wrote:
| Does VPN solves the problem? I'm living in an EU country and I
| don't like that the EU decides for me (and companies like
| OpenAI or Meta don't give out their models to me)! I'm an old
| enough adult to decide for myself what I want...
| ionwake wrote:
| I used a Japanese protonVPN , I got past the "Not in the EU"
| thing but it said "no new signups are allowed atm".
|
| Perhaps just best to wait
| ionwake wrote:
| Im sure if openai just waited the extra day or 2 to make sure
| its available int he EU it wouldnt annoy everyone in the EU so
| much. Often with new releases everyone in the EU needs to wait
| a couple of days, the FOMO is not cool bros
| cube2222 wrote:
| There's an ongoing related livestream[0].
|
| [0]: https://youtu.be/2jKVx2vyZOY
| mhh__ wrote:
| Unless they drop something mega in the next few months can't help
| but think that openai's moat is basically gone for now at least.
| gzer0 wrote:
| For the $20/month subscription: you get 50 generations a month.
| So it is included in your subscription already! Nice.
|
| For the Pro $200/month subscription: you get unlimited
| generations a month (on a slower que).
| system2 wrote:
| I wonder what I will be doing with 20 garbage videos. And this
| probably includes revisions too. It takes 10 attempts to get
| something remotely useful as an image (and that's just for a
| blog post).
| iamleppert wrote:
| Yawn, there are literally 10 different apps and wannabe startups
| that do video generation and AI videos have already flooded
| social media. This doesn't look any better than what is and has
| been already available to the masses. OpenAI announced this ages
| ago and never did give people access, now competitors have
| already captured the AI generated video for social media slop
| market.
|
| We have yet to see any kind of AI created movie, like Toy Story
| was for computer 3D animation.
|
| OpenAI isn't a player in the video AI game, but certainly has
| bagged most of the money for it already (somehow).
| keiferski wrote:
| Don't just critique - link. What other video generation tools
| have you used and recommend?
| Workaccount2 wrote:
| The subreddit /r/aivideo has tons of videos all tagged with
| what model was used to generate them.
| LanceJones wrote:
| So you're saying there is literally nothing good about Sora?
| vunderba wrote:
| From the few videos that I've seen, I would agree that it
| doesn't seem to be better than any of the major competitors
| such as Kling, Hailuo, Runway, etc.
| petercooper wrote:
| I wonder what it is about EU and UK law, in particular, that
| restricts its availability there. Their FAQs don't mention this.
|
| If it's about training models on potentially personal
| information, the GDPR (EU and UK variants) kicks in, but then
| that hasn't restricted OpenAI's ability to deploy (Chat)GPT
| there. The same applies to broader copyright regulations around
| platforms needing to proactively prevent copyright violation,
| something GPT could also theoretically accomplish. Any (planned)
| EU-specific regulations don't apply to the UK, so I doubt it's
| those either.
|
| The only thing that leaves, perhaps, is laws around the
| generation of deepfakes which both the UK and EU have laws about?
| But then why didn't that affect DALL-E? Anyone with a more
| detailed understanding of this space have any ideas?
| ilaksh wrote:
| Part of it might also be capacity problems.
| Stevvo wrote:
| It's a capacity related constraint, not a legal one.
| simicd wrote:
| Hmm enough capacity for the rest of the world but not EU:
|
| https://help.openai.com/en/articles/10250692-sora-
| supported-...
| MrKristopher wrote:
| A lot has changed since ChatGPT was released.
| https://en.wikipedia.org/wiki/Digital_Markets_Act wasn't in
| effect back then. Microsoft hadn't made their big investment
| yet either. OpenAi is a growing target, and the laws are
| becoming more strict, so they need to be more cautious from a
| legal perspective, and they need to consider that compliance
| with EU laws will slow down their product development.
| ChrisArchitect wrote:
| Link should be annoucement post: https://openai.com/index/sora-
| is-here/
| aspenmayer wrote:
| https://news.ycombinator.com/item?id=42368981
| dang wrote:
| Ok, we changed the URL to that from https://sora.com/ above.
| bnrdr wrote:
| The bear video is quite funny - two bear cubs merge into one
| and another cub appears out of the adult bear's leg.
| chefandy wrote:
| If you're looking for video for casual personal projects or fill-
| ins for vlog posts, or something to make your PowerPoint look
| neat, this seems like a rad tool. It has a looong way to go
| before it's taking anyone's movie VFX job.
| MyFirstSass wrote:
| Wow this is bad. And by bad i mean worse than leading open source
| and existing alternatives.
|
| Is it me or does it seem like OpenAI revolutionized with both
| chatGPT and Sora, but they've completely hit the ceiling?
|
| Honestly a bit surprised it happened so fast!
| kranke155 wrote:
| Sora was not really that big of a revolution, it was just
| catching up with competitors. I would even say in gen video
| they are behind right now.
| pawelduda wrote:
| What is the best model in your opinion right now?
| ElectroNomad wrote:
| RunwayML
| echelon wrote:
| HunYuan by Tencent. It's 100% open source too.
| SV_BubbleTime wrote:
| Sora had some sweet cherry picked initial hype videos. That
| was more impressive than anything we could do at the time.
| Now, yea, it's questionable if it's on-par let alone better.
| kranke155 wrote:
| Wasn't just cherry picked. The balloon kid video had a VFX
| team cleaning up the output. They've said that now.
| joe_the_user wrote:
| Bad also in the sense once you get over the "boy, it's amazing
| they can do that", you immediately think "boy, they really
| shouldn't do that".
| Banditoz wrote:
| What are some of the open source video models?
| tshaddox wrote:
| What are the leading alternatives? (Open source or otherwise)
| elorant wrote:
| MidJourney (commercial), Standard Diffusion XL
| aruametello wrote:
| > Standard Diffusion XL
|
| you probably meant Stable Diffusion XL. (autocorrect
| victim)
| amrrs wrote:
| Minimax (from China) and Kling 1.5 from China. Recently
| Tencent launched its own.
|
| You can see more model samples heee
| https://youtu.be/bCAV_9O1ioc
| ztratar wrote:
| Those look... far worse? What am I missing.
| amrrs wrote:
| Exactly I don't know how people are saying SORA is bad. I
| know there are restrictions with humans. But with the
| storyboard and other customisations, it's definitely up
| there!
| stuckkeys wrote:
| FLUX
| vunderba wrote:
| You have to _be specific_. What 's more important to you?
|
| - uncensored output (SD + LoRa)
|
| - Overall speed of generation (midjourney)
|
| - Image quality (probably midjourney, or an SDXL checkpoint +
| upscaler)
|
| - Prompt adherence (flux, DALL-E 3)
|
| EDIT: This is strictly around image generation. The main
| video competitors are Kling, Hailuo, and Runway.
| sebazzz wrote:
| SD does not generate video, does it?
| xvector wrote:
| https://stable-diffusion-art.com/animatediff/
| CryptoBanker wrote:
| It does as of recently.
| tom1337 wrote:
| Same goes with DALLE. It was cool to try it the first week or
| so but now the output is so much worse than Midjourney and
| stable diffusion. For me it can't even generate straight lines
| and everything looks comic-ish.
| vunderba wrote:
| DALL-E 3 image quality has always been subpar, but its prompt
| adherence is on par with FLUX. Midjourney has some of the
| worst prompt adherence, but some of the best image quality.
| amzn-throw wrote:
| To me this is just a simple artifact of size & attention.
|
| Another example of this is stuff like Bluesky. There's a lot
| of reasons to hate Twitter/X, but people going "Wow, Bluesky
| is so amazing, there's no ads and it's so much less toxic!"
| aren't complimenting Bluesky, they're just noting that it's
| smaller, has less attention, and so they don't have ads or
| the toxic masses YET.
|
| GenAI image generation is an obvious vector for all sorts of
| problems, from copyrighted material, to real life people, to
| porn, and so on. OpenAI and Google have to be extraordinarily
| strict about this due to all the attention on them, and so
| end up locking down artistic expression dramatically.
|
| Midjourney and Stable Diffision may have equal stature
| amongst tech people, but in the public sphere they're
| unknowns. So they can get away with more risk.
| lacoolj wrote:
| If you're going to say something like this, you need to back it
| up with specific alternatives that provide a better result.
|
| Besides just citing your sources, I'm genuinely curious what
| the best ones are for this so I can see the competition :)
| echelon wrote:
| HunYuan released by Tencent [1] is much better than Sora.
| It's 100% open source, is compatible with fine tuning,
| ComfyUI, control nets, and is receiving lots of active
| development.
|
| That's not the only open video model, either. Lightricks'
| LTX, Genmo's Mochi, and Black Forest Labs' upcoming models
| will all be open source video foundation models.
|
| Sora is commoditized like Dall-E at this point.
|
| Video will be dominated by players like Flux and Stable
| Diffusion.
|
| [1] https://github.com/Tencent/HunyuanVideo/
| vlovich123 wrote:
| Something being available OSS is very different from a
| turnkey product solution, not to mention that Tencent's 60
| GiB requirement requires a setup with like at least 3-4
| GPUs which is quite rare & fairly expensive vs something
| time-sharing like Sora where you pay a relatively small
| amount per video.
|
| I think the important thing is task quality and I haven't
| seen any evaluations of that yet.
| wslh wrote:
| Could it be that text sources are plenty, and more dense than
| training for videos, and images?
| torginus wrote:
| My working theory is that OpenAI is the 'moonshot' kind of
| company full of super smart researchers who like tackling hard
| problems, but have no time and effort for things like 'how do
| we create an UX people actually want to use', which actually
| requires a ton of painful back-and-forth and thoughtful design
| work.
|
| This is not a problem as long as they do the ChatGPT thing, and
| sell an API and let others figure out how to build an UX around
| it, but here they seem to be gunning for creating a boxed
| product.
| rushingcreek wrote:
| As there was no mention of an API for either Sora or o1 Pro, I
| think this launch further marks OpenAI's transition from an
| infrastructure company to a product company.
| metzpapa wrote:
| It seems like there going that direction - especially the way
| they setup the Sora interface, It feels its nearing a video
| editing product.
| colesantiago wrote:
| Hollywood's days are numbered.
|
| If you are a creative in this industry, start preparing to
| transition to another industry or adapt.
|
| Your boss is highly likely to be toying around with this.
|
| The first entirely AI generated film (with Sora or other AI video
| tools) to win an Oscar will be less than 5 years away.
| smithcoin wrote:
| > entirely
|
| What would you like to wager on this?
| do_not_redeem wrote:
| I'd take that bet at 10:1 odds.
| onlyrealcuzzo wrote:
| I'd be careful.
|
| OpenAI could be a big enough bubble in less than 5 years to
| buy the Oscar winner, even if the film is terrible.
|
| Also, OP only said "an Oscar".
|
| The Oscar committee could easily get themselves hyped enough
| on the AI bubble, to create an AI Oscar Film award.
|
| No one said anything about making a "good" movie.
| mdp2021 wrote:
| > _OP only said "an Oscar"_
|
| ...For soundtrack. (Sorry.)
|
| But seriously: like the democratization which made music
| production cheap brought some interesting or commercially
| successful endavours, the increased effort from people who
| could not bring their dreams to reality because of the
| basic constraint of budget will probably bring some very
| good results, even anthology worth - and lots of trash.
| whynotminot wrote:
| Nothing I'm seeing here looks like it's going to destroy
| Hollywood.
|
| I could see this tool _maybe_ being used for generating
| establishing shots (generate a sweeping drone shot of a
| lighthouse looking out over a stormy sea), but then the actual
| talent work in a scene will be way more sensitive. The little
| details matter so much, and this feels so far from getting all
| of that right.
|
| Sure, this is the worst it will ever be, things will improve,
| etc, but if we've learned anything with AI, it's that the last
| mile is often the hardest.
| ALittleLight wrote:
| I'm not sure the little details are enough of a moat.
| Consider TikTok - people use cheap "special effects" to get
| the message across, e.g. if a man is playing a woman he might
| drape a towel over his head - it's silly and low quality but
| it gets the idea across to the viewer. Think too about
| programs like Archer or South Park that have (stylistically)
| low quality animation but still huge fan bases.
|
| What I think this will unlock, maybe with a bit of
| improvement, is low quality video generation for a vast
| number of people. Do you have a short film idea? Know people
| with some? Likely millions of people will be able to use this
| to put together _good enough_ short films - that yes, have
| terrible details, but are still good enough to watch. Some of
| those millions of newly enabled videos will have such strong
| ideas or writing behind them that it will make up for, or
| capitalize on, the weak video generation.
|
| As the tools become easier, cheaper, faster, better etc more
| and more hobbyists will pick them up and try to use them. The
| user base will encourage the product to grow, and it will
| gradually consume film (assuming it can reach the point of
| being as or nearly as good as modern special effects).
|
| I think of it like - when Steven Spielberg was young he used
| an 8mm camera, not as good as professional film equipment in
| the day, but good enough to create with. If I were a high
| school student interested in film I would absolutely be using
| stuff like this to create.
| whynotminot wrote:
| > What I think this will unlock, maybe with a bit of
| improvement, is low quality video generation for a vast
| number of people. Do you have a short film idea? Know
| people with some? Likely millions of people will be able to
| use this to put together good enough short films - that
| yes, have terrible details, but are still good enough to
| watch.
|
| Sure, this is already happening on Reels, Tik Tok, etc.
| People are ok with low quality content on those platforms.
| Lazy AI will undoubtedly be more utilized here. But I don't
| think it's threatening Hollywood (well, aside from slowly
| destroying people's attention spans for long form content,
| but that's a different debate). People will still want high
| quality entertainment, even if they can also be satisfied
| with low fidelity stuff too.
|
| I think this has always been true -- think the difference
| between made for TV CGI and big-budget Hollywood movie CGI.
| Expectations are different in different mediums.
|
| This current product is not good enough for Hollywood. As
| long as people have some desire for Hollywood level
| quality, this will not take those jobs.
|
| The big caveat here is "yet" -- when does this get good
| enough? And this is where my skepticism comes in, because
| the last mile is the hardest, and getting things _mostly
| right_ isn't really good enough for high quality content.
| (Remember how much the internet lost it over a Starbucks
| cup in Game of Thrones?)
|
| The other caveat is maybe that our minds melt into
| stupidity to the point that we only watch things in low
| fidelity 10 seconds clips that AI can capably run amock
| with. In which case I don't really think AI actually takes
| over Hollywood so much as Hollywood -- effectively high
| fidelity long form content -- just ceases to exist
| altogether. That is the sad timeline.
| rideontime wrote:
| The day that 90 minutes of 3-second dolly shots wins an Oscar
| is the day cinema dies.
| rsynnott wrote:
| ... Have you _seen_ the output from these things? I'm not sure
| actors need to panic just yet.
| qilthewise wrote:
| I mean thats a bold claim. I'd first let chatgpt win an Oscar
| for writing the best screenplay, and only then would Sora come
| into the picture.
| n144q wrote:
| If you are ok with physics that is completely wrong, camera
| angles that just don't feel right, strange light effects, and
| all other kinds of distorted images/videos, maybe Hollywood is
| doomed. But I don't see that happening.
|
| A reminder: as advanced as CGI is today, lots and lots of movie
| are still based on (very expensive) real-life scenery or
| miniature sets (just two of many examples), because they are
| far, far more realistic than what you get out of computers.
| neom wrote:
| In July I made this 3 minute little content marketing video for
| Canada Day. Took me about 40 minutes using a combo of midjourney
| + pika, suno for the music. Honestly I had a lot of fun making
| it, I can see these tools with be fun for creative teams to
| hammer out little things for social media and stuff:
| https://x.com/ascent_hi/status/1807871799302279372
|
| I don't see sora being THAT much better than pika now that I'm
| trying both, except that it's included in my openai subscription,
| but I do think people who do discreet parts of the "modal stack"
| are going to be able to compete on their merits (be it pika for
| vid or suno for music etc)
| pen2l wrote:
| Every day that passes I grow fonder of Google's decision to delay
| or otherwise keep a lot of this under the wraps.
|
| The other day I was scrolling down on YouTube shorts and a couple
| videos invoked an uncanny valley response from me (I think it was
| a clip of an unrealistically large snake covering some hut) which
| was somehow fascinating and strange and captivating, and then
| scrolling down a few more, again I saw something kind of
| "unbelievable"... I saw a comment or two saying it's fake, and
| upon closer inspection: yeah, there were enough AI'esque
| artifacts that one could confidently conclude it's fake.
|
| We'd known about AI slop permeating Facebook -- usually a Jesus
| figure made out of unlikely set of things (like shrimp!) and we'd
| known that it grips eyeballs. And I don't even know in which box
| to categorize this, in my mind it conjures the image of those
| people on slot machines, mechanically and soullessly pulling
| levers because they are addicted. It's just so strange.
|
| I can imagine now some of the conversations that might have
| happened at Google when they choose to keep a lot of innovations
| related to genAI under the wraps (I'm being charitable here of
| their motives), and I can't help but agree.
|
| And I can't help but be saddened about OpenAI's decisions to
| unload a lot of this before recognizing the results of unleashing
| this to humanity, because I'm almost certain it'll be used more
| for bad things than good things, I'm certain its application on
| bad things will secure more eyeballs than on good things.
| quenix wrote:
| It saddens me. Innovations in AI 'art' generation (music,
| audio, photo) have been a net negative to society and are
| already actively harming the Internet and our media sphere.
|
| Like I said in another comment, LLMs are cool and useful, but
| who in the hell asked for AI art? It's good enough to fool
| people and break the fragile trust relationship we had with
| online content, but is also extremely shit and carries no
| meaning or depth whatsoever.
| mojuba wrote:
| I think AI "art" can be as useful as the text generators,
| i.e. only within certain limits of dull and stupid stuff that
| needs to exist but has little to no value.
|
| For example, you need to generate a landing page for your
| boring company: text, images, videos and the overall design
| (as well as code!) can be and should be generated because...
| who cares about your boring company's landing page, right?
| carlosjobim wrote:
| Then you don't understand the purpose of a landing page. If
| the boring company hires somebody to make the landing page
| who actually understands their job, the landing page will
| have great importance.
| whatevertrevor wrote:
| One could ask why the boring company landing page exists in
| the first place though. If it's not providing value to
| humans to warrant actual attention being paid to it...
| computerex wrote:
| How do you know they are a net negative? What's your source?
| quenix wrote:
| My opinion ;-)
|
| That's what HN is for
| randomlurking wrote:
| I agree with the first part. For me, AI art is the chance to
| have a somewhat creative outlet that I wouldn't have
| otherwise, because I'm much worse at painting that I can
| stand. Drawing by prompts helps me be creative and work
| through some stuff - for that it's also nice and interesting
| to see that the result differs from my mental image. I will
| tweak the prompt to some extent and to some extent go with
| some unintentioned elements of the drawing. I keep the
| drawing on my phone in the notes app with a title and the
| prompt.
|
| To get back to the beginning: I really do agree that the
| societal impact on the whole appears to be negative. But
| there are some positives and I wanted to share my example of
| that.
| lelandfe wrote:
| I saw my first AI video that completely fooled commenters:
| https://imgur.com/a/cbjVKMU
|
| This was not marked as AI-generated and commenters were in awe
| at this fuzzy train, missing the "AIGC" signs.
|
| I'm quite nervous for the future.
| dagmx wrote:
| Most people have terrible eyes for distinguishing content.
|
| I've worked in CG for many years and despite the online nerd
| fests that decry CG imagery in films, 99% of those people
| can't tell what's CG or not unless it's incredibly obvious.
|
| It's the same for GenAI, though I think there are more tells.
| Still, most people cannot tell reality from fiction. If you
| just tell them it's real, they'll most likely believe it.
| starshadowx2 wrote:
| The face of the girl on the left at the start in the first
| second should have been a giveaway.
| Perseids wrote:
| My intuition went for _video compression artifact_ instead
| of _AI modeling problem_. There is even a moment directly
| before the cut that can be interpreted as the next key
| frame clearing up the face. To be honest, the whole video
| could have fooled me. There is definitely an aspect in
| discerning these videos that can be trained just by
| watching more of them with a critical eye, so try to be
| kind to those that did not concern themselves with
| generative AI as much as you have.
| vlovich123 wrote:
| Hard to not discount that as a compression artifact.
| booleandilemma wrote:
| No one is looking at her face though, they're looking at
| the giant hello kitty train. And you were only looking at
| her face because you were told it's an AI-generated video.
| I agree with superfrank that extreme skepticism of
| everything seen online is going to have to be the default,
| unfortunately.
| tim333 wrote:
| Also "HELLO KITTY" being backwards is odd - writting on
| trains doesn't normally come out like that eg
| https://www.groupe-sncf.com/medias-
| publics/styles/crop_1_1/p...
| superfrank wrote:
| I know there are people acting like this is obvious that this
| is AI, but I get why people wouldn't catch it, even if they
| know that AI is capable of creating a video like this.
|
| A) Most of the give aways are pretty subtle and not what
| viewers are focused on. Sure, if you look closely the fur
| blends in with the pavement in some places, but I'm not going
| to spend 5 minutes investigating every video I see for hints
| of AI.
|
| B) Even if I did notice something like that, I'm much more
| likely to write it off as a video filter glitch, a weird
| video perspective, or just low quality video. For example,
| when they show the inside of the car, the vertical handrails
| seem to bend in a weird way as the train moves, but I've seen
| similar things from real videos with wide angle lenses.
| Similar thoughts on one of the bystander's faces going
| blurry.
|
| I think we just have to get people comfortable with the idea
| that you shouldn't trust a single unknown entity as the
| source or truth on things because everything can be faked.
| For insignificant things like this it doesn't matter, but for
| big things you need multiple independent sources. That's
| definitely an uphill battle and who knows if we can do it,
| but that's the only way we're going to get out the other side
| of this in one piece.
| lanthissa wrote:
| exactly one lab has passed the test of morals vs profit at this
| point, and thats deepmind, and they were thoroughly punished
| for it.
|
| Every value oAI has claimed to have hasn't lasted a milisecond
| longer than there was profit motive to break it, and even
| anthropic is doing military tech now.
| dmix wrote:
| LLMs aren't AGI
| halyconWays wrote:
| Your comment is AI.
| thr3000 wrote:
| So is yours! Mine isn't, however. I am a hard-nosed real boy
| now.
| bko wrote:
| I don't think Google delayed or kept this under wraps for any
| noble reasons. I think they were just disorganized as evidenced
| by their recent scrambling to compete in this space.
| mrcwinn wrote:
| Too charitable indeed. Google was simply unprepared and has
| inferior alternatives.
|
| My prediction is that next year they will catch up a bit and
| will not be shy about releasing new technology. They will
| remain behind in LLMs but at least will more deeply envelope
| their own existing products, thus creating a narrative of
| improved innovation and profit potential. They will publicly
| acknowledge perceived risks and say they have teams ensuring it
| will be okay.
| raincole wrote:
| Considering google image search is polluted by AI-generated
| images at this moment, perhaps google is afraid of making the
| search even worse?
| computerex wrote:
| They should have kept this amazing tech under the wraps because
| you have a bad feeling about it? Hate to break it to you, but
| there have been fake videos on the internet ever since it has
| existed. There are more ways to fake videos than GenAI. If you
| haven't been consuming everything on the internet with a high
| alert bs sensor, then that's an issue of its own. You shouldn't
| trust things on the internet anyway unless there is
| overwhelming evidence.
| sergiogdr wrote:
| > If you haven't been consuming everything on the internet
| with a high alert bs sensor, then that's an issue of its own
|
| "just be privileged as I was to get all the necessary
| education to be able to not be fooled by this tech". Yeah,
| very realistic and compassionate.
| cma wrote:
| With a heavy dose of "if masses of people are fooled by
| this, it can't affect me as long as I can see through it.
| No possible repercussions of mass people believing
| completely made up stuff that could affect laws, etc."
| pier25 wrote:
| I wish Google would allow me to remove the AI stuff from search
| results.
|
| 99% of the times it's either useless or wrong.
| fraXis wrote:
| Add a -ai to the end of your Google search query. There are
| also browser extensions that stop the AI content from
| displaying. I use the one for Chrome called "Remove Google
| Search Generative AI".
| titzer wrote:
| Strong plus one here. Not only that, but it uses _gobs_ of
| energy in total. Google has reneged on all of its carbon
| promises to stay in the running for AI domination and to head
| off disruption to search ads business. Since I 've
| unconsciously trained my brain to not look at the top search
| results anymore because they long ago turned into impossible-
| to-distinguish ads, I've quickly learned to just ignore the
| stupid AI summary. So it's an absurd waste of computational
| power to generate something wrong that I don't even want to
| see, and I can't even tell them to stop when they're wasting
| their own money to do so.
| makestuff wrote:
| I don't even know if this will be possible, or how it would
| work, but it seems like the next iteration of social media will
| be based on some verification that the user is not using AI or
| is a bot. Currently they are all incentivized to not stop bot
| activity because it increases user counts, ad revenue, etc.
|
| Maybe the model is you have to pay per account to use it, or
| maybe the model will be something else.
|
| I doubt this will make everyone just go back to primarily
| communicating in person/via voice servers but that is a
| possibility.
| mnau wrote:
| So Musk was right?
| joaohaas wrote:
| Twitter Blue is paid and yet every single bot account has it
| in order to boost views.
| kylehotchkiss wrote:
| > the image of those people on slot machines, mechanically and
| soullessly pulling levers because they are addicted. It's just
| so strange.
|
| Worse, the audience is our parents and grandparents. They have
| little context to be able to sort out reality from this stuff
| ren_engineer wrote:
| forced to finally release after that new open source model came
| out that was equal or better?
| remoquete wrote:
| Ah, yes. We definitely needed another bad dreams generator.
| andybak wrote:
| Interesting creative people will produce interesting creative
| output.
|
| People with no taste will produce tasteless content.
|
| The mountain of slop will grow.
|
| And some of us have no intention of publishing any output
| whatsoever but just find the existence of these tools
| fascinating and inspiring.
| remoquete wrote:
| While it's indeed fascinating, part of me finds the sheer
| energy expenditure to be problematic, not to mention the
| "Hollywood is dead" innuendos.
| gburdell3 wrote:
| And some of us have no intention of publishing any output and
| find the existence of these tools extremely worrying and
| problematic.
| crakhamster01 wrote:
| Interesting creative people are currently creating
| interesting output _without_ generative AI.
|
| These tools are fascinating, though I can't help but feel
| that the main benefactor after all is said and done will be
| venture capitalists and tech/entertainment execs.
| LeoPanthera wrote:
| This seems pretty broken at the moment, I haven't actually
| managed to create a video, every prompt results in "There was an
| unexpected error running this prompt".
| ilaksh wrote:
| I can't even sign up. I assume it's a capacity issue.
| knicholes wrote:
| At least you get to even see the page! I'm seeing "Sign ups are
| temporarily unavailable We're currently experiencing heavy
| traffic and have temporarily disabled sign ups. We're working
| to get them back up shortly so check back soon."
| tacticalturtle wrote:
| Something about that image of the spinning coffee cup with
| sailing ships is giving me severe trypophobia:
|
| https://en.m.wikipedia.org/wiki/Trypophobia
|
| It's a like a spider's eyes... and also not what I would expect a
| latte to look like.
| throw4321 wrote:
| One of the problems with a 10-month preannouncement is that the
| competition is ready to trash the actual announcement. Half an
| hour in, I already see half a dozen barely-concealed posts
| ranging from downplays to over-demands to non-user criticism.
| rtsil wrote:
| I think that's just the typical HN cynicism.
| abenga wrote:
| I don't think you can reduce it to this. Even on the showcase
| videos on the home page I can see weird artifacts, like the
| red car in the video of the guy walking through a market. The
| car is driving on a pedestrian walkway (through pedestrians),
| and just suddenly disappears from one frame to another.
| manquer wrote:
| Also the tennis player walking through the net
| minimaxir wrote:
| The competition is ready to trash the announcement because the
| 10-month delay gave rise to several viable competitors, and
| that would still be the case if OpenAI never did the
| preannouncement. If OpenAI released Sora 10 months ago, there
| wouldn't be as much cynicism.
| madihaa wrote:
| Account creation currently unavailable
| topherjaynes wrote:
| yea just that that too. Did anyone get in or did they get
| overwhelemed?
| HaZeust wrote:
| Doesn't look like you will.
|
| "We're currently experiencing heavy traffic and have
| temporarily disabled sign ups. We're working to get them back
| up shortly so check back soon."
| zb3 wrote:
| no API = not good enough
|
| no pay per use = overpriced
| zb3 wrote:
| not available in the EU = might use everything you did there
| against you, sell that data to the higgest bidder
| ulrischa wrote:
| It will not be available in the EU for now. I always feel
| disadvantaged when I read that sentence
| simicd wrote:
| And in the UK and Switzerland unfortunately
|
| https://help.openai.com/en/articles/10250692-sora-supported-...
| hmmm-i-wonder wrote:
| I'm not in the EU, but when I see something that is US only, I
| tend to assume its doing something with privacy/user
| data/otherwise that is restricted in the EU.
|
| Which means I generally avoid things that are not EU available
| even if they are available to me. Its not 100% but its a fairly
| decent measure of how much companies care about users to ensure
| they meet EU privacy laws from the start, vs if they provide
| some limited version or delayed version to the EU.
| xvector wrote:
| It's really just because it's expensive as fuck and ungodly
| complicated to ship _literally anything_ to the EU, so a lot
| of us in big tech have just given up on it.
|
| You make a small mistake and they call you evil and hit you
| with a $1B fine. Or you don't make a mistake but they make up
| some bullshit reason to fine you anyways, and fund the
| government coffers.
|
| It's just not worth it. And every day the EU becomes worth
| less. They will miss out on the AI revolution liked it missed
| out on the mobile revolution. And they can only miss out on
| so many industrial revolutions before they fade away.
| Whatever, it's their problem :shrug:
|
| When AGI is finally discovering new therapies, we'll be able
| to measure how much the EU slowed down innovation in AI and
| the cost in lives. It will be around 150k lives for _every
| day_ the EU delayed progress. I 'm sure some people will find
| a way to rationalize that as being okay. Future generations
| certainly won't.
| sksrbWgbfK wrote:
| I wonder how all those European companies are doing it.
| They ship everything all the time, avoid the $billions
| fines, yet make mistakes like everybody else.
|
| > how much the EU slowed down innovation
|
| You say this all the time, yet we're doing fine. How come?
| mhh__ wrote:
| Are you saying you're not glad that the EU has chosen for you?
|
| I would ask an AI to generate a riff on a "I am the very model
| of a modern major general" but for some EU bureaucrat but I'll
| spare you the spam.
| joshstrange wrote:
| OpenAI is a masterclass in pissing off paying customers.
|
| I'm just about ready to cancel my ChatGPT subscription and move
| fully over to Claude because OpenAI has spit in my face one too
| many times.
|
| I'm tired of announcements of things being available only to find
| out "No, they aren't" or "It's rolling out slowly" where "slowly"
| can mean days, weeks, or month (no exaggeration).
|
| I'm tired of shit like this: Sign ups are
| temporarily unavailable We're currently experiencing
| heavy traffic and have temporarily disabled sign ups. We're
| working to get them back up shortly so check back soon.
|
| Sign up? I'm already signed up, I've had a paid account for a
| year now or so.
|
| > We're releasing it today as a standalone product at Sora.com to
| ChatGPT Plus and Pro users.
|
| No you aren't, you might be rolling it out (see above for what
| that means) but it's not released, I'm a ChatGPT Plus user and I
| can't use it.
| EliBullockPapa wrote:
| I really don't think it's reasonable to expect them to onboard
| what is likely tens of thousands of sign ups in the first hour.
| minimaxir wrote:
| ChatGPT has far, far more concurrent users than tens of
| thousands. Sora is not a small hobby project by an amateur
| hacker that blew up.
| joshstrange wrote:
| I don't disagree, what I'm asking for is "truth in
| advertising". I'm not saying they need to give everyone
| access on day 1, I'm saying don't _say_ you've given everyone
| access if you haven't.
| zlies wrote:
| Is there information when it will be available in other
| countries, like Germany for example?
| null_investor wrote:
| I hope somebody pays 100.000 pro subscriptions and uses AI to
| request Sora to generate videos 24/7. Maybe Elon?
|
| Even if they use queues, I'm sure they are running at a loss and
| the GPU time is going to cost 100x more than what they charge.
|
| Creating false demand for AI can easily bankrupt their business,
| as they will believe people actually want to use that crap for
| that purpose.
| minimaxir wrote:
| Deliberately wasting electricity isn't exactly a moral win.
| toasteros wrote:
| Generative AI is a waste of electricity by definition.
| mdp2021 wrote:
| > _by definition_
|
| "Definition" does not mean "...plus your own assumptions".
|
| The results are there. Optimal, no; somehow valuable, yes.
| adultsthroaway wrote:
| Genuinely curious who is doing this for adult content?
|
| Complaints about Sora's quality and prompt complexity likely not
| as important to auteur's in that category, especially with
| ability to load a custom character etc
| minimaxir wrote:
| Sora (along with DALL-E 2 well before it) specifically has
| safeguards against NSFW content.
| fosterfriends wrote:
| Anyone else feeling their servers melt a bit on sora.com?
| seydor wrote:
| even in this mammoths demo, the dust clouds keep popping behind
| them even after they have moved forward
| Imnimo wrote:
| I feel like there is a sweet spot for AI generation of images and
| videos that I would describe as "charmingly bad", like the stuff
| we got from the old CLIP+VQGAN models. I feel like Sora has
| jumped past that into the valley of "unappealingly bad".
| halyconWays wrote:
| I think that's why humor and memes are such good targets for
| this type of stuff. If you look up videos like "luma memes
| compilation," it takes well-known memes and distorts them in
| uncanny, freaky, and bizarre ways. Yet the fact the original
| subject is a meme somehow bypasses the uncanny valley
| repulsion. We seem to accept that much more readily, for
| whatever reason.
| ilaksh wrote:
| This is actually a different version from what they had before.
| What they released today is Sora Turbo.
| therein wrote:
| Account creation not available. Login to see more videos.
|
| Classic OpenAI. I don't care, there are so many better
| alternatives to everything they do. Funny how quickly they have
| become irrelevant and lost their moat.
| natvert wrote:
| anyone done a comparison with the open-source hunyuanvideoai.com?
| tetris11 wrote:
| 0 stars, and no comments the last time this was posted. Maybe
| too good to be true?
| natvert wrote:
| I've ran hunyuanvideoai from their GitHub and it seems to
| generate realistic videos. It is a bit slow (30-60min per
| video clip) and requires ~50GB VRAM. I wonder how the quality
| compares though?
| natvert wrote:
| oh, and the output videos that are generated are 5sec clips
| at 554x960px. this is on a single A6000
| tetris11 wrote:
| It looks good, I'm just wondering why it has no attention
| from the ML community
| echelon wrote:
| That's not the upstream source. You're looking for this:
|
| https://github.com/Tencent/HunyuanVideo/
|
| This isn't "too good to be true" - this is the holy grail.
| Hunyuan is set to become the Flux/Stable Diffusion of AI
| video.
|
| I don't see how Hunyuan doesn't completely kill off Sora.
| It's 100% open source, is rapidly being developed for
| consumer PCs, can be fine tuned, works with ComfyUI/other
| tools, and it has control nets.
| dartos wrote:
| The most impressive part is the temporal consistency in the demo
| videos.
|
| He flower one is the best looking.
| tetris11 wrote:
| That cat skateboarding off the path cut out just when it was
| getting interesting.
|
| Many of these likely fall apart just split seconds after
| dartos wrote:
| I don't doubt it, but even 60 seconds of temporal consistency
| is an improvement, even if it's incremental.
| sjm wrote:
| Anyone else find this stuff extremely distasteful? "Disrupting"
| creativity and art feels like it goes against our humanity.
| ganzuul wrote:
| It is like an attempt to do psychic battle over the meaning of
| "disruption".
| ronsor wrote:
| I'm glad someone else said this. Hopefully we can get rid of
| that terrible disruptive camera too.
| quenix wrote:
| The past few years' innovation in AI has roughly been split
| into two camps for me.
|
| LLMs -- Awesome and useful. Disruptive, and somewhat dangerous,
| but probably more good than harm if we do it right.
|
| 'Generative art' (i.e. music generation, image generation,
| video generation) -- Why? Just why?
|
| The 'art' is always good enough to trick most humans at a
| glance but clearly fake, plastic, and soulless when you look a
| bit closer. It has instilled somewhat of a paranoia in me when
| browsing images and genuinely worsened my experience consuming
| art on the internet overall. I've just recently found out that
| a jazz mix I found on YouTube and thought was pretty neat is
| fully AI generated, and the same happens when I browse niche
| artstyles on Instagram. Don't get me started on what this Sora
| release will do...
|
| It changed my relationship consuming art online in general.
| When I see something that looks cool on the surface, my
| reaction is adversarial, one of suspicion. If it's recent, I
| default to assuming the piece is AI, and most of the time I
| don't have time or effort to sleuth the creator down and check.
| It's only been like a year, and it's already exhausting.
|
| No one asked for AI art. I don't understand why corporations
| keep pushing it so much.
| WXLCKNO wrote:
| I understand your take but it's only going to get better and
| incredibly fast.
|
| I'm a huge film nerd and I can only dream of a future where I
| could use these type of tools (but more advanced) to create
| short films about ideas I've had.
|
| It's very exciting to me
| mirsadm wrote:
| I somehow doubt it's (lack of) technology that's stopping
| you from creating your ideas.
| huehehue wrote:
| There's this FinTech ad on the NYC subway right now. I can't
| remember the company, but the entire ad is just a picture of
| a guitar and some text.
|
| Anyway, the guitar is AI generated, and it's really bad.
| There are 5 strings, which morph into 6 at the headstock.
| There's a trem bar jammed under the pickguard, somehow.
| There's a randomly placed blob on the guitar that is supposed
| to be a knob/button, but clearly is not. The pickups are
| visually distorted.
|
| It's repulsive. You're trying to sell me on something, why
| would you put so little effort into your advertising? Why
| would you not just...take a picture of a real guitar? I so
| badly want to cover it up.
| DebtDeflation wrote:
| Just need to add a hand with 6 fingers strumming it and it
| could be a meme.
| wumeow wrote:
| Reminds me of the new Coca Cola Christmas ad which is
| equally off-putting.
| imiric wrote:
| > You're trying to sell me on something, why would you put
| so little effort into your advertising? Why would you not
| just...take a picture of a real guitar?
|
| Is this not evident? Because using AI is much cheaper and
| faster. Instead of finding the right guitar, paying for a
| good photographer, location, decoration, and all the
| associated logistics, a graphics designer can write a
| prompt that gets you 90% of the vision, for orders of
| magnitude less cost and time. AI is even cheaper and faster
| than using stock images and talented graphic designers,
| which is what we've been doing for the past few decades.
|
| All our media channels, in both physical and digital
| spaces, will be flooded with this low-effort AI garbage
| from here on out. This is only the beginning. We'll need to
| use aggressive filtering and curation in order to find
| quality media, whether that's done manually by humans or
| automatically by other AI. Welcome to the future.
| huehehue wrote:
| I was able to find a similar public domain image in all
| of 5 seconds, so neither faster nor cheaper in this case.
|
| In fact, it's not hard to imagine people using AI tools
| even if they're slower, more expensive, and yield worse
| quality results in the long run.
|
| "When all you have is a hammer...".
| imiric wrote:
| I don't understand why you see a distinction between models
| that generate text, and those that generate images, video or
| audio. They're all digital formats, and the technology itself
| is fairly agnostic about what it's actually generating.
|
| Can't text also be considered art? There's as much art in
| poetry, lyrics, novels, scripts, etc. as in other forms of
| media.
|
| The thing is that the generative tech is out of the bag, and
| there's no going back. So we'll have to endure the negative
| effects along with the positive.
| quenix wrote:
| Simple: I am equally offput when LLMs are used for
| generating poetry, lyrics, novels, scripts, etc. _I don 't
| like it when low-effort generated slop is passed off as
| art_.
|
| I just think that LLMs have genuine use for non-artistic
| things, which is why I said it's dangerous but may be
| useful if we play our cards right.
| imiric wrote:
| I see. Well, I agree to an extent, but there's no clear
| agreement about what constitutes art with human-generated
| works either. There are examples of paintings where the
| human clearly "just" slapped some colors on a canvas, yet
| they're highly regarded in art circles. Just because
| something is low-effort doesn't mean it's not art, or
| worthy of merit.
|
| So we could say the same thing about AI-generated art.
| Maybe most of it is low-effort, but why can't it be
| considered art? There is a separate topic about human
| emotion being a key component these generated works are
| missing, but art is in the eyes of the beholder, after
| all, so who are we to judge?
|
| Mind you, I'm merely playing devil's advocate here. I
| think that all of this technology has deep implications
| we're only beginning to grapple with, and art is a small
| piece of the puzzle.
| quenix wrote:
| You make a good point. I'm just spitballing here, but I
| think what sets generative art apart for me is the
| element of _deception_.
|
| I'd be perfectly fine with a hypothetical world in which
| all generated art is clearly denoted as such. Like you
| said, art is in the eyes of the beholder. I welcome a
| world in which AI art lives side-by-side with traditional
| art, but clearly demarcated.
|
| Unfortunately, the reality is very different.
|
| AI art inherently tries to pass off as if it were made by
| a human. The result of the tools released in the past
| year is that my relationship with media online has become
| adversarial. I've been tricked in the past by AI music
| and images which were not labelled as such, which fosters
| a sort of paranoia that just isn't there with the
| examples you mentioned.
| shombaboor wrote:
| the offensive part is that it's creative theft by
| digesting other people's creative works then reworked and
| regurgitated. It's 'fine' when it's technical
| documentation and reference work, but that's not human
| expression.
| doug_durham wrote:
| So pre-LLM were you offended when someone posted their
| personal poetry or artwork on internet if it was clear
| they had put little effort into it? Somehow I doubt it.
| l33tbro wrote:
| Wish it was just generative AI for me.
|
| You don't have the same paranoia with LLM? So often I find
| myself getting a third of the way into reading an article or
| blog post and think: "wait a minute...".
|
| LLM tone is so specific and unrealistic that it completely
| disengages me as a reader.
| PartiallyTyped wrote:
| I have found a channel that curates and cleans some AI
| generated music. I really enjoy it, it's nothing I heard
| before, it's unique, distinct, and devoid of copyright.
| moralestapia wrote:
| "And then everyone clapped ..."
|
| There's nothing wrong with technology going forward and this
| doesn't go against "creativity and art", to the contrary, it
| will enhance it.
| TheAlchemist wrote:
| That's the optimistic version and in theory I would agree -
| it will be a great enhancer of creativity for some poeple.
|
| But mostly it will end up like the smartphones - we carry
| more computing power in our pockets that was used to send man
| to the moon, and instead of taking advantage of it to do
| great things, we are glued to this small screen several hours
| / day scrolling social medias nonsense. It's just human
| nature.
| tim333 wrote:
| There's some of that but it produces some cool stuff too. I
| mean you have these new virtual worlds like this that didn't
| exist before https://youtu.be/y_4Kv_Xy7vs?t=13
|
| The video there is kind of a combination of human design and AI
| which produces something beyond that which either would come up
| with on their own.
| simonw wrote:
| I got lucky and got in moments after it launched, managed to get
| a video of "A pelican riding a bicycle along a coastal path
| overlooking a harbor" and then the queue times jumped up (my
| second video has been in the queue for 20+ minutes already) and
| the https://sora.com site now says "account creation currently
| unavailable"
|
| Here's my pelican video:
| https://simonwillison.net/2024/Dec/9/sora/
| ByThyGrace wrote:
| Did you notice the frame rate (so to speak) of what's happening
| down the lake is much lower than the pelican's bicycle
| animation?
| vletal wrote:
| Image details 9/10 Animation 3/10 Temporal consistency 2/10
|
| Verdict 4/10
| rjtavares wrote:
| One of the highlights of any model release for me is checking
| your "pelican riding a bicycle" test.
| echelon wrote:
| For those who can't try Sora out, Tencent's super recent
| HunYuan is 100% open source and outperforms Sora. It's
| compatible with fine tuning, ComfyUI development, and is
| getting all manner of ControlNets and plugins.
|
| I don't see how Sora can stay in this race. The open source
| commoditization is going to hit hard, and OpenAI probably
| doesn't have the product DNA or focus to bark up this tree too.
|
| Tencent isn't the only company releasing open weights. Genmo,
| Black Forest Labs, and Lightricks are developing completely
| open source video models, and that's .
|
| Even if there weren't open source competitors, there are a
| dozen closed source foundation video companies: Runway, Pika,
| Kling, Hailuo, etc.
|
| I don't think OpenAI can afford to divert attention and win in
| this space. It'll be another Dall-E vs. Midjourney, Flux,
| Stable Diffusion.
|
| https://github.com/Tencent/HunyuanVideo
|
| https://x.com/kennethlynne/status/1865528133807386666
|
| https://fal.ai/models/fal-ai/hunyuan-video
| alberth wrote:
| Thanks, would you mind elaborate more on what you wrote below:
| Sora is built entirely around the idea of directly manipulating
| and editing and remixing the clips it generates, so the goal
| isn't to have it produce usable videos from a single prompt.
| simonw wrote:
| If you watch the OpenAI announcement they spend most of their
| time talking about the editing controls:
| https://www.youtube.com/watch?v=2jKVx2vyZOY
| OJFord wrote:
| > The Pelican inexplicably morphs to cycle in the opposite
| direction half way through
|
| It's pretty cool though, the kind of thing that'd be hard if it
| was what you actually wanted!
| benatkin wrote:
| That's an awful result. It turning around has absolutely
| nothing to do with what you asked for. It's similar in nature
| to what the chatbot in the recent and ongoing scandal said,
| saying to come home to her, when it should have known that the
| idea would be nonsensical or could be taken to mean something
| horrendous. https://apnews.com/article/chatbot-ai-lawsuit-
| suicide-teen-a...
|
| So you were lucky indeed to be able to run your prompt and
| share it, because the result was quite illuminating, but not in
| a way that looks good for Sora and OpenAI as a whole.
| pushcx wrote:
| I don't have a lot of mental model for how this works, but I
| was surprised to note that it seems to maintain continuity on
| the shapes of the bushes and brown spots on the grass that
| track out of frame on the left and then reappear as it pans
| back into frame.
| benatkin wrote:
| That must be exactly it. The simulated scene extends beyond
| what the camera is currently capturing.
| vunderba wrote:
| _" The Pelican inexplicably morphs to cycle in the opposite
| direction half way through"_
|
| Oof, if sora can't even manage to maintain an internal
| consistency of the world for a 5 second short, I can't imagine
| how exacerbated it'll be at longer video generation times.
| jrflowers wrote:
| "Right before the TikTok ban goes into effect" is incredible
| market timing for the release of a tool that is useless for
| anything other than terrible TikTok spam videos
| jeroenhd wrote:
| Hey now, no need to downplay the product here, it's also useful
| for spamming other video sharing platforms! Think Facebook
| timelines, which are already full of AI image barf, Twitter
| feeds, which mostly consist of AI text barf, and Youtube
| Shorts, which is full of existing AI animation barf!
|
| Soon, lots of people can pay a modest sum to make the internet
| just a worse for everyone in exchange for a chance to make
| their money back!
| cryptozeus wrote:
| Raises billion dollar, claims of agi by 2025, cannot handle new
| user sign up traffic.
| iLoveOncall wrote:
| This is by design, they want the news articles saying "this is
| so popular it crashed their website!"
| manquer wrote:
| Billion is table stakes , OpenAI has raised over 6 billion
| dollars this year alone
| pritambarhate wrote:
| That's because in this case scaling to big traffic needs more
| hardware which is very expensive and even if you have money the
| manufacturers may not have the capacity you need.
| knicholes wrote:
| I don't even get why I have to "sign up." I'm already a paying
| customer with an existing account.
| exe34 wrote:
| regarding all the comments about physics, I wonder if a hybrid
| approach would work better, with an llm generating 3d objects
| that interact in a physics simulation with guiding forces from
| the LLM and then another model generating photo realistic
| rendering.
| topaz0 wrote:
| Gentle reminder that it's important to boycott this kind of
| thing.
| tgv wrote:
| This, and similar tools, make the world a worse place, just so
| a handful can get the big bucks. This is not technological
| progress, it's greed. Ethics is a dirty word.
| gavindean90 wrote:
| Why?
| lacoolj wrote:
| A little worried how young children watching these videos may
| develop inaccurate impressions of physics in nature.
|
| For instance, that ladybug looks pretty natural, but there's a
| little glitch in there that an unwitting observer, who's never
| seen a ladybug move before, may mistake as being normal. And
| maybe it is! And maybe it isn't?
|
| The sailing ship - are those water movements correct?
|
| The sinking of the elephant into snow - how deep is too deep?
| Should there be snow on the elephant or would it have melted from
| body heat? Should some of the snow fall off during movement or is
| it maybe packed down too tightly already?
|
| There's no way to know because they aren't actual recordings, and
| if you don't know that, and this tech improves leaps and bounds
| (as we know it will), it will eventually become published and
| will be taken at face value by many.
|
| Hopefully I'm just overthinking it.
| highwaylights wrote:
| I'd be more worried about the inevitable "we're under nuclear
| attack, head for shelter" CNN deepfakes.
| icepat wrote:
| > The sinking of the elephant into snow - how deep is too deep?
| Should there be snow on the elephant or would it have melted
| from body heat? Should some of the snow fall off during
| movement or is it maybe packed down too tightly already?
|
| Should there be an elephant in the snow? The layers of possible
| confusion, and subtle incorrect understandings go much deeper.
| bbarnett wrote:
| Yes, they were used to traverse mountains paths.
| sccomps wrote:
| With the same reasoning, do reindeer actually fly and pull a
| sleigh carrying a 200-pound man along with tons of gifts? I
| believe you're underestimating human intelligence and our
| ability to apply logic and reasoning.
| tetris11 wrote:
| Also, I guess its just normal for a car lane to just merge
| seamlessly into a pedestrian zone
| RyeCombinator wrote:
| I share your concern as well and at times worry about what I'm
| seeing too.
|
| I suppose the reminder here is that seeing does not warrant
| believing.
| darepublic wrote:
| Sure this is problematic for society although I'm not concerned
| about what you are mentioning. I remember as a kid noticing how
| in looney tunes wile e coyote could run off the cliff a few
| steps and thinking maybe there's a way to do that. Or kids
| arguing about whether it was possible to perform a sonic boom
| like in street fighter. Or jumping off the playground with an
| umbrella etc
| sccomps wrote:
| > For instance, that ladybug looks pretty natural, but there's
| a little glitch in there that an unwitting observer, who's
| never seen a ladybug move before, may mistake as being normal.
| And maybe it is! And maybe it isn't?
|
| Well, none of the existing animation movies follow exact laws
| of physics.
| cj wrote:
| Take the example to the extreme: In 10 years, I prompt my
| photo album app "Generate photorealistic video of my mother
| playing with a ladybug".
|
| The juxtaposition of something that looks extremely real
| (your mother) and something that never happened (ladybug) is
| something that's hard for the mind to reconcile.
|
| The presence of a real thing inadvertently and subconsciously
| gives confidence to the fake thing also being real.
| Fade_Dance wrote:
| I think this hooks in quite well to the existing dialogue
| about movies in particular. Take an action movie. It looks
| real but is entirely fabricated.
|
| It is indeed something that society has to shift to deal
| with.
|
| Personally, I'm not sure that it's the photoreal aspect
| that poses the biggest challenge. I think that we are
| mentally prepared to handle that as long as it's not out of
| control (malicious deep-fakes used to personally target and
| harass people, etc.) I think the biggest challenge has
| already been identified, namely, passing off fake media as
| being real. If we know something is fake, we can put a
| mental filter in place, like a movie. If there is no way to
| know what is real and what is fake, then our perception
| reality itself starts to break down. _That_ would be a
| major new shift, and certainly not one that I think would
| be positive.
| browningstreet wrote:
| I looked at the Sora videos and all the subject "weights"
| and "heft" are off. And in the same way that Anna Taylor-
| Joy's jump in the The Gorge at the end of the new movie
| trailer looked not much better than years-ago Spiderman
| swinging on a rope.
| normalaccess wrote:
| I'm still waiting on the future waves of PTSD from hyper
| realistic horror games. I can't think of a worse thing to
| do then hand a kid a VR headset (or game system) and have
| them play a game that is _designed_ to activate every
| single fight or flight nerve in the body on a level that
| is almost indistinguishable from reality. 20 years ago
| that would have been the plot to a torture porn flick.
|
| Even worse than that is when people get USED to it and no
| longer have a natural aversion to horrific scenes taking
| place in the real world.
|
| This AI stuff accelerates that process of illusion but in
| every possible direction at once.
|
| As much as people don't want to believe it, by beholding
| we are indeed changed.
| dartos wrote:
| That argument can and probably was pointed towards movies
| with color, movies with audio before that, comics, movies
| without audio, books, etc.
|
| I don't think that slippery slope holds up.
|
| IIRC there's pretty solid research showing that even
| children beyond the age of 8 can tell the difference
| between fiction and reality.
| normalaccess wrote:
| Distinguishing reality from fiction is useful, but it
| doesn't shape our desires or define our values. As a
| culture, we've grown colder and more detached. Think of
| the first Dracula film--audiences were so shaken by a
| simple eerie face that some reportedly lost control in
| the theater. Compare that visceral reaction to the apathy
| we feel toward far more shocking imagery today.
|
| If media didn't profoundly affect us, how could exposure
| therapy rewire fears? Why would billions be spent on
| advertising if it didn't work? Why would propaganda or
| education exist if ideas couldn't be planted and nurtured
| through storytelling?
|
| Is there any meaningful difference between a sermon from
| the pulpit and a feature film in the theater? Both are
| designed to influence, persuade, and reshape our
| worldview.
|
| As Alan Moore aptly put it: "Art is, like magic, the
| science of manipulating symbols, words, or images to
| achieve changes in consciousness."
|
| In my opinion the old adage holds true, _you are what you
| eat_. And we will soon be eating unimaginable mountains
| of artificial content cooked up by dream engines tuned to
| our every desire and whim.
| brookst wrote:
| Wouldn't this same concern apply to historical fiction in
| general?
| spullara wrote:
| gravity acts immediately, you don't hover in the air for few
| seconds before falling
| dylan604 wrote:
| then how will I have time to flash my sign to the audience
| that says "uh-oh"?
| byteknight wrote:
| Feels like you're looking for a strawman argument, and may
| have found one.
|
| I would retort that animation and real-life-looking video do
| different things to our psyche. As an uneducated wanna-be
| intellectual, I would lean toward thinking real-looking
| objects more directly influence our perception of life than
| animations.
| a_wild_dandan wrote:
| Animation _can_ look real though, e.g sci-fi vfx. But maybe
| you're concerned about how prolific it may be? I could see
| that. Personally I think it'll be fine. It's just that
| disruptive tools create uncertainty. Or maybe I'm
| overcompensating to avoid being the "old man yelling at
| cloud" dude.
| byteknight wrote:
| Now you're intentionally mixing VFX and animation.
| Animation, at least in my meaning, was more cartoon.
| FridgeSeal wrote:
| Well none of the existing animation movies...a to be anything
| other than animation?
|
| You just know there'll be people making content within the
| week for social media that will be trying to pass itself off
| as real imagery.
| jsheard wrote:
| Animation doesn't follow exact laws of physics, but the
| specific ways they don't follow physics have very deliberate
| intent behind them. There's a pretty clear difference between
| the coyote running off a cliff and taking 2 seconds to start
| falling, and a character awkwardly floating over the ground
| because an AI model got confused.
| bee_rider wrote:
| It is a good point...
|
| Although, plenty of kids have tied a blanket around their
| necks and jumped off some furniture or a low roof, right?
| Breaking a leg or twisting an ankle in their attempt to
| imitate their favorite animated superhero.
| 867-5309 wrote:
| oh yes, _Suipercideman_
| IanCal wrote:
| >but the specific ways they don't follow physics have very
| deliberate intent behind them.
|
| That is only true for well crafted things. There's plenty
| of stuff that's just wrong for no reason beyond ease of
| creation or lack of care about the output.
| 1024core wrote:
| Clearly you haven't seen any Bollywood movies:
| https://youtu.be/PdvRwe39NCs
| sdf4j wrote:
| I grew up watching Looney Tunes interpretation of physics and
| turned out just fine.
| artur_makly wrote:
| these will be a lot less violent too ;-) for a little while
| at least.
| AyyEye wrote:
| There's big difference between cartoonishly incorrect and
| uncanny valley plausibly correct.
| fooker wrote:
| There's a huge amount of such stuff in movies.
|
| Special effects, weapons physics, unrealistic vehicles and
| planes, or the classic 'hacking'.
| mojuba wrote:
| Yes but a movie is a movie whereas these AI-generated
| videos will likely be used to replace stock footage in
| other (documentary, promotional, etc.) contexts
| ssl-3 wrote:
| If the producer wants to publish bad physics, they get
| bad physics.
|
| If the producer wants to publish good physics, they get
| good physics.
|
| It doesn't matter if it is AI, CGI, live action, stop
| motion, pen-and-ink animation, or anything else.
|
| The output is whatever the production team wants it to
| be, just as has been the case for as long as we've had
| cinema (or advertising or documentaries or TikToks or
| whatevers).
|
| Nothing has changed.
| mojuba wrote:
| You don't have full control over AI-generated images
| though, or not to the same extent producers have with
| CGI.
|
| There's a video on sora.com at the very bottom, with
| tennis players on the roof, notice how one player just
| walks "through" the net. I don't think you can fix this
| other than by just cutting the video earlier.
| ssl-3 wrote:
| >You don't have full control over AI-generated images
| though,
|
| So the AI just publishes stuff on my behalf now?
|
| No, comrade.
| evilduck wrote:
| There's already techniques for controlling AI generated
| images. There's ControlNet for Stable Diffusion and there
| are already techniques to take existing footage and
| style-morphing it with AI. For larger budget productions
| I would anticipate video production tooling to arise
| where directors and animators have fine grained influence
| and control over the wireframes within a 3D scene to
| directly prevent or fix issues like clipping, volumetric
| changes, visual consistency, text generation, gravity,
| etc. Or even just them recording and producing their
| video in a lower budget format and then having it re-
| rendered with AI to set the style or mood but adhering to
| scene layout, perspective, timing, cuts, etc. Not just
| for mitigating AI errors but also just for controlling
| their vision of the final product.
|
| Or they could simply brute force it by clipping the scene
| at the problem point and have it try, try again with
| another re-render iteration from that point until it's no
| longer problematic. Or just do the bulk of the work with
| AI and do video inpainting for small areas to fix or
| reserve the human CGI artists for fixing unmitigatable
| problems that crop up if they're fixable without full re-
| rendering (whichever probably ends up less expensive).
|
| Plus with what we've recently seen with world models that
| have been released in the last week or so, AI will soon
| get better at having a full and accurate representation
| of the world it creates and future generations of this
| technology beyond what Sora is doing simply won't make
| these mistakes.
| zoover2020 wrote:
| Yet, in a movie setting it's clear something is a special
| effect or alike which is not the case for GenAI. Massive
| underestimation of the potential impact in this thread,
| scary.
| brookst wrote:
| Maybe. Or maybe some people massively underestimate our
| ability to cope with fiction and new media types.
|
| I am sure that there were people decrying radio for all
| these same reasons ("how will _the children_ know that
| the voices aren't people in the same room?")
| ics wrote:
| There's also a huge difference in what people, even
| children, expect when sitting down to watch a movie
| versus seeing a clip of some funny cat/seal hybrid
| playing football while I'm looking for the Bluey episode
| we left off on. My daughter is almost five and cautiously
| asks "is that real?" about a lot of things now. It
| definitely makes me work harder when trying to explain
| the things that don't look real but actually are; one
| could definitely feel like it takes some of the magic
| away from moments. I feel alright in my ability to handle
| it, it's my responsibility to try, but it isn't as simple
| as the Looney Tunes argument or, I believe, dramatic
| effects in movies and TV.
| kube-system wrote:
| Not a bad point, those representations have, in some
| cases, caused widespread misunderstandings among people
| who learn about those concepts from movies... and this is
| all while simultaneously knowing "it's just a movie".
| eddieroger wrote:
| People don't watch The Matrix expecting a documentary on
| how we all got plugged in. If someone generated the
| referenced ladybug movie for use in a science classroom,
| that's a problem.
| fooker wrote:
| I agree. The issue is in using it for teaching science
| though, not in generating it.
|
| Similar to how it's fine to create fiction, but not to
| claim it to be true.
| gmuslera wrote:
| Did you see the movie Battleship? Or a good percent of
| recent and not so recent action movies, at least Matrix
| could be argued that it was about a virtual reality.
| ma2t wrote:
| "A body at rest remains at rest until it looks down and
| realizes it has stepped off of a cliff."
| sdenton4 wrote:
| Between omnipresent cgi in movies and tv, animation, and video
| game physics (all of which are human-coded approximations of
| real physics, often intentionally distorted for various
| reasons), that ship has long since sailed.
| dowager_dan99 wrote:
| no one is shooting blockbuster-grade CGI for stock footage
| though; the casualness of this is what will be the most
| impactful
| CyberDildonics wrote:
| _A little worried how young children watching these videos may
| develop inaccurate impressions of physics in nature._
|
| Pretty sure cartoons and actions movies do that already, until
| youtube videos of attempted stunts show what reality looks
| like.
| spicymaki wrote:
| I know this sounds judgmental, but this reminds me of the idiom
| "touch grass". Children should be outdoors observing real life
| and not be consuming AI slop. You are not overthinking this,
| this will most likely be bad for children and everyone in the
| long run.
| bparsons wrote:
| I dont think you are overthinking it.
|
| Facebook seems full of older people interacting with AI
| generated visual content who don't seem to understand that it
| is fake.
|
| Our society already had a problem with people (not)
| participating in consensus reality. This is going to pour
| gasoline on the fire.
| skybrian wrote:
| Yes, entertainment spreads lots of myths. But bad physics from
| AI movies is only a tiny part of the problem. This is similar
| to worries about the misconceptions people might get from
| playing too many video games, reading too many novels, watching
| too much TV, or participating too much in social media.
|
| It helps somewhat that people are fairly aware that
| entertainment is fake and usually don't take it too seriously.
| gruntbuggly wrote:
| Fair! I watched a lot of Superman as a kid and I killed myself
| jumping off a building
| dylan604 wrote:
| Don't be an asshole. When learning to fly, learn by starting
| on the ground first, not from a tall building. --Bill Hicks
| anonu wrote:
| > inaccurate impressions of physics
|
| Or just inaccurate impressions of the physical world.
|
| My young kids and I happened to see a video of some very cute
| baby seals jumping onto a boat. It was not immediately clear it
| was AI-generated, but after a few runs I noticed it was a bit
| too good to be true. The kids would never have known otherwise.
| throwawayian wrote:
| Don't worry, you are.
| TeMPOraL wrote:
| Me too. While I'm generally optimistic about generative art, at
| this point the models still have this dreamlike quality; things
| look OK at first glance, but you often get the feeling
| something is off. Because it is. Texture, geometry, lights,
| shadows, effects of gravity, etc. are more or less
| inconsistent.
|
| I do worry that, as we get exposed more and more to such art,
| we'll become less sensitive to this feeling, which effectively
| means we'll become less calibrated to actual reality. I worry
| this will screw with people's "system 1" intuitions long-term
| (but then I can't say exactly how; I guess we'll find out soon
| enough).
| juddlyon wrote:
| YouTube Shorts are full of AI animal videos with distorted
| proportions, living in the wrong habitat, and so on. They
| popped up on my son's account and I hate them for the reasons
| you outline. They aren't cartoonish enough explain away, nor
| realistic enough to be educational.
| jonpo wrote:
| And have you watched the brain rot that is Tik toks?
| t0bia_s wrote:
| Young generation that will grow up with this tools will have
| completely different approach to anything virtual. Remember how
| prople though that camera stole part of their soul when they
| see themselves copied on picture?
| Terr_ wrote:
| > A little worried how young children watching these videos may
| develop inaccurate impressions of physics in nature.
|
| I'm less concerned with physics for children--assuming they get
| enough time outdoors--and more about adulthood biases and
| media-literacy.
|
| In particular, a turbocharged version of a problem we already
| have: People grow up watching movies and become subconsciously
| taught that _flaws_ of the creation pipeline (e.g. lens flare,
| depth of field) are signs of "realism" in a general sense.
|
| That manifests in things such as video-games where your human
| character somehow sees the world with crappy video-cameras for
| eyes. (Excepting a cyberpunk context, where that would actually
| make sense.)
| jstummbillig wrote:
| > Hopefully I'm just overthinking it.
|
| I think it's unnecessary to worry about obviously bad stuff in
| nascent and rapidly developing technology. The people who spent
| most time with it (the developers) are aware of the obviously
| bad stuff and will work to improve it.
| hash07e wrote:
| Yes Bugs bunny and willie the coyote harmed ours physics.
| raincole wrote:
| > A little worried how young children watching these videos may
| develop inaccurate impressions of physics in nature.
|
| And why don't we worry this about CGI?
|
| CGI is not always made with a full physical simulation, and is
| not always intended to accurately represent real-world physics.
| andrewstuart wrote:
| Kids are fine with fiction.
| mike_hearn wrote:
| AI physics isn't worth worrying about compared to other
| inaccurate things kids see in movies. It doesn't seem to hurt
| them.
|
| If you really want something to worry about, consider that
| movies regularly show pint-sized women successfully drop
| kicking men significantly bigger than themselves in ways that
| look highly plausible but aren't. It's not AI but it violates
| basic laws of biology and physics anyway. Teaching girls they
| can physically fight off several men at once when they aren't
| strong enough to do that seems like it could have pretty
| dangerous consequences, but in practice it doesn't seem to
| cause problems. People realize pretty quick that movie physics
| isn't real.
| SaintSeiya wrote:
| Don't be, physics laws miss interpretation are very quick to
| correct with a reality check. I'm more worried for kids that
| have to learn how the world works trough a screen. Just let
| them play outside and interact with other kids and nature. Let
| them fall and cry, and scratch and itch, it will make them
| stronger and healthier adults.
| WesolyKubeczek wrote:
| You are not overthinking it, moreover, text LLM have the same
| problem in that they are almost good. Almost. Which is what
| gives me the creeps.
| uludag wrote:
| Here's the obligatory AI enthusiast answer:
|
| What is physics besides next token/frame prediction? I'm not
| sure these videos deserve the label "inaccurate" as who's to
| judge what way of generating next tokens/frames is better? Even
| if you you judge the "physical" world to be "better", I think
| it's much more harmful to teach young children to be skeptical
| of AI as their futures will depend on integrating them in their
| lives. Also, with enough data, such models will not only match,
| but probably exceed "real-physics" models in quality, fidelity,
| and speed.
| cryptoegorophy wrote:
| I am not sure if you have kids or not but you are in for a big
| surprise if you don't have kids. Watching videos =\= real life.
| 8note wrote:
| i wouldnt expect young children to learn how to walk by
| watching people walk on a screen, regardless of if its a real
| person walking, or an ai animation.
|
| the real world gives way more stimulus
|
| watching the animations might help them play video games, but i
| again imagine that the feedback is what will do the real job.
|
| even for the real ladybug video, who says the behaviour on
| screen is similar to what a typical ladybug does? if its on
| video, the ladybug was probably doing something weird amd
| unexpected
| unraveller wrote:
| Is it better or just more distracting away from it's flaws and
| using flaws to advantage? I only see repulsive mouth movements to
| induce fear, face coverings to hide the uncanniness, dreamy
| physics sim to distract. Not so out of place in present day
| hollywood but never any coherence of feeling.
| jack_pp wrote:
| I'm surprised they put in 2 legged poodles
| goykasi wrote:
| A bit off-topic, but how much does a 4-letter (or less) .com go
| for these days? I wonder if they bought this via an intermediary
| so that the seller wouldnt see "OpenAI" and tack on a few zeros.
|
| edit: previously, this thread pointed to sora.com
| silvestrov wrote:
| His review video is so much better than the announcement video
| at explaining what has been released.
| geor9e wrote:
| Pretty off-topic, but yes, domains and land are often bought
| via shell companies for this reason. OpenAI bought chat.com for
| 8 figures previously.
| wslh wrote:
| "We are currently experiencing heavy loads..."
| inoffensivename wrote:
| Great. In a world awash with disinformation, we're making it
| easier to create even more of it.
|
| I don't see any good coming from tools like these.
| sergiotapia wrote:
| sorry for the tangent: can't remember a launch they've had where
| you could just use it. it's always "rollout", "later this
| quarter", "select users", what's the deal here?
|
| it's given openai this tinge to me that i probably won't ever
| manage to forget.
| m3kw9 wrote:
| A minimum setting video took an hour 480p 5s 1:1. Servers getting
| cooked
| vinni2 wrote:
| Account creation currently unavailable
| adregan wrote:
| Why keep building AI to do the things that people find fun to do
| rather than the mundane bullshit? All we'll be left with is
| cleaning, folding laundry, and doing the dishes while AI does all
| the interesting things.
| amelius wrote:
| Because we don't have as much data about mundane bullshit.
| siliconc0w wrote:
| I may be the only one but this kinda breaks my brain in that I
| notice weird physics anomalies in these but then I start to look
| for those in non-AI produced video and start to question
| everything. Hopefully this a short term situation.
| okdood64 wrote:
| So when's the lawsuit from Google coming?
| andrewstuart wrote:
| So we are now a few years into the AI video thing.
|
| I'm curious to know - is it actually useful for real world tasks
| that people/companies need videos for?
| system2 wrote:
| I wonder when in the future ai images and videos will be remotely
| useful and easy to create. These are still weird and garbage
| quality.
| lossolo wrote:
| If we take HunyuanVideo, which is similar to Sora, as an example,
| they state that generating a 5-second video requires 5 minutes on
| 8xH100 GPUs. Therefore, if 10,000 users simultaneously want to
| generate a 5-second video within the same 5-minute window, you
| would need 80,000 H100 GPUs, which would cost around 2 billion
| USD in GPUs alone.
| IanCal wrote:
| Not available in
|
| > the United Kingdom, Switzerland and the European Economic Area.
| We are working to expand access further in the coming months
|
| Excellent to announce this lack of access after the launch of
| pro. At least I have no business reason for sora so it's not a
| loss there so much but annoying nonetheless.
| aglione wrote:
| ok, so gpt pro with some extra power and sora. This means that
| gtp5 and generally speaking AGI can wait
| advael wrote:
| People really worry about fake video and images and whatever but
| I have to say, the correct heuristics both already exist and have
| existed for a long time:
|
| 1. Anything on the internet can be fake
|
| 2. Trust is interpersonal, and trusting content should be
| predicated first and foremost on trusting its source to not
| deceive you
|
| This is imperfect but also the best people ever really do in the
| general case, and just orders of magnitude better than most
| people are currently doing
|
| The issue isn't models like this, it's that people are eating a
| ton of information but have been strongly encouraged to be
| credulous, and a lion's share of that training is directly coming
| from the tech grift industrial complex
|
| I wouldn't even say this is the most compelling kind of tool for
| plausible-looking disinformation out there by a long shot for the
| record, but without actually examining why people are gullible
| there is no technology that's going to make people accepting
| fiction as fact substantially worse, or better, really. Scams
| target people on the order of their life savings every day and
| there are robust technologies and protocols for vetting
| communications, but people have to know to use them, care to use
| them, and be able to use them, for that to matter at all
| jiggawatts wrote:
| "The version of Sora we are deploying has many limitations. It
| often generates unrealistic physics and struggles with complex
| actions over long durations. Although Sora Turbo is much faster
| than the February preview, we're still working to make the
| technology affordable for everyone."
|
| So they demo the full model and release the quantised and
| censored model.
|
| Does anyone else find this kind of bait & switch distasteful?
| mewpmewp2 wrote:
| Maybe, but alternative would be to not demo results with state
| of the art processing at all, which I wouldn't like either.
| jedberg wrote:
| > We're introducing our video generation technology now to give
| society time to explore its possibilities and co-develop norms
| and safeguards that ensure it's used responsibly as the field
| advances.
|
| That's an interesting way of saying "we're probably gonna miss
| some stuff in our safety tools, so hopefully society picks up the
| slack for us". :)
| FrustratedMonky wrote:
| "to give society time to explore its possibilities and co-
| develop norms and safeguards"
|
| Or, "this safety stuff is harder than we thought, we're just
| going to call 'tag you're it' on society"
|
| Or,
|
| -Oppenheimer : speaking "man, this nuclear safety stuff is
| hard, I'm just going to put it all out there and let society
| explore developing norms and safeguards".
|
| -Society : Bombs Japan
|
| -Oppenheimer : "No, not like that, oops".
| Arnt wrote:
| Aren't you kind of saying that you don't have any answers so
| therefore OpenAI should have provided the answers?
| usrnm wrote:
| Oppenheimer was making a bomb from day 1, he knew exactly
| what he was doing and how it would be used. There aren't so
| many different use cases for a bomb, after all. It was a nice
| movie, but it does not absolve him
| xvector wrote:
| Eh, society did a pretty good job overall.
|
| The bomb was the end of conventional warfare between nuclear
| nations. MAD has created an era of peace unlike anything our
| species has ever seen before.
| rurp wrote:
| Well it works great, until is doesn't. We're perpetually a
| few bad decisions from a few possibly deranged actors away
| from obliterating all of those gains and then some.
| jsheard wrote:
| Flashbacks to when they were cagey about releasing the GPT
| models because they could so easily be used for spam, and then
| just pretended not to see all the spam their model was making
| when they did release it.
|
| If you happen to notice a Twitter spam bot claiming to be "an
| AI language model created by OpenAI", know that we have
| conducted an investigation and concluded that no you didn't.
| Mission accomplished!
| nostromo wrote:
| The irony is that users want more freedom and fewer safeguards.
|
| But these companies are rightfully worried about regulators and
| legislatures, often led by a pearl-clutching journalists, so we
| can't have nice things.
| DFHippie wrote:
| Recent events (many events in many places) show "users" don't
| think too hard before acting. And sometimes they act with
| inadequate or inaccurate information. If we want better
| outcomes, it behooves us to hire people to do the thinking
| that ordinary users see no point in doing for themselves. We
| call the people doing the hard thinking scientists,
| regulators, and journalists. The regulators, when empowered
| to do so by the government, can stop things from happening.
| The scientists and journalists can just issue warnings.
|
| Giving people what they want when they want it doesn't always
| lead to happy outcomes. The people themselves, through their
| representatives, have created the institutions that sometimes
| put a brake on their worst impulses.
| nicbou wrote:
| The onus will be on the rest of society to defend itself from
| all the grift that will result from this.
| pesus wrote:
| If the worst we ultimately get from this kind of tech is
| grifting, I will consider that a very positive outcome.
| miohtama wrote:
| Users, not tools, should be judged.
|
| It is unlikely anyone is going to perform act of terrorism with
| this, or any kind of deep fakes that buy Easter European
| elections. The worst outcome is likely teens having a laugh.
| observationist wrote:
| Funny how all the negative uses to which something like this
| might be put are regulated or criminalized already - if you
| try to scam someone, commit libel or defamation, attempt
| widespread fraud, or any of a million nefarious uses, you'll
| get fined, sued, or go to jail.
|
| Would you want Microsoft to claim they're responsible for the
| "safety" of what you write with Word? For the legality of the
| numbers you're punching into an Excel spreadsheet? Would you
| want Verizon keeping tabs on every word you say, to make sure
| it's in line with their corporate ethos?
|
| This idea that AI is somehow special, that they absolutely
| must monitor and censor and curtail usage, that they claim
| total responsibility for the behavior of their users -
| Anthropic and OpenAI don't seem to realize that they're the
| bad guys.
|
| If you build tools of totalitarian dystopian tyranny,
| dystopian tyrants will take those tools from you and use
| them. Or worse yet, force your compliance and you'll become
| nothing more than the big stick used to keep people cowed.
|
| We have laws and norms and culture about what's ok and what's
| not ok to write, produce, and publish. We don't need
| corporate morality police, thanks.
|
| Censorship of tools is ethically wrong. If someone wants to
| publish things that are horrific or illegal, let that person
| be responsible for their own actions. There is absolutely no
| reason for AI companies to be involved.
| 8note wrote:
| that works for locally hosted models, but if its as a
| service, openai is publishing those verboten works to you,
| the person who requested it.
|
| even if it is a local model, if you trained a model to spew
| nazi propaganda, youre still publishing nazi propaganda to
| the people who then go use it to make propaganda. its just
| very summarized propaganda
| gus_massa wrote:
| Does this apply to the spell checker in Office 365 or
| Google Docs?
| jimkleiber wrote:
| Are hunting knives regulated the same way as rocket
| launchers? Both can be used to kill but at much different
| intensity levels.
| nlehuen wrote:
| You are posting this under a pseudonym. If you did publish
| something horrific or illegal, it would have been the
| responsibility of this web site to either censor your
| content, and/or identify you when asked by authorities.
| Which do you prefer?
| xvector wrote:
| This website is not a tool - not really.
|
| Your keyboard is.
|
| Censoring AI generation itself is very much like
| censoring your keyboard or text editor or IDE.
|
| Edit: Of course, "literally everything is a tool", yada
| yada. You get what I mean. There is a meaningful
| difference between that translate our thoughts to a
| digital medium (keyboards) and tools that share those
| thoughts that others.
| jimkleiber wrote:
| A website is almost certainly a tool. It has servers and
| distributes information typed on thousands of keyboards
| to millions of screens.
| skydhash wrote:
| HN is the one doing the distribution, not the user. The
| latter is free to type whatever it wants, but it is not
| entitled to have HN distributes his words. Just like a
| publisher do not have to publish a book he doesn't want
| to.
| gardenhedge wrote:
| When someone posts on FB, they don't consider that FB is
| publishing their content for them
| do_not_redeem wrote:
| > when asked by authorities
|
| Key point right here.
|
| You let people post what they will, and if the
| authorities get involved, cooperate with them. HN should
| not be preemptively monitoring all comments and making
| corporate moralistic judgments on what you wrote and
| censoring people who mention Mickey Mouse or post song
| lyrics or talk about hotwiring a car.
|
| Why shouldn't OpenAI do the same?
| 9rx wrote:
| It seems reasonable to work with law enforcement if
| information provides details about a crime that took
| place in the real world. I am not sure what purpose
| censoring as a responsibility would serve? Who cares if
| someone writes a fictional horrific story? A site like
| this may choose to remove noise to keep the quality of
| the signal high, but preference and responsibility are
| not the same.
| pyrale wrote:
| > Would you want Microsoft to claim they're responsible for
| the "safety" of what you write with Word? For the legality
| of the numbers you're punching into an Excel spreadsheet?
| Would you want Verizon keeping tabs on every word you say,
| to make sure it's in line with their corporate ethos?
|
| Would you want DuPont to check the toxicity of Teflon
| effluents they're releasing in your neighbourhood? That's
| insane. It's people's responsibility to make sure that they
| drink harmless water. New tech is always amazing.
| nightski wrote:
| Yes, because we know a.) that the toxicity exists and b.)
| how to test for it.
|
| There is no definition of a "safe" model without
| significant controversy nor is there any standardized
| test for it. There are other reasons why that is a
| terrible analogy, but this is probably the most
| important.
| miohtama wrote:
| It's called Overton window what's politically acceptable.
| Unlike toxicity, it is fully subjective.
|
| https://en.m.wikipedia.org/wiki/Overton_window
| bayindirh wrote:
| Maybe you should talk with image editor developers,
| copier/scanner manufacturers and governments about the
| safeguards they shall implement to prevent counterfeiting
| money.
|
| Because, at the end of the day, counterfeiting money is
| already illegal.
|
| ...and we should not censor tools, and judge people, not
| the tools they use.
| mayukh wrote:
| So guns are ok? How about bombs?
| rixed wrote:
| Interestingly, you must know that any printing equipment
| that is good enough to output realistic banknotes are
| regulated to embed a protection preventing this use case.
|
| Even more interestingly, and maybe that could help
| understand that even in the most principled argument
| there should be a limit: molecular 3d printers able to
| reproduce proteins (yes, this is a thing) are regulated
| to recognise a design from a database of dangerous
| pathogens and refuse to print.
| miohtama wrote:
| Gimp doesn't have the secret binary blob to "prevent
| counterfeiting" and there is no flood of forged money
|
| https://www.reddit.com/r/GIMP/comments/3c7i55/does_gimp_h
| ave...
| AntiEgo wrote:
| "Teens having a laugh" can escalate quickly to, "... at
| someone else's expense," and this distinction is EXACTLY the
| sort of subtlety an algorithm can't filter.
|
| This does not need to become a thread about bullying and self
| harm, but it should be recognized that this example is not
| benign or victimless.
|
| This genie is out of the bottle, let us hope that laws about
| users are enough when the tools evolve faster than
| legislative response.
|
| [edit:spelling]
| thordenmark wrote:
| Exactly. You can make anything you want in Photoshop, Word,
| Excel, Blender, etc. The company isn't held accountable for
| what the User makes with it.
| jimkleiber wrote:
| Yes and one could kill a hundred people with their fists,
| but we regulate super powerful weapons more than fists.
|
| I think the degree of power matters.
| miltonlost wrote:
| > It is unlikely no one is going to perform act of terrorism
| with this, or any kind of deep fakes that buy Easter European
| elections. The worst outcome is likely teens having a laugh.
|
| And the teens are having a laugh by... creating deepfake
| nudes of their classmates? The tools are bad, and the
| toolmakers should feel deep guilt and shame for what they
| released on the world. Do you not know the story of Nobel and
| dynamite? Technology must be paired with morality.
| Aeolun wrote:
| Technology _is_ paired with morality. It's just not the one
| you want.
| botanical76 wrote:
| Is it? It seems to me to be paired with shareholders'
| interests, and nothing more.
| miohtama wrote:
| I am sure a school has a way to deal with pupils sharing
| such images, as the recent cases have proven. Deep fakes or
| real pictures. It it a social problem with existing
| framework of decades of proven history and should be dealt
| so.
| timeon wrote:
| > or any kind of deep fakes that buy Easter European
| elections
|
| Finally people do not label Slovakia as Eastern Europe...
| tshaddox wrote:
| There are certain tools for which we heavily restrict which
| users have access to the entire supply chain. That's still
| about users, I suppose, but it's also about tools.
| miohtama wrote:
| In China, the whole Internet is heavily restricted. Bad
| tools.
| ClumsyPilot wrote:
| > no one is going to perform act of terrorism with this
|
| Especially certain someone that's worth a billion dollars, is
| 100 years old and their name ends with inc.
| sleepybrett wrote:
| 'when civilization collapses because all photo, audio and video
| evidence is 100% suspect, i mean, how could you blame us'
| jstummbillig wrote:
| Do we not want new stuff? If the answer is "Sure, but only if
| whoever invents the stuff does all the work and finds all rough
| edges" then the answer is actually just "No, thanks".
| jedberg wrote:
| Oh, I have no problem with them doing it this way. I just
| thought it was a funny way to do it.
| 123yawaworht456 wrote:
| text, image, video, and audio editing tools have no 'safety'
| and 'alignment' whatsoever, and skilled humans are far more
| capable of creating 'unsafe' and 'unethical' media than
| generative AI will ever be.
|
| somehow, the society had survived just fine.
|
| the notion that generative AI tools should be 'safe' and
| 'aligned' is as absurd as the notion that tools like Notepad,
| Photoshop, Premiere and Audacity should exist only in the
| cloud, monitored by kommissars to ensure that proles aren't
| doing something 'unsafe' with them.
| pyrale wrote:
| "We're releasing this like rats on a remote island, in hopes of
| seeing how the ecosystem is going to respond".
| raincole wrote:
| The problem isn't whether we should regulate AI. It's whether
| it's even possible to regulate them without causing significant
| turmoil and damage to the society.
|
| It's not hyperbole. Hunyuan was released before Sora. So
| regulating Sora does absolutely nothing unless you can regulate
| Hunyuan, which is 1) open source and 2) made by a China
| company.
|
| How do we expect the US govt to regulate that? Threatening
| sanction China unless they stop doing AI research???
| ssl-3 wrote:
| Easy-peasy. Just require all software to be cryptographically
| signed, with a trusted chain that leads to a government-
| vetted author, and make that author responsible for the
| wrongdoings of that software's users.
|
| We're most of the way there with "our" locked-down, walled-
| garden pocket supercomputers. Just extend that breadth and
| bring it to the rest of computing using the force of law.
|
| ---
|
| Can I hear someone saying something like "That will never
| work!"?
|
| Perhaps we should meditate upon that before we leap into any
| new age of regulation.
| Nition wrote:
| "Climate Change is likely to mean more fires in the future, so
| we've lit a small fire at everyone's house to give society time
| to co-develop norms and safeguards."
| soheil wrote:
| Specially since they were originally supposed to be a non-
| profit focused on AI safety and Sam Altman single-handedly
| pivoting to a for-profit after taking all the donations and
| partnering with probably the single most evil corporation that
| has ever existed, Microsoft.
| vinay_ys wrote:
| co-develop := we are in f** around and f** out mode, please bear
| with us.
| karmasimida wrote:
| I am not impressed by it at all ... Is it actually better than
| the competitors?
| belter wrote:
| "...I felt a great disturbance in the algorithm... as if millions
| of influencers, OnlyFans stars, and video creators suddenly cried
| out in terror..."
| submeta wrote:
| What I desperately need is a model that generates perfectly made
| PowerPoint slides. I have to create many presentations for
| management, and it's a very time consuming task. It's easy to
| outline my train of thoughts and let an LLM write the full text,
| but then to create a convincing presentation slide by slide takes
| days.
|
| I know there is Beautiful.ai or Copilot for PowerPoint, but none
| of the existing tools really work for me because the results and
| the user flow aren't convincing.
| buzzy_hacker wrote:
| Have you checked out Marp? https://marp.app/
|
| Basically it generates slides from markdown, which is great
| even without LLMs. But you can get LLMs to output in
| markdown/Marp format and then use Marp to generate the slides.
|
| I haven't looked into complicated slides, but works well for
| text-based ones.
| ghita_ wrote:
| there is a YC company that does that I think:
| https://www.rollstack.com/ i've never used them but I think
| they have many satisfied customers, maybe worth a shot!
| jmugan wrote:
| I want something that can take my ugly line drawing and make it
| a cool looking line drawing without distorting the main idea
| ShakataGaNai wrote:
| Never used it but seen it mentioned in that space:
| https://gamma.app/
| brcmthrowaway wrote:
| Sora? More like r/ShittyHDR
| TiredOfLife wrote:
| "I've come up with a set of rules that describe our reactions to
| technologies:
|
| 1. Anything that is in the world when you're born is normal and
| ordinary and is just a natural part of the way the world works.
|
| 2. Anything that's invented between when you're fifteen and
| thirty-five is new and exciting and revolutionary and you can
| probably get a career in it.
|
| 3. Anything invented after you're thirty-five is against the
| natural order of things."
|
| -- Douglas Adams, The Salmon of Doubt: Hitchhiking the Galaxy One
| Last Time
| SillyUsername wrote:
| No integration with ChatGPT is a lost opportunity and illustrates
| no joined up thinking in all senses of that phrase.
| Demonstrations, helping people with learning difficulties
| visualise things, education purposes, story telling...
| esskay wrote:
| Great. More tools to continue the enshitification of everything
| on the web.
| matthewmorgan wrote:
| "Sora is not available in The United Kingdom yet". Available
| elsewhere, from Albania to Zimbabwe. Any particular reason why?
| dcchambers wrote:
| Meh. It's a cool POC and immediately useful for abstract imagery,
| but not for anything realistic.
|
| Looking forward to the onslaught of AI-generated slop filling
| every video feed on the Internet. Maybe it's finally what's going
| to kill things like TikTok, YT Shorts, Reels, etc. One can
| hope...anyway.
| ngd wrote:
| What's next, Tiagra?
| mkaic wrote:
| A friendly reminder: if you have tech-illiterate people in your
| life (parents, grandparents, friends, etc), _please_ reach out to
| them and inform them about advances in AI text, image, audio, and
| (as of very recently) video generation. Many folks are not aware
| of what modern algorithms are capable of, and this puts them at
| risk. GenAI makes it easier and cheaper than ever for bad actors
| to create targeted, believable scams. Let your loved ones know
| that it is possible to create believable images, audio, and
| videos which may depict anything from "Politician Says OUTRAGEOUS
| Thing!" to "a member of your own family is begging you for
| money." The best defense you can give them is to make them aware
| of what they're up against. These tools are currently the worst
| they will ever be, and their capabilities will only grow in the
| coming months and years. They are _already_ widely used by
| scammers.
| kylehotchkiss wrote:
| Who is the audience for this product? A lot of people like video
| because it's a way of experience something they currently cannot
| for one reason or another. People don't want to see arbitrary
| fake worlds or places on earth that aren't real. Unless it's
| video game or something. But I see this product being used
| primarily to trick Facebook users
|
| I guess the CGI industry implications are interesting, but look
| at the waves behind the AI generated man. They don't break so
| much as dissolving into each other. There's always a tell. These
| aren't GPU generated versions of reality with thought behind the
| effects.
| danielbln wrote:
| > People don't want to see arbitrary fake worlds or places on
| earth that aren't real.
|
| Isn't there a multi-billion dollar industry in California
| somewhere that caters exactly to that demand?
| Klonoar wrote:
| _> Unless it 's video game or something. _
|
| The "or something" pretty much covers the gotcha you're
| trying to use. OP is acknowledging that fantasy media is a
| thing before going on to their actual point.
| themagician wrote:
| "People don't want to see arbitrary fake worlds or places on
| earth that aren't real."
|
| What? This is 90% of the Instagram/TikTok experience, and has
| been for years. No one cares if something is real. They care
| how it makes them feel.
|
| The audience for this is every "creator" or "influecner". No
| one cares if the content is fake. They'll sell you a vacation
| package to a destination that doesn't exist and people will
| still rate it 3/5 stars for a $15 Starbucks gift card.
| jrflowers wrote:
| > Who is the audience for this product?
|
| Infants, people just coming out of anesthesia, the concussed,
| the hypoxic, the mortally febrile and so on
| sktrdie wrote:
| To me this is what all AI feels like. People want "hard to make
| things" because they feel special and unordinary. If anybody
| with a prompt can do it, it ain't gonna sell
| robomartin wrote:
| Here's something I find interesting: We have multiple paid
| accounts with OpenAI. In other words, we are paying customers. I
| have yet to see a single announcement or new development that we
| learn about through email. In most cases we learn these things
| when they get covered by some online outfit, posted on HN, etc.
|
| OpenAI isn't the only company that seems to act in this manner. I
| find this to be interesting. Your paying customers actively want
| to know about what you are doing and, more than likely, would
| love to get a heads-up before the word goes out to the world.
| Hearing about things from third parties can make you feel like a
| company takes your business for grant it or does not deem it
| important enough to feed you news when it happens.
|
| Another example of this is Kickstarter, although, their problem
| is different. I have only ever backed technology projects on KS.
| That's all I am interested in. And yet, every single email they
| send is full of projects that don't even begin to approach my
| profile (built over dozens of backed projects). As a result of
| this, KS emails have become spam to be deleted without even
| reading them. This also means I have not backed projects I would
| have seriously considered and I don't frequent the site as much
| as I used to.
|
| Getting back on topic: It will be interesting to see how Sora
| usage evolves.
| matco11 wrote:
| Forget video. Imagine what this going to do for video-gaming
| azinman2 wrote:
| Technically it's amazing that this is possible at all. Yet I
| don't see how the world is better off for it on net. Aside from
| eliminating jobs in FX/filming/acting/set design/etc, what do we
| really gain? Amateur filmmakers can be more powerful? How about
| we put the same money into a fund for filmmakers to access. The
| negatives are plentiful, from the mundane reduction of our media
| to monolithic simulacra to putting the nail in the coffin for
| truth to exist unchallenged, let alone the 'fine tunes' that will
| continue to come for deepfakes that are literal (sexual)
| harassment.
|
| Humans are not built for this power to be in the hands of
| everyone with low friction.
| khushy wrote:
| I can't wait for the safety features because I know there are
| those in society that would do bad things. But not me, though.
| I'd like the unlocked version.
| soheil wrote:
| That's now how the world is supposed to work, I wonder if there
| is going to be long-term psychological effects if being exposed
| to videos like these regularly. If our neurons are unable to
| receive a stable stream of reality like we have for millions of
| years will our brains become dysfunctional over time?
| nox101 wrote:
| what in particular is better as about this than
|
| https://civitai.com/videos
| ta2112 wrote:
| The mammoths are walking over some pre-existing footprints, but
| they don't leave any prints of their own. I guess I'm getting
| hung up on little things. For a prompt of a few words, it looks
| pretty nice!
___________________________________________________________________
(page generated 2024-12-09 23:00 UTC)