hngopher.com

       [HN Gopher] Sora is here
       ___________________________________________________________________
        
       Sora is here
        
       Author : toomuchtodo
       Score  : 594 points
       Date   : 2024-12-09 18:02 UTC (4 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | toomuchtodo wrote:
       | From "12 Days of OpenAI: Day 3"
       | 
       | https://www.youtube.com/watch?v=2jKVx2vyZOY (live as of this
       | comment)
        
         | bbor wrote:
         | Over now, and pretty short/light on info AFAICT. That said,
         | knowing what we know now about Altman made me physically unable
         | to watch while he engages in sustained eye contact with the
         | camera, so maybe missed something while skimming! On the
         | upside, I'm so glad we have three billionaires cultivating
         | three different cinema-supervillain vibes (Musk, Altman, &
         | Zuckerberg). Much more fresh than the usual "oil baron"
         | aesthetic that we know from the gilded age
        
       | optimalsolver wrote:
       | Wish they'd followed their previous MO of releasing stuff with no
       | warning or buildup.
       | 
       | Results won't match the hype.
        
         | cooper_ganglia wrote:
         | I feel like announcing a new product in the same vein as your
         | main product as an established company is almost always a bad
         | idea. If you're going to improve your product, don't announce
         | the improvements 6-12 months ahead of time and grow the hype to
         | unmanageable levels, just announce a great product and tell
         | them it's available starting today.
        
       | transformi wrote:
       | Not impressive compare to the opensource video models out there,
       | I anticipated some physics/VR capabilities, but it's basically
       | just a marketing promotion to "stay in the game"...
        
         | Geee wrote:
         | What's the best open source video model right now?
        
           | minimaxir wrote:
           | Hunyan (https://replicate.com/tencent/hunyuan-video ,
           | $0.70/video) is the best but somewhat expensive. LTX
           | (https://replicate.com/fofr/ltx-video , $0.10) is
           | cheaper/faster but less capable.
           | 
           | Both are permissively licensed.
        
             | treesciencebot wrote:
             | Hunyuan at other providers like fal.ai is cheaper than SORA
             | for the same resolution (720p 5 seconds gets you ~15 videos
             | for $20 vs almost 50 videos at fal). It is slower than SORA
             | (~3 minutes for a 720p video) but faster than replicate's
             | hunyuan (by 6-7x for the same settings).
             | 
             | https://fal.ai/models/fal-ai/hunyuan-video
        
           | cooper_ganglia wrote:
           | Hunyuan is a recent one that has looked pretty good.
        
         | bbor wrote:
         | I... can you explain, or point to some competitors...? To me
         | this looks _leagues_ ahead of everything else. But maybe I 'm
         | behind the game?
         | 
         | AFAIK based on HuggingFace trending[1], the competitors are:
         | 
         | - bytedance/animatediff-lightning:
         | https://arxiv.org/pdf/2403.12706 (2.7M downloads in the past
         | 30d, released in March)
         | 
         | - genmo/mochi-1-preview: https://github-production-user-
         | asset-6210df.s3.amazonaws.com... (21k downloads, released in
         | October)
         | 
         | - thudm/cogvideox-5b: https://huggingface.co/THUDM/CogVideoX-5b
         | (128k downloads, released in August)
         | 
         | Is there a better place to go? I'm very much not plugged into
         | this part of LLMs, partially because it's just so damn
         | spooky...
         | 
         | EDIT: I now see the reply above referencing Hunyuan, which I
         | didn't even know was its own model. Fair enough! I guess, like
         | always, we'll just need to wait for release so people can run
         | their own human-preference tests to definitively say which is
         | better. Hunyuan does indeed seem good
        
         | zeknife wrote:
         | Like with music generation models, the main thing that might
         | make "open source" models better is most likely that they have
         | no concern about excluding copyrighted material from the
         | training data, so they actually get a good starting point
         | instead of using a dataset consisting of youtube videos and
         | stock footage
        
       | fngjdflmdflg wrote:
       | Serious question: is this better than current text-to-video
       | models like Hailuo?
        
       | meetpateltech wrote:
       | Pricing:
       | 
       | Plus Tier (20$/month)
       | 
       | - Up to 50 priority videos (1,000 credits)
       | 
       | - Up to 720p resolution and 5s duration
       | 
       | Pro Tier (200$/month)
       | 
       | - Up to 500 priority videos (10,000 credits)
       | 
       | - Unlimited relaxed videos
       | 
       | - Up to 1080p resolution, 20s duration and 5 concurrent
       | generations
       | 
       | - Download without watermark
       | 
       | more info: https://help.openai.com/en/articles/10245774-sora-
       | billing-cr...
        
         | cube2222 wrote:
         | Worth noting here that this is the existing ChatGPT
         | subscription, you don't need a separate one.
        
         | jsheard wrote:
         | Called it, they were sitting on Sora until the $200 tier
         | launched. Between the watermarking and 50 video limit the $20
         | tier is functionally a trial.
        
         | dbspin wrote:
         | Wow they're watermarking videos and limiting them to 720 at the
         | 20 dollar price point? That's a bold move, considering their
         | competition's pricing...
         | 
         | https://www.klingai.com/membership/membership-plan
         | 
         | Quality seems relatively similar based on the samples I've
         | seen. With the same issues - object permanence, temporal
         | stability, physics comprehension etc, being present in both.
         | Kling has no qualms about copyright violation however.
        
           | minimaxir wrote:
           | At OpenAI's $20/mo price point, you can also only generate
           | _16_ 720p 5s videos per month.
           | 
           | Kling doesn't seem to have more granular information
           | publically but I suspect it allows for more than 16 videos
           | per month.
        
         | throwup238 wrote:
         | From the FAQ [1], too:
         | 
         |  _> > Can I purchase more credits?
         | 
         | > We currently don't support the ability to purchase more
         | credits on a one-time basis.
         | 
         | > If you are on a ChatGPT Plus and would like to access more
         | credits to use with Sora, you can upgrade to the Pro plan._
         | 
         | Ouch. Looks like they're really pushing this ChatGPT pro
         | subscription. Between the watermark and being unable to buy
         | more credits, the plus plan is basically a small trial.
         | 
         | [1] https://help.openai.com/en/articles/10245774-sora-billing-
         | cr...
        
       | yeknoda wrote:
       | I've found using these and similar tools that the amount of
       | prompts and iteration required to create my vision (image or
       | video in my mind) is very large and often is not able to create
       | what I had originally wanted. A way to test this is to take a
       | piece of footage or an image which is the ground truth, and test
       | how much prompting and editing it takes to get the same or
       | similar ground truth starting from scratch. It is basically not
       | possible with the current tech and finite amounts of time and
       | iterations.
        
         | cube2222 wrote:
         | Agreed. It's still much better than what I could do myself
         | without it, though.
         | 
         | (Talking about visual generative AI in general)
        
           | JKCalhoun wrote:
           | Yeah, but if I handed you a Maxfield Parrish it would be
           | better than either of us can do -- but not what I asked for.
           | 
           | I find generative AI frustrating because I _know_ what I
           | want. To this point I have been trying but then ultimately
           | sitting it out -- waiting for the one that really works the
           | way I want.
        
             | cube2222 wrote:
             | For me even if I know what I want, if I'm using gen AI I'm
             | happy to compromise and get good enough (which again, is so
             | much better than I could do otherwise).
             | 
             | If you want higher quality/precision, you'll likely want to
             | ask a professional, and I don't expect that to change in
             | the near future.
        
               | adamc wrote:
               | That limits its value for industries like Hollywood,
               | though, doesn't it? And without that, who exactly is
               | going to pay for this?
        
               | jddj wrote:
               | Advertisers, I guess. Same folks who paid for everything
               | else around here
        
               | adamc wrote:
               | Yeah, I just question if there are enough customers to
               | make this work.
        
               | cube2222 wrote:
               | To me, currently, visual generative ai is an evolution
               | and improvement of stock images, and has effectively the
               | same purpose.
               | 
               | People pay for stock images.
        
               | adamc wrote:
               | Yeah, maybe for some purposes. In business, people
               | sometimes pay for stock images but often don't have the
               | expertise or patience to really spend a lot of time
               | coaching a video into fruition. Maybe for advertising or
               | other contexts where more effort is worth it (not just
               | powerpoints), but it feels like a slim audience.
        
               | cube2222 wrote:
               | With tools like Apple Intelligence and its genmoji (emoji
               | generation) and playground (general diffusion image
               | generation) I expect it to also take on some of the
               | current entertainment and social use-cases of stickers
               | and GIFs.
               | 
               | But that's probably something you don't pay for directly,
               | instead paying for e.g. a phone that has those features.
        
         | nomel wrote:
         | I think inpainting and "draw the label scene" type interfaces
         | are the obvious future. Never thought I'd miss GauGAN [1].
         | 
         | https://www.youtube.com/watch?v=uNv7XBngmLY&t=25
        
         | minimaxir wrote:
         | Way back in the days of GPT-2, there was an expectation that
         | you'd need to cherry-pick atleast 10% of your output to get
         | something usable/coherent. GPT-3 and ChatGPT greatly reduced
         | the need to cherry-pick, for better or for worse.
         | 
         | All the generated video startups seem to generate videos with
         | _much_ lower than 10% usable output, without significant human-
         | guided edits. Given the massive amount of compute needed to
         | generate a video relative to hyperoptimized LLMs, the quality
         | issue will handicap gen video for the foreseeable future.
        
           | joe_the_user wrote:
           | Plus editing text or an image is practical. Video editors
           | typically are used to cut and paste video streams - a video
           | editor can't fix a stream of video that gets motion or
           | anatomy wrong.
        
         | mattigames wrote:
         | Not too far in the future you will be able to drag and drop the
         | position of the characters as well as the position of the
         | camera, among other refiment tools.
        
         | isoprophlex wrote:
         | And another thing that irks me: none of these video generators
         | get motion right...
         | 
         | Especially anything involving fluid/smoke dynamics, or fast
         | dynamic momements of humans and animals all suffer from the
         | same weird motion artifacts. I can't describe it other than
         | that the fluidity of the movements are completely off.
         | 
         | And as all genai video tools I've used are suffering from the
         | same problem, I wonder if this is somehow inherent to the
         | approach & somehow unsolvable with the current model
         | architectures.
        
           | benchmarkist wrote:
           | Neural networks use smooth manifolds as their underlying
           | inductive bias so in theory it should be possible to
           | incorporate smooth kinematic and Hamiltonian constraints but
           | I am certain no one at OpenAI actually understands enough of
           | the theory to figure out how to do that.
        
             | david-gpu wrote:
             | _> I am certain no one at OpenAI actually understands
             | enough of the theory to figure out how to do that_
             | 
             | We would love to learn more about the origin of your
             | certainty.
        
               | benchmarkist wrote:
               | I don't work there so I'm certain there is no one with
               | enough knowledge to make it work with Hamiltonian
               | constraints because the idea is very obvious but they
               | haven't done it because they don't have the wherewithal
               | to do so. In other words, no one at OpenAI understands
               | enough basic physics to incorporate conservation
               | principles into the generative network so that objects
               | with random masses don't appear and disappear on the
               | "video" manifold as it evolves in time.
        
               | david-gpu wrote:
               | _> the idea is very obvious but they haven 't done it
               | because they don't have the wherewithal to do so_
               | 
               | Fascinating! I wish I had the knowledge and wherewithal
               | to do that and become rich instead of wasting my time on
               | HN.
        
               | benchmarkist wrote:
               | No one is perfect but you should try to do better and
               | waste less time on HN now that you're aware and can act
               | on that knowledge.
        
               | david-gpu wrote:
               | Nah, I'm good. HN can be a very amusing place at times.
               | Thanks, though.
        
             | dartos wrote:
             | How does your conclusion follow from your statement?
             | 
             | Neural networks are largely black box piles of linear
             | algebra which are massaged to minimize a loss function.
             | 
             | How would you incorporate smooth kinematic motion in such
             | an environment?
             | 
             | The fact that you discount the knowledge of literally every
             | single employee at OpenAI is a big signal that you have no
             | idea what you're talking about.
             | 
             | I don't even really like OpenAI and I can see that.
        
               | benchmarkist wrote:
               | I've seen the quality of OpenAI engineers on Twitter and
               | it's easy enough to extrapolate. Moreoever, neural
               | networks are not black boxes, you're just parroting
               | whatever you've heard on social media. The underlying
               | theory is very simple.
        
               | dartos wrote:
               | Do not make assumptions about people you do not know in
               | an attempt to discredit them. You seem to be a big fan of
               | that.
               | 
               | I have been working with NLP and neural networks since
               | 2017.
               | 
               | They aren't just black boxes, they are _largely_ black
               | boxes.
               | 
               | When training an NN, you don't have great control over
               | what parts of the model does what or how.
               | 
               | Now instead of trying to discredit me, would you mind
               | answering my question? Especially since, as you say, the
               | theory is so simple.
               | 
               | How would you incorporate smooth kinematic motion in such
               | an environment?
        
               | benchmarkist wrote:
               | Why would I give away the idea for free? How much do you
               | want to pay for the implementation?
        
               | dartos wrote:
               | lol. Ok dude you have a good one.
        
               | benchmarkist wrote:
               | You too but if you do want to learn the basics then
               | here's one good reference:
               | https://www.amazon.com/Hamiltonian-Dynamics-Gaetano-
               | Vilasi/d.... If you already know the basics then this is
               | a good followup: https://www.amazon.com/Integrable-
               | Hamiltonian-Systems-Geomet.... The books are much cheaper
               | than paying someone like me to do the implementation.
        
               | dartos wrote:
               | Yeah I mean I would never pay you for anything.
               | 
               | You've convinced me that you're small and know very
               | little about the subject matter.
               | 
               | You don't need to reply to this. I'm done with this
               | convo.
        
               | benchmarkist wrote:
               | Ok, have a good one dude.
        
               | mech422 wrote:
               | cop out... according to you, the idea is so obvious it
               | wouldn't be worth anything.
        
               | benchmarkist wrote:
               | It's not worth anything to me but I'm sure OpenAI would
               | be willing to pay a lot of money for opening up the
               | market of video generators with realistic physical
               | evolution. If they want to hire an ultra genius such as
               | myself to do the work then it would be worth at least
               | $500k to them so that's basically the floor for anyone
               | else who wants the actual details. The actual market
               | would be worth billions so I'm basically giving the idea
               | away at that price.
        
           | giantrobot wrote:
           | I think one of the biggest problems is the models are trained
           | on 2D sequences and don't have any understanding of what
           | they're actually seeing. They see some structure of pixels
           | shift in a frame and learn that some 2D structures should
           | shift in a frame over time. They don't actually understand
           | the images are 2D capture of an event that occurred in four
           | dimensions and the thing that's been imaged is under the
           | influence of _unimaged_ forces.
           | 
           | I saw a Santa dancing video today and the suspension of
           | disbelief was almost instantly dispelled when the cuffs of
           | his jacket moved erratically. The GenAI was trying to get
           | them to sway with arm movements but because it didn't
           | understand _why_ they would sway it just generated a
           | statistical approximation of swaying.
           | 
           | GenAI also definitely doesn't understand 3D structures easily
           | demonstrated by completely incorrect morphological features.
           | Even my dogs understand gravity, if I drop an object they're
           | tracking (food) they know it should hit the ground. They also
           | understand 3D space, if they stand on their back legs they
           | can see over things or get a better perspective.
           | 
           | I've yet to see any GenAI that demonstrates even my dogs'
           | level of understanding the physical world. This leaves their
           | output in the uncanny valley.
        
           | jeroen wrote:
           | They don't even get basic details right. The ship in the 8th
           | video changes with every camera change and birds appear out
           | of nowhere.
        
           | soheil wrote:
           | What's the point of poking holes in new technology and
           | nitpiking like this? Are you blind to the immense
           | breakthroughs made today and yet you focus what irks you
           | about some tiny detail that might go away after a couple of
           | versions?
        
         | miltonlost wrote:
         | The adage "a picture is worth a thousand words" has the nice
         | corollary "A thousand words isn't enough to be precise about an
         | image".
         | 
         | Now expand that to movies and games and you can get why this
         | whole generative-AI bubble is going to pop.
        
           | szundi wrote:
           | Comment was probably rather about the 360 degree turning
           | heads etc.
        
           | GistNoesis wrote:
           | (2020) https://arxiv.org/abs/2010.11929 : an image is worth
           | 16x16 words transformers for image recognition at scale
           | 
           | (2021) https://arxiv.org/abs/2103.13915 : An Image is Worth
           | 16x16 Words, What is a Video Worth?
           | 
           | (2024) https://arxiv.org/abs/2406.07550 : An Image is Worth
           | 32 Tokens for Reconstruction and Generation
        
             | dartos wrote:
             | Those are indeed 3 papers.
        
               | GistNoesis wrote:
               | Yes in a nutshell they explain that you can express a
               | picture or a video with relatively few discrete
               | information.
               | 
               | First paper is the most famous and prompted a lot of
               | research to using text generation tools in the image
               | generation domain : 256 "words" for an image, Second
               | paper is 24 reference image per minutes of video, Third
               | paper is a refinement of the first saying you only need
               | 32 "tokens". I'll let you multiply the numbers.
               | 
               | In kind of the same way as a who's who game, where you
               | can identify any human on earth with ~32bits of
               | information.
               | 
               | The corollary being that contrary to what parent is
               | telling there is no theoretical obstacle to obtaining a
               | video from a textual description.
        
               | dartos wrote:
               | I think something is getting lost in translation.
               | 
               | These papers, from my quick skim (tho I did read the
               | first one fully years ago,) seem to show that some images
               | and to an extent video can be generated from discrete
               | tokens, but does not show that exact images nor that any
               | image can be.
               | 
               | For instance, what combination of tokens must I put in to
               | get _exactly_ Mona Lisa or starry night? (Tho these might
               | be very well represented in the data set. Maybe a lesser
               | known image would be a better example)
               | 
               | As I understand, OC was saying that they can't produce
               | what they want with any degree of precision since there's
               | no way to encode that information in discrete tokens.
        
               | GistNoesis wrote:
               | If you want to know what tokens you want to obtain
               | _exactly_ Mona Lisa, or any other image, you take the
               | image and put it through your image tokenizer aka encode
               | it, and if you have the sequence of token you can decode
               | it to an image.
               | 
               | VQ-VAE (Vector Quantised-Variational AutoEncoder), (2017)
               | https://arxiv.org/abs/1711.00937
               | 
               | The whole encoding-decoding process is reversible, and
               | you only lose some imperceptible "details", the process
               | can be either trained with a L2Loss, or a perceptual loss
               | depending what you value.
               | 
               | The point being that images which occurs naturally are
               | not really information rich and can be compressed a lot
               | by neural networks of a few GB that have seen billions of
               | pictures. With that strong prior, aka common knowledge,
               | we can indeed paint with words.
        
               | dartos wrote:
               | Maybe I'm not able to articulate my thought well enough.
               | 
               | Taking an existing image and reversing the process to get
               | the tokens that led to it then redoing that doesn't seem
               | the same as inserting token to get a precise novel image.
               | 
               | Especially since, as you said, we'd lose some details, it
               | suggests that not all images can be perfectly described
               | and recreated.
               | 
               | I suppose I'll need to play around with some of those
               | techniques.
        
           | TeMPOraL wrote:
           | > _Now expand that to movies and games and you can get why
           | this whole generative-AI bubble is going to pop._
           | 
           | What will save it is that, no matter how picky you are as a
           | creator, your audience will never know what exactly was that
           | you dreamed up, so any half-decent approximation will work.
           | 
           | In other words, a corollary to your corollary is,
           | "Fortunately, you don't need them to be, because no one cares
           | about low-order bits".
           | 
           | Or, as we say in Poland, "What the eye doesn't see, the heart
           | doesn't mourn."
        
             | jsheard wrote:
             | > What will save it is that, no matter how picky you are as
             | a creator, your audience will never know what exactly was
             | that you dreamed up, so any half-decent approximation will
             | work.
             | 
             | Part of the problem is the "half decent approximations"
             | tend towards a cliched average, the audience won't know
             | that the cool cyberpunk cityscape you generated isn't
             | exactly what you had in mind, but they will know that it
             | looks like every other AI generated cyberpunk cityscape and
             | mentally file your creation in the slop folder.
             | 
             | I think the pursuit of fidelity has made the models less
             | creative over time, they make fewer glaring mistakes like
             | giving people six fingers but their output is ever more
             | homogenized and interchangable.
        
               | randomcatuser wrote:
               | a somewhat counterintuitive argument is this: AI models
               | will make the overall creative landscape more diverse and
               | interesting, ie, less "average"!
               | 
               | Imagine the space of ideas as a circle, with stuff in the
               | middle being more easy to reach (the "cliched average").
               | Previously, traversing the circle was incredibly hard -
               | we had to use tools like DeviantArt, Instragram, etc to
               | agglomerate the diverse tastes of artists, hoping to find
               | or create the style we're looking for. Creating the same
               | art style is hiring the artist. As a result, on average,
               | what you see is the result of huge amounts of human
               | curation, effort, and branding teams.
               | 
               | Now reduce the effort 1000x, and all of a sudden, it's
               | incredibly easy to reach the edge of the circle (or
               | closer to it). Sure, we might still miss some things at
               | the very outer edge, but it's equivalent to building
               | roads. Motorists appear, people with no time to sit down
               | and spend 10000 hours to learn and master a particular
               | style can simply _remix art_ and create things wildly
               | beyond their manual capabilities. As a result, the amount
               | of content in the infosphere skyrockets, the tastemaking
               | velocity accelerates, and you end up with a more
               | interesting infosphere than you 're used to.
        
               | wongarsu wrote:
               | And as AI oversaturates the cliched average, creators
               | will have to get further and further away from the
               | average to differentiate themselves. If you pour a lot of
               | work into your creation you want to make it clear that it
               | isn't some cliched AI drivel.
        
               | skydhash wrote:
               | You will basically have to provide a video showcasing
               | your workflow.
        
               | TeMPOraL wrote:
               | To extend the analogy, imagine the circle as a
               | probability distribution; for simplicity, imagine it's a
               | bivariate normal joint distribution (aka. Gaussian in 3D)
               | + some noise, and you're above it and looking down.
               | 
               | When you're commissioning an artist to make you some art,
               | you're basically sampling from the entire distribution.
               | Stuff in the middle is, as you say, easiest to reach, so
               | that's what you'll most likely get. Generative models let
               | more people do art, meaning there's more sampling
               | happening, so the stuff further from the centre will be
               | visited more often, too.
               | 
               | However, AI tools also make another thing easier: moving
               | and narrowing the sampling area. Much like with a very
               | good human artist, you can find some work that's "out
               | there", and ask for variations of it. However, there are
               | only so many good artists to go around. AI making this
               | process much easier and more accessible means more
               | exploration of the circle's edges will happen. Not just
               | "more like this weird thing", but also combinations of 2,
               | 3, 4, N distinct weird things. So in a way, I feel that
               | AI tools will surface creative art disproportionally more
               | than it'll boost the common case.
               | 
               | Well, except for the fly in the ointment that's the
               | advertising industry (aka. the cancer on modern society).
               | Unfortunately, by far most of the creative output of
               | humanity today is done for advertising purposes, and that
               | goal favors the common, as it maximizes the audience (and
               | is least off-putting). Deluge of AI slop is unavoidable,
               | because slop is how the digital world makes money, and
               | generative AI models make it cheaper than generative
               | protein models that did it so far. Don't blame AI
               | research for that, blame advertising.
        
               | robertlagrant wrote:
               | It's just like when Bootstrap came out. Terrible-looking
               | websites stopped appearing, but so did beautiful
               | websites.
        
               | TeMPOraL wrote:
               | > _I think the pursuit of fidelity has made the models
               | less creative over time (...) their output is ever more
               | homogenized and interchangable._
               | 
               | Ironically, we're long past that point with _human
               | creators_ , at least when it comes to movies and games.
               | 
               | Take sci-fi movies, compare modern ones to the ones from
               | the tail end of the 20th century. Year by year, VFX gets
               | more and more detailed (and expensive) - more and better
               | lights, finer details on every material, more stuff
               | moving and emitting lights, etc. But all that effort
               | arguably _killed_ immersion and believability, by _making
               | scenes incomprehensible_. There 's _way_ too much visual
               | noise in action scenes in particular - bullets and
               | lighting bolts zip around, and all that detail just blurs
               | together. Contrast the 20th century productions -
               | textures weren 't as refined, but you could at least tell
               | who's shooting who and when.
               | 
               | Or take video games, where all that graphics works makes
               | everything look the same. Especially games that go for
               | realistic style, they're all homogenous these days, and
               | it's all cheap plastic.
               | 
               | (Seriously, what the fuck went wrong here? All that talk,
               | and research, and work into "physically based rendering",
               | yet in the end, all PBR materials end up looking like
               | painted plastic. Raytracing seems to help a bit when it
               | comes to liquids, but it still can't seem to make metals
               | look like metals and not Fischer-Price toys repainted to
               | gray.)
               | 
               | So I guess in this way, more precision just makes the
               | audience give up entirely.
               | 
               | > _they will know that it looks like every other AI
               | generated cyberpunk cityscape and mentally file your
               | creation in the slop folder._
               | 
               | The answer here is the same as with human-produced slop:
               | don't. People are good at spotting patterns, so keep
               | adding those low-order bits until it's no longer obvious
               | you're doing the same thing everyone else is.
               | 
               | EDIT: Also, obligatory reminder that generative models
               | don't give you average of training data with some noise
               | mixed up; they _sample from learned distribution_. Law of
               | large numbers apply, but it just means that to get more
               | creative output, you need to bias the sampling.
        
               | wongarsu wrote:
               | Video games (the much larger industry of the two, by
               | revenue) seems to be closer to understanding this. AAA
               | games dominate advertising and news cycles, but on any
               | best-seller list AAA games are on par with indie and B
               | games (I think they call them AA now?). For every
               | successful $60M PBR-rendered Unreal 5 title there is an
               | equally successful game with low-fidelity graphics but
               | exceptional art direction, story or gameplay.
               | 
               | Western movie studios may discover the same thing soon,
               | with the number of high-budget productions tanking
               | lately.
        
               | robertlagrant wrote:
               | I agree. The one shining hope I have is the incredible
               | art and animation style of Fortiche[0]'s Arcane[1]
               | series. Watch that, and then watch any recent (and
               | identikit) Pixar movie, and they are just streets ahead.
               | It's just brilliant.
               | 
               | [0] https://en.wikipedia.org/wiki/Fortiche
               | 
               | [1] https://en.wikipedia.org/wiki/Arcane_(TV_series)
        
               | samatman wrote:
               | Empirically, we've passed the point where that's true,
               | for someone not being lazy about it.
               | 
               | https://www.astralcodexten.com/p/how-did-you-do-on-the-
               | ai-ar...
               | 
               | In other words, someone willing to tweak the prompt and
               | press the button enough times to say "yeah, that one,
               | that's really good" is going to have a result which
               | cannot in fact be reliably binned as AI-generated.
        
             | dartos wrote:
             | Your eye sees just about every frame of a film...
             | 
             | People may not think they care, but obviously they do.
             | That's why marvel movies do better than DC ones.
             | 
             | People absolutely care about details in their media.
        
               | TeMPOraL wrote:
               | Fair point, particularly given the example. My conclusion
               | wrt. Marvel vs. DC is that DC productions care much less
               | about details, in exactly the way I find off-putting.
               | 
               | Not all details matter, some do. And, it's better to not
               | show the details at all, than to be inconsistent in them.
               | 
               | Like, idk., don't identify a bomb as a specific type of
               | existing air-fuel ordnance and then act about it as if it
               | was a goddamn tactical nuke. Something along these lines
               | was what made me stop watching _Arrow_ series.
        
               | dartos wrote:
               | > Not all details matter, some do
               | 
               | This is a key observation, unfortunately generally
               | solving for what details matter is extremely difficult.
               | 
               | I don't think video generation models help with that
               | problem, since you have even less control of details than
               | you do with film.
               | 
               | At least before post.
        
               | og_kalu wrote:
               | The visuals are the absolute bottom of why DC movies have
               | performed worse over the years.
               | 
               | The movies have just had much worse audience and critical
               | reception.
        
             | naasking wrote:
             | I was just going to say this. If you have an artistic
             | vision that you simply _must_ create to the minutest
             | detail, then like any artist, you 're in for a lot of
             | manual work.
             | 
             | If you are not beholden to a precise vision or maybe just
             | want to create something that sells, these tools will
             | likely be significant productivity multipliers.
        
               | whstl wrote:
               | Exactly.
               | 
               | So far ChatGPT is not for writing books, but is great for
               | SEO-spam blogposts. It is already killing the content
               | marketing industry.
               | 
               | So far Dall-E is not for making master paintings, but
               | it's great for stock images. It might kill most of the
               | clipart and stock image industry.
               | 
               | So far Udio and other song generators are not able to
               | make symphonies, but it's great for quiet background
               | music. It might kill most of the generic royalty-free-
               | music industry.
        
             | hammock wrote:
             | It's like how there are two types of movie directors (or
             | creative directors in general), the dictatorial "100 takes
             | until I get it exactly how I envision it" type, and the "I
             | hired you to act, so you bring the character to life for me
             | and what will be will be" type
             | 
             | Right now AI is more the latter, but many people want it to
             | be the former
        
               | troupo wrote:
               | AI is neither.
               | 
               | A director letting actors "just be" knows exactly what
               | he/she wants, and choses actors accordingly. Just as the
               | directors that want the most minute detail.
               | 
               | Clint Eastwood tries to do at most one take of a scene.
               | David Fincher is infamous for his dozens of takes.
               | 
               | AI is neither Fincher nor Eastwood.
        
             | wcfrobert wrote:
             | Do artist really have a fully formed vision in their head?
             | I suspect the creative process is much more iterative
             | rather than one-directional.
        
               | skydhash wrote:
               | No one can have a fully formed vision. But intent, yes.
               | Then you use techniques to materialize it. Word is a poor
               | substitute for that intent, which is why there's so many
               | sketches in a visual project.
        
               | maxglute wrote:
               | And why physical execution frequently significantly
               | departs from sketches and concept art. The amount of
               | intent that doesn't get translated is pretty staggering
               | in both physical and digital pipelines in many projects.
        
             | Ar-Curunir wrote:
             | That's just sad, and why people have a derogative stance
             | towards generative AI: "half-decent" approximation removes
             | all personality from the output, leading to a bunch of slop
             | on the internet.
        
               | TeMPOraL wrote:
               | It does indeed, but then many of those people don't
               | notice they're already consuming half-decent,
               | personality-less slop, because that's what human artists
               | make too, when churning out commercial art for peanuts
               | and on tight deadlines.
               | 
               | It's less obvious because people _project_ personality
               | onto the content they see, because they implicitly
               | _assume_ the artist _cared_ , and had some vision in
               | mind. Cheap shit doesn't look like cheap shit in
               | isolation. Except when you know it's AI-generated,
               | because this removes the artist from the equation, and
               | with it, your assumptions that there's any personality
               | involved.
        
               | whatevertrevor wrote:
               | I'm not so sure, one of the primary complaints about IP
               | farming slop that major studios have produced recently is
               | a lack of firm creative vision, and clear evidence of
               | design by committee over artist direction.
               | 
               | People can generally see the lack of artistic intent when
               | consuming entertainment.
        
               | TeMPOraL wrote:
               | That's true. Then again, complaints about "lack of firm
               | creative vision, and clear evidence of design by
               | committee over artist direction" is something I've seen
               | levied against Disney for several years now; importantly,
               | they started _before_ generative AI found its way into
               | major productions.
               | 
               | So, while GenAI tools make it easier to create
               | superficially decent work that lacks creative intent, the
               | studios managed to do it just fine with human
               | intelligence only, suggesting the problem isn't AI, but
               | the studios and their modern management policies.
        
             | msabalau wrote:
             | Half decent approximations work a lot better in generating
             | the equivalent of a stock illustrations of a powerpoint
             | slide.
             | 
             | Actual long form art like a movie works because it includes
             | many well informed choices that work together as a whole.
             | 
             | There seems to be a large gap between generating a few
             | seconds of video vaguely like one's notion, and trying to
             | create 90 minutes that are related and meaningful.
             | 
             | Which doesn't mean that you can't build from this starting
             | place build more robust tools. But if you think that this
             | is a large, hard amount of work, it certainly could call
             | into question optisimitic projections from people who don't
             | even seem to notice that there is work need at all.
        
           | throwup238 wrote:
           | "A frame is worth a billion rays"
           | 
           | The last production I worked on averaged 16 hours per frame
           | for the final rendering. The amount of information encoded in
           | lighting, models, texture, maps, etc is insane.
        
             | bongodongobob wrote:
             | What were you working on? It took a month to render 2
             | seconds of video?
        
               | elmigranto wrote:
               | I would guess there is more than one computer :)
               | 
               | Pixar's stuff famously takes days per frame.
        
               | Arelius wrote:
               | > Pixar's stuff famously takes days per frame.
               | 
               | Do you have a citation for this? My guess would be much
               | closer to a couple of hours per frame.
        
               | elmigranto wrote:
               | https://sciencebehindpixar.org/pipeline/rendering
        
               | Arelius wrote:
               | Most VFX productions take over 2 CPU hours a frame for
               | final video, and have for a very long time. It takes
               | shorter then a month since this gets parallelized on
               | large render farms.
        
               | throwup238 wrote:
               | VFX heavy feature for a Disney subsidiary. Each frame is
               | rendered independently of each other - it's not like
               | video encoding where each frame depends on the previous
               | one, they all have their own scene assembly that can be
               | sent to a server to parallelize rendering. With enough
               | compute, the entire film can be rendered in a few days.
               | (It's a little more complicated than that but works to a
               | first order approximation)
               | 
               | I don't remember how long the final rendering took but it
               | was nearly two months and the final compute budget was 7
               | or 8 figures. I think we had close to 100k cores running
               | at peak from three different render farms during crunch
               | time, but don't take my word for it I wasn't producing
               | the picture.
        
               | dist-epoch wrote:
               | Are they still using CPUs and not GPUs for rendering?
               | 
               | Weren't the rendering algos ported to CUDA yet?
        
               | jsheard wrote:
               | GPU renderers exist but they have pretty hard scaling
               | limits, so the highest end productions still use CPU
               | renderers almost exclusively.
               | 
               | The 3D you see in things like commercials is usually done
               | on GPUs though because at their smaller scale it's much
               | faster.
        
           | Al-Khwarizmi wrote:
           | If you can build a system that can generate engaging games
           | and movies, from an economic (bubble popping or not popping)
           | point of view it's largely irrelevant whether they conform to
           | fine-grained specifications by a human or not.
        
             | dartos wrote:
             | In other words:
             | 
             | If you find a silver bullet then everything else is largely
             | irrelevant.
             | 
             | Idk if you noticed but that "if" is carrying an insane
             | amount of weight.
        
             | jsheard wrote:
             | Text generation is the most mature form of genAI and even
             | that isn't even remotely close to producing infinite
             | engaging stories. Adding the visual aspect to make that
             | story into a movie or the interactive element to turn it
             | into a game is only uphill from there.
        
           | beambot wrote:
           | Corollary: I couldn't create an original visual piece of art
           | to save my life, so prompting is infinitely better than what
           | I could do myself (or am willing to invest time in building
           | skills). The gen-AI bubble isn't going to burst. Pareto
           | always wins.
        
           | mrandish wrote:
           | I agree that people who want any meaningful precision in
           | their visual results will inevitably be disappointed.
        
           | fooker wrote:
           | Sure it's going to pop. But when is the important question.
           | 
           | Being too early about this and being wrong are the same.
        
           | stale2002 wrote:
           | You are half right. Its funny because I use the same same.
           | Mine is "A picture is worth a thousand words. thats why it
           | takes 1000 words to describe the exact image that you want!
           | Much better to just use Image to Image instead".
           | 
           | Thats my full quote on this topic. And I think it stands.
           | Sure, people won't describe a picture. instead, they will
           | take an existing picture or video, and do modifications of
           | it, using AI. That is much much simpler and more useful, if
           | you can file a scene, and then animate it later with AI.
        
           | meta_x_ai wrote:
           | A picture is worth a thousand words.
           | 
           | A word is worth a thousand pictures. (E.g Love)
           | 
           | It is abstraction all the way
        
           | raincole wrote:
           | The point is not to be precise. It's to be "good enough".
           | 
           | Trust me, even if you work with human artists, you'll keep
           | saying "it's not quite I initially invisioned, but we don't
           | have budget/time for another revision, so it's good enough
           | for now." _all the time_.
        
           | soheil wrote:
           | Maybe your AI bubble! If you define AI to be something like
           | just another programming language yes you will be sadly
           | disappointed. You see it as an employee with its own
           | intuitions and ways of doing things that you're trying to
           | micromanage.
           | 
           | I have a bad feeling that you'd be a horrible manager if you
           | ever were one.
        
         | hipadev23 wrote:
         | Real artists struggle matching vague descriptions of what is in
         | your head too. This is at least quicker?
        
           | janalsncm wrote:
           | The point is if you are the artist and have something in your
           | head. It's the same problem with image editing. I am sure you
           | have experienced this.
        
             | mlboss wrote:
             | So what I am getting a use-case for brain-computer
             | interface.
        
             | TeMPOraL wrote:
             | There is no problem unless you insist on reflecting what
             | you had in mind _exactly_. That needs minute controls, but
             | no matter the medium and tools you use, unless you 're
             | doing it in your own quest for artistic perfection, the
             | economic constraints will make you stop short of your idea
             | - there's always a point past which any further refinement
             | will not make a difference to the audience (which doesn't
             | have access to the thing in your head to use as reference),
             | and the costs of continuing will exceed any value (monetary
             | or otherwise) you expect to get from the work.
             | 
             | AI or not, no one but you cares about the lower order bits
             | of your idea.
        
               | janalsncm wrote:
               | I disagree. Even without exactness, adding any reasonable
               | constraints is impossible. Ask it to generate a realistic
               | circuit diagram or chess board or any other thing where
               | precision matters. Good luck going back and forth getting
               | it right.
               | 
               | These are situations with relatively simple logical
               | constraints, but an infinite number of valid solutions.
               | 
               | Keep in mind that we are not requiring any particular
               | configuration of circuit diagram, just any diagram that
               | makes sense. There are an infinite number of valid ones.
        
               | TeMPOraL wrote:
               | That's using the wrong tool for a job :). Asking
               | diffusion models to give you a valid circuit diagram is
               | like asking a painter to paint you pixel-perfect 300DPI
               | image on a regular canvas, using their standard
               | paintbrush. It ain't gonna work.
               | 
               | That doesn't mean it _can 't_ work with AI - it's that
               | you may need to add something extra to the generative
               | pipeline, something that can do circuit diagrams, and
               | make the diffusion model supply style and extra noise
               | (er, beautifying elements).
               | 
               | > _Keep in mind that we are not requiring any particular
               | configuration of circuit diagram, just any diagram that
               | makes sense. There are an infinite number of valid ones._
               | 
               | On that note. I'm the kind of person that loves to
               | freeze-frame movies to look at markings, labels, and
               | computer screens, and one thing I learned is that _humans
               | fail at this task too_. Most of the time the problems are
               | big and obvious, ruining my suspension of disbelief, and
               | importantly, they could be trivially solved if the
               | producers grabbed a random STEM-interested intern and
               | asked for advice. Alas, it seems they don 't care.
               | 
               | This is just a specific instance of the general problem
               | of "whatever you work with or are interested in, you'll
               | see movies keep getting it wrong". Most of the time, it's
               | somewhat defensible - e.g. most movies get guns wrong,
               | but in way people are used to, and makes the scenes more
               | streamlined and entertaining. But with labels, markings
               | and computer screens, doing it right isn't any more
               | expensive, nor would it make the movie any less
               | entertaining. It seems that the people responsible don't
               | know better or care.
               | 
               | Let's keep that in mind when comparing AI output to the
               | "real deal", as to not set an impossible standards that
               | human productions don't match, and never did.
        
               | janalsncm wrote:
               | The issue isn't any particular constraint. The issue is
               | the inability to add any constraints at all.
               | 
               | In particular, internal consistency is one of the
               | important constraints which viewers will immediately
               | notice. If you're just using sora for 5 second unrelated
               | videos it may be less of an issue but if you want to do
               | anything interesting you'll need the clips to tie
               | together which requires internal consistency.
        
               | throwup238 wrote:
               | Nobody else really cares about the lower order bits of
               | the idea but they do care that those lower order bits are
               | consistent. The simplest example is color grading: most
               | viewers are generally ignorant of artistic choices in
               | color palettes unless it's noticeable like the Netflix
               | blue tint but a movie where the scenes haven't been made
               | consistently color graded is obviously jarring and even
               | an expensive production can come off amateur.
               | 
               | GenAI is great at filling in those lower order bits but
               | until stuff like ControlNet gets much better precision
               | and UX, I think genAI will be stuck in the uncanny valley
               | because they're inconsistent between scenes, frames, etc.
        
               | TeMPOraL wrote:
               | Yup, 100% agreed on that, and mentioned this caveat
               | elsewhere. As you say - people don't pay attention to
               | details (or lack of it), as long as the details are
               | consistent. Inconsistencies stand out like sore thumbs.
               | Which is why IMO it's best to have less details than to
               | be inconsistent with them.
        
           | staticman2 wrote:
           | Real artists take comic book scripts and turn them into
           | actual comic books every month. They may not match exactly
           | what the writer had in mind, but they are fit for purpose.
        
             | TeMPOraL wrote:
             | > _They may not match exactly what the writer had in mind,
             | but they are fit for purpose._
             | 
             | That's what GenAI is doing, too. After all, the audience
             | only sees the final product; they never get know what the
             | writer had in mind.
        
               | staticman2 wrote:
               | I haven't used SORA, but none of the GenAI I'm aware of
               | could produce a competent comic book. When a human artist
               | draws a character in a house in panel 1, they'll draw the
               | same house in panel 2, not a procedurally generated
               | different house for each image.
               | 
               | If a 60 year old grizzled detective is introduced in page
               | 1, a human artist will draw the same grizzled detective
               | in page 2, 3 and so on, not procedurally generate a new
               | grizzled detective each time.
        
               | TeMPOraL wrote:
               | A human artist keeps state :). They keep it between
               | drawing sessions, and more importantly, they keep _very
               | detailed_ state - their imagination or interpretation of
               | what the thing (house, grizzled detective, etc.) is.
               | 
               | Most models people currently use don't keep state between
               | invocations, and whatever interpretation they make from
               | provided context (e.g. reference image, previous frame)
               | is surface level and doesn't translate well to output.
               | This is akin to giving each panel in a comic to a
               | different artist, and also telling them to sketch it out
               | by their gut, without any deep analysis of prior work.
               | It's a big limitation, alright, but researchers and
               | practitioners are actively working to overcome it.
               | 
               | (Same applies to LLMs, too.)
        
         | jerf wrote:
         | It just plain isn't possible if you mean a prompt the size of
         | what most people have been using lately, in the couple hundred
         | character range. By sheer information theory, the number of
         | possible interpretations of "a zoom in on a happy dog catching
         | a frisbee" means that you can not match a particular clip out
         | of the set with just that much text. You will _need_ vastly
         | more content; information about the breed, information about
         | the frisbee, information about the background, information
         | about timing, information about framing, information about
         | lighting, and so on and so forth. Right now the AIs can 't do
         | that, which is to say, even if you sit there and type a prompt
         | containing all that information, it is going to be forced to
         | ignore most of the result. Under the hood, with the way the
         | text is turned into vector embeddings, it's fairly questionable
         | whether you'd agree that it can even represent such a thing.
         | 
         | This isn't a matter of human-level AI or superhuman-level AI;
         | it's just straight up impossible. If you want the information
         | to match, it has to be provided. If it isn't there, an AI can
         | fill in the gaps with "something" that will make the scene
         | work, but expecting it to fill in the gaps the way you "want"
         | even though you gave it no indication of what that is is
         | expecting literal magic.
         | 
         | Long term, you'll never have a coherent movie produced by
         | stringing together a series of textual snippets because, again,
         | that's just impossible. Some sort of long-form "write me a
         | horror movie staring a precocious 22-year old elf in a far-
         | future Ganymede colony with a message about the importance of
         | friendship" AI that generates a coherent movie of many scenes
         | will have to be doing a lot of some sort of internal
         | communication in an internal language to hold the result
         | together between scenes, because what it takes to hold stuff
         | coherent between scenes is an amount of English text not
         | entirely dissimilar in size from the underlying representation
         | itself. You might as well skip the English middleman and go
         | straight to an embedding not constrained by a human language
         | mapping.
        
           | yeknoda wrote:
           | something like a white paper with a mood board, color scheme,
           | and concept art as the input might work. This could be sent
           | into an LLM "expander" that increases the words and
           | speficity. Then multiple reviews to tap things in the right
           | direction.
        
             | 3form wrote:
             | And I think this realistically is going to be the shape of
             | the tools to come in the foreseeable future.
        
               | echelon wrote:
               | You should see what people are building with Open Source
               | video models like HunYuan [1] and ComfyUI + Control Nets.
               | It blows Sora out of the water.
               | 
               | Check out the Banodoco Discord community [2]. These are
               | the people pioneering steerable AI video, and it's all
               | being built on top of open source.
               | 
               | [1] https://github.com/Tencent/HunyuanVideo
               | 
               | [2] https://banodoco.ai/
        
             | mikepurvis wrote:
             | I expect this kind of thing is actually how it's going to
             | work longer term, where AI is a copilot to a human artist.
             | The human artist does storyboarding, sketching in backdrops
             | and character poses in keyframes, and then the AI steps in
             | and "paints" the details over top of it, perhaps based on
             | some pre-training about what the characters and settings
             | are so that there's consistency throughout a given work.
             | 
             | The real trick is that the AI needs to be able to
             | participate in iteration cycles, where the human can say
             | "okay this is all mostly good, but I've circled some areas
             | that don't look quite right and described what needs to be
             | different about them." As far as I've played with it,
             | current AIs aren't very good at revisiting their own work--
             | you're basically just tweaking the original inputs and
             | otherwise starting over from scratch each time.
        
               | programd wrote:
               | We will shortly have much better tweaking tools which
               | work not only on images and video but concepts like what
               | aspects a character should exhibit. See for example the
               | presentation from Shapeshift Labs.
               | 
               | https://www.shapeshift.ink/
        
           | echelon wrote:
           | For those not in this space, _Sora is essentially dead on
           | arrival._
           | 
           | Sora performs worse than closed source Kling and Hailuo, but
           | more importantly, it's already trumped by open source too.
           | 
           | Tencent is releasing a fully open source Hunyuan model [1]
           | that is better than all of the SOTA closed source models.
           | Lightricks has their open source LTX model and Genmo is
           | pushing Mochi as open source. Black Forest Labs is working on
           | video too.
           | 
           | Sora will fall into the same pit that Dall-E did. SaaS
           | doesn't work for artists, and open source always trumps
           | closed source models.
           | 
           | Artists want to fine tune their models, add them to ComfyUI
           | workflows, and use ControlNets to precision control the
           | outputs.
           | 
           | Images are now almost 100% Flux and Stable Diffusion, and
           | video will soon be 100% Hunyuan and LTX.
           | 
           | Sora doesn't have much market apart from name recognition at
           | this point. It's just another inflexible closed source model
           | like Runway or Pika. Open source has caught up with state of
           | the art and is pushing past it.
           | 
           | [1] https://github.com/Tencent/HunyuanVideo
        
           | minimaxir wrote:
           | > Under the hood, with the way the text is turned into vector
           | embeddings, it's fairly questionable whether you'd agree that
           | it can even represent such a thing.
           | 
           | The _text encoder_ may not be able to know complex
           | relationships, but the generative image /video models that
           | are conditioned on said text embeddings absolutely can.
           | 
           | Flux, for example, uses the very old T5 model for text
           | encoding, but image generations from it can (loosely) adhere
           | to all rules and nuances in a multi-paragraph prompt:
           | https://x.com/minimaxir/status/1820512770351411268
        
           | robotresearcher wrote:
           | > Long term, you'll never have a coherent movie produced by
           | stringing together a series of textual snippets because,
           | again, that's just impossible.
           | 
           | Why snippets? Submit a whole script the way a writer delivers
           | a movie to a director. The (automated) director/DP/editor
           | could maintain internal visual coherence, while the script
           | drives the story coherence.
        
             | jerf wrote:
             | That's what I describe at the end, albeit quickly in lingo,
             | where the internal coherence is maintained in internal
             | embeddings that are never related to English at all. A top-
             | level AI could orchestrate component AIs through embedded
             | vectors, but you'll never do it with a human trying to type
             | out descriptions.
        
             | troupo wrote:
             | You should watch how movies are made sometime. How a script
             | is developed. How changes to it are made. How storyboards
             | are created. How actors are screened for roles. How
             | locations are scouted, booked, and changed. How the
             | gazillion of different departments end up affecting how a
             | movie looks, is produced, made, and in which direction it
             | goes (the wardrobe alone, and its availability and
             | deadlines will have a huge impact on the movie).
             | 
             | What does "EXT. NIGHT" mean in a script? Is it cloudy?
             | Rainy? Well lit? What are camera locations? Is the scene
             | important for the context of the movie? What are characters
             | wearing? What are they looking at?
             | 
             | What do actors actually do? How do they actually behave?
             | 
             | Here are a few examples of script vs. screen.
             | 
             | Here's a well described script of Whiplash. Tell me the one
             | hundred million things happening on screen that are not in
             | the script: https://www.youtube.com/watch?v=kunUvYIJtHM
             | 
             | Or here's Joker interrogation from The Dark Night Rises.
             | Same million different things, including actors (or the
             | director) _ignoring_ instructions in the script:
             | https://www.youtube.com/watch?v=rqQdEh0hUsc
             | 
             | Here's A Few Good Men: https://www.youtube.com/watch?v=6hv7
             | U7XhDdI&list=PLxtbRuSKCC...
             | 
             | and so on
             | 
             | ---
             | 
             |  _Edit_. Here 's Annie Atkins on visual design in movies,
             | including Grand Budapest Hotel:
             | https://www.youtube.com/watch?v=SzGvEYSzHf4. And here's a
             | small article summarizing some of it:
             | https://www.itsnicethat.com/articles/annie-atkins-grand-
             | buda...
             | 
             | Good luck finding any of these details in any of the
             | scripts. See minute 14:16 where she goes through the script
             | 
             |  _Edit 2_ : do watch The Kerning chapter at 22:35 to see
             | what it _actually_ takes to create something :)
        
             | coffeebeqn wrote:
             | This almost certainly won't work. Feel free to feed any of
             | the hundreds of existing film scripts and test how coherent
             | the models can be. My guess is not at all
        
               | sleepybrett wrote:
               | This will almost certainly be in theaters within 5 years,
               | probably first as a small experimental project (think
               | blair witch).
        
             | letmevoteplease wrote:
             | Shane Carruth (Primer) released interesting scripts for "A
             | Topiary" and "The Modern Ocean" which now have no hope of
             | being filmed. I hope AI can bring them to life someday. If
             | we get tools like ControlNet for video, maybe Carruth could
             | even "direct" them himself.
        
           | LASR wrote:
           | What you are saying is totally correct.
           | 
           | And this applies to language / code outputs as well.
           | 
           | The number of times I've had engineers at my company type out
           | 5 sentences and then expect a complete react webapp.
           | 
           | But what I've found in practice is using LLMs to generate the
           | prompt with low-effort human input (eg: thumbs up/down,
           | multiple-choice etc) is quite useful. It generates walls of
           | text, but with metaprompting, that's kind of the point. With
           | this, I've definitely been able to get high ROI out of LLMs.
           | I suspect the same would work for vision output.
        
             | kurthr wrote:
             | I'm not sure, but I think you're saying what I'm thinking.
             | 
             | Stick the video you want to replicate into -o1 and ask for
             | a descriptive prompt to generate a video with the same
             | style and content. Take that prompt and put it into Sora.
             | Iterate with human and o1 generated critical responses.
             | 
             | I suspect you can get close pretty quickly, but I don't
             | know the cost. I'm also suspicious that they might have put
             | in "safeguards" to prevent some high profile/embarrassing
             | rip-offs.
        
           | amelius wrote:
           | Can't you just give it a photo of a dog, and then say "use
           | this dog in this or that scene"?
        
             | alpha_squared wrote:
             | How would that even work? A dog has physical features
             | (legs, nose, eyes, ears, etc.) that they use to interact
             | with the world around them (ground, tree, grass, sounds,
             | etc.). And each one of those things has physical structures
             | that compose senses (nervous system, optic nerves, etc.).
             | There are layers upon layers of intricate complexity that
             | took eons to develop and a single photo cannot encapsulate
             | that level of complexity and density of information. Even a
             | 3D scan can't capture that level of information. There is
             | an implicit understanding of the physical world that helps
             | us make sense of images. For example, a dog with all four
             | paws standing on grass is within the bounds of possibility;
             | a dog with six paws, two of which are on it's head, are
             | outside the bounds of possibility. An image generator
             | doesn't understand that obvious delineation and just
             | approximates likelihood.
        
               | int_19h wrote:
               | A single photo doesn't have to capture all that
               | complexity. It's carried by all those countless dog
               | photos and videos in the training set of the model.
        
             | artemisart wrote:
             | Yes, the idea works and was explored with
             | dreambooth/textual inversion for image diffusion models.
             | 
             | https://dreambooth.github.io/ https://textual-
             | inversion.github.io/
        
               | minimaxir wrote:
               | Both of those are of course out of date and require
               | significant training instead of just feeding it a single
               | image.
               | 
               | InstantID (https://replicate.com/zsxkib/instant-id) fixes
               | that issue.
        
         | torginus wrote:
         | Yeah, it almost feels like gambling - 'you're very close, just
         | spend 20 more credits and you might get it right this time!'
        
         | moralestapia wrote:
         | Still three or four order of magnitudes cheaper and easier than
         | to produce said video through traditional methods.
        
         | beefnugs wrote:
         | AI isn't trying to sell to you: a precise artist with real
         | vision in your brain. It is selling to managers who want to
         | shit out something in an evening that approximates anything,
         | that writes ads that no one wants to see anyway, that produces
         | surface level examples of how you can pay employees less
         | because "their job is so easy"
        
           | spuz wrote:
           | Yes and the thing is, even for those tasks, it's incredibly
           | difficult to achieve even the low bar that a typical
           | advertising manager expects. Try it yourself for any real
           | world task and you will see.
        
             | cornel_io wrote:
             | Counterpoint: our CEO spent 25 minutes shitting out a bunch
             | of AI ads because he was frustrated with the pace of our
             | advertising creative team. They hated the ads that he
             | created, for the reasons you mention, but we tested them
             | anyways and the best performing ones beat all of our
             | "expert" team's best ads by a healthy margin (on _all_ the
             | metrics we care about, from CTR to IPM and downstream stuff
             | like retention and RoAS).
             | 
             | Maybe we're in a honeymoon period where your average user
             | hasn't gotten annoyed by all the slop out there and they
             | will soon, but at least for now, there is real value here.
             | Yes, out of 20 ads maybe only 2 outperform the manually
             | created ones, but if I can create those 20 with a couple
             | hundred bucks in GenAI credits and maybe an hour or two of
             | video editing that process wipes the floor with the
             | competition, which is several thousand dollars per ad, most
             | of which are terrible and end up thrown away, too. With the
             | way the platforms function now, ad creative is quickly
             | becoming a volume-driven "throw it at the wall and see what
             | sticks" game, and AI is great for that.
        
               | sarchertech wrote:
               | > Maybe we're in a honeymoon period where your average
               | user hasn't gotten annoyed by all the slop out there and
               | they will soon
               | 
               | It's this. A video ad with a person morphing into a bird
               | that takes off like a rocket with fire coming out of its
               | ass, sure it might perform well because we aren't
               | saturated with that yet.
               | 
               | You'd probably get a similar result by giving a camera to
               | a 5 year old.
               | 
               | But you also have to ask what that's doing long term to
               | your brand.
        
               | mewpmewp2 wrote:
               | A/B/C/D testing is the perfect grounds for that. You can
               | keep automatically generating and iterating quickly while
               | A/B tests are constantly being ran. This data on CTR can
               | later be used to train the model better as well.
        
           | soheil wrote:
           | You seem to speak from experience of being that manager...
           | I'm not going to ask what you shit out in your evenings.
        
         | jstummbillig wrote:
         | > A way to test this is to take a piece of footage or an image
         | which is the ground truth, and test how much prompting and
         | editing it takes to get the same or similar ground truth
         | starting from scratch.
         | 
         | Sure, if you then do the same in reverse.
        
         | javier123454321 wrote:
         | This is the conundrum of AI generated art. It will lower the
         | barrier to entry for new artists to produce audiovisual
         | content, but it will not lower the amount of effort required to
         | make good art. If anything it will increase the effort, as it
         | has to be excellent in order to get past the slop of base level
         | drudge that is bound to fill up every single distribution
         | channel.
        
         | diob wrote:
         | I believe it. I was just using AI to help out with some
         | mandatory end of year writing exercises at work.
         | 
         | Eventually, it starts to muck with the earlier work that it did
         | good on, when I'm just asking it to add onto it.
         | 
         | I was still happy with what I got in the end, but it took trial
         | and error and then a lot of piecemeal coaxing with verification
         | that it didn't do more than I asked along the way.
         | 
         | I can imagine the same for video or images. You have to examine
         | each step post prompt to verify it didn't go back and muck with
         | the already good parts.
        
         | bilsbie wrote:
         | Sounds like another way of saying a picture is worth a thousand
         | words.
        
         | estebarb wrote:
         | For those scenarios would be helpful a draft generation mode:
         | 16 colors, 320x200...
        
         | hmottestad wrote:
         | When I first started learning Photoshop as a teenager I often
         | knew what I wanted my final image to look like, but no matter
         | how hard I tried I could never get the there. It wasn't that it
         | was impossible, it was just that my skills just weren't there
         | yet. I needed a lot more practice before I got good enough to
         | create what I could see in my imagination.
         | 
         | Sora is obviously not Photoshop, but given that you can write
         | basically anything you can think of I reckon it's going to take
         | a long time to get good at expressing your vision in words that
         | a model like Sora will understand.
        
         | titzer wrote:
         | As only a cursory user of said tools (but strong opinions) I
         | felt the immediate desire to get an editable (2D) scene that I
         | could rearrange. For example I often have a specific vantage
         | point or composition in mind, which is fine to start from, but
         | to tweak it and the elements, I'd like to edit it afterwards.
         | To foray into 3D, I'd be wanting to rearrange the characters
         | and direct them, as well as change the vantage point. Can it do
         | that yet?
        
         | corytheboyd wrote:
         | Free text is just the fundamentally wrong input for precision
         | work like this. Because it is wrong for this doesn't mean it
         | has NO purpose, it's still useful and impressive for what it
         | is.
         | 
         | FWIW I too have been quite frustrated iterating with AI to
         | produce a vision that is clear in my head. Past changing the
         | broad strokes, once you start "asking" for specifics, it all
         | goes to shit.
         | 
         | Still, it's good enough at those broad strokes. If you want
         | your vision to become reality, you either need to learn how to
         | paint (or whatever the medium), or hire a professional, both
         | being tough-but-fair IMO.
        
           | londons_explore wrote:
           | I don't think it'll be long before GUI tools catch up for
           | editing video.
           | 
           | Things like rearranging things in the scene with drag'n'drop
           | sound implementable (although incredibly GPU heavy)
        
         | goldfeld wrote:
         | If you use it in a utilitarist way it'll give you a run for
         | your money, if you use for expression, such as art, learning to
         | embrace some serendipity, it makes good stuff.
        
       | owenpalmer wrote:
       | MKBHD's review of the new Sora release:
       | 
       | https://www.youtube.com/watch?v=OY2x0TyKzIQ
        
         | laweijfmvo wrote:
         | Love the callout of them definitely training on his own videos
        
         | awongh wrote:
         | Interesting to see how bad the physics/object permanence is. I
         | wonder if combining this with a Genie 2 type model (Google's
         | new "world model") would be the next step in refining it's
         | capabilities.
        
           | kranke155 wrote:
           | Until these models can figure out physics, it seems to me
           | they will be an interesting toy
        
             | andybak wrote:
             | They can figure out a fair bit of physics. It's not a "no
             | physics" vs "physics" thing. Rather it's a "flawed and
             | unreliable physics" thing.
             | 
             | It's similar to the LLM hallucination problem. LLMs produce
             | nonsense and untruths - but they are still useful in many
             | domains.
        
               | Barrin92 wrote:
               | It's a pretty binary thing in the sense that "bad
               | physics" pretty quickly decoheres into no physics.
               | 
               | I saw one of these models doing a Minecraft like
               | simulation and it looked sort of okay but then water
               | started to end up in impossible places and once it was
               | there it kept spreading and you ended up in some
               | lovecraftian horror dimension. Any useful physics
               | simluation at least needs boundary conditions to hold and
               | these models have no boundary conditions because they
               | have no clear categories of anything.
        
             | kylehotchkiss wrote:
             | But they don't, they just understand pixel relationships
             | (right?)
        
           | torginus wrote:
           | This feels like computer graphics and the 'screen space'
           | techniques that got introduced in the Xbox 360 generation -
           | reflection, shadows etc. all suffered from the inability to
           | work with off screen information and gave wildly bad answers
           | once off screen info was required.
           | 
           | The solution was simple - just maintain the information in
           | world space, and sample for that. But simple does not mean
           | cheap, and it led to a ton of redundant (as in invisible in
           | the final image) having to be kept track of.
        
       | pearjuice wrote:
       | Though I like the novelty of AI generated content, it kind of
       | sucks dead internet theory is becoming more and more prevalent.
       | YouTube (and all of the web) is already being spammed with AI
       | generated slop and "better" video/text/audio models only make
       | this worse. At some point we will cross the threshold of "real"
       | and "generated" content being posted on the web and there's no
       | stopping that.
        
         | xena wrote:
         | My hope was that AI would make it easier for people to create
         | new things that haven't been done before, but my fear was that
         | it would just be an endless slop machine. We're living in the
         | endless slop machine timeline and even genuine attempts to make
         | something artistic end up just coming off as more slop.
         | 
         | I love this timeline.
        
           | visarga wrote:
           | Even if it's made with AI, it is slop only if you don't add
           | anything original in your prompt, and don't spend much time
           | selecting.
           | 
           | The real competition of any new work is the backlog of
           | decades of content that is instantly accessible. Of course it
           | makes all content less valuable, you can always find
           | something else. Hence the race for attention and the slop
           | machine. It was actually invented by the ad driven revenue
           | model.
           | 
           | We should not project on AI something invented elsewhere.
           | Even if gen AI could make original interesting works, the
           | social network feeds would prioritize slop back again. So the
           | problem is the way we let them control our feeds.
        
             | minimaxir wrote:
             | > if you don't add anything original in your prompt
             | 
             | Define "original". You could generate a pregnant Spongebob
             | Squarepants and that would be original, but it would still
             | be noise that doesn't inherently expand the creative space.
             | 
             | > don't spend much time selecting
             | 
             | That's the unexpected issue with the proliferation of
             | generative AI now being accessible to nontechnical people.
             | Most are lazy and go with the first generation that matches
             | the vibe, which is the main reason why we have slop.
        
           | 999900000999 wrote:
           | Imagine a movie like Napoleon, but instead of needing 100
           | million and thousands of extras, you just need 5 actors and
           | maybe a budget of 50k.
           | 
           | You could get something much more creative or historically
           | accurate than whatever Hollywood deems marketable.
           | 
           | I think about AI like any other tool. For example I make
           | music using various software.
           | 
           | Are drum machines cheating? Is electronic music computer
           | sloop compared to playing each instrument.
           | 
           | Is using a Mac and a 1k mic over a 30k studio cheating ?
        
             | xena wrote:
             | The main comparator is Kasane Teto and Suno. Kasane Teto is
             | functionally a piano that uses generative AI for vocal
             | synthesis: https://youtu.be/s3VPKCC9LSs. This is an aid to
             | the creative process. Suno lets you put in a description
             | and completely bypass the creative process by instantly
             | getting to the end: https://youtu.be/UpBVDSJorlU
             | 
             | Kokoro is art. Driveway is content. Art uses the medium and
             | implementation to say something and convey messages.
             | Content is what goes between the ads so the shareholders
             | see a number increase.
             | 
             | I wish there were more things like Kokoro and less things
             | like Driveway.
        
               | 999900000999 wrote:
               | What if your making a short movie and driveway is playing
               | in the background during a scene.
               | 
               | It's like everything else. It's just a tool.
               | 
               | You can create an entire movie using a high end phone
               | with quality that would have cost millions 40 years ago.
               | Do real movies need film?
        
           | FergusArgyll wrote:
           | It might be true for "Creators" etc. but there were things
           | that I always wanted paintings of but I have no talent, time,
           | tools or anything really.
           | 
           | When I first got access to dalle (in '22) the first thing I
           | tried was to get an impressionist style painting of the way I
           | always imagined Bob Dylan's 'Mr. Tambourine Man' I
           | regenerated it multiple times and I got something I was very
           | happy with! I didn't put it on social media, didn't try to
           | make money off it, it's _for me_ .
           | 
           | If you enjoy "art" (nice pictures, paintings, videos now I
           | guess) You can create it yourself! I think people are missing
           | that aspect of it, use it to make yourself happy, make
           | pictures you want to look at!
        
         | huijzer wrote:
         | Put more weight on your subscriptions. I don't have much AI
         | content in my YouTube suggestions. (Good luck AI generating an
         | interview with Chris Lattner or Stephen Kotkin for example. It
         | won't work.)
        
           | yaj54 wrote:
           | It will work within thousands of days.
        
         | skepticATX wrote:
         | I felt this same way as image generation was rapidly improving,
         | but I've been caught by surprise and impressed with how
         | resilient we have been in the face of it.
         | 
         | Turns out it's surprisingly, at least for me, to tune out the
         | slop. Some platforms will fall victim to it (Google image
         | search, for one), but new platforms will spring up to take
         | their place.
        
         | tguedes wrote:
         | My hope is that it will be the death of the aggregators and
         | there will be more value in high quality and authentic content.
         | The past 10-15 years has rewarded people who appeal to the
         | aggregation algorithms and get the most views. Hopefully going
         | forward theres going to be more organic, word of mouth
         | recommendations of high quality content.
        
         | ghita_ wrote:
         | yeah i already have so many AI-generated videos in my feed on
         | all social media it's insane. i spot them from far for now but
         | at some point i'll just be consuming content that took seconds
         | to generate just to get money
        
       | minimaxir wrote:
       | No API/per video generation? Huh.
        
         | owenpalmer wrote:
         | It's probably because they're relying heavily on their new
         | editing UI to make the model useful. You can cut out weird
         | parts of the videos and append a newly generated potion with a
         | new prompt.
        
         | Tiberium wrote:
         | OpenAI almost always waits a few months before adding new
         | features or models to the API, the same happened to DALL-E 3,
         | advanced voice mode, and lots of smaller model updates and
         | releases.
        
       | rvz wrote:
       | That's around more than 20+ VC-backed AI video generation
       | startups destroyed in a microsecond and scrambling to compete
       | against Sora in the race to zero.
       | 
       | Many of them will die, but may the AI slop continue anyway.
        
         | hagbarth wrote:
         | Not really a microsecond. Sora was announced months ago.
        
         | blackeyeblitzar wrote:
         | It's a race to zero margin. The people who win will have lots
         | of existing distribution channels (customers) or lots of money
         | or control over data. Those who innovate but don't have these
         | things will be copied and run out of money eventually, as sad
         | is it is. The competition between those startups and bigger
         | players isn't fair.
        
       | marban wrote:
       | Not available in the EU:
       | https://help.openai.com/en/articles/10250692-sora-supported-...
        
         | blfr wrote:
         | What did the EU do this time?
        
           | mnau wrote:
           | Probably this
           | https://en.wikipedia.org/wiki/Artificial_Intelligence_Act,
           | possibl this:
           | https://en.wikipedia.org/wiki/Digital_Markets_Act
        
         | therealmarv wrote:
         | Does VPN solves the problem? I'm living in an EU country and I
         | don't like that the EU decides for me (and companies like
         | OpenAI or Meta don't give out their models to me)! I'm an old
         | enough adult to decide for myself what I want...
        
           | ionwake wrote:
           | I used a Japanese protonVPN , I got past the "Not in the EU"
           | thing but it said "no new signups are allowed atm".
           | 
           | Perhaps just best to wait
        
         | ionwake wrote:
         | Im sure if openai just waited the extra day or 2 to make sure
         | its available int he EU it wouldnt annoy everyone in the EU so
         | much. Often with new releases everyone in the EU needs to wait
         | a couple of days, the FOMO is not cool bros
        
       | cube2222 wrote:
       | There's an ongoing related livestream[0].
       | 
       | [0]: https://youtu.be/2jKVx2vyZOY
        
       | mhh__ wrote:
       | Unless they drop something mega in the next few months can't help
       | but think that openai's moat is basically gone for now at least.
        
       | gzer0 wrote:
       | For the $20/month subscription: you get 50 generations a month.
       | So it is included in your subscription already! Nice.
       | 
       | For the Pro $200/month subscription: you get unlimited
       | generations a month (on a slower que).
        
         | system2 wrote:
         | I wonder what I will be doing with 20 garbage videos. And this
         | probably includes revisions too. It takes 10 attempts to get
         | something remotely useful as an image (and that's just for a
         | blog post).
        
       | iamleppert wrote:
       | Yawn, there are literally 10 different apps and wannabe startups
       | that do video generation and AI videos have already flooded
       | social media. This doesn't look any better than what is and has
       | been already available to the masses. OpenAI announced this ages
       | ago and never did give people access, now competitors have
       | already captured the AI generated video for social media slop
       | market.
       | 
       | We have yet to see any kind of AI created movie, like Toy Story
       | was for computer 3D animation.
       | 
       | OpenAI isn't a player in the video AI game, but certainly has
       | bagged most of the money for it already (somehow).
        
         | keiferski wrote:
         | Don't just critique - link. What other video generation tools
         | have you used and recommend?
        
           | Workaccount2 wrote:
           | The subreddit /r/aivideo has tons of videos all tagged with
           | what model was used to generate them.
        
         | LanceJones wrote:
         | So you're saying there is literally nothing good about Sora?
        
         | vunderba wrote:
         | From the few videos that I've seen, I would agree that it
         | doesn't seem to be better than any of the major competitors
         | such as Kling, Hailuo, Runway, etc.
        
       | petercooper wrote:
       | I wonder what it is about EU and UK law, in particular, that
       | restricts its availability there. Their FAQs don't mention this.
       | 
       | If it's about training models on potentially personal
       | information, the GDPR (EU and UK variants) kicks in, but then
       | that hasn't restricted OpenAI's ability to deploy (Chat)GPT
       | there. The same applies to broader copyright regulations around
       | platforms needing to proactively prevent copyright violation,
       | something GPT could also theoretically accomplish. Any (planned)
       | EU-specific regulations don't apply to the UK, so I doubt it's
       | those either.
       | 
       | The only thing that leaves, perhaps, is laws around the
       | generation of deepfakes which both the UK and EU have laws about?
       | But then why didn't that affect DALL-E? Anyone with a more
       | detailed understanding of this space have any ideas?
        
         | ilaksh wrote:
         | Part of it might also be capacity problems.
        
         | Stevvo wrote:
         | It's a capacity related constraint, not a legal one.
        
           | simicd wrote:
           | Hmm enough capacity for the rest of the world but not EU:
           | 
           | https://help.openai.com/en/articles/10250692-sora-
           | supported-...
        
         | MrKristopher wrote:
         | A lot has changed since ChatGPT was released.
         | https://en.wikipedia.org/wiki/Digital_Markets_Act wasn't in
         | effect back then. Microsoft hadn't made their big investment
         | yet either. OpenAi is a growing target, and the laws are
         | becoming more strict, so they need to be more cautious from a
         | legal perspective, and they need to consider that compliance
         | with EU laws will slow down their product development.
        
       | ChrisArchitect wrote:
       | Link should be annoucement post: https://openai.com/index/sora-
       | is-here/
        
         | aspenmayer wrote:
         | https://news.ycombinator.com/item?id=42368981
        
         | dang wrote:
         | Ok, we changed the URL to that from https://sora.com/ above.
        
           | bnrdr wrote:
           | The bear video is quite funny - two bear cubs merge into one
           | and another cub appears out of the adult bear's leg.
        
       | chefandy wrote:
       | If you're looking for video for casual personal projects or fill-
       | ins for vlog posts, or something to make your PowerPoint look
       | neat, this seems like a rad tool. It has a looong way to go
       | before it's taking anyone's movie VFX job.
        
       | MyFirstSass wrote:
       | Wow this is bad. And by bad i mean worse than leading open source
       | and existing alternatives.
       | 
       | Is it me or does it seem like OpenAI revolutionized with both
       | chatGPT and Sora, but they've completely hit the ceiling?
       | 
       | Honestly a bit surprised it happened so fast!
        
         | kranke155 wrote:
         | Sora was not really that big of a revolution, it was just
         | catching up with competitors. I would even say in gen video
         | they are behind right now.
        
           | pawelduda wrote:
           | What is the best model in your opinion right now?
        
             | ElectroNomad wrote:
             | RunwayML
        
             | echelon wrote:
             | HunYuan by Tencent. It's 100% open source too.
        
           | SV_BubbleTime wrote:
           | Sora had some sweet cherry picked initial hype videos. That
           | was more impressive than anything we could do at the time.
           | Now, yea, it's questionable if it's on-par let alone better.
        
             | kranke155 wrote:
             | Wasn't just cherry picked. The balloon kid video had a VFX
             | team cleaning up the output. They've said that now.
        
         | joe_the_user wrote:
         | Bad also in the sense once you get over the "boy, it's amazing
         | they can do that", you immediately think "boy, they really
         | shouldn't do that".
        
         | Banditoz wrote:
         | What are some of the open source video models?
        
         | tshaddox wrote:
         | What are the leading alternatives? (Open source or otherwise)
        
           | elorant wrote:
           | MidJourney (commercial), Standard Diffusion XL
        
             | aruametello wrote:
             | > Standard Diffusion XL
             | 
             | you probably meant Stable Diffusion XL. (autocorrect
             | victim)
        
           | amrrs wrote:
           | Minimax (from China) and Kling 1.5 from China. Recently
           | Tencent launched its own.
           | 
           | You can see more model samples heee
           | https://youtu.be/bCAV_9O1ioc
        
             | ztratar wrote:
             | Those look... far worse? What am I missing.
        
               | amrrs wrote:
               | Exactly I don't know how people are saying SORA is bad. I
               | know there are restrictions with humans. But with the
               | storyboard and other customisations, it's definitely up
               | there!
        
           | stuckkeys wrote:
           | FLUX
        
           | vunderba wrote:
           | You have to _be specific_. What 's more important to you?
           | 
           | - uncensored output (SD + LoRa)
           | 
           | - Overall speed of generation (midjourney)
           | 
           | - Image quality (probably midjourney, or an SDXL checkpoint +
           | upscaler)
           | 
           | - Prompt adherence (flux, DALL-E 3)
           | 
           | EDIT: This is strictly around image generation. The main
           | video competitors are Kling, Hailuo, and Runway.
        
             | sebazzz wrote:
             | SD does not generate video, does it?
        
               | xvector wrote:
               | https://stable-diffusion-art.com/animatediff/
        
               | CryptoBanker wrote:
               | It does as of recently.
        
         | tom1337 wrote:
         | Same goes with DALLE. It was cool to try it the first week or
         | so but now the output is so much worse than Midjourney and
         | stable diffusion. For me it can't even generate straight lines
         | and everything looks comic-ish.
        
           | vunderba wrote:
           | DALL-E 3 image quality has always been subpar, but its prompt
           | adherence is on par with FLUX. Midjourney has some of the
           | worst prompt adherence, but some of the best image quality.
        
           | amzn-throw wrote:
           | To me this is just a simple artifact of size & attention.
           | 
           | Another example of this is stuff like Bluesky. There's a lot
           | of reasons to hate Twitter/X, but people going "Wow, Bluesky
           | is so amazing, there's no ads and it's so much less toxic!"
           | aren't complimenting Bluesky, they're just noting that it's
           | smaller, has less attention, and so they don't have ads or
           | the toxic masses YET.
           | 
           | GenAI image generation is an obvious vector for all sorts of
           | problems, from copyrighted material, to real life people, to
           | porn, and so on. OpenAI and Google have to be extraordinarily
           | strict about this due to all the attention on them, and so
           | end up locking down artistic expression dramatically.
           | 
           | Midjourney and Stable Diffision may have equal stature
           | amongst tech people, but in the public sphere they're
           | unknowns. So they can get away with more risk.
        
         | lacoolj wrote:
         | If you're going to say something like this, you need to back it
         | up with specific alternatives that provide a better result.
         | 
         | Besides just citing your sources, I'm genuinely curious what
         | the best ones are for this so I can see the competition :)
        
           | echelon wrote:
           | HunYuan released by Tencent [1] is much better than Sora.
           | It's 100% open source, is compatible with fine tuning,
           | ComfyUI, control nets, and is receiving lots of active
           | development.
           | 
           | That's not the only open video model, either. Lightricks'
           | LTX, Genmo's Mochi, and Black Forest Labs' upcoming models
           | will all be open source video foundation models.
           | 
           | Sora is commoditized like Dall-E at this point.
           | 
           | Video will be dominated by players like Flux and Stable
           | Diffusion.
           | 
           | [1] https://github.com/Tencent/HunyuanVideo/
        
             | vlovich123 wrote:
             | Something being available OSS is very different from a
             | turnkey product solution, not to mention that Tencent's 60
             | GiB requirement requires a setup with like at least 3-4
             | GPUs which is quite rare & fairly expensive vs something
             | time-sharing like Sora where you pay a relatively small
             | amount per video.
             | 
             | I think the important thing is task quality and I haven't
             | seen any evaluations of that yet.
        
         | wslh wrote:
         | Could it be that text sources are plenty, and more dense than
         | training for videos, and images?
        
         | torginus wrote:
         | My working theory is that OpenAI is the 'moonshot' kind of
         | company full of super smart researchers who like tackling hard
         | problems, but have no time and effort for things like 'how do
         | we create an UX people actually want to use', which actually
         | requires a ton of painful back-and-forth and thoughtful design
         | work.
         | 
         | This is not a problem as long as they do the ChatGPT thing, and
         | sell an API and let others figure out how to build an UX around
         | it, but here they seem to be gunning for creating a boxed
         | product.
        
       | rushingcreek wrote:
       | As there was no mention of an API for either Sora or o1 Pro, I
       | think this launch further marks OpenAI's transition from an
       | infrastructure company to a product company.
        
         | metzpapa wrote:
         | It seems like there going that direction - especially the way
         | they setup the Sora interface, It feels its nearing a video
         | editing product.
        
       | colesantiago wrote:
       | Hollywood's days are numbered.
       | 
       | If you are a creative in this industry, start preparing to
       | transition to another industry or adapt.
       | 
       | Your boss is highly likely to be toying around with this.
       | 
       | The first entirely AI generated film (with Sora or other AI video
       | tools) to win an Oscar will be less than 5 years away.
        
         | smithcoin wrote:
         | > entirely
         | 
         | What would you like to wager on this?
        
         | do_not_redeem wrote:
         | I'd take that bet at 10:1 odds.
        
           | onlyrealcuzzo wrote:
           | I'd be careful.
           | 
           | OpenAI could be a big enough bubble in less than 5 years to
           | buy the Oscar winner, even if the film is terrible.
           | 
           | Also, OP only said "an Oscar".
           | 
           | The Oscar committee could easily get themselves hyped enough
           | on the AI bubble, to create an AI Oscar Film award.
           | 
           | No one said anything about making a "good" movie.
        
             | mdp2021 wrote:
             | > _OP only said "an Oscar"_
             | 
             | ...For soundtrack. (Sorry.)
             | 
             | But seriously: like the democratization which made music
             | production cheap brought some interesting or commercially
             | successful endavours, the increased effort from people who
             | could not bring their dreams to reality because of the
             | basic constraint of budget will probably bring some very
             | good results, even anthology worth - and lots of trash.
        
         | whynotminot wrote:
         | Nothing I'm seeing here looks like it's going to destroy
         | Hollywood.
         | 
         | I could see this tool _maybe_ being used for generating
         | establishing shots (generate a sweeping drone shot of a
         | lighthouse looking out over a stormy sea), but then the actual
         | talent work in a scene will be way more sensitive. The little
         | details matter so much, and this feels so far from getting all
         | of that right.
         | 
         | Sure, this is the worst it will ever be, things will improve,
         | etc, but if we've learned anything with AI, it's that the last
         | mile is often the hardest.
        
           | ALittleLight wrote:
           | I'm not sure the little details are enough of a moat.
           | Consider TikTok - people use cheap "special effects" to get
           | the message across, e.g. if a man is playing a woman he might
           | drape a towel over his head - it's silly and low quality but
           | it gets the idea across to the viewer. Think too about
           | programs like Archer or South Park that have (stylistically)
           | low quality animation but still huge fan bases.
           | 
           | What I think this will unlock, maybe with a bit of
           | improvement, is low quality video generation for a vast
           | number of people. Do you have a short film idea? Know people
           | with some? Likely millions of people will be able to use this
           | to put together _good enough_ short films - that yes, have
           | terrible details, but are still good enough to watch. Some of
           | those millions of newly enabled videos will have such strong
           | ideas or writing behind them that it will make up for, or
           | capitalize on, the weak video generation.
           | 
           | As the tools become easier, cheaper, faster, better etc more
           | and more hobbyists will pick them up and try to use them. The
           | user base will encourage the product to grow, and it will
           | gradually consume film (assuming it can reach the point of
           | being as or nearly as good as modern special effects).
           | 
           | I think of it like - when Steven Spielberg was young he used
           | an 8mm camera, not as good as professional film equipment in
           | the day, but good enough to create with. If I were a high
           | school student interested in film I would absolutely be using
           | stuff like this to create.
        
             | whynotminot wrote:
             | > What I think this will unlock, maybe with a bit of
             | improvement, is low quality video generation for a vast
             | number of people. Do you have a short film idea? Know
             | people with some? Likely millions of people will be able to
             | use this to put together good enough short films - that
             | yes, have terrible details, but are still good enough to
             | watch.
             | 
             | Sure, this is already happening on Reels, Tik Tok, etc.
             | People are ok with low quality content on those platforms.
             | Lazy AI will undoubtedly be more utilized here. But I don't
             | think it's threatening Hollywood (well, aside from slowly
             | destroying people's attention spans for long form content,
             | but that's a different debate). People will still want high
             | quality entertainment, even if they can also be satisfied
             | with low fidelity stuff too.
             | 
             | I think this has always been true -- think the difference
             | between made for TV CGI and big-budget Hollywood movie CGI.
             | Expectations are different in different mediums.
             | 
             | This current product is not good enough for Hollywood. As
             | long as people have some desire for Hollywood level
             | quality, this will not take those jobs.
             | 
             | The big caveat here is "yet" -- when does this get good
             | enough? And this is where my skepticism comes in, because
             | the last mile is the hardest, and getting things _mostly
             | right_ isn't really good enough for high quality content.
             | (Remember how much the internet lost it over a Starbucks
             | cup in Game of Thrones?)
             | 
             | The other caveat is maybe that our minds melt into
             | stupidity to the point that we only watch things in low
             | fidelity 10 seconds clips that AI can capably run amock
             | with. In which case I don't really think AI actually takes
             | over Hollywood so much as Hollywood -- effectively high
             | fidelity long form content -- just ceases to exist
             | altogether. That is the sad timeline.
        
         | rideontime wrote:
         | The day that 90 minutes of 3-second dolly shots wins an Oscar
         | is the day cinema dies.
        
         | rsynnott wrote:
         | ... Have you _seen_ the output from these things? I'm not sure
         | actors need to panic just yet.
        
         | qilthewise wrote:
         | I mean thats a bold claim. I'd first let chatgpt win an Oscar
         | for writing the best screenplay, and only then would Sora come
         | into the picture.
        
         | n144q wrote:
         | If you are ok with physics that is completely wrong, camera
         | angles that just don't feel right, strange light effects, and
         | all other kinds of distorted images/videos, maybe Hollywood is
         | doomed. But I don't see that happening.
         | 
         | A reminder: as advanced as CGI is today, lots and lots of movie
         | are still based on (very expensive) real-life scenery or
         | miniature sets (just two of many examples), because they are
         | far, far more realistic than what you get out of computers.
        
       | neom wrote:
       | In July I made this 3 minute little content marketing video for
       | Canada Day. Took me about 40 minutes using a combo of midjourney
       | + pika, suno for the music. Honestly I had a lot of fun making
       | it, I can see these tools with be fun for creative teams to
       | hammer out little things for social media and stuff:
       | https://x.com/ascent_hi/status/1807871799302279372
       | 
       | I don't see sora being THAT much better than pika now that I'm
       | trying both, except that it's included in my openai subscription,
       | but I do think people who do discreet parts of the "modal stack"
       | are going to be able to compete on their merits (be it pika for
       | vid or suno for music etc)
        
       | pen2l wrote:
       | Every day that passes I grow fonder of Google's decision to delay
       | or otherwise keep a lot of this under the wraps.
       | 
       | The other day I was scrolling down on YouTube shorts and a couple
       | videos invoked an uncanny valley response from me (I think it was
       | a clip of an unrealistically large snake covering some hut) which
       | was somehow fascinating and strange and captivating, and then
       | scrolling down a few more, again I saw something kind of
       | "unbelievable"... I saw a comment or two saying it's fake, and
       | upon closer inspection: yeah, there were enough AI'esque
       | artifacts that one could confidently conclude it's fake.
       | 
       | We'd known about AI slop permeating Facebook -- usually a Jesus
       | figure made out of unlikely set of things (like shrimp!) and we'd
       | known that it grips eyeballs. And I don't even know in which box
       | to categorize this, in my mind it conjures the image of those
       | people on slot machines, mechanically and soullessly pulling
       | levers because they are addicted. It's just so strange.
       | 
       | I can imagine now some of the conversations that might have
       | happened at Google when they choose to keep a lot of innovations
       | related to genAI under the wraps (I'm being charitable here of
       | their motives), and I can't help but agree.
       | 
       | And I can't help but be saddened about OpenAI's decisions to
       | unload a lot of this before recognizing the results of unleashing
       | this to humanity, because I'm almost certain it'll be used more
       | for bad things than good things, I'm certain its application on
       | bad things will secure more eyeballs than on good things.
        
         | quenix wrote:
         | It saddens me. Innovations in AI 'art' generation (music,
         | audio, photo) have been a net negative to society and are
         | already actively harming the Internet and our media sphere.
         | 
         | Like I said in another comment, LLMs are cool and useful, but
         | who in the hell asked for AI art? It's good enough to fool
         | people and break the fragile trust relationship we had with
         | online content, but is also extremely shit and carries no
         | meaning or depth whatsoever.
        
           | mojuba wrote:
           | I think AI "art" can be as useful as the text generators,
           | i.e. only within certain limits of dull and stupid stuff that
           | needs to exist but has little to no value.
           | 
           | For example, you need to generate a landing page for your
           | boring company: text, images, videos and the overall design
           | (as well as code!) can be and should be generated because...
           | who cares about your boring company's landing page, right?
        
             | carlosjobim wrote:
             | Then you don't understand the purpose of a landing page. If
             | the boring company hires somebody to make the landing page
             | who actually understands their job, the landing page will
             | have great importance.
        
             | whatevertrevor wrote:
             | One could ask why the boring company landing page exists in
             | the first place though. If it's not providing value to
             | humans to warrant actual attention being paid to it...
        
           | computerex wrote:
           | How do you know they are a net negative? What's your source?
        
             | quenix wrote:
             | My opinion ;-)
             | 
             | That's what HN is for
        
           | randomlurking wrote:
           | I agree with the first part. For me, AI art is the chance to
           | have a somewhat creative outlet that I wouldn't have
           | otherwise, because I'm much worse at painting that I can
           | stand. Drawing by prompts helps me be creative and work
           | through some stuff - for that it's also nice and interesting
           | to see that the result differs from my mental image. I will
           | tweak the prompt to some extent and to some extent go with
           | some unintentioned elements of the drawing. I keep the
           | drawing on my phone in the notes app with a title and the
           | prompt.
           | 
           | To get back to the beginning: I really do agree that the
           | societal impact on the whole appears to be negative. But
           | there are some positives and I wanted to share my example of
           | that.
        
         | lelandfe wrote:
         | I saw my first AI video that completely fooled commenters:
         | https://imgur.com/a/cbjVKMU
         | 
         | This was not marked as AI-generated and commenters were in awe
         | at this fuzzy train, missing the "AIGC" signs.
         | 
         | I'm quite nervous for the future.
        
           | dagmx wrote:
           | Most people have terrible eyes for distinguishing content.
           | 
           | I've worked in CG for many years and despite the online nerd
           | fests that decry CG imagery in films, 99% of those people
           | can't tell what's CG or not unless it's incredibly obvious.
           | 
           | It's the same for GenAI, though I think there are more tells.
           | Still, most people cannot tell reality from fiction. If you
           | just tell them it's real, they'll most likely believe it.
        
           | starshadowx2 wrote:
           | The face of the girl on the left at the start in the first
           | second should have been a giveaway.
        
             | Perseids wrote:
             | My intuition went for _video compression artifact_ instead
             | of _AI modeling problem_. There is even a moment directly
             | before the cut that can be interpreted as the next key
             | frame clearing up the face. To be honest, the whole video
             | could have fooled me. There is definitely an aspect in
             | discerning these videos that can be trained just by
             | watching more of them with a critical eye, so try to be
             | kind to those that did not concern themselves with
             | generative AI as much as you have.
        
             | vlovich123 wrote:
             | Hard to not discount that as a compression artifact.
        
             | booleandilemma wrote:
             | No one is looking at her face though, they're looking at
             | the giant hello kitty train. And you were only looking at
             | her face because you were told it's an AI-generated video.
             | I agree with superfrank that extreme skepticism of
             | everything seen online is going to have to be the default,
             | unfortunately.
        
             | tim333 wrote:
             | Also "HELLO KITTY" being backwards is odd - writting on
             | trains doesn't normally come out like that eg
             | https://www.groupe-sncf.com/medias-
             | publics/styles/crop_1_1/p...
        
           | superfrank wrote:
           | I know there are people acting like this is obvious that this
           | is AI, but I get why people wouldn't catch it, even if they
           | know that AI is capable of creating a video like this.
           | 
           | A) Most of the give aways are pretty subtle and not what
           | viewers are focused on. Sure, if you look closely the fur
           | blends in with the pavement in some places, but I'm not going
           | to spend 5 minutes investigating every video I see for hints
           | of AI.
           | 
           | B) Even if I did notice something like that, I'm much more
           | likely to write it off as a video filter glitch, a weird
           | video perspective, or just low quality video. For example,
           | when they show the inside of the car, the vertical handrails
           | seem to bend in a weird way as the train moves, but I've seen
           | similar things from real videos with wide angle lenses.
           | Similar thoughts on one of the bystander's faces going
           | blurry.
           | 
           | I think we just have to get people comfortable with the idea
           | that you shouldn't trust a single unknown entity as the
           | source or truth on things because everything can be faked.
           | For insignificant things like this it doesn't matter, but for
           | big things you need multiple independent sources. That's
           | definitely an uphill battle and who knows if we can do it,
           | but that's the only way we're going to get out the other side
           | of this in one piece.
        
         | lanthissa wrote:
         | exactly one lab has passed the test of morals vs profit at this
         | point, and thats deepmind, and they were thoroughly punished
         | for it.
         | 
         | Every value oAI has claimed to have hasn't lasted a milisecond
         | longer than there was profit motive to break it, and even
         | anthropic is doing military tech now.
        
           | dmix wrote:
           | LLMs aren't AGI
        
         | halyconWays wrote:
         | Your comment is AI.
        
           | thr3000 wrote:
           | So is yours! Mine isn't, however. I am a hard-nosed real boy
           | now.
        
         | bko wrote:
         | I don't think Google delayed or kept this under wraps for any
         | noble reasons. I think they were just disorganized as evidenced
         | by their recent scrambling to compete in this space.
        
         | mrcwinn wrote:
         | Too charitable indeed. Google was simply unprepared and has
         | inferior alternatives.
         | 
         | My prediction is that next year they will catch up a bit and
         | will not be shy about releasing new technology. They will
         | remain behind in LLMs but at least will more deeply envelope
         | their own existing products, thus creating a narrative of
         | improved innovation and profit potential. They will publicly
         | acknowledge perceived risks and say they have teams ensuring it
         | will be okay.
        
         | raincole wrote:
         | Considering google image search is polluted by AI-generated
         | images at this moment, perhaps google is afraid of making the
         | search even worse?
        
         | computerex wrote:
         | They should have kept this amazing tech under the wraps because
         | you have a bad feeling about it? Hate to break it to you, but
         | there have been fake videos on the internet ever since it has
         | existed. There are more ways to fake videos than GenAI. If you
         | haven't been consuming everything on the internet with a high
         | alert bs sensor, then that's an issue of its own. You shouldn't
         | trust things on the internet anyway unless there is
         | overwhelming evidence.
        
           | sergiogdr wrote:
           | > If you haven't been consuming everything on the internet
           | with a high alert bs sensor, then that's an issue of its own
           | 
           | "just be privileged as I was to get all the necessary
           | education to be able to not be fooled by this tech". Yeah,
           | very realistic and compassionate.
        
             | cma wrote:
             | With a heavy dose of "if masses of people are fooled by
             | this, it can't affect me as long as I can see through it.
             | No possible repercussions of mass people believing
             | completely made up stuff that could affect laws, etc."
        
         | pier25 wrote:
         | I wish Google would allow me to remove the AI stuff from search
         | results.
         | 
         | 99% of the times it's either useless or wrong.
        
           | fraXis wrote:
           | Add a -ai to the end of your Google search query. There are
           | also browser extensions that stop the AI content from
           | displaying. I use the one for Chrome called "Remove Google
           | Search Generative AI".
        
           | titzer wrote:
           | Strong plus one here. Not only that, but it uses _gobs_ of
           | energy in total. Google has reneged on all of its carbon
           | promises to stay in the running for AI domination and to head
           | off disruption to search ads business. Since I 've
           | unconsciously trained my brain to not look at the top search
           | results anymore because they long ago turned into impossible-
           | to-distinguish ads, I've quickly learned to just ignore the
           | stupid AI summary. So it's an absurd waste of computational
           | power to generate something wrong that I don't even want to
           | see, and I can't even tell them to stop when they're wasting
           | their own money to do so.
        
         | makestuff wrote:
         | I don't even know if this will be possible, or how it would
         | work, but it seems like the next iteration of social media will
         | be based on some verification that the user is not using AI or
         | is a bot. Currently they are all incentivized to not stop bot
         | activity because it increases user counts, ad revenue, etc.
         | 
         | Maybe the model is you have to pay per account to use it, or
         | maybe the model will be something else.
         | 
         | I doubt this will make everyone just go back to primarily
         | communicating in person/via voice servers but that is a
         | possibility.
        
           | mnau wrote:
           | So Musk was right?
        
           | joaohaas wrote:
           | Twitter Blue is paid and yet every single bot account has it
           | in order to boost views.
        
         | kylehotchkiss wrote:
         | > the image of those people on slot machines, mechanically and
         | soullessly pulling levers because they are addicted. It's just
         | so strange.
         | 
         | Worse, the audience is our parents and grandparents. They have
         | little context to be able to sort out reality from this stuff
        
       | ren_engineer wrote:
       | forced to finally release after that new open source model came
       | out that was equal or better?
        
       | remoquete wrote:
       | Ah, yes. We definitely needed another bad dreams generator.
        
         | andybak wrote:
         | Interesting creative people will produce interesting creative
         | output.
         | 
         | People with no taste will produce tasteless content.
         | 
         | The mountain of slop will grow.
         | 
         | And some of us have no intention of publishing any output
         | whatsoever but just find the existence of these tools
         | fascinating and inspiring.
        
           | remoquete wrote:
           | While it's indeed fascinating, part of me finds the sheer
           | energy expenditure to be problematic, not to mention the
           | "Hollywood is dead" innuendos.
        
           | gburdell3 wrote:
           | And some of us have no intention of publishing any output and
           | find the existence of these tools extremely worrying and
           | problematic.
        
           | crakhamster01 wrote:
           | Interesting creative people are currently creating
           | interesting output _without_ generative AI.
           | 
           | These tools are fascinating, though I can't help but feel
           | that the main benefactor after all is said and done will be
           | venture capitalists and tech/entertainment execs.
        
       | LeoPanthera wrote:
       | This seems pretty broken at the moment, I haven't actually
       | managed to create a video, every prompt results in "There was an
       | unexpected error running this prompt".
        
         | ilaksh wrote:
         | I can't even sign up. I assume it's a capacity issue.
        
         | knicholes wrote:
         | At least you get to even see the page! I'm seeing "Sign ups are
         | temporarily unavailable We're currently experiencing heavy
         | traffic and have temporarily disabled sign ups. We're working
         | to get them back up shortly so check back soon."
        
       | tacticalturtle wrote:
       | Something about that image of the spinning coffee cup with
       | sailing ships is giving me severe trypophobia:
       | 
       | https://en.m.wikipedia.org/wiki/Trypophobia
       | 
       | It's a like a spider's eyes... and also not what I would expect a
       | latte to look like.
        
       | throw4321 wrote:
       | One of the problems with a 10-month preannouncement is that the
       | competition is ready to trash the actual announcement. Half an
       | hour in, I already see half a dozen barely-concealed posts
       | ranging from downplays to over-demands to non-user criticism.
        
         | rtsil wrote:
         | I think that's just the typical HN cynicism.
        
           | abenga wrote:
           | I don't think you can reduce it to this. Even on the showcase
           | videos on the home page I can see weird artifacts, like the
           | red car in the video of the guy walking through a market. The
           | car is driving on a pedestrian walkway (through pedestrians),
           | and just suddenly disappears from one frame to another.
        
             | manquer wrote:
             | Also the tennis player walking through the net
        
         | minimaxir wrote:
         | The competition is ready to trash the announcement because the
         | 10-month delay gave rise to several viable competitors, and
         | that would still be the case if OpenAI never did the
         | preannouncement. If OpenAI released Sora 10 months ago, there
         | wouldn't be as much cynicism.
        
       | madihaa wrote:
       | Account creation currently unavailable
        
         | topherjaynes wrote:
         | yea just that that too. Did anyone get in or did they get
         | overwhelemed?
        
           | HaZeust wrote:
           | Doesn't look like you will.
           | 
           | "We're currently experiencing heavy traffic and have
           | temporarily disabled sign ups. We're working to get them back
           | up shortly so check back soon."
        
       | zb3 wrote:
       | no API = not good enough
       | 
       | no pay per use = overpriced
        
         | zb3 wrote:
         | not available in the EU = might use everything you did there
         | against you, sell that data to the higgest bidder
        
       | ulrischa wrote:
       | It will not be available in the EU for now. I always feel
       | disadvantaged when I read that sentence
        
         | simicd wrote:
         | And in the UK and Switzerland unfortunately
         | 
         | https://help.openai.com/en/articles/10250692-sora-supported-...
        
         | hmmm-i-wonder wrote:
         | I'm not in the EU, but when I see something that is US only, I
         | tend to assume its doing something with privacy/user
         | data/otherwise that is restricted in the EU.
         | 
         | Which means I generally avoid things that are not EU available
         | even if they are available to me. Its not 100% but its a fairly
         | decent measure of how much companies care about users to ensure
         | they meet EU privacy laws from the start, vs if they provide
         | some limited version or delayed version to the EU.
        
           | xvector wrote:
           | It's really just because it's expensive as fuck and ungodly
           | complicated to ship _literally anything_ to the EU, so a lot
           | of us in big tech have just given up on it.
           | 
           | You make a small mistake and they call you evil and hit you
           | with a $1B fine. Or you don't make a mistake but they make up
           | some bullshit reason to fine you anyways, and fund the
           | government coffers.
           | 
           | It's just not worth it. And every day the EU becomes worth
           | less. They will miss out on the AI revolution liked it missed
           | out on the mobile revolution. And they can only miss out on
           | so many industrial revolutions before they fade away.
           | Whatever, it's their problem :shrug:
           | 
           | When AGI is finally discovering new therapies, we'll be able
           | to measure how much the EU slowed down innovation in AI and
           | the cost in lives. It will be around 150k lives for _every
           | day_ the EU delayed progress. I 'm sure some people will find
           | a way to rationalize that as being okay. Future generations
           | certainly won't.
        
             | sksrbWgbfK wrote:
             | I wonder how all those European companies are doing it.
             | They ship everything all the time, avoid the $billions
             | fines, yet make mistakes like everybody else.
             | 
             | > how much the EU slowed down innovation
             | 
             | You say this all the time, yet we're doing fine. How come?
        
         | mhh__ wrote:
         | Are you saying you're not glad that the EU has chosen for you?
         | 
         | I would ask an AI to generate a riff on a "I am the very model
         | of a modern major general" but for some EU bureaucrat but I'll
         | spare you the spam.
        
       | joshstrange wrote:
       | OpenAI is a masterclass in pissing off paying customers.
       | 
       | I'm just about ready to cancel my ChatGPT subscription and move
       | fully over to Claude because OpenAI has spit in my face one too
       | many times.
       | 
       | I'm tired of announcements of things being available only to find
       | out "No, they aren't" or "It's rolling out slowly" where "slowly"
       | can mean days, weeks, or month (no exaggeration).
       | 
       | I'm tired of shit like this:                   Sign ups are
       | temporarily unavailable         We're currently experiencing
       | heavy traffic and have temporarily disabled sign ups. We're
       | working to          get them back up shortly so check back soon.
       | 
       | Sign up? I'm already signed up, I've had a paid account for a
       | year now or so.
       | 
       | > We're releasing it today as a standalone product at Sora.com to
       | ChatGPT Plus and Pro users.
       | 
       | No you aren't, you might be rolling it out (see above for what
       | that means) but it's not released, I'm a ChatGPT Plus user and I
       | can't use it.
        
         | EliBullockPapa wrote:
         | I really don't think it's reasonable to expect them to onboard
         | what is likely tens of thousands of sign ups in the first hour.
        
           | minimaxir wrote:
           | ChatGPT has far, far more concurrent users than tens of
           | thousands. Sora is not a small hobby project by an amateur
           | hacker that blew up.
        
           | joshstrange wrote:
           | I don't disagree, what I'm asking for is "truth in
           | advertising". I'm not saying they need to give everyone
           | access on day 1, I'm saying don't _say_ you've given everyone
           | access if you haven't.
        
       | zlies wrote:
       | Is there information when it will be available in other
       | countries, like Germany for example?
        
       | null_investor wrote:
       | I hope somebody pays 100.000 pro subscriptions and uses AI to
       | request Sora to generate videos 24/7. Maybe Elon?
       | 
       | Even if they use queues, I'm sure they are running at a loss and
       | the GPU time is going to cost 100x more than what they charge.
       | 
       | Creating false demand for AI can easily bankrupt their business,
       | as they will believe people actually want to use that crap for
       | that purpose.
        
         | minimaxir wrote:
         | Deliberately wasting electricity isn't exactly a moral win.
        
           | toasteros wrote:
           | Generative AI is a waste of electricity by definition.
        
             | mdp2021 wrote:
             | > _by definition_
             | 
             | "Definition" does not mean "...plus your own assumptions".
             | 
             | The results are there. Optimal, no; somehow valuable, yes.
        
       | adultsthroaway wrote:
       | Genuinely curious who is doing this for adult content?
       | 
       | Complaints about Sora's quality and prompt complexity likely not
       | as important to auteur's in that category, especially with
       | ability to load a custom character etc
        
         | minimaxir wrote:
         | Sora (along with DALL-E 2 well before it) specifically has
         | safeguards against NSFW content.
        
       | fosterfriends wrote:
       | Anyone else feeling their servers melt a bit on sora.com?
        
       | seydor wrote:
       | even in this mammoths demo, the dust clouds keep popping behind
       | them even after they have moved forward
        
       | Imnimo wrote:
       | I feel like there is a sweet spot for AI generation of images and
       | videos that I would describe as "charmingly bad", like the stuff
       | we got from the old CLIP+VQGAN models. I feel like Sora has
       | jumped past that into the valley of "unappealingly bad".
        
         | halyconWays wrote:
         | I think that's why humor and memes are such good targets for
         | this type of stuff. If you look up videos like "luma memes
         | compilation," it takes well-known memes and distorts them in
         | uncanny, freaky, and bizarre ways. Yet the fact the original
         | subject is a meme somehow bypasses the uncanny valley
         | repulsion. We seem to accept that much more readily, for
         | whatever reason.
        
       | ilaksh wrote:
       | This is actually a different version from what they had before.
       | What they released today is Sora Turbo.
        
       | therein wrote:
       | Account creation not available. Login to see more videos.
       | 
       | Classic OpenAI. I don't care, there are so many better
       | alternatives to everything they do. Funny how quickly they have
       | become irrelevant and lost their moat.
        
       | natvert wrote:
       | anyone done a comparison with the open-source hunyuanvideoai.com?
        
         | tetris11 wrote:
         | 0 stars, and no comments the last time this was posted. Maybe
         | too good to be true?
        
           | natvert wrote:
           | I've ran hunyuanvideoai from their GitHub and it seems to
           | generate realistic videos. It is a bit slow (30-60min per
           | video clip) and requires ~50GB VRAM. I wonder how the quality
           | compares though?
        
             | natvert wrote:
             | oh, and the output videos that are generated are 5sec clips
             | at 554x960px. this is on a single A6000
        
               | tetris11 wrote:
               | It looks good, I'm just wondering why it has no attention
               | from the ML community
        
           | echelon wrote:
           | That's not the upstream source. You're looking for this:
           | 
           | https://github.com/Tencent/HunyuanVideo/
           | 
           | This isn't "too good to be true" - this is the holy grail.
           | Hunyuan is set to become the Flux/Stable Diffusion of AI
           | video.
           | 
           | I don't see how Hunyuan doesn't completely kill off Sora.
           | It's 100% open source, is rapidly being developed for
           | consumer PCs, can be fine tuned, works with ComfyUI/other
           | tools, and it has control nets.
        
       | dartos wrote:
       | The most impressive part is the temporal consistency in the demo
       | videos.
       | 
       | He flower one is the best looking.
        
         | tetris11 wrote:
         | That cat skateboarding off the path cut out just when it was
         | getting interesting.
         | 
         | Many of these likely fall apart just split seconds after
        
           | dartos wrote:
           | I don't doubt it, but even 60 seconds of temporal consistency
           | is an improvement, even if it's incremental.
        
       | sjm wrote:
       | Anyone else find this stuff extremely distasteful? "Disrupting"
       | creativity and art feels like it goes against our humanity.
        
         | ganzuul wrote:
         | It is like an attempt to do psychic battle over the meaning of
         | "disruption".
        
         | ronsor wrote:
         | I'm glad someone else said this. Hopefully we can get rid of
         | that terrible disruptive camera too.
        
         | quenix wrote:
         | The past few years' innovation in AI has roughly been split
         | into two camps for me.
         | 
         | LLMs -- Awesome and useful. Disruptive, and somewhat dangerous,
         | but probably more good than harm if we do it right.
         | 
         | 'Generative art' (i.e. music generation, image generation,
         | video generation) -- Why? Just why?
         | 
         | The 'art' is always good enough to trick most humans at a
         | glance but clearly fake, plastic, and soulless when you look a
         | bit closer. It has instilled somewhat of a paranoia in me when
         | browsing images and genuinely worsened my experience consuming
         | art on the internet overall. I've just recently found out that
         | a jazz mix I found on YouTube and thought was pretty neat is
         | fully AI generated, and the same happens when I browse niche
         | artstyles on Instagram. Don't get me started on what this Sora
         | release will do...
         | 
         | It changed my relationship consuming art online in general.
         | When I see something that looks cool on the surface, my
         | reaction is adversarial, one of suspicion. If it's recent, I
         | default to assuming the piece is AI, and most of the time I
         | don't have time or effort to sleuth the creator down and check.
         | It's only been like a year, and it's already exhausting.
         | 
         | No one asked for AI art. I don't understand why corporations
         | keep pushing it so much.
        
           | WXLCKNO wrote:
           | I understand your take but it's only going to get better and
           | incredibly fast.
           | 
           | I'm a huge film nerd and I can only dream of a future where I
           | could use these type of tools (but more advanced) to create
           | short films about ideas I've had.
           | 
           | It's very exciting to me
        
             | mirsadm wrote:
             | I somehow doubt it's (lack of) technology that's stopping
             | you from creating your ideas.
        
           | huehehue wrote:
           | There's this FinTech ad on the NYC subway right now. I can't
           | remember the company, but the entire ad is just a picture of
           | a guitar and some text.
           | 
           | Anyway, the guitar is AI generated, and it's really bad.
           | There are 5 strings, which morph into 6 at the headstock.
           | There's a trem bar jammed under the pickguard, somehow.
           | There's a randomly placed blob on the guitar that is supposed
           | to be a knob/button, but clearly is not. The pickups are
           | visually distorted.
           | 
           | It's repulsive. You're trying to sell me on something, why
           | would you put so little effort into your advertising? Why
           | would you not just...take a picture of a real guitar? I so
           | badly want to cover it up.
        
             | DebtDeflation wrote:
             | Just need to add a hand with 6 fingers strumming it and it
             | could be a meme.
        
             | wumeow wrote:
             | Reminds me of the new Coca Cola Christmas ad which is
             | equally off-putting.
        
             | imiric wrote:
             | > You're trying to sell me on something, why would you put
             | so little effort into your advertising? Why would you not
             | just...take a picture of a real guitar?
             | 
             | Is this not evident? Because using AI is much cheaper and
             | faster. Instead of finding the right guitar, paying for a
             | good photographer, location, decoration, and all the
             | associated logistics, a graphics designer can write a
             | prompt that gets you 90% of the vision, for orders of
             | magnitude less cost and time. AI is even cheaper and faster
             | than using stock images and talented graphic designers,
             | which is what we've been doing for the past few decades.
             | 
             | All our media channels, in both physical and digital
             | spaces, will be flooded with this low-effort AI garbage
             | from here on out. This is only the beginning. We'll need to
             | use aggressive filtering and curation in order to find
             | quality media, whether that's done manually by humans or
             | automatically by other AI. Welcome to the future.
        
               | huehehue wrote:
               | I was able to find a similar public domain image in all
               | of 5 seconds, so neither faster nor cheaper in this case.
               | 
               | In fact, it's not hard to imagine people using AI tools
               | even if they're slower, more expensive, and yield worse
               | quality results in the long run.
               | 
               | "When all you have is a hammer...".
        
           | imiric wrote:
           | I don't understand why you see a distinction between models
           | that generate text, and those that generate images, video or
           | audio. They're all digital formats, and the technology itself
           | is fairly agnostic about what it's actually generating.
           | 
           | Can't text also be considered art? There's as much art in
           | poetry, lyrics, novels, scripts, etc. as in other forms of
           | media.
           | 
           | The thing is that the generative tech is out of the bag, and
           | there's no going back. So we'll have to endure the negative
           | effects along with the positive.
        
             | quenix wrote:
             | Simple: I am equally offput when LLMs are used for
             | generating poetry, lyrics, novels, scripts, etc. _I don 't
             | like it when low-effort generated slop is passed off as
             | art_.
             | 
             | I just think that LLMs have genuine use for non-artistic
             | things, which is why I said it's dangerous but may be
             | useful if we play our cards right.
        
               | imiric wrote:
               | I see. Well, I agree to an extent, but there's no clear
               | agreement about what constitutes art with human-generated
               | works either. There are examples of paintings where the
               | human clearly "just" slapped some colors on a canvas, yet
               | they're highly regarded in art circles. Just because
               | something is low-effort doesn't mean it's not art, or
               | worthy of merit.
               | 
               | So we could say the same thing about AI-generated art.
               | Maybe most of it is low-effort, but why can't it be
               | considered art? There is a separate topic about human
               | emotion being a key component these generated works are
               | missing, but art is in the eyes of the beholder, after
               | all, so who are we to judge?
               | 
               | Mind you, I'm merely playing devil's advocate here. I
               | think that all of this technology has deep implications
               | we're only beginning to grapple with, and art is a small
               | piece of the puzzle.
        
               | quenix wrote:
               | You make a good point. I'm just spitballing here, but I
               | think what sets generative art apart for me is the
               | element of _deception_.
               | 
               | I'd be perfectly fine with a hypothetical world in which
               | all generated art is clearly denoted as such. Like you
               | said, art is in the eyes of the beholder. I welcome a
               | world in which AI art lives side-by-side with traditional
               | art, but clearly demarcated.
               | 
               | Unfortunately, the reality is very different.
               | 
               | AI art inherently tries to pass off as if it were made by
               | a human. The result of the tools released in the past
               | year is that my relationship with media online has become
               | adversarial. I've been tricked in the past by AI music
               | and images which were not labelled as such, which fosters
               | a sort of paranoia that just isn't there with the
               | examples you mentioned.
        
               | shombaboor wrote:
               | the offensive part is that it's creative theft by
               | digesting other people's creative works then reworked and
               | regurgitated. It's 'fine' when it's technical
               | documentation and reference work, but that's not human
               | expression.
        
               | doug_durham wrote:
               | So pre-LLM were you offended when someone posted their
               | personal poetry or artwork on internet if it was clear
               | they had put little effort into it? Somehow I doubt it.
        
           | l33tbro wrote:
           | Wish it was just generative AI for me.
           | 
           | You don't have the same paranoia with LLM? So often I find
           | myself getting a third of the way into reading an article or
           | blog post and think: "wait a minute...".
           | 
           | LLM tone is so specific and unrealistic that it completely
           | disengages me as a reader.
        
           | PartiallyTyped wrote:
           | I have found a channel that curates and cleans some AI
           | generated music. I really enjoy it, it's nothing I heard
           | before, it's unique, distinct, and devoid of copyright.
        
         | moralestapia wrote:
         | "And then everyone clapped ..."
         | 
         | There's nothing wrong with technology going forward and this
         | doesn't go against "creativity and art", to the contrary, it
         | will enhance it.
        
           | TheAlchemist wrote:
           | That's the optimistic version and in theory I would agree -
           | it will be a great enhancer of creativity for some poeple.
           | 
           | But mostly it will end up like the smartphones - we carry
           | more computing power in our pockets that was used to send man
           | to the moon, and instead of taking advantage of it to do
           | great things, we are glued to this small screen several hours
           | / day scrolling social medias nonsense. It's just human
           | nature.
        
         | tim333 wrote:
         | There's some of that but it produces some cool stuff too. I
         | mean you have these new virtual worlds like this that didn't
         | exist before https://youtu.be/y_4Kv_Xy7vs?t=13
         | 
         | The video there is kind of a combination of human design and AI
         | which produces something beyond that which either would come up
         | with on their own.
        
       | simonw wrote:
       | I got lucky and got in moments after it launched, managed to get
       | a video of "A pelican riding a bicycle along a coastal path
       | overlooking a harbor" and then the queue times jumped up (my
       | second video has been in the queue for 20+ minutes already) and
       | the https://sora.com site now says "account creation currently
       | unavailable"
       | 
       | Here's my pelican video:
       | https://simonwillison.net/2024/Dec/9/sora/
        
         | ByThyGrace wrote:
         | Did you notice the frame rate (so to speak) of what's happening
         | down the lake is much lower than the pelican's bicycle
         | animation?
        
         | vletal wrote:
         | Image details 9/10 Animation 3/10 Temporal consistency 2/10
         | 
         | Verdict 4/10
        
         | rjtavares wrote:
         | One of the highlights of any model release for me is checking
         | your "pelican riding a bicycle" test.
        
         | echelon wrote:
         | For those who can't try Sora out, Tencent's super recent
         | HunYuan is 100% open source and outperforms Sora. It's
         | compatible with fine tuning, ComfyUI development, and is
         | getting all manner of ControlNets and plugins.
         | 
         | I don't see how Sora can stay in this race. The open source
         | commoditization is going to hit hard, and OpenAI probably
         | doesn't have the product DNA or focus to bark up this tree too.
         | 
         | Tencent isn't the only company releasing open weights. Genmo,
         | Black Forest Labs, and Lightricks are developing completely
         | open source video models, and that's .
         | 
         | Even if there weren't open source competitors, there are a
         | dozen closed source foundation video companies: Runway, Pika,
         | Kling, Hailuo, etc.
         | 
         | I don't think OpenAI can afford to divert attention and win in
         | this space. It'll be another Dall-E vs. Midjourney, Flux,
         | Stable Diffusion.
         | 
         | https://github.com/Tencent/HunyuanVideo
         | 
         | https://x.com/kennethlynne/status/1865528133807386666
         | 
         | https://fal.ai/models/fal-ai/hunyuan-video
        
         | alberth wrote:
         | Thanks, would you mind elaborate more on what you wrote below:
         | Sora is built entirely around the idea of directly manipulating
         | and editing and remixing the clips it generates, so the goal
         | isn't to have it produce usable videos from a single prompt.
        
           | simonw wrote:
           | If you watch the OpenAI announcement they spend most of their
           | time talking about the editing controls:
           | https://www.youtube.com/watch?v=2jKVx2vyZOY
        
         | OJFord wrote:
         | > The Pelican inexplicably morphs to cycle in the opposite
         | direction half way through
         | 
         | It's pretty cool though, the kind of thing that'd be hard if it
         | was what you actually wanted!
        
         | benatkin wrote:
         | That's an awful result. It turning around has absolutely
         | nothing to do with what you asked for. It's similar in nature
         | to what the chatbot in the recent and ongoing scandal said,
         | saying to come home to her, when it should have known that the
         | idea would be nonsensical or could be taken to mean something
         | horrendous. https://apnews.com/article/chatbot-ai-lawsuit-
         | suicide-teen-a...
         | 
         | So you were lucky indeed to be able to run your prompt and
         | share it, because the result was quite illuminating, but not in
         | a way that looks good for Sora and OpenAI as a whole.
        
         | pushcx wrote:
         | I don't have a lot of mental model for how this works, but I
         | was surprised to note that it seems to maintain continuity on
         | the shapes of the bushes and brown spots on the grass that
         | track out of frame on the left and then reappear as it pans
         | back into frame.
        
           | benatkin wrote:
           | That must be exactly it. The simulated scene extends beyond
           | what the camera is currently capturing.
        
         | vunderba wrote:
         | _" The Pelican inexplicably morphs to cycle in the opposite
         | direction half way through"_
         | 
         | Oof, if sora can't even manage to maintain an internal
         | consistency of the world for a 5 second short, I can't imagine
         | how exacerbated it'll be at longer video generation times.
        
       | jrflowers wrote:
       | "Right before the TikTok ban goes into effect" is incredible
       | market timing for the release of a tool that is useless for
       | anything other than terrible TikTok spam videos
        
         | jeroenhd wrote:
         | Hey now, no need to downplay the product here, it's also useful
         | for spamming other video sharing platforms! Think Facebook
         | timelines, which are already full of AI image barf, Twitter
         | feeds, which mostly consist of AI text barf, and Youtube
         | Shorts, which is full of existing AI animation barf!
         | 
         | Soon, lots of people can pay a modest sum to make the internet
         | just a worse for everyone in exchange for a chance to make
         | their money back!
        
       | cryptozeus wrote:
       | Raises billion dollar, claims of agi by 2025, cannot handle new
       | user sign up traffic.
        
         | iLoveOncall wrote:
         | This is by design, they want the news articles saying "this is
         | so popular it crashed their website!"
        
         | manquer wrote:
         | Billion is table stakes , OpenAI has raised over 6 billion
         | dollars this year alone
        
         | pritambarhate wrote:
         | That's because in this case scaling to big traffic needs more
         | hardware which is very expensive and even if you have money the
         | manufacturers may not have the capacity you need.
        
         | knicholes wrote:
         | I don't even get why I have to "sign up." I'm already a paying
         | customer with an existing account.
        
       | exe34 wrote:
       | regarding all the comments about physics, I wonder if a hybrid
       | approach would work better, with an llm generating 3d objects
       | that interact in a physics simulation with guiding forces from
       | the LLM and then another model generating photo realistic
       | rendering.
        
       | topaz0 wrote:
       | Gentle reminder that it's important to boycott this kind of
       | thing.
        
         | tgv wrote:
         | This, and similar tools, make the world a worse place, just so
         | a handful can get the big bucks. This is not technological
         | progress, it's greed. Ethics is a dirty word.
        
         | gavindean90 wrote:
         | Why?
        
       | lacoolj wrote:
       | A little worried how young children watching these videos may
       | develop inaccurate impressions of physics in nature.
       | 
       | For instance, that ladybug looks pretty natural, but there's a
       | little glitch in there that an unwitting observer, who's never
       | seen a ladybug move before, may mistake as being normal. And
       | maybe it is! And maybe it isn't?
       | 
       | The sailing ship - are those water movements correct?
       | 
       | The sinking of the elephant into snow - how deep is too deep?
       | Should there be snow on the elephant or would it have melted from
       | body heat? Should some of the snow fall off during movement or is
       | it maybe packed down too tightly already?
       | 
       | There's no way to know because they aren't actual recordings, and
       | if you don't know that, and this tech improves leaps and bounds
       | (as we know it will), it will eventually become published and
       | will be taken at face value by many.
       | 
       | Hopefully I'm just overthinking it.
        
         | highwaylights wrote:
         | I'd be more worried about the inevitable "we're under nuclear
         | attack, head for shelter" CNN deepfakes.
        
         | icepat wrote:
         | > The sinking of the elephant into snow - how deep is too deep?
         | Should there be snow on the elephant or would it have melted
         | from body heat? Should some of the snow fall off during
         | movement or is it maybe packed down too tightly already?
         | 
         | Should there be an elephant in the snow? The layers of possible
         | confusion, and subtle incorrect understandings go much deeper.
        
           | bbarnett wrote:
           | Yes, they were used to traverse mountains paths.
        
           | sccomps wrote:
           | With the same reasoning, do reindeer actually fly and pull a
           | sleigh carrying a 200-pound man along with tons of gifts? I
           | believe you're underestimating human intelligence and our
           | ability to apply logic and reasoning.
        
         | tetris11 wrote:
         | Also, I guess its just normal for a car lane to just merge
         | seamlessly into a pedestrian zone
        
         | RyeCombinator wrote:
         | I share your concern as well and at times worry about what I'm
         | seeing too.
         | 
         | I suppose the reminder here is that seeing does not warrant
         | believing.
        
         | darepublic wrote:
         | Sure this is problematic for society although I'm not concerned
         | about what you are mentioning. I remember as a kid noticing how
         | in looney tunes wile e coyote could run off the cliff a few
         | steps and thinking maybe there's a way to do that. Or kids
         | arguing about whether it was possible to perform a sonic boom
         | like in street fighter. Or jumping off the playground with an
         | umbrella etc
        
         | sccomps wrote:
         | > For instance, that ladybug looks pretty natural, but there's
         | a little glitch in there that an unwitting observer, who's
         | never seen a ladybug move before, may mistake as being normal.
         | And maybe it is! And maybe it isn't?
         | 
         | Well, none of the existing animation movies follow exact laws
         | of physics.
        
           | cj wrote:
           | Take the example to the extreme: In 10 years, I prompt my
           | photo album app "Generate photorealistic video of my mother
           | playing with a ladybug".
           | 
           | The juxtaposition of something that looks extremely real
           | (your mother) and something that never happened (ladybug) is
           | something that's hard for the mind to reconcile.
           | 
           | The presence of a real thing inadvertently and subconsciously
           | gives confidence to the fake thing also being real.
        
             | Fade_Dance wrote:
             | I think this hooks in quite well to the existing dialogue
             | about movies in particular. Take an action movie. It looks
             | real but is entirely fabricated.
             | 
             | It is indeed something that society has to shift to deal
             | with.
             | 
             | Personally, I'm not sure that it's the photoreal aspect
             | that poses the biggest challenge. I think that we are
             | mentally prepared to handle that as long as it's not out of
             | control (malicious deep-fakes used to personally target and
             | harass people, etc.) I think the biggest challenge has
             | already been identified, namely, passing off fake media as
             | being real. If we know something is fake, we can put a
             | mental filter in place, like a movie. If there is no way to
             | know what is real and what is fake, then our perception
             | reality itself starts to break down. _That_ would be a
             | major new shift, and certainly not one that I think would
             | be positive.
        
               | browningstreet wrote:
               | I looked at the Sora videos and all the subject "weights"
               | and "heft" are off. And in the same way that Anna Taylor-
               | Joy's jump in the The Gorge at the end of the new movie
               | trailer looked not much better than years-ago Spiderman
               | swinging on a rope.
        
               | normalaccess wrote:
               | I'm still waiting on the future waves of PTSD from hyper
               | realistic horror games. I can't think of a worse thing to
               | do then hand a kid a VR headset (or game system) and have
               | them play a game that is _designed_ to activate every
               | single fight or flight nerve in the body on a level that
               | is almost indistinguishable from reality. 20 years ago
               | that would have been the plot to a torture porn flick.
               | 
               | Even worse than that is when people get USED to it and no
               | longer have a natural aversion to horrific scenes taking
               | place in the real world.
               | 
               | This AI stuff accelerates that process of illusion but in
               | every possible direction at once.
               | 
               | As much as people don't want to believe it, by beholding
               | we are indeed changed.
        
               | dartos wrote:
               | That argument can and probably was pointed towards movies
               | with color, movies with audio before that, comics, movies
               | without audio, books, etc.
               | 
               | I don't think that slippery slope holds up.
               | 
               | IIRC there's pretty solid research showing that even
               | children beyond the age of 8 can tell the difference
               | between fiction and reality.
        
               | normalaccess wrote:
               | Distinguishing reality from fiction is useful, but it
               | doesn't shape our desires or define our values. As a
               | culture, we've grown colder and more detached. Think of
               | the first Dracula film--audiences were so shaken by a
               | simple eerie face that some reportedly lost control in
               | the theater. Compare that visceral reaction to the apathy
               | we feel toward far more shocking imagery today.
               | 
               | If media didn't profoundly affect us, how could exposure
               | therapy rewire fears? Why would billions be spent on
               | advertising if it didn't work? Why would propaganda or
               | education exist if ideas couldn't be planted and nurtured
               | through storytelling?
               | 
               | Is there any meaningful difference between a sermon from
               | the pulpit and a feature film in the theater? Both are
               | designed to influence, persuade, and reshape our
               | worldview.
               | 
               | As Alan Moore aptly put it: "Art is, like magic, the
               | science of manipulating symbols, words, or images to
               | achieve changes in consciousness."
               | 
               | In my opinion the old adage holds true, _you are what you
               | eat_. And we will soon be eating unimaginable mountains
               | of artificial content cooked up by dream engines tuned to
               | our every desire and whim.
        
             | brookst wrote:
             | Wouldn't this same concern apply to historical fiction in
             | general?
        
           | spullara wrote:
           | gravity acts immediately, you don't hover in the air for few
           | seconds before falling
        
             | dylan604 wrote:
             | then how will I have time to flash my sign to the audience
             | that says "uh-oh"?
        
           | byteknight wrote:
           | Feels like you're looking for a strawman argument, and may
           | have found one.
           | 
           | I would retort that animation and real-life-looking video do
           | different things to our psyche. As an uneducated wanna-be
           | intellectual, I would lean toward thinking real-looking
           | objects more directly influence our perception of life than
           | animations.
        
             | a_wild_dandan wrote:
             | Animation _can_ look real though, e.g sci-fi vfx. But maybe
             | you're concerned about how prolific it may be? I could see
             | that. Personally I think it'll be fine. It's just that
             | disruptive tools create uncertainty. Or maybe I'm
             | overcompensating to avoid being the "old man yelling at
             | cloud" dude.
        
               | byteknight wrote:
               | Now you're intentionally mixing VFX and animation.
               | Animation, at least in my meaning, was more cartoon.
        
           | FridgeSeal wrote:
           | Well none of the existing animation movies...a to be anything
           | other than animation?
           | 
           | You just know there'll be people making content within the
           | week for social media that will be trying to pass itself off
           | as real imagery.
        
           | jsheard wrote:
           | Animation doesn't follow exact laws of physics, but the
           | specific ways they don't follow physics have very deliberate
           | intent behind them. There's a pretty clear difference between
           | the coyote running off a cliff and taking 2 seconds to start
           | falling, and a character awkwardly floating over the ground
           | because an AI model got confused.
        
             | bee_rider wrote:
             | It is a good point...
             | 
             | Although, plenty of kids have tied a blanket around their
             | necks and jumped off some furniture or a low roof, right?
             | Breaking a leg or twisting an ankle in their attempt to
             | imitate their favorite animated superhero.
        
               | 867-5309 wrote:
               | oh yes, _Suipercideman_
        
             | IanCal wrote:
             | >but the specific ways they don't follow physics have very
             | deliberate intent behind them.
             | 
             | That is only true for well crafted things. There's plenty
             | of stuff that's just wrong for no reason beyond ease of
             | creation or lack of care about the output.
        
             | 1024core wrote:
             | Clearly you haven't seen any Bollywood movies:
             | https://youtu.be/PdvRwe39NCs
        
         | sdf4j wrote:
         | I grew up watching Looney Tunes interpretation of physics and
         | turned out just fine.
        
           | artur_makly wrote:
           | these will be a lot less violent too ;-) for a little while
           | at least.
        
           | AyyEye wrote:
           | There's big difference between cartoonishly incorrect and
           | uncanny valley plausibly correct.
        
             | fooker wrote:
             | There's a huge amount of such stuff in movies.
             | 
             | Special effects, weapons physics, unrealistic vehicles and
             | planes, or the classic 'hacking'.
        
               | mojuba wrote:
               | Yes but a movie is a movie whereas these AI-generated
               | videos will likely be used to replace stock footage in
               | other (documentary, promotional, etc.) contexts
        
               | ssl-3 wrote:
               | If the producer wants to publish bad physics, they get
               | bad physics.
               | 
               | If the producer wants to publish good physics, they get
               | good physics.
               | 
               | It doesn't matter if it is AI, CGI, live action, stop
               | motion, pen-and-ink animation, or anything else.
               | 
               | The output is whatever the production team wants it to
               | be, just as has been the case for as long as we've had
               | cinema (or advertising or documentaries or TikToks or
               | whatevers).
               | 
               | Nothing has changed.
        
               | mojuba wrote:
               | You don't have full control over AI-generated images
               | though, or not to the same extent producers have with
               | CGI.
               | 
               | There's a video on sora.com at the very bottom, with
               | tennis players on the roof, notice how one player just
               | walks "through" the net. I don't think you can fix this
               | other than by just cutting the video earlier.
        
               | ssl-3 wrote:
               | >You don't have full control over AI-generated images
               | though,
               | 
               | So the AI just publishes stuff on my behalf now?
               | 
               | No, comrade.
        
               | evilduck wrote:
               | There's already techniques for controlling AI generated
               | images. There's ControlNet for Stable Diffusion and there
               | are already techniques to take existing footage and
               | style-morphing it with AI. For larger budget productions
               | I would anticipate video production tooling to arise
               | where directors and animators have fine grained influence
               | and control over the wireframes within a 3D scene to
               | directly prevent or fix issues like clipping, volumetric
               | changes, visual consistency, text generation, gravity,
               | etc. Or even just them recording and producing their
               | video in a lower budget format and then having it re-
               | rendered with AI to set the style or mood but adhering to
               | scene layout, perspective, timing, cuts, etc. Not just
               | for mitigating AI errors but also just for controlling
               | their vision of the final product.
               | 
               | Or they could simply brute force it by clipping the scene
               | at the problem point and have it try, try again with
               | another re-render iteration from that point until it's no
               | longer problematic. Or just do the bulk of the work with
               | AI and do video inpainting for small areas to fix or
               | reserve the human CGI artists for fixing unmitigatable
               | problems that crop up if they're fixable without full re-
               | rendering (whichever probably ends up less expensive).
               | 
               | Plus with what we've recently seen with world models that
               | have been released in the last week or so, AI will soon
               | get better at having a full and accurate representation
               | of the world it creates and future generations of this
               | technology beyond what Sora is doing simply won't make
               | these mistakes.
        
               | zoover2020 wrote:
               | Yet, in a movie setting it's clear something is a special
               | effect or alike which is not the case for GenAI. Massive
               | underestimation of the potential impact in this thread,
               | scary.
        
               | brookst wrote:
               | Maybe. Or maybe some people massively underestimate our
               | ability to cope with fiction and new media types.
               | 
               | I am sure that there were people decrying radio for all
               | these same reasons ("how will _the children_ know that
               | the voices aren't people in the same room?")
        
               | ics wrote:
               | There's also a huge difference in what people, even
               | children, expect when sitting down to watch a movie
               | versus seeing a clip of some funny cat/seal hybrid
               | playing football while I'm looking for the Bluey episode
               | we left off on. My daughter is almost five and cautiously
               | asks "is that real?" about a lot of things now. It
               | definitely makes me work harder when trying to explain
               | the things that don't look real but actually are; one
               | could definitely feel like it takes some of the magic
               | away from moments. I feel alright in my ability to handle
               | it, it's my responsibility to try, but it isn't as simple
               | as the Looney Tunes argument or, I believe, dramatic
               | effects in movies and TV.
        
               | kube-system wrote:
               | Not a bad point, those representations have, in some
               | cases, caused widespread misunderstandings among people
               | who learn about those concepts from movies... and this is
               | all while simultaneously knowing "it's just a movie".
        
               | eddieroger wrote:
               | People don't watch The Matrix expecting a documentary on
               | how we all got plugged in. If someone generated the
               | referenced ladybug movie for use in a science classroom,
               | that's a problem.
        
               | fooker wrote:
               | I agree. The issue is in using it for teaching science
               | though, not in generating it.
               | 
               | Similar to how it's fine to create fiction, but not to
               | claim it to be true.
        
             | gmuslera wrote:
             | Did you see the movie Battleship? Or a good percent of
             | recent and not so recent action movies, at least Matrix
             | could be argued that it was about a virtual reality.
        
           | ma2t wrote:
           | "A body at rest remains at rest until it looks down and
           | realizes it has stepped off of a cliff."
        
         | sdenton4 wrote:
         | Between omnipresent cgi in movies and tv, animation, and video
         | game physics (all of which are human-coded approximations of
         | real physics, often intentionally distorted for various
         | reasons), that ship has long since sailed.
        
           | dowager_dan99 wrote:
           | no one is shooting blockbuster-grade CGI for stock footage
           | though; the casualness of this is what will be the most
           | impactful
        
         | CyberDildonics wrote:
         | _A little worried how young children watching these videos may
         | develop inaccurate impressions of physics in nature._
         | 
         | Pretty sure cartoons and actions movies do that already, until
         | youtube videos of attempted stunts show what reality looks
         | like.
        
         | spicymaki wrote:
         | I know this sounds judgmental, but this reminds me of the idiom
         | "touch grass". Children should be outdoors observing real life
         | and not be consuming AI slop. You are not overthinking this,
         | this will most likely be bad for children and everyone in the
         | long run.
        
         | bparsons wrote:
         | I dont think you are overthinking it.
         | 
         | Facebook seems full of older people interacting with AI
         | generated visual content who don't seem to understand that it
         | is fake.
         | 
         | Our society already had a problem with people (not)
         | participating in consensus reality. This is going to pour
         | gasoline on the fire.
        
         | skybrian wrote:
         | Yes, entertainment spreads lots of myths. But bad physics from
         | AI movies is only a tiny part of the problem. This is similar
         | to worries about the misconceptions people might get from
         | playing too many video games, reading too many novels, watching
         | too much TV, or participating too much in social media.
         | 
         | It helps somewhat that people are fairly aware that
         | entertainment is fake and usually don't take it too seriously.
        
         | gruntbuggly wrote:
         | Fair! I watched a lot of Superman as a kid and I killed myself
         | jumping off a building
        
           | dylan604 wrote:
           | Don't be an asshole. When learning to fly, learn by starting
           | on the ground first, not from a tall building. --Bill Hicks
        
         | anonu wrote:
         | > inaccurate impressions of physics
         | 
         | Or just inaccurate impressions of the physical world.
         | 
         | My young kids and I happened to see a video of some very cute
         | baby seals jumping onto a boat. It was not immediately clear it
         | was AI-generated, but after a few runs I noticed it was a bit
         | too good to be true. The kids would never have known otherwise.
        
         | throwawayian wrote:
         | Don't worry, you are.
        
         | TeMPOraL wrote:
         | Me too. While I'm generally optimistic about generative art, at
         | this point the models still have this dreamlike quality; things
         | look OK at first glance, but you often get the feeling
         | something is off. Because it is. Texture, geometry, lights,
         | shadows, effects of gravity, etc. are more or less
         | inconsistent.
         | 
         | I do worry that, as we get exposed more and more to such art,
         | we'll become less sensitive to this feeling, which effectively
         | means we'll become less calibrated to actual reality. I worry
         | this will screw with people's "system 1" intuitions long-term
         | (but then I can't say exactly how; I guess we'll find out soon
         | enough).
        
         | juddlyon wrote:
         | YouTube Shorts are full of AI animal videos with distorted
         | proportions, living in the wrong habitat, and so on. They
         | popped up on my son's account and I hate them for the reasons
         | you outline. They aren't cartoonish enough explain away, nor
         | realistic enough to be educational.
        
           | jonpo wrote:
           | And have you watched the brain rot that is Tik toks?
        
         | t0bia_s wrote:
         | Young generation that will grow up with this tools will have
         | completely different approach to anything virtual. Remember how
         | prople though that camera stole part of their soul when they
         | see themselves copied on picture?
        
         | Terr_ wrote:
         | > A little worried how young children watching these videos may
         | develop inaccurate impressions of physics in nature.
         | 
         | I'm less concerned with physics for children--assuming they get
         | enough time outdoors--and more about adulthood biases and
         | media-literacy.
         | 
         | In particular, a turbocharged version of a problem we already
         | have: People grow up watching movies and become subconsciously
         | taught that _flaws_ of the creation pipeline (e.g. lens flare,
         | depth of field) are signs of  "realism" in a general sense.
         | 
         | That manifests in things such as video-games where your human
         | character somehow sees the world with crappy video-cameras for
         | eyes. (Excepting a cyberpunk context, where that would actually
         | make sense.)
        
         | jstummbillig wrote:
         | > Hopefully I'm just overthinking it.
         | 
         | I think it's unnecessary to worry about obviously bad stuff in
         | nascent and rapidly developing technology. The people who spent
         | most time with it (the developers) are aware of the obviously
         | bad stuff and will work to improve it.
        
         | hash07e wrote:
         | Yes Bugs bunny and willie the coyote harmed ours physics.
        
         | raincole wrote:
         | > A little worried how young children watching these videos may
         | develop inaccurate impressions of physics in nature.
         | 
         | And why don't we worry this about CGI?
         | 
         | CGI is not always made with a full physical simulation, and is
         | not always intended to accurately represent real-world physics.
        
         | andrewstuart wrote:
         | Kids are fine with fiction.
        
         | mike_hearn wrote:
         | AI physics isn't worth worrying about compared to other
         | inaccurate things kids see in movies. It doesn't seem to hurt
         | them.
         | 
         | If you really want something to worry about, consider that
         | movies regularly show pint-sized women successfully drop
         | kicking men significantly bigger than themselves in ways that
         | look highly plausible but aren't. It's not AI but it violates
         | basic laws of biology and physics anyway. Teaching girls they
         | can physically fight off several men at once when they aren't
         | strong enough to do that seems like it could have pretty
         | dangerous consequences, but in practice it doesn't seem to
         | cause problems. People realize pretty quick that movie physics
         | isn't real.
        
         | SaintSeiya wrote:
         | Don't be, physics laws miss interpretation are very quick to
         | correct with a reality check. I'm more worried for kids that
         | have to learn how the world works trough a screen. Just let
         | them play outside and interact with other kids and nature. Let
         | them fall and cry, and scratch and itch, it will make them
         | stronger and healthier adults.
        
         | WesolyKubeczek wrote:
         | You are not overthinking it, moreover, text LLM have the same
         | problem in that they are almost good. Almost. Which is what
         | gives me the creeps.
        
         | uludag wrote:
         | Here's the obligatory AI enthusiast answer:
         | 
         | What is physics besides next token/frame prediction? I'm not
         | sure these videos deserve the label "inaccurate" as who's to
         | judge what way of generating next tokens/frames is better? Even
         | if you you judge the "physical" world to be "better", I think
         | it's much more harmful to teach young children to be skeptical
         | of AI as their futures will depend on integrating them in their
         | lives. Also, with enough data, such models will not only match,
         | but probably exceed "real-physics" models in quality, fidelity,
         | and speed.
        
         | cryptoegorophy wrote:
         | I am not sure if you have kids or not but you are in for a big
         | surprise if you don't have kids. Watching videos =\= real life.
        
         | 8note wrote:
         | i wouldnt expect young children to learn how to walk by
         | watching people walk on a screen, regardless of if its a real
         | person walking, or an ai animation.
         | 
         | the real world gives way more stimulus
         | 
         | watching the animations might help them play video games, but i
         | again imagine that the feedback is what will do the real job.
         | 
         | even for the real ladybug video, who says the behaviour on
         | screen is similar to what a typical ladybug does? if its on
         | video, the ladybug was probably doing something weird amd
         | unexpected
        
       | unraveller wrote:
       | Is it better or just more distracting away from it's flaws and
       | using flaws to advantage? I only see repulsive mouth movements to
       | induce fear, face coverings to hide the uncanniness, dreamy
       | physics sim to distract. Not so out of place in present day
       | hollywood but never any coherence of feeling.
        
       | jack_pp wrote:
       | I'm surprised they put in 2 legged poodles
        
       | goykasi wrote:
       | A bit off-topic, but how much does a 4-letter (or less) .com go
       | for these days? I wonder if they bought this via an intermediary
       | so that the seller wouldnt see "OpenAI" and tack on a few zeros.
       | 
       | edit: previously, this thread pointed to sora.com
        
         | silvestrov wrote:
         | His review video is so much better than the announcement video
         | at explaining what has been released.
        
         | geor9e wrote:
         | Pretty off-topic, but yes, domains and land are often bought
         | via shell companies for this reason. OpenAI bought chat.com for
         | 8 figures previously.
        
       | wslh wrote:
       | "We are currently experiencing heavy loads..."
        
       | inoffensivename wrote:
       | Great. In a world awash with disinformation, we're making it
       | easier to create even more of it.
       | 
       | I don't see any good coming from tools like these.
        
       | sergiotapia wrote:
       | sorry for the tangent: can't remember a launch they've had where
       | you could just use it. it's always "rollout", "later this
       | quarter", "select users", what's the deal here?
       | 
       | it's given openai this tinge to me that i probably won't ever
       | manage to forget.
        
       | m3kw9 wrote:
       | A minimum setting video took an hour 480p 5s 1:1. Servers getting
       | cooked
        
       | vinni2 wrote:
       | Account creation currently unavailable
        
       | adregan wrote:
       | Why keep building AI to do the things that people find fun to do
       | rather than the mundane bullshit? All we'll be left with is
       | cleaning, folding laundry, and doing the dishes while AI does all
       | the interesting things.
        
         | amelius wrote:
         | Because we don't have as much data about mundane bullshit.
        
       | siliconc0w wrote:
       | I may be the only one but this kinda breaks my brain in that I
       | notice weird physics anomalies in these but then I start to look
       | for those in non-AI produced video and start to question
       | everything. Hopefully this a short term situation.
        
       | okdood64 wrote:
       | So when's the lawsuit from Google coming?
        
       | andrewstuart wrote:
       | So we are now a few years into the AI video thing.
       | 
       | I'm curious to know - is it actually useful for real world tasks
       | that people/companies need videos for?
        
       | system2 wrote:
       | I wonder when in the future ai images and videos will be remotely
       | useful and easy to create. These are still weird and garbage
       | quality.
        
       | lossolo wrote:
       | If we take HunyuanVideo, which is similar to Sora, as an example,
       | they state that generating a 5-second video requires 5 minutes on
       | 8xH100 GPUs. Therefore, if 10,000 users simultaneously want to
       | generate a 5-second video within the same 5-minute window, you
       | would need 80,000 H100 GPUs, which would cost around 2 billion
       | USD in GPUs alone.
        
       | IanCal wrote:
       | Not available in
       | 
       | > the United Kingdom, Switzerland and the European Economic Area.
       | We are working to expand access further in the coming months
       | 
       | Excellent to announce this lack of access after the launch of
       | pro. At least I have no business reason for sora so it's not a
       | loss there so much but annoying nonetheless.
        
       | aglione wrote:
       | ok, so gpt pro with some extra power and sora. This means that
       | gtp5 and generally speaking AGI can wait
        
       | advael wrote:
       | People really worry about fake video and images and whatever but
       | I have to say, the correct heuristics both already exist and have
       | existed for a long time:
       | 
       | 1. Anything on the internet can be fake
       | 
       | 2. Trust is interpersonal, and trusting content should be
       | predicated first and foremost on trusting its source to not
       | deceive you
       | 
       | This is imperfect but also the best people ever really do in the
       | general case, and just orders of magnitude better than most
       | people are currently doing
       | 
       | The issue isn't models like this, it's that people are eating a
       | ton of information but have been strongly encouraged to be
       | credulous, and a lion's share of that training is directly coming
       | from the tech grift industrial complex
       | 
       | I wouldn't even say this is the most compelling kind of tool for
       | plausible-looking disinformation out there by a long shot for the
       | record, but without actually examining why people are gullible
       | there is no technology that's going to make people accepting
       | fiction as fact substantially worse, or better, really. Scams
       | target people on the order of their life savings every day and
       | there are robust technologies and protocols for vetting
       | communications, but people have to know to use them, care to use
       | them, and be able to use them, for that to matter at all
        
       | jiggawatts wrote:
       | "The version of Sora we are deploying has many limitations. It
       | often generates unrealistic physics and struggles with complex
       | actions over long durations. Although Sora Turbo is much faster
       | than the February preview, we're still working to make the
       | technology affordable for everyone."
       | 
       | So they demo the full model and release the quantised and
       | censored model.
       | 
       | Does anyone else find this kind of bait & switch distasteful?
        
         | mewpmewp2 wrote:
         | Maybe, but alternative would be to not demo results with state
         | of the art processing at all, which I wouldn't like either.
        
       | jedberg wrote:
       | > We're introducing our video generation technology now to give
       | society time to explore its possibilities and co-develop norms
       | and safeguards that ensure it's used responsibly as the field
       | advances.
       | 
       | That's an interesting way of saying "we're probably gonna miss
       | some stuff in our safety tools, so hopefully society picks up the
       | slack for us". :)
        
         | FrustratedMonky wrote:
         | "to give society time to explore its possibilities and co-
         | develop norms and safeguards"
         | 
         | Or, "this safety stuff is harder than we thought, we're just
         | going to call 'tag you're it' on society"
         | 
         | Or,
         | 
         | -Oppenheimer : speaking "man, this nuclear safety stuff is
         | hard, I'm just going to put it all out there and let society
         | explore developing norms and safeguards".
         | 
         | -Society : Bombs Japan
         | 
         | -Oppenheimer : "No, not like that, oops".
        
           | Arnt wrote:
           | Aren't you kind of saying that you don't have any answers so
           | therefore OpenAI should have provided the answers?
        
           | usrnm wrote:
           | Oppenheimer was making a bomb from day 1, he knew exactly
           | what he was doing and how it would be used. There aren't so
           | many different use cases for a bomb, after all. It was a nice
           | movie, but it does not absolve him
        
           | xvector wrote:
           | Eh, society did a pretty good job overall.
           | 
           | The bomb was the end of conventional warfare between nuclear
           | nations. MAD has created an era of peace unlike anything our
           | species has ever seen before.
        
             | rurp wrote:
             | Well it works great, until is doesn't. We're perpetually a
             | few bad decisions from a few possibly deranged actors away
             | from obliterating all of those gains and then some.
        
         | jsheard wrote:
         | Flashbacks to when they were cagey about releasing the GPT
         | models because they could so easily be used for spam, and then
         | just pretended not to see all the spam their model was making
         | when they did release it.
         | 
         | If you happen to notice a Twitter spam bot claiming to be "an
         | AI language model created by OpenAI", know that we have
         | conducted an investigation and concluded that no you didn't.
         | Mission accomplished!
        
         | nostromo wrote:
         | The irony is that users want more freedom and fewer safeguards.
         | 
         | But these companies are rightfully worried about regulators and
         | legislatures, often led by a pearl-clutching journalists, so we
         | can't have nice things.
        
           | DFHippie wrote:
           | Recent events (many events in many places) show "users" don't
           | think too hard before acting. And sometimes they act with
           | inadequate or inaccurate information. If we want better
           | outcomes, it behooves us to hire people to do the thinking
           | that ordinary users see no point in doing for themselves. We
           | call the people doing the hard thinking scientists,
           | regulators, and journalists. The regulators, when empowered
           | to do so by the government, can stop things from happening.
           | The scientists and journalists can just issue warnings.
           | 
           | Giving people what they want when they want it doesn't always
           | lead to happy outcomes. The people themselves, through their
           | representatives, have created the institutions that sometimes
           | put a brake on their worst impulses.
        
         | nicbou wrote:
         | The onus will be on the rest of society to defend itself from
         | all the grift that will result from this.
        
           | pesus wrote:
           | If the worst we ultimately get from this kind of tech is
           | grifting, I will consider that a very positive outcome.
        
         | miohtama wrote:
         | Users, not tools, should be judged.
         | 
         | It is unlikely anyone is going to perform act of terrorism with
         | this, or any kind of deep fakes that buy Easter European
         | elections. The worst outcome is likely teens having a laugh.
        
           | observationist wrote:
           | Funny how all the negative uses to which something like this
           | might be put are regulated or criminalized already - if you
           | try to scam someone, commit libel or defamation, attempt
           | widespread fraud, or any of a million nefarious uses, you'll
           | get fined, sued, or go to jail.
           | 
           | Would you want Microsoft to claim they're responsible for the
           | "safety" of what you write with Word? For the legality of the
           | numbers you're punching into an Excel spreadsheet? Would you
           | want Verizon keeping tabs on every word you say, to make sure
           | it's in line with their corporate ethos?
           | 
           | This idea that AI is somehow special, that they absolutely
           | must monitor and censor and curtail usage, that they claim
           | total responsibility for the behavior of their users -
           | Anthropic and OpenAI don't seem to realize that they're the
           | bad guys.
           | 
           | If you build tools of totalitarian dystopian tyranny,
           | dystopian tyrants will take those tools from you and use
           | them. Or worse yet, force your compliance and you'll become
           | nothing more than the big stick used to keep people cowed.
           | 
           | We have laws and norms and culture about what's ok and what's
           | not ok to write, produce, and publish. We don't need
           | corporate morality police, thanks.
           | 
           | Censorship of tools is ethically wrong. If someone wants to
           | publish things that are horrific or illegal, let that person
           | be responsible for their own actions. There is absolutely no
           | reason for AI companies to be involved.
        
             | 8note wrote:
             | that works for locally hosted models, but if its as a
             | service, openai is publishing those verboten works to you,
             | the person who requested it.
             | 
             | even if it is a local model, if you trained a model to spew
             | nazi propaganda, youre still publishing nazi propaganda to
             | the people who then go use it to make propaganda. its just
             | very summarized propaganda
        
               | gus_massa wrote:
               | Does this apply to the spell checker in Office 365 or
               | Google Docs?
        
               | jimkleiber wrote:
               | Are hunting knives regulated the same way as rocket
               | launchers? Both can be used to kill but at much different
               | intensity levels.
        
             | nlehuen wrote:
             | You are posting this under a pseudonym. If you did publish
             | something horrific or illegal, it would have been the
             | responsibility of this web site to either censor your
             | content, and/or identify you when asked by authorities.
             | Which do you prefer?
        
               | xvector wrote:
               | This website is not a tool - not really.
               | 
               | Your keyboard is.
               | 
               | Censoring AI generation itself is very much like
               | censoring your keyboard or text editor or IDE.
               | 
               | Edit: Of course, "literally everything is a tool", yada
               | yada. You get what I mean. There is a meaningful
               | difference between that translate our thoughts to a
               | digital medium (keyboards) and tools that share those
               | thoughts that others.
        
               | jimkleiber wrote:
               | A website is almost certainly a tool. It has servers and
               | distributes information typed on thousands of keyboards
               | to millions of screens.
        
               | skydhash wrote:
               | HN is the one doing the distribution, not the user. The
               | latter is free to type whatever it wants, but it is not
               | entitled to have HN distributes his words. Just like a
               | publisher do not have to publish a book he doesn't want
               | to.
        
               | gardenhedge wrote:
               | When someone posts on FB, they don't consider that FB is
               | publishing their content for them
        
               | do_not_redeem wrote:
               | > when asked by authorities
               | 
               | Key point right here.
               | 
               | You let people post what they will, and if the
               | authorities get involved, cooperate with them. HN should
               | not be preemptively monitoring all comments and making
               | corporate moralistic judgments on what you wrote and
               | censoring people who mention Mickey Mouse or post song
               | lyrics or talk about hotwiring a car.
               | 
               | Why shouldn't OpenAI do the same?
        
               | 9rx wrote:
               | It seems reasonable to work with law enforcement if
               | information provides details about a crime that took
               | place in the real world. I am not sure what purpose
               | censoring as a responsibility would serve? Who cares if
               | someone writes a fictional horrific story? A site like
               | this may choose to remove noise to keep the quality of
               | the signal high, but preference and responsibility are
               | not the same.
        
             | pyrale wrote:
             | > Would you want Microsoft to claim they're responsible for
             | the "safety" of what you write with Word? For the legality
             | of the numbers you're punching into an Excel spreadsheet?
             | Would you want Verizon keeping tabs on every word you say,
             | to make sure it's in line with their corporate ethos?
             | 
             | Would you want DuPont to check the toxicity of Teflon
             | effluents they're releasing in your neighbourhood? That's
             | insane. It's people's responsibility to make sure that they
             | drink harmless water. New tech is always amazing.
        
               | nightski wrote:
               | Yes, because we know a.) that the toxicity exists and b.)
               | how to test for it.
               | 
               | There is no definition of a "safe" model without
               | significant controversy nor is there any standardized
               | test for it. There are other reasons why that is a
               | terrible analogy, but this is probably the most
               | important.
        
               | miohtama wrote:
               | It's called Overton window what's politically acceptable.
               | Unlike toxicity, it is fully subjective.
               | 
               | https://en.m.wikipedia.org/wiki/Overton_window
        
             | bayindirh wrote:
             | Maybe you should talk with image editor developers,
             | copier/scanner manufacturers and governments about the
             | safeguards they shall implement to prevent counterfeiting
             | money.
             | 
             | Because, at the end of the day, counterfeiting money is
             | already illegal.
             | 
             | ...and we should not censor tools, and judge people, not
             | the tools they use.
        
               | mayukh wrote:
               | So guns are ok? How about bombs?
        
               | rixed wrote:
               | Interestingly, you must know that any printing equipment
               | that is good enough to output realistic banknotes are
               | regulated to embed a protection preventing this use case.
               | 
               | Even more interestingly, and maybe that could help
               | understand that even in the most principled argument
               | there should be a limit: molecular 3d printers able to
               | reproduce proteins (yes, this is a thing) are regulated
               | to recognise a design from a database of dangerous
               | pathogens and refuse to print.
        
               | miohtama wrote:
               | Gimp doesn't have the secret binary blob to "prevent
               | counterfeiting" and there is no flood of forged money
               | 
               | https://www.reddit.com/r/GIMP/comments/3c7i55/does_gimp_h
               | ave...
        
           | AntiEgo wrote:
           | "Teens having a laugh" can escalate quickly to, "... at
           | someone else's expense," and this distinction is EXACTLY the
           | sort of subtlety an algorithm can't filter.
           | 
           | This does not need to become a thread about bullying and self
           | harm, but it should be recognized that this example is not
           | benign or victimless.
           | 
           | This genie is out of the bottle, let us hope that laws about
           | users are enough when the tools evolve faster than
           | legislative response.
           | 
           | [edit:spelling]
        
           | thordenmark wrote:
           | Exactly. You can make anything you want in Photoshop, Word,
           | Excel, Blender, etc. The company isn't held accountable for
           | what the User makes with it.
        
             | jimkleiber wrote:
             | Yes and one could kill a hundred people with their fists,
             | but we regulate super powerful weapons more than fists.
             | 
             | I think the degree of power matters.
        
           | miltonlost wrote:
           | > It is unlikely no one is going to perform act of terrorism
           | with this, or any kind of deep fakes that buy Easter European
           | elections. The worst outcome is likely teens having a laugh.
           | 
           | And the teens are having a laugh by... creating deepfake
           | nudes of their classmates? The tools are bad, and the
           | toolmakers should feel deep guilt and shame for what they
           | released on the world. Do you not know the story of Nobel and
           | dynamite? Technology must be paired with morality.
        
             | Aeolun wrote:
             | Technology _is_ paired with morality. It's just not the one
             | you want.
        
               | botanical76 wrote:
               | Is it? It seems to me to be paired with shareholders'
               | interests, and nothing more.
        
             | miohtama wrote:
             | I am sure a school has a way to deal with pupils sharing
             | such images, as the recent cases have proven. Deep fakes or
             | real pictures. It it a social problem with existing
             | framework of decades of proven history and should be dealt
             | so.
        
           | timeon wrote:
           | > or any kind of deep fakes that buy Easter European
           | elections
           | 
           | Finally people do not label Slovakia as Eastern Europe...
        
           | tshaddox wrote:
           | There are certain tools for which we heavily restrict which
           | users have access to the entire supply chain. That's still
           | about users, I suppose, but it's also about tools.
        
             | miohtama wrote:
             | In China, the whole Internet is heavily restricted. Bad
             | tools.
        
           | ClumsyPilot wrote:
           | > no one is going to perform act of terrorism with this
           | 
           | Especially certain someone that's worth a billion dollars, is
           | 100 years old and their name ends with inc.
        
         | sleepybrett wrote:
         | 'when civilization collapses because all photo, audio and video
         | evidence is 100% suspect, i mean, how could you blame us'
        
         | jstummbillig wrote:
         | Do we not want new stuff? If the answer is "Sure, but only if
         | whoever invents the stuff does all the work and finds all rough
         | edges" then the answer is actually just "No, thanks".
        
           | jedberg wrote:
           | Oh, I have no problem with them doing it this way. I just
           | thought it was a funny way to do it.
        
         | 123yawaworht456 wrote:
         | text, image, video, and audio editing tools have no 'safety'
         | and 'alignment' whatsoever, and skilled humans are far more
         | capable of creating 'unsafe' and 'unethical' media than
         | generative AI will ever be.
         | 
         | somehow, the society had survived just fine.
         | 
         | the notion that generative AI tools should be 'safe' and
         | 'aligned' is as absurd as the notion that tools like Notepad,
         | Photoshop, Premiere and Audacity should exist only in the
         | cloud, monitored by kommissars to ensure that proles aren't
         | doing something 'unsafe' with them.
        
         | pyrale wrote:
         | "We're releasing this like rats on a remote island, in hopes of
         | seeing how the ecosystem is going to respond".
        
         | raincole wrote:
         | The problem isn't whether we should regulate AI. It's whether
         | it's even possible to regulate them without causing significant
         | turmoil and damage to the society.
         | 
         | It's not hyperbole. Hunyuan was released before Sora. So
         | regulating Sora does absolutely nothing unless you can regulate
         | Hunyuan, which is 1) open source and 2) made by a China
         | company.
         | 
         | How do we expect the US govt to regulate that? Threatening
         | sanction China unless they stop doing AI research???
        
           | ssl-3 wrote:
           | Easy-peasy. Just require all software to be cryptographically
           | signed, with a trusted chain that leads to a government-
           | vetted author, and make that author responsible for the
           | wrongdoings of that software's users.
           | 
           | We're most of the way there with "our" locked-down, walled-
           | garden pocket supercomputers. Just extend that breadth and
           | bring it to the rest of computing using the force of law.
           | 
           | ---
           | 
           | Can I hear someone saying something like "That will never
           | work!"?
           | 
           | Perhaps we should meditate upon that before we leap into any
           | new age of regulation.
        
         | Nition wrote:
         | "Climate Change is likely to mean more fires in the future, so
         | we've lit a small fire at everyone's house to give society time
         | to co-develop norms and safeguards."
        
         | soheil wrote:
         | Specially since they were originally supposed to be a non-
         | profit focused on AI safety and Sam Altman single-handedly
         | pivoting to a for-profit after taking all the donations and
         | partnering with probably the single most evil corporation that
         | has ever existed, Microsoft.
        
       | vinay_ys wrote:
       | co-develop := we are in f** around and f** out mode, please bear
       | with us.
        
       | karmasimida wrote:
       | I am not impressed by it at all ... Is it actually better than
       | the competitors?
        
       | belter wrote:
       | "...I felt a great disturbance in the algorithm... as if millions
       | of influencers, OnlyFans stars, and video creators suddenly cried
       | out in terror..."
        
       | submeta wrote:
       | What I desperately need is a model that generates perfectly made
       | PowerPoint slides. I have to create many presentations for
       | management, and it's a very time consuming task. It's easy to
       | outline my train of thoughts and let an LLM write the full text,
       | but then to create a convincing presentation slide by slide takes
       | days.
       | 
       | I know there is Beautiful.ai or Copilot for PowerPoint, but none
       | of the existing tools really work for me because the results and
       | the user flow aren't convincing.
        
         | buzzy_hacker wrote:
         | Have you checked out Marp? https://marp.app/
         | 
         | Basically it generates slides from markdown, which is great
         | even without LLMs. But you can get LLMs to output in
         | markdown/Marp format and then use Marp to generate the slides.
         | 
         | I haven't looked into complicated slides, but works well for
         | text-based ones.
        
         | ghita_ wrote:
         | there is a YC company that does that I think:
         | https://www.rollstack.com/ i've never used them but I think
         | they have many satisfied customers, maybe worth a shot!
        
         | jmugan wrote:
         | I want something that can take my ugly line drawing and make it
         | a cool looking line drawing without distorting the main idea
        
         | ShakataGaNai wrote:
         | Never used it but seen it mentioned in that space:
         | https://gamma.app/
        
       | brcmthrowaway wrote:
       | Sora? More like r/ShittyHDR
        
       | TiredOfLife wrote:
       | "I've come up with a set of rules that describe our reactions to
       | technologies:
       | 
       | 1. Anything that is in the world when you're born is normal and
       | ordinary and is just a natural part of the way the world works.
       | 
       | 2. Anything that's invented between when you're fifteen and
       | thirty-five is new and exciting and revolutionary and you can
       | probably get a career in it.
       | 
       | 3. Anything invented after you're thirty-five is against the
       | natural order of things."
       | 
       | -- Douglas Adams, The Salmon of Doubt: Hitchhiking the Galaxy One
       | Last Time
        
       | SillyUsername wrote:
       | No integration with ChatGPT is a lost opportunity and illustrates
       | no joined up thinking in all senses of that phrase.
       | Demonstrations, helping people with learning difficulties
       | visualise things, education purposes, story telling...
        
       | esskay wrote:
       | Great. More tools to continue the enshitification of everything
       | on the web.
        
       | matthewmorgan wrote:
       | "Sora is not available in The United Kingdom yet". Available
       | elsewhere, from Albania to Zimbabwe. Any particular reason why?
        
       | dcchambers wrote:
       | Meh. It's a cool POC and immediately useful for abstract imagery,
       | but not for anything realistic.
       | 
       | Looking forward to the onslaught of AI-generated slop filling
       | every video feed on the Internet. Maybe it's finally what's going
       | to kill things like TikTok, YT Shorts, Reels, etc. One can
       | hope...anyway.
        
       | ngd wrote:
       | What's next, Tiagra?
        
       | mkaic wrote:
       | A friendly reminder: if you have tech-illiterate people in your
       | life (parents, grandparents, friends, etc), _please_ reach out to
       | them and inform them about advances in AI text, image, audio, and
       | (as of very recently) video generation. Many folks are not aware
       | of what modern algorithms are capable of, and this puts them at
       | risk. GenAI makes it easier and cheaper than ever for bad actors
       | to create targeted, believable scams. Let your loved ones know
       | that it is possible to create believable images, audio, and
       | videos which may depict anything from "Politician Says OUTRAGEOUS
       | Thing!" to "a member of your own family is begging you for
       | money." The best defense you can give them is to make them aware
       | of what they're up against. These tools are currently the worst
       | they will ever be, and their capabilities will only grow in the
       | coming months and years. They are _already_ widely used by
       | scammers.
        
       | kylehotchkiss wrote:
       | Who is the audience for this product? A lot of people like video
       | because it's a way of experience something they currently cannot
       | for one reason or another. People don't want to see arbitrary
       | fake worlds or places on earth that aren't real. Unless it's
       | video game or something. But I see this product being used
       | primarily to trick Facebook users
       | 
       | I guess the CGI industry implications are interesting, but look
       | at the waves behind the AI generated man. They don't break so
       | much as dissolving into each other. There's always a tell. These
       | aren't GPU generated versions of reality with thought behind the
       | effects.
        
         | danielbln wrote:
         | > People don't want to see arbitrary fake worlds or places on
         | earth that aren't real.
         | 
         | Isn't there a multi-billion dollar industry in California
         | somewhere that caters exactly to that demand?
        
           | Klonoar wrote:
           | _> Unless it 's video game or something. _
           | 
           | The "or something" pretty much covers the gotcha you're
           | trying to use. OP is acknowledging that fantasy media is a
           | thing before going on to their actual point.
        
         | themagician wrote:
         | "People don't want to see arbitrary fake worlds or places on
         | earth that aren't real."
         | 
         | What? This is 90% of the Instagram/TikTok experience, and has
         | been for years. No one cares if something is real. They care
         | how it makes them feel.
         | 
         | The audience for this is every "creator" or "influecner". No
         | one cares if the content is fake. They'll sell you a vacation
         | package to a destination that doesn't exist and people will
         | still rate it 3/5 stars for a $15 Starbucks gift card.
        
         | jrflowers wrote:
         | > Who is the audience for this product?
         | 
         | Infants, people just coming out of anesthesia, the concussed,
         | the hypoxic, the mortally febrile and so on
        
         | sktrdie wrote:
         | To me this is what all AI feels like. People want "hard to make
         | things" because they feel special and unordinary. If anybody
         | with a prompt can do it, it ain't gonna sell
        
       | robomartin wrote:
       | Here's something I find interesting: We have multiple paid
       | accounts with OpenAI. In other words, we are paying customers. I
       | have yet to see a single announcement or new development that we
       | learn about through email. In most cases we learn these things
       | when they get covered by some online outfit, posted on HN, etc.
       | 
       | OpenAI isn't the only company that seems to act in this manner. I
       | find this to be interesting. Your paying customers actively want
       | to know about what you are doing and, more than likely, would
       | love to get a heads-up before the word goes out to the world.
       | Hearing about things from third parties can make you feel like a
       | company takes your business for grant it or does not deem it
       | important enough to feed you news when it happens.
       | 
       | Another example of this is Kickstarter, although, their problem
       | is different. I have only ever backed technology projects on KS.
       | That's all I am interested in. And yet, every single email they
       | send is full of projects that don't even begin to approach my
       | profile (built over dozens of backed projects). As a result of
       | this, KS emails have become spam to be deleted without even
       | reading them. This also means I have not backed projects I would
       | have seriously considered and I don't frequent the site as much
       | as I used to.
       | 
       | Getting back on topic: It will be interesting to see how Sora
       | usage evolves.
        
       | matco11 wrote:
       | Forget video. Imagine what this going to do for video-gaming
        
       | azinman2 wrote:
       | Technically it's amazing that this is possible at all. Yet I
       | don't see how the world is better off for it on net. Aside from
       | eliminating jobs in FX/filming/acting/set design/etc, what do we
       | really gain? Amateur filmmakers can be more powerful? How about
       | we put the same money into a fund for filmmakers to access. The
       | negatives are plentiful, from the mundane reduction of our media
       | to monolithic simulacra to putting the nail in the coffin for
       | truth to exist unchallenged, let alone the 'fine tunes' that will
       | continue to come for deepfakes that are literal (sexual)
       | harassment.
       | 
       | Humans are not built for this power to be in the hands of
       | everyone with low friction.
        
       | khushy wrote:
       | I can't wait for the safety features because I know there are
       | those in society that would do bad things. But not me, though.
       | I'd like the unlocked version.
        
       | soheil wrote:
       | That's now how the world is supposed to work, I wonder if there
       | is going to be long-term psychological effects if being exposed
       | to videos like these regularly. If our neurons are unable to
       | receive a stable stream of reality like we have for millions of
       | years will our brains become dysfunctional over time?
        
       | nox101 wrote:
       | what in particular is better as about this than
       | 
       | https://civitai.com/videos
        
       | ta2112 wrote:
       | The mammoths are walking over some pre-existing footprints, but
       | they don't leave any prints of their own. I guess I'm getting
       | hung up on little things. For a prompt of a few words, it looks
       | pretty nice!
        
       ___________________________________________________________________
       (page generated 2024-12-09 23:00 UTC)