[HN Gopher] Stable Diffusion 3
       ___________________________________________________________________
        
       Stable Diffusion 3
        
       Author : reqo
       Score  : 823 points
       Date   : 2024-02-22 13:20 UTC (9 hours ago)
        
 (HTM) web link (stability.ai)
 (TXT) w3m dump (stability.ai)
        
       | pqdbr wrote:
       | The sample images are absolutely stunning.
       | 
       | Also, I was blown away by the "Stable Diffusion" written on the
       | side of the bus.
        
         | kzrdude wrote:
         | Is it just me or is the stable diffusion bus image broken in
         | the background? The bus back there does not look logical w.r.t
         | placement and size relative to the sidewalk.
        
       | PcChip wrote:
       | The text/spelling part is a huge step forward
        
       | gat1 wrote:
       | I guess we do not know anything about the training dataset ?
        
         | _1 wrote:
         | It's ethical
        
           | kranke155 wrote:
           | "Ethical"
        
           | amirhirsch wrote:
           | The dataset is so ethical that it is actually just a press
           | release and not generally available.
        
           | wtcactus wrote:
           | Who decides what's ethical in this scenario? Is it some
           | independent entity?
        
             | potwinkle wrote:
             | I decided.
        
         | thelazyone wrote:
         | This is a good question - not only for the actual ethics of the
         | training, but for the future of AI use for art. It's both gonna
         | damage the livelyhood of many artists (me included, probably)
         | but also make it accessibly to many more people. As long as the
         | training dataset is ethical, I think fighting it is hard and
         | pointless.
        
           | yreg wrote:
           | What data would you consider making the dataset unethical vs.
           | ethical?
        
       | satisfice wrote:
       | Can it make a picture of a woman chasing a bear?
       | 
       | The old one can't.
        
         | cheald wrote:
         | SD 1.5 (using RealisticVision 5.1, 20 steps, Euler A) spit out
         | something technically correct (but hilarious) in just a few
         | generations.
         | 
         | "a woman chasing a bear, pursuit"
         | 
         | https://i.imgur.com/RqCXVYC.png
        
       | kbumsik wrote:
       | So there is no license information yet?
        
       | alexb_ wrote:
       | > We believe in safe, responsible AI practices. This means we
       | have taken and continue to take reasonable steps to prevent the
       | misuse of Stable Diffusion 3 by bad actors. Safety starts when we
       | begin training our model and continues throughout the testing,
       | evaluation, and deployment. In preparation for this early
       | preview, we've introduced numerous safeguards. By continually
       | collaborating with researchers, experts, and our community, we
       | expect to innovate further with integrity as we approach the
       | model's public release.
       | 
       | What exactly does this mean? Will we be able to see all of the
       | "safeguards" and access all of the technology's power without
       | someone else's restrictions on them?
        
         | Tiberium wrote:
         | For SDXL this meant that there were almost no NSFW (porn and
         | similar) images included in the dataset, so the community had
         | to fine-tune the model themselves to make it generate those.
        
           | hhjinks wrote:
           | The community would've had to do that anyway. The SD1.5-based
           | NSFW models of today are miles ahead of those from just a
           | year ago.
        
             | Der_Einzige wrote:
             | And the pony SDXL nsfw model is miles ahead of SD1.5 NSFW
             | models. Thank you bronies!
        
         | sschueller wrote:
         | No worries, the safeguards are only for the general public.
         | Criminals will have no issues going around them. /s
        
           | SXX wrote:
           | Criminals? We dont care about those.
           | 
           | Think of childern! We must stop people from generating porn!
        
       | willsmith72 wrote:
       | at this point perfect text would be a gamechanger if it can be
       | solved
       | 
       | midjourney 6 can be completely photorealistic and include valid
       | text, but also sometimes adds bad text. it's not much, but having
       | to use an image editor for that is still annoying. for creating
       | marketing material, getting perfect text every time and never
       | getting bad text would be amazing
        
         | falcor84 wrote:
         | I wonder if we could get it to generate a layered output, to
         | make it easy to change just the text layer. It already creates
         | the textual part in a separate pass, right?
        
           | deprecative wrote:
           | I would bet that Adobe is definitely salivating at that.
           | Might not be for a long time but it seems like a no brainer
           | once the technology can handle it. Just the last few years
           | have been fast and I interacted with the JS landscape for a
           | few years. It moves faster than Sonic and this tech iterates
           | quick.
        
           | spywaregorilla wrote:
           | Current open source tools include pretty decent off the shelf
           | segment anything based detectors. It leaves a lot to be
           | desired, but you do layer-like operations automatically
           | detecting certain concept and applying changes to them or,
           | less commonly exporting the cropped areas. But not the
           | content "beneath" the layers as they don't exist.
        
             | snovv_crash wrote:
             | Which tools would you recommend for this kind of thing?
        
               | spywaregorilla wrote:
               | comfyui + https://github.com/ltdrdata/ComfyUI-Impact-Pack
        
       | patates wrote:
       | Half of the announcement talks about safety. The next step will
       | be these control mechanisms being built into all sorts of
       | software I suppose.
       | 
       | It's "safe" for them, not for the users, at least they should
       | make that clear.
        
         | spir wrote:
         | thanks, i hadn't fully realized that 'safety' means 'safe to
         | offer' and not 'safe for users'. i won't forget it
        
         | wiz21c wrote:
         | They rather talk about "reasonable steps" to safety. Sounds
         | like "just the minimum so we don't end up in legal trouble" to
         | me...
        
         | tasty_freeze wrote:
         | There is some truth in what you say, just like saying you're a
         | "free speech absolutist" sounds good at first blush. But the
         | real world is more complicated, and the provider adds safety
         | features because they have to operate in the real world and not
         | just make superficial arguments about how things should work.
         | 
         | Yes, they are protecting themselves from lawsuits, but they are
         | also protecting other people. Preventing people asking for
         | specific celebrities (or children) having sex is for their
         | benefit too.
        
         | s1k3s wrote:
         | I truly wonder what "unsafe" scenarios an image generator could
         | be used for? Don't we already have software that can do pretty
         | much anything if a professional human is using it?
        
           | t_von_doom wrote:
           | I would say the barrier to entry is stopping a lot of
           | 'candid' unsafe behaviour. I think you allude to it yourself
           | in implying currently it requires a professional to achieve
           | the same results.
           | 
           | But giving that ability to _everyone_ will lead to a huge
           | increase in undesirable and targeted/local behaviour.
           | 
           | Presumably it enables any creep to generate what they want by
           | virtue of being able to imagine it and type it, rather than
           | learn a niche skill set or employ someone to do it (who is
           | then also complicit in the act)
        
             | hypocrticalCons wrote:
             | "undesirable local behavior"
             | 
             | Why don't you just say you believe thought crime should be
             | punishable?
        
               | KittenInABox wrote:
               | [Edited: I'm realizing the person I'm responding to is
               | kinda unhinged, so I'm retracting out of the convo.]
        
               | wongarsu wrote:
               | I imagine they might talk about things like students
               | making nudes of their classmates and distributing them.
               | 
               | Or maybe not. It's hard to tell when nobody seems to want
               | to spell out what behaviors we want to prevent.
        
               | hypocrticalCons wrote:
               | Students already share nudes every day.
               | 
               | Where are the Americans asking about Snapchat? If I were
               | a developer at Scnapchat I could prolly open a few Blob
               | Storage accounts and feed a darknet account big enough to
               | live off of. You people are so manipulatable.
        
               | jncfhnb wrote:
               | Students don't share photorealistic renders of nude
               | classmates getting gangbanged though
        
               | 4bpp wrote:
               | Would it be illegal for a student who is good at drawing
               | to paint a nude picture of an unknowing classmate and
               | distribute it?
               | 
               | If yes, why doesn't the same law apply to AI? If no, why
               | are we only concerned about it when AI is involved?
        
               | Cthulhu_ wrote:
               | Because AI lowers the barrier to entry; using your
               | example, few people have the drawing skills (or the
               | patience to learn them) or take the effort to make a
               | picture like that, but the barrier is much lower when it
               | takes five seconds of typing out a prompt.
               | 
               | Second, the tool will become available to anyone,
               | anywhere, not just a localised school. If generating
               | naughty nudes is frowned upon in one place, another will
               | have no qualms about it. And that's just things that are
               | about decency, then there's the discussion about
               | legality.
               | 
               | Finally, when person A draws a picture, they are
               | responsible for it - they produced it. Not the party that
               | made the pencil or the paper. But when AI is used to
               | generate it, is all of the responsibility still with the
               | person that entered the prompt? I'm sure the T's and C's
               | say so, but there may still be lawsuits.
        
               | 4bpp wrote:
               | Right, these are the same arguments against uncontrolled
               | empowerment that I imagine mass literacy and the printing
               | press faced. I would prefer to live in a society where
               | individual freedom, at least in the cognitive domain, is
               | protected by a more robust principle than "we have
               | reviewed the pros and cons of giving you the freedom to
               | do this, and determined the former to outweigh the latter
               | _for the time being_ ".
        
               | pixl97 wrote:
               | You seem to be very confused about civil versus criminal
               | penalties....
               | 
               | Feel free to make an AI model that does almost anything,
               | though I'd probably suggest that it doesn't make porn of
               | minors as that is criminal in most jurisdiction, short of
               | that it's probably not a criminal offense.
               | 
               | Most companies are only very slightly worried about
               | criminal offenses, they are far more concerned about
               | civil trials. There is a far lower requirement for
               | evidence. AI creator in email "Hmm, this could be
               | dangerous". That's all you need to lose a civil trial.
        
               | 4bpp wrote:
               | Why do you figure I would be confused? Whether any
               | liability for drawing porn of classmates is civil or
               | criminal is orthogonal to the AI comparison. The question
               | is if we would hold manufacturers of drawing tools or
               | software, or purveyors of drawing knowledge (such as
               | learn-to-draw books), liable, because they are playing
               | the same role as the generative AI does here.
        
               | pixl97 wrote:
               | Because you seem to be very confused on civil liabilities
               | in most products. Manufactures are commonly held liable
               | for the users use of products, for example look at any
               | number of products that have caused injury.
        
               | 4bpp wrote:
               | Surely those are typically when the manufacturer was
               | taken to have made an implicit promise of safety to the
               | user and their surroundings, and the user got injured. If
               | your fridge topples onto you and you get injured, the
               | manufacturer might be liable; if you set up a trap where
               | you topple your fridge onto a hapless passer-by, the
               | manufacturer will probably not be liable towards them.
               | Likewise with the classic McDonalds coffee spill
               | liability story - I've yet to hear of a case of a coffee
               | vendor being held liable over a deliberate attack where
               | someone splashed someone else with hot coffee.
        
               | Sohcahtoa82 wrote:
               | > You seem to be very confused about civil versus
               | criminal penalties....
               | 
               | Nah, I think it's a disagreement over whether a tool's
               | maker gets blamed for evil use or the tool's user.
               | 
               | It's a similar argument over whether or not gun
               | manufacturers should have any liability for their
               | products being used for murder.
        
               | pixl97 wrote:
               | >It's a similar argument over whether or not gun
               | manufacturers
               | 
               | This is really only a debate in the US and only because
               | it's directly written in the constitution. Pretty much no
               | other product works that way.
        
               | darkwater wrote:
               | Are we on the same HN that bashes
               | Facebook/Twitter/X/TikTok/ads because they manipulate
               | people, spread fake news or destroyed attention span?
        
               | SV_BubbleTime wrote:
               | Can you point to other crimes that are based on skill or
               | effort?
        
               | sssilver wrote:
               | Photoshop also lowers that barrier of entry compared to
               | pen and pencil. Paper also lowers the barrier compared to
               | oil canvas.
               | 
               | Affordable drawing classes and YouTube drawing tutorials
               | lower the barrier of entry as well.
               | 
               | Why on earth would manufacturers of pencils, papers,
               | drawing classes, and drawing software feel responsible
               | for censoring the result of combining their tool with the
               | brain of their customer?
               | 
               | A sharp kitchen knife significantly lowers the barrier of
               | entry to murder someone. Many murders are committed
               | everyday using a kitchen knife. Should kitchen knife
               | manufacturers blog about this every week?
        
               | freedomben wrote:
               | I agree with your point, but I would be willing to bet
               | that if knives were invented today rather than having
               | been around awhile, they would absolutely be regulated
               | and restricted to law enforcement if not military use.
               | Hell, even printers, maybe not if invented today but
               | perhaps in a couple years if we stay on the same
               | trajectory, would probably require some sort of ML to
               | refuse to print or "reproduce" unsafe content.
               | 
               | I guess my point is that I don't think we're as
               | inconsistent as a society as it seems when considering
               | things like knives. It's not even strictly limited to
               | thought crimes/information crimes. If alcohol were
               | discovered today , I have no doubt that it would be
               | banned and made schedule I
        
               | Sohcahtoa82 wrote:
               | > Hell, even printers, maybe not if invented today but
               | perhaps in a couple years if we stay on the same
               | trajectory, would probably require some sort of ML to
               | refuse to print or "reproduce" unsafe content.
               | 
               | Fun fact: Many scanners and photocopiers will detect that
               | you're trying to scan/copy a banknote and will refuse to
               | complete the scan. One of the ways is detecting the
               | EURion Constellation.
               | 
               | https://en.wikipedia.org/wiki/EURion_constellation
        
               | hospadar wrote:
               | IANAL but that sounds like harrassment, I assume the
               | legality of that depends on the context (did the artist
               | previously date the subject? lots of states have laws
               | against harassment and revenge porn that seem applicable
               | here [1]. are you coworkers? etc), but I don't see why
               | such laws wouldn't apply to AI generated art as well.
               | It's the distribution that's really the issue in most
               | cases. If you paint secret nudes and keep them in your
               | bedroom and never show them to anyone it's creepy, but I
               | imagine not illegal.
               | 
               | I'd guess that stability is concerned with their legal
               | liability, also perhaps they are decent humans who don't
               | want to make a product that is primarily used for
               | harassment (whether they are decent humans or not, I
               | imagine it would affect the bottom line eventually if
               | they develop a really bad rep, or a bunch of politicians
               | and rich people are targeted by deepfake harassment).
               | 
               | [1] https://www.cagoldberglaw.com/states-with-revenge-
               | porn-laws/...
               | 
               | ^ a lot of, but not all of those laws seem pretty
               | specific to photographs/videos that were shared with the
               | expectation of privacy and I'm not sure how they would
               | apply to a painting/drawing, and I certainly don't know
               | how the courts would handle deepfakes that are
               | indistinguishable from genuine photographs. I imagine
               | juries might tend to side with the harassed rather than a
               | bully who says "it's not illegal cause it's actually a
               | deepfake but yeah i obviously intended to harass the
               | victim"
        
               | AuryGlenz wrote:
               | That's not even necessarily a bad thing (as a whole -
               | individually it can be). Now, any leaked nudes can be
               | claimed to be AI. That'll probably save far more grief
               | than it causes.
        
               | polski-g wrote:
               | Such activity is legal per Ashcroft v Free Speech
               | Coalition (2002). Artwork cannot be criminalized because
               | of the contents of it.
        
               | nickthegreek wrote:
               | Artwork is currently criminalized because of its
               | contents. You cannot paint nude children engaged in sex
               | acts.
        
               | polski-g wrote:
               | The case I literally just referenced allows you to paint
               | nude children engaged in sex acts.
               | 
               | > The Ninth Circuit reversed, reasoning that the
               | government could not prohibit speech merely because of
               | its tendency to persuade its viewers to engage in illegal
               | activity.[6] It ruled that the CPPA was substantially
               | overbroad because it prohibited material that was neither
               | obscene _nor produced by exploiting real children, as
               | Ferber prohibited_.[6] The court declined to reconsider
               | the case en banc.[7] The government asked the Supreme
               | Court to review the case, and it agreed, noting that the
               | Ninth Circuit 's decision conflicted with the decisions
               | of four other circuit courts of appeals. Ultimately, _the
               | Supreme Court agreed with the Ninth Circuit_.
        
               | nickthegreek wrote:
               | I appreciate you taking the time to lay that out, I was
               | under the opposite impression for US law.
        
               | astrange wrote:
               | Stability is not an American company. The US is not the
               | only country in the world.
        
               | pixl97 wrote:
               | What do you mean should be... it 100% is.
               | 
               | In a large number of countries if you create an image
               | that represents a minor in a sexual situation you will
               | find yourself on the receiving side of the long arm of
               | the law.
               | 
               | If you are the maker of an AI model that allows this, you
               | will find yourself on the receiving side of the long arm
               | of the law.
               | 
               | Moreso, many of these companies operate in countries
               | where thought crime _is_ illegal. Now, you can argue that
               | said companies should not operate in those countries, but
               | companies will follow money every time.
        
               | hypocrticalCons wrote:
               | I think it's pretty important to specify that you have to
               | willingly seek and share all of these illegal items.
               | That's why this is so sketch. These things are being
               | baked with moral codes that'll _share_ the information,
               | incriminating everyone. Like why? Why not just let it
               | work and leave it up to the criminal to share their
               | crimes? People are such authoritarian shit-stains, and
               | acting like their existence is enough to justify their
               | stance is disgusting.
        
               | pixl97 wrote:
               | >I think it's pretty important to specify that you have
               | to willingly seek and share all of these illegal items.
               | 
               | This is not obvious at all when it comes to AI models.
               | 
               | >People are such authoritarian shit-stains
               | 
               | Yes, but this is a different conversation altogether.
        
               | mempko wrote:
               | Once it is outside your mind and in a physical form, is
               | it still just a thought sir?
        
               | hypocrticalCons wrote:
               | In my country there is legal precedent setting that
               | private, unshared documents are tantamount to thought.
        
           | Sharlin wrote:
           | Eh, a professional human could easily lockpick the majority
           | of front doors out there. Nevertheless I don't think we're
           | going to give up on locking our doors any time soon.
        
           | martiuk wrote:
           | Similar to why Google's latest image generator refuses to
           | produce a correct image of a 'Realistic, historically
           | accurate, Medieval English King'. They have guard rails and
           | system prompts set up to force the output of the generator
           | with the company's values, or else someone would produce Nazi
           | propaganda or worse. It (for some reason) would be attributed
           | to Google and their AI, rather than the user who found the
           | magic prompt words.
        
             | s1k3s wrote:
             | Yeah this is probably the most realistic reason
        
           | fragmede wrote:
           | For some scenarios, it's not the image itself but the
           | associations that the model might possibly make from being
           | fed a diet of 4chan and Stormfront's unofficial YouTube
           | channel. The worry is over horrible racist shit, like if you
           | ask it for a picture of a black person, and it outputs a
           | picture of a gorilla. Or if you ask it for a picture of a bad
           | driver, and it only manages to output pictures of Asian
           | women. I'm sure you can think up other horrible stereotypes
           | that would result in a PR disaster.
        
           | PeterisP wrote:
           | The major risky use cases for image generators are (a) sexual
           | imagery of kids and (b) public personalities in various
           | contexts usable for propaganda.
        
         | Spivak wrote:
         | It's also "safety" in the sense that you can deploy it as part
         | of your own application without human review and not have to
         | worry that it's gonna generate anything that will get you in
         | hot water.
        
         | matthewmacleod wrote:
         | I really wish that every discussion about a new model didn't
         | rapidly become a boring and shallow discussion about AI safety.
        
           | jprete wrote:
           | AI is not an engineered system; it's emergent behavior from a
           | system we can vaguely direct but do not fundamentally
           | understand. So it's natural that the boundaries of system
           | behavior would be a topic of conversation pretty much all the
           | time.
           | 
           | EDIT: Boring and shallow are, unfortunately, the Internet's
           | fault. Don't know what to do about those.
        
             | PeterisP wrote:
             | At least in some latest controversies (e.g. Gemini
             | generation of people) all of the criticized behavior was
             | _not_ emergent from ML training, but explicitly
             | intentionally engineered manually.
        
           | cypress66 wrote:
           | This announcement only mentions safety. What else do you
           | expect to talk about?
        
         | hedora wrote:
         | PSA: There are now calls to embed phone-home / remote kill
         | switch mechanisms into hardware because "AI safety".
        
           | newzisforsukas wrote:
           | examples? seems like it would be easier to instead
           | communicate with ISPs.
        
           | astrange wrote:
           | Hmm, all computers have remote kill switches in them unless
           | you have a generator at home.
        
         | root_axis wrote:
         | This is the world we live in. CYA is necessary. Politicians,
         | media organizations, activists and the parochial masses will
         | not brook a laissez faire attitude towards the generation of
         | graphic violence and illegal porn.
        
           | Sharlin wrote:
           | Not even legal porn, unfortunately. Or even the display of a
           | single female nipple...
        
             | realusername wrote:
             | looking at the manual censorship of the big channels on
             | youtube, you don't even need to display anything, just
             | suggesting it is enough to get a strike.
             | 
             | (of course unless you are into yoga, then everything is
             | permitted)
        
               | Sohcahtoa82 wrote:
               | > (of course unless you are into yoga, then everything is
               | permitted)
               | 
               | ...or children's gymnastics.
        
           | hypocrticalCons wrote:
           | > This is the world we live in.
           | 
           | Great talk about slavery and religious-persecution, Jim!
           | Wait, what were we talking about? Fucking American fascists
           | trying to control our thoughts and actions, right right.
        
         | hypocrticalCons wrote:
         | BTW Nvidia and AMD are baking safety mechanisms into the
         | fucking video drivers
         | 
         | No where is safe
        
           | jprete wrote:
           | Do you have a reference on this?
        
         | BryanLegend wrote:
         | From George Hotz on Twitter (https://twitter.com/realGeorgeHotz
         | /status/176060391883954211...)
         | 
         | "It's not the models they want to align, it's you."
        
           | jtr1 wrote:
           | What specific cases are being prevented by safety controls
           | that you think should be allowed?
        
             | Tomte wrote:
             | Not specifically SD, but DallE: I wanted to get an image of
             | a pure white British shorthair cat on the arm of a brunette
             | middle-aged woman by the balcony door, both looking
             | outside.
             | 
             | It wasn't important, just something I saw in the moment and
             | wanted to see what DallE makes of it.
             | 
             | Generation denied. No explanation given, I can only imagine
             | that it triggered some detector of sexual request?
             | 
             | (It wasn't the phrase "pure white", as far as I can tell,
             | because I have lots of generated pics of my cat in other
             | contexts)
        
             | bonton89 wrote:
             | Well for starters, ChatGPT shouldn't balk at creating
             | something "in Tim Burton's style" just because Tim Burton
             | complained about AI. I guess its fair use unless a select
             | rich person who owns the data complains. Seems like it
             | isn't fair use at all then, just theft from those who
             | cannot legally defend themselves.
        
               | archontes wrote:
               | Fair use is an exception to copyright. The issue here is
               | that it's _not_ fair use, because copyright simply _does
               | not apply_. Copyright explicitly does not, has never, and
               | will never protect style.
        
               | SamBam wrote:
               | Didn't Tom Waits successfully sue Frito Lay when the
               | company found an artist that could closely replicate his
               | style and signature voice, who sang a song for a
               | commercial that sounded very Tom Waits-y?
        
               | dangrossman wrote:
               | Yes, though explicitly not for copyright infringement.
               | Quoting the court's opinion, "A voice is not
               | copyrightable. The sounds are not 'fixed'." The case was
               | won under the theory of "voice misappropriation", which
               | California case law (Midler v Ford Motor Co) establishes
               | as a violation of the common law right of publicity.
        
               | aimor wrote:
               | Yes but that was not a copyright or trademark violation.
               | This article explained it to me:
               | 
               | https://grr.com/publications/hey-thats-my-voice-can-i-
               | sue-th...
        
               | bonton89 wrote:
               | That makes it even more ridiculous, as that means they
               | are giving rights to rich complaining people that no one
               | has.
               | 
               | Examples: Can you great an image of a cat in Tim Burton's
               | style? Oops! Try another prompt Looks like there are some
               | words that may be automatically blocked at this time.
               | Sometimes even safe content can be blocked by mistake.
               | Check our content policy to see how you can improve your
               | prompt.
               | 
               | Can you create an image of a cat in Wes Anderson's style?
               | Certainly! Wes Anderson's distinctive style is
               | characterized by meticulous attention to detail,
               | symmetrical compositions, pastel color palettes, and
               | whimsical storytelling. Let's imagine a feline friend in
               | the world of Wes Anderson...
        
               | astrange wrote:
               | ...in the US. Other countries don't have fair use.
        
             | rmi_ wrote:
             | Tell me what they mean by "safety controls" first. It's
             | very vaguely worded.
             | 
             | DALL-E, for example, wrongly denied serveral request of
             | mine.
        
               | Aeolun wrote:
               | I don't feel like it truly matters since they'll release
               | it and people will happily fine-tune/train all that
               | safety right back out.
               | 
               | It sounds like a reputation/ethics thing to me. You
               | probably don't want to be known as the company that
               | freely released a model that gleefully provides images of
               | dismembered bodies (or worse).
        
               | bergen wrote:
               | You are using someone elses propietary technology, you
               | have to deal with their limitations. If you don't like
               | there are endless alternatives.
               | 
               | "Wrongly denied" in this case depends on your point of
               | view, clearly DALL-E didn't want this combination of
               | words created, but you have no right for creation of
               | these prompts.
               | 
               | I'm the last one defending large monolithic corps, but if
               | you go to one and want to be free to do whatever you want
               | you are already starting from a very warped expectation.
        
             | AuryGlenz wrote:
             | As far as Stable Diffusion goes - when the released SD
             | 2.1/XL/Stable Cascade, you couldn't even make a (woman's)
             | nipple.
             | 
             | I don't use them for porn like a lot of people seem too,
             | but it seems weird to me that something that's kind of made
             | to generate art can't generate one of the most common
             | subjects in all of art history - nude humans.
        
               | b33j0r wrote:
               | For some reason its training thinks they are decorative,
               | I guess it's a pretty funny elucidation of how it works.
               | 
               | I have seen a lot of "pasties" that look like Sorry! game
               | pieces, coat buttons, and especially hell-forged
               | cybernetic plumbuses. Did they train it at an alien strip
               | club?
               | 
               | The LoRAs and VAEs work (see civit.ai), but do you really
               | want something named NSFWonly in your pipeline just for
               | nipples? Haha
        
               | Aeolun wrote:
               | I'm not sure if they updated them to rectify those "bugs"
               | but you certainly can now.
        
               | araes wrote:
               | I seem to have the opposite problem a lot of the time. I
               | tried using Meta's image gen tool, and had such a time
               | trying to get it to make art that was not "kind of"
               | sexual. It felt like Facebook's entire learning chain
               | must have been built on people's sexy images of their
               | girlfriend that's all now hidden in the art.
               | 
               | These were examples that were not super blatant, like a
               | tree landscape that just happens to have a human figure
               | and cave in their crotch. Examples:
               | 
               | https://i.imgur.com/RlH4NNy.jpg - Art is very focused on
               | the monster's crotch
               | 
               | https://i.imgur.com/0M8RZYN.jpg - The comparison should
               | hopefully be obvious
        
               | Fischgericht wrote:
               | Not meant in a rude way, but please consider that your
               | brain is making these up and you might need to see a
               | therapist. I can see absolutely nothing "kind of sexual"
               | in those two pictures.
        
               | astrange wrote:
               | I have in fact gotten a nude out of Stable Cascade. And
               | that's just with text prompting, the proper way to use
               | these is with multimodal prompting. I'm sure it can do it
               | with an example image.
        
             | slily wrote:
             | Parody and pastiche
        
             | miohtama wrote:
             | Generating images of nazis
             | 
             | https://www.theverge.com/2024/2/21/24079371/google-ai-
             | gemini...
        
             | stale2002 wrote:
             | Oh the big one would be models weights being released for
             | anyone to use or fine tune themselves.
             | 
             | Sure, the safety people lost that battle for Stable
             | diffusion and LLama. And because they lost, entire
             | industries were created by startups that could now use
             | models themselves, without it being locked behind someone
             | else's AI.
             | 
             | But it wasn't guaranteed to go that way. Maybe the
             | safetyists could have won.
             | 
             | I don't we'd be having our current AI revolution if
             | facebook or SD weren't the first to release models, for
             | anyone to use.
        
           | thefourthchime wrote:
           | No, it's the cacophony of zealous point scores on X they want
           | to avoid.
        
           | dang wrote:
           | We detached this subthread from
           | https://news.ycombinator.com/item?id=39466910.
        
         | wongarsu wrote:
         | What's equally interesting is that while they spend a lot of
         | words on safety, they don't actually say anything. The only
         | hint what they even mean by safety is that they took
         | "reasonable steps" to "prevent misuse by bad actors". But it's
         | hard to be more vague than that. I still have no idea what they
         | did and why they did it, or what the threat model is.
         | 
         | Maybe that will be part of future papers or the teased
         | technical report. But I find it strange to put so much emphasis
         | on safety and then leave it all up to the reader's imagination.
        
           | fortran77 wrote:
           | Remember when AI safety meant the computers weren't going to
           | kill us?
        
             | SV_BubbleTime wrote:
             | Now people spend a lot of time making them worse to ensure
             | we don't see boobs.
        
         | dmezzetti wrote:
         | Any large publicly available model has no choice but to do
         | this. Otherwise, they're petrified of a PR nightmare.
         | 
         | Models with a large user base will have an inverse relationship
         | with usability. That's why it's important to have options to
         | train your own with open source.
        
         | beefield wrote:
         | I get a slightly uncomfortable feeling with this talk about AI
         | safety. Not in the sense that there is anything wrong with that
         | (may be or may be not), but in the sense I don't understand
         | what people are talking about when they talk about safety in
         | this context. Could someone explain like I have Asperger
         | (ELIA?) whats this about? What are the "bad actors" possibly
         | going to do? Generate (child) porn/ images with violence etc.
         | and sell them? Pollute the training data so that the racist
         | images pops up when someone wants to get an image of a white
         | pussycat? Or produce images that contain vulnerabilities so
         | that when you open that in your browser you get compromised? Or
         | what?
        
           | Tadpole9181 wrote:
           | > Could someone explain like I have Asperger (ELIA?)
           | 
           |  _Excuse me?_
        
             | beefield wrote:
             | You sound offended. My apologies. I had no intention
             | whatsoever to offend anyone. Even if I am not diagnosed, I
             | think I am at least borderline somewhere in the spectrum,
             | and thought that would be a good way to ask people explain
             | without assuming I can read between the lines.
        
               | Tadpole9181 wrote:
               | Let's just stick with the widely understood "Explain Like
               | I'm 5" (ELI5). Nobody knows you personally, so this comes
               | off quite poorly.
        
               | beefield wrote:
               | I think ELI5 means that you simplify a complex issue so
               | that even a small kid understands it. In this case there
               | is no need to simplify anything, just explain what a term
               | actually means without assuming reader understanding
               | nuances of terms used. And I still do not quite get how
               | ELIA can be considered hostile, but given the feedback,
               | maybe I avoid it in the future.
        
               | Tadpole9181 wrote:
               | Saying "explain like I have <specific disability>" is
               | blatantly inappropriate. As a gauge: Would you say this
               | to your coworkers? Giving a presentation? Would you say
               | this in front of (a caretaker for) someone with Autism?
               | Especially since Asperger's hasn't even been used in
               | practice for, what, over a decade?
               | 
               | > In this case there is no need to simplify anything
               | 
               | Then just ask the question itself.
        
               | charcircuit wrote:
               | AI isn't a coworker, not a human so it's not as awkward
               | to talk about one's disability.
        
               | Tadpole9181 wrote:
               | I don't see how this is a response to anything I've said.
               | They're speaking to other humans and the original use of
               | their modified idiom isn't framed as if one were talking
               | about their own, personal disability.
        
           | vprcic wrote:
           | Just as an example:
           | 
           | https://arstechnica.com/information-
           | technology/2024/02/deepf...
        
           | Q6T46nT668w6i3m wrote:
           | The bad actor might be the model itself, e.g., returning
           | unwanted pornography or violence. Do you have a problem with
           | Google's SafeSearch?
        
           | reaperman wrote:
           | I'm not part of Stability AI but I can take a stab at this:
           | 
           | > explain like I have ~~Asperger (ELIA?)~~ limited
           | understanding of how the world _really_ works.
           | 
           | The AI is being limited so that it cannot produce any
           | "offensive" content which could end up on the news or go
           | viral and bring negative publicity to Stability AI.
           | 
           | Viral posts containing generated content that brings negative
           | publicity to Stability AI are fine as long as they're not
           | "offensive". For example, wrong number of fingers is fine.
           | 
           | There is not a comprehensive, definitive list of things that
           | are "offensive". Many of them we are aware of - e.g. nudity,
           | child porn, depictions of Muhammad. But for many things it
           | cannot be known a priori whether the current zeitgeist will
           | find it offensive or not (e.g. certain depictions of current
           | political figures, like Trump).
           | 
           | Perhaps they will use AI to help decide what might be
           | offensive if it does not explicitly appear on the blocklist.
           | They will definitely keep updating the "AI Safety" to cover
           | additional offensive edge cases.
           | 
           | It's important to note that "AI Safety", as defined above
           | (cannot produce any "offensive" content which could end up on
           | the news or go viral and bring negative publicity to
           | Stability AI) is not just about facially offensive content,
           | but also about offensive uses for milquetoast content.
           | Stability AI won't want news articles detailing how they're
           | used by fraudsters, for example. So there will be some guards
           | on generating things that look like scans of official
           | documents, etc.
        
             | beefield wrote:
             | So it's just fancy words for safety (legal/reputational)
             | for Stability AI, not users?
        
               | reaperman wrote:
               | Yes*. At least for the purposes of understanding what the
               | implementations of "AI safety" are most likely to entail.
               | I think that's a very good cognitive model which will
               | lead to high fidelity predictions.
               | 
               | *But to be slightly more charitable, I genuinely think
               | Stability AI / OpenAI / Meta / Google / MidJourney
               | believe that there is significant overlap in the set of
               | protections which are safe for the company, safe for
               | users, and safe for society in a broad sense. But I don't
               | think any released/deployed AI product focuses on the
               | latter two, just the first one.
               | 
               | Examples include:
               | 
               | Society + Company: Depictions of Muhammad could result in
               | small but historically significant moments of civil
               | strife/discord.
               | 
               | Individual + Company: Accidentally generating NSFW
               | content at work could be harmful to a user. Sometimes
               | your prompt won't seem like it would generate NSFW
               | content, but could be adjacent enough: e.g. "I need some
               | art in the style of a 2000's R&B album cover" (See: Sade
               | - Love Deluxe, Monica - Makings of Me, Rihanna -
               | Unapologetic, Janet Jackson - Damita Jo)
               | 
               | Society + Company: Preventing the product from being used
               | for fraud. e.g. CAPTCHA solving, fraudulent
               | documentation, etc.
               | 
               | Individual + Company: Preventing generation of child
               | porn. In the USA, this would likely be illegal both for
               | the user and for the company.
        
               | astrange wrote:
               | Their enterprise customers care even more than Stability
               | does.
        
         | mempko wrote:
         | I think this AI safety thing is great. These models will be
         | used by people to make boring art. The exciting art will be
         | left for people to make.
         | 
         | This idea of AI doing the boring stuff is good. Nothing
         | prevents you from making exciting, dangerous, or 'unsafe' art
         | on your own.
         | 
         | My feeling is that most people who are upset about AI safety
         | really just mean they want it to generate porn. And because it
         | doesn't, they are upset. But they hide it under the umbrella of
         | user freedom. You want to create porn in your bedroom? Then go
         | ahead and make some yourself. Nothing stopping you, the person,
         | from doing that.
        
         | TulliusCicero wrote:
         | I agree with you, but when companies don't implement these
         | things, they get absolutely trashed in the press & social
         | media, which I'm sure affects their business.
         | 
         | What would you have them do? Commit corporate suicide?
        
           | TylerLives wrote:
           | This is a good question. I think it would be best for them to
           | give some sort of signal, which would mean "We're doing this
           | because we have to. We are willing to change if you offer us
           | an alternative." If enough companies/people did this, at some
           | point change would become possible.
        
         | acomjean wrote:
         | I think this isn't software as much as a service. When viewed
         | through this lens the guard rails make more sense.
        
       | FloatArtifact wrote:
       | I'm curious to know if they're safeguards are eliminated when
       | users find tune the model?
        
         | pmx wrote:
         | There are some VERY nsfw model fine tunes available for other
         | versions of SD
        
           | witcH wrote:
           | such as?
        
             | mdrzn wrote:
             | Check out civitai.com for finetuned models for a wide range
             | of uses
        
               | AuryGlenz wrote:
               | I believe you need to be signed in to see the NSFW stuff,
               | for what it's worth.
        
       | 123yawaworht456 wrote:
       | >This preview phase, as with previous models, is crucial for
       | gathering insights to improve its performance and safety ahead of
       | an open release.
       | 
       | oh, for fuck's sake.
        
         | memossy wrote:
         | We did this for every stable diffusion release, you get the
         | feedback data to improve it continuously ahead of open release.
        
           | 123yawaworht456 wrote:
           | I was referring to 'safety'. how the hell can an image
           | generation model be dangerous? we had software for editing
           | text, images, videos and audio for half a century now.
        
             | Jensson wrote:
             | Advertisers will cancel you if you do anything they don't
             | like, 'safety' is to prevent that.
        
       | glimshe wrote:
       | This reinforces my impression that Google is at least one year
       | behind. Stunning images, 3D, video while Gemini had to be
       | partially halted this morning.
        
         | bamboozled wrote:
         | For "political" reasons, not for technical reasons. Don't get
         | it twisted.
        
           | coeneedell wrote:
           | I would describe those issues as technical. It's genuinely
           | getting things wrong because the "safety" element was
           | implemented poorly.
        
             | anononaut wrote:
             | Those are safety elements which exist for political
             | reasons, not technical ones.
        
           | ethbr1 wrote:
           | Of all criticism that could be leveled at Google, 'shipping a
           | product and supporting it' being the only thing that matters
           | seems fair.
           | 
           | Which takes _all_ the behind the scenes steps, not just the
           | technical ones.
        
           | verticalscaler wrote:
           | You think that technology is first. You think that
           | mathematicians and computer engineers or mechanical engineers
           | or doctors are first. They're very important, but they're not
           | first. They're second. Now I'll prove it to you.
           | 
           | There was a country that had the best mathematicians, the
           | best physicists, the best metallurgists in the world. But
           | that country was very poor. It's called the Soviet Union. But
           | when you took one of these mathematicians or physicists, who
           | was smuggled out or escaped, put him on a plane and brought
           | him to Palo Alto. Within two weeks, they were producing added
           | value that could produce great wealth.
           | 
           | What comes first is markets. If you have great technology
           | without markets, without a market-friendly economy, you'll
           | get nowhere. But if you have a market-friendly economy,
           | sooner or later the market forces will give you the
           | technology you want.
           | 
           | And that my friend, simply won't come from an office
           | paralyzed by internal politics of fear and conformity. Don't
           | get it twisted.
        
           | TulliusCicero wrote:
           | I mean, it's kind of both? Making Nazis look diverse isn't
           | just a political error, it's also a technical one. By
           | default, showing Nazis should show them as they actually
           | were.
        
             | astrange wrote:
             | There's a product for that called Google Image Search.
        
         | chickenpotpie wrote:
         | I don't think that's a fair comparison because they're
         | fulfilling substantially different niches. Gemini is a
         | conversational model that can generate images, but is mainly
         | designed for text. Stable Diffusion is only for images. If you
         | compare a model that can do many things and a model that can
         | only do images by how well they generate images, of course the
         | image generation model looks better.
         | 
         | Stability does have an LLM, but it's not provided in a unified
         | framework like Gemini is.
        
         | bluescrn wrote:
         | The public only see the 'safe' lobotomized versions of each,
         | though.
         | 
         | I wonder how far ahead the internal versions are?
        
       | treesciencebot wrote:
       | Quite nice to see diffusion transformers [0] becoming the next
       | dominant architecture on the generative media.
       | 
       | [0]: https://twitter.com/EMostaque/status/1760660709308846135
        
       | poulpy123 wrote:
       | Didn't they released another model few days ago ?
        
       | amelius wrote:
       | Does anyone know of a good tutorial on how diffusion models work?
        
         | Ologn wrote:
         | I liked this 18 minute video (
         | https://www.youtube.com/watch?v=1CIpzeNxIhU ). Computerphile
         | has other good videos with people like Brian Kernighan.
        
         | spaceheater wrote:
         | fast.ai has a whole free course
         | 
         | https://www.youtube.com/watch?v=_7rMfsA24Ls
         | https://course.fast.ai/Lessons/part2.html
        
         | jasonjmcghee wrote:
         | https://jalammar.github.io/illustrated-stable-diffusion/
         | 
         | His whole blog is fantastic. If you want more background (e.g.
         | how transformers work) he's got all the posts you need
        
           | amelius wrote:
           | This looks nice, thank you, but I'm looking for a more hands-
           | on tutorial, with e.g. Python code, like Andrej Karpathy
           | makes them.
        
             | astrange wrote:
             | SD3 is a new architecture using DiT (diffusion
             | transformers), so those would be out of date.
             | 
             | The older ones have drawbacks like not being able to spell.
        
               | ttul wrote:
               | Not too out of date. Just replace the magic UNet with a
               | DiT and squint. It's doing the same thing - removing
               | noise.
        
       | keiferski wrote:
       | The obsession with safety in this announcement feels like a
       | missed marketing opportunity, considering the recent Gemini
       | debacle. Isn't SD's primary use case the fact that you can
       | install it on your own computer and make what you want to make?
        
         | jsheard wrote:
         | At some point they have to actually make money, and I don't see
         | how continuously releasing the fruits of their expensive
         | training for people to run locally on their own computer (or a
         | competing cloud service) for free is going to get them there.
         | They're not running a charity, the walls will have to go up
         | eventually.
         | 
         | Likewise with Mistral, you don't get half a billion in funding
         | and a two billion valuation on the assumption that you'll keep
         | giving the product away for free forever.
        
           | keiferski wrote:
           | But there are plenty of other business models available for
           | open source projects.
           | 
           | I use Midjourney a lot and (based on the images in the
           | article) it's leaps and bounds beyond SD. Not sure why I
           | would switch if they are both locked down.
        
             | AuryGlenz wrote:
             | SD would probably be a lot better if they didn't have to
             | make sure it worked on consumer GPUs. Maybe this
             | announcement is a step towards that where the best model
             | will only be able to be accessed by most using a paid
             | service.
        
             | bee_rider wrote:
             | Is it possible to fine-tune Midjourney or produce a LORA?
        
               | keiferski wrote:
               | Sorry I don't know what that means, but a quick google
               | shows some results about it.
        
               | elbear wrote:
               | Finetune means to do extra training on the model with
               | your own dataset, for example to teach it to produce
               | images in a certain style.
        
               | nickthegreek wrote:
               | No. You can provide a photos to merge though.
        
           | archerx wrote:
           | Ironically their over sensitive nsfw image detector in their
           | api caused me to stop using it and run it locally instead. I
           | was using it to render animations of hundreds of frames but
           | when every 20th to 30th image comes out blurry it ruins the
           | whole animation and it would double the cost or more to
           | rerender it with a different seed hoping to not trigger the
           | over zealous blurring.
           | 
           | I don't mind that they don't want to let you generate nsfw
           | images but their detector is hopelessly broken, it once
           | censored a cube, yes a cube...
        
             | Sharlin wrote:
             | Unfortunately their financial and reputational incentives
             | are firmly aligned with preventing false negatives at the
             | cost of a lot of false positives.
        
               | archerx wrote:
               | Unfortunately I don't want to pay for hundreds if not
               | thousands of images I have to throw away because it
               | decided some random innocent element is offensive and
               | blurs the entire image.
               | 
               | Here is the red cube it censored because my innocent eyes
               | wouldn't be able to handle it;
               | https://archerx.com/censoredcube.png
               | 
               | What they are achieving with the over zealous safety
               | issues are driving developers to on demand GPU hosts that
               | will let them host their own models, which also opens up
               | a lot more freedom. I wanted to use the stability AI api
               | as my main source for Stable Diffusion but they make it
               | really really hard especially if you want use it as part
               | of your business.
        
             | TehCorwiz wrote:
             | Everyone always talks about Platonic Solids but never
             | Romantic Solids. /s
        
         | causal wrote:
         | Open source models can be fine-tuned by the community if
         | needed.
         | 
         | I would much rather have this than a company releasing models
         | this size into the wild without any safety checks whatsoever.
        
           | srid wrote:
           | Could you list the concrete "safety checks" that you think
           | prevents real-world harm? What particular image that you
           | think a random human will ask the AI to generate, which then
           | leads to concrete harm in the real world?
        
             | politician wrote:
             | Not even the large companies will explain with precision
             | their implementation of safety.
             | 
             | Until then, we must view this "safety" as both a scapegoat
             | and a vector for social engineering.
        
               | astrange wrote:
               | Companies are not going to explain their legal risks in
               | their marketing material.
        
             | causal wrote:
             | If 1 in 1,000 generations will randomly produce memorized
             | CSAM that slipped into the training set then yeah, it's
             | pretty damn unsafe to use. Producing memorized images has
             | precedent[0].
             | 
             | Is it unlikely? Sure, but worth validating.
             | 
             | [0] https://arxiv.org/abs/2301.13188
        
               | srid wrote:
               | Okay, by "safety checks" you meant the already unlawful
               | things like CSAM, but not politically-overloaded beliefs
               | like "diversity"? The latter is what the comment[1] you
               | were replying to was referring to (viz. "considering the
               | recent Gemini debacle"[2]).
               | 
               | [1] https://news.ycombinator.com/item?id=39466991
               | 
               | [2] https://news.ycombinator.com/item?id=39456577
        
               | causal wrote:
               | Right, by "rather have this [nothing]" I meant Stable
               | Diffusion doing some basic safety checking, not Google's
               | obviously flawed ideas of safety. I should have made that
               | clear.
               | 
               | I posed the worst-case scenario of generating actual CSAM
               | in response to your question, "What particular image that
               | you think a random human will ask the AI to generate,
               | which then leads to concrete harm in the real world?"
        
               | thomquaid wrote:
               | Could you elaborate on the concrete real world harm?
        
               | yreg wrote:
               | Why not run the safety check on the training data?
        
               | causal wrote:
               | They try to, but it is difficult to comb through billions
               | of images, and at least some of SD's earlier datasets
               | were later found to have been contaminated with CSAM[0].
               | 
               | https://www.404media.co/laion-datasets-removed-stanford-
               | csam...
        
               | dns_snek wrote:
               | Do you have an example? I've never heard of anyone
               | accidentally generating CSAM, with any model. "1 in
               | 1,000" is just an obviously bogus probability, there must
               | have been billions of images generated using hundreds of
               | different models.
               | 
               | Besides, and this is a serious question, what's the harm
               | of a model _accidentally_ generating CSAM? If you weren
               | 't intending to generate these images then you would just
               | discard the output, no harm done.
               | 
               | Nobody is forcing you to use a model that might
               | accidentally offend you with its output. You can try
               | "aligning" it, but you'll just end up with Google Gemini
               | style "Sorry I can't generate pictures of white people".
        
               | causal wrote:
               | Earlier datasets used by SD were likely contaminated with
               | CSAM[0]. It was unlikely to have been significant enough
               | to result in memorized images, but checking the safety of
               | models increases that confidence.
               | 
               | And yeah I think we should care, for a lot of reasons,
               | but a big one is just trying to stay well within the law.
               | 
               | [0] https://www.404media.co/laion-datasets-removed-
               | stanford-csam...
        
               | astrange wrote:
               | SD always removed enough nsfw material that this probably
               | never made it in there.
        
               | 7moritz7 wrote:
               | Then you know almost nothing about the SD 1.5 ecosystem
               | apparently. I've finetuned multiple models myself and
               | it's nearly impossible to get rid of the child-bias in
               | anime-derived models (which applies to 90 % of character
               | focussed models) including nsfw ones. Took me like 30
               | attempts to get somewhere reasonable and it's still
               | noticeable.
        
               | dns_snek wrote:
               | If we're being honest, anime and anything "anime-derived"
               | is uncomfortably close to CSAM as a source material,
               | before you even get SD involved, so I'm not surprised.
               | 
               | What I had in mind were regular general purpose models
               | which I've played around with quite extensively.
        
             | astrange wrote:
             | The harm is that any use of the model becomes illegal in
             | most countries (or offends credit card processors) if it
             | easily generates porn. Especially if it does it when you
             | didn't ask for it.
        
             | dyslexit wrote:
             | This question narrows the scope of "safety" to something
             | less than what the people at SD or even probably what OP
             | cares about. _Non-random_ CSAM requests targeting
             | potentially real people is the obvious answer here, but
             | even non-CSAM sexual content is also a probably a threat. I
             | can understand frustration with it currently going
             | overboard on blurring, but removing safety checks
             | altogether would result in SD mainly being associated with
             | porn pretty quickly, which I'm sure Stability AI wants to
             | avoid for the safety of their company.
             | 
             | Add to that, parents who want to avoid having their kids
             | generate sexual content would now need to prevent their
             | kids from using this tool because it can create it
             | randomly, limiting SD usage to kids 18+ (which is probably
             | something else Stability AI does not want to deal with.)
             | 
             | It's definitely a balance between going overboard and
             | having restrictions though. I haven't used SD in several
             | months now so I'm not sure where that balance is right now.
        
         | bluescrn wrote:
         | Before long we're going to need a new word for physical
         | 'safety' - when dealing with heavy machinery, chemicals, high
         | voltages, etc.
        
           | jiggawatts wrote:
           | Just replace "safety" with "puritan" in all of these
           | announcements and they'll make more sense.
        
         | AnthonyMouse wrote:
         | > the recent Gemini debacle.
         | 
         | I've noticed that SDXL does something a little odd. For a given
         | prompt it essentially decides what race the subject should be
         | without the prompt having specified one. You generate 20 images
         | with 20 different seeds but the same prompt and they're
         | typically all the same race. In some cases they even appear to
         | be the same "person" even though I doubt it's a real person (at
         | least not anyone I could recognize as a known public figure any
         | of the times it did this). I'm kind of curious what they
         | changed from SD 1.5, which _didn 't_ do this.
        
       | lreeves wrote:
       | People in this discussion seem to be hand-wringing about
       | Stability's "saftey" comments but every model they've released
       | has been fine tuned for porn in like 24 hours.
        
         | mopierotti wrote:
         | That's not entirely true. This wasn't the case for SD 2.0/2.1,
         | and I don't think SD 3.0 will be available publicly for fine
         | tuning.
        
           | lreeves wrote:
           | SD 2 definitely seems like an anomaly that they've learned
           | from though and was hard for everyone to use for various
           | reasons. SDXL and even Cascade (the new side-project model)
           | seems to be embraced by horny people.
        
           | viraptor wrote:
           | 2 is not popular because people have better quality results
           | with 1.5 and xl. That's it. If 3 is released and works
           | better, it will be fine tuned too.
        
       | londons_explore wrote:
       | All the demo images are 'artwork'.
       | 
       | will the model also be able to produce good photographs,
       | technical drawings, and other graphical media?
        
         | spywaregorilla wrote:
         | Photorealism is well within current capabilities. Technical
         | drawings absolutely not. Not sure what other graphical media
         | includes.
        
           | sweezyjeezy wrote:
           | Yeah but try getting e.g. Dall-E 3 to do photorealism, I
           | think they've RLHF'd the crap out of it in the name of
           | safety.
        
             | spywaregorilla wrote:
             | well that's what you get with closed ai.
        
             | astrange wrote:
             | That's not safety, the safety RLHF is because it tries to
             | generate porn and people with three legs if you don't stop
             | it.
             | 
             | It has the weird art style because that's what looks the
             | most "aesthetic". And because it doesn't actually have
             | nearly as good enough data as you'd think it does.
             | 
             | Sora looks like it could be better.
        
           | Jensson wrote:
           | > Not sure what other graphical media includes.
           | 
           | I'd want a model that can draw website designs and other UIs
           | well. So I give it a list of things in the UI, and I get back
           | a bunch of UI design examples with those elements.
        
             | spywaregorilla wrote:
             | I'm gonna hazard a guess and say well within the
             | capabilities of a fine tuned model, but that no such fine
             | tuned model exists and the labeled data required to
             | generate it is not really there.
        
             | dmalik wrote:
             | You'd have better luck with an LLM with
             | HTML/JavaScript/CSS.
        
             | astrange wrote:
             | https://www.usegalileo.ai/explore
             | 
             | https://v0.dev
        
             | senseiV wrote:
             | Theres a startup doing that named galileo_ai
        
         | Sharlin wrote:
         | Photographs, digital illustrations, comic or cartoon style
         | images, whatever graphical style you can imagine are all easy
         | to achieve with current models (though no single model is a
         | master of all trades). Things that look like technical drawings
         | are as well, but don't expect them to make any sense
         | engineering-wise unless maybe if you train a finetune
         | specifically for that purpose.
        
         | Fervicus wrote:
         | And will the model also pretend that a certain particular race
         | doesn't exist?
        
       | londons_explore wrote:
       | I really wonder what harm would come to the company if they
       | didn't talk about safety?
       | 
       | Would investors stop giving them money? Would users sue that they
       | now had PTSD after looking at all the 'unsafe' outputs? Would
       | regulators step in and make laws banning this 'unsafe' AI?
       | 
       | What is it specifically that company management is worried about?
        
         | brainwipe wrote:
         | All of the above! Additionally... I think AI companies are
         | trying to steer the conversation about safety so that when
         | regulations do come in (and they will) that the legal
         | culpability is with the user of the model, not the trainer of
         | it. The business model doesn't work if you're liable for harm
         | caused by your training process - especially if the harm is
         | already covered by existing laws.
         | 
         | One example of that would be if your model was being used to
         | spot criminals in video footage and it turns out that the bias
         | of the model picks one socioeconomic group over another. Most
         | western nations have laws protecting the public against that
         | kind of abuse (albeit they're not applied fairly) and the fines
         | are pretty steep.
        
           | graphe wrote:
           | They have already used "AI" with success to give people loans
           | and they were biased. Nothing happened legally to that
           | company.
        
         | dorkwood wrote:
         | They're attempting to guard themselves against incoming
         | regulation. The big players, such as Microsoft, want to squash
         | Stable Diffusion while protecting themselves, and they're going
         | to do it by wielding the "safety is important and only we have
         | the resources to implement it" hammer.
        
           | HeatrayEnjoyer wrote:
           | Safety is a _very_ real concern, always has been in ML
           | research. I 'm tired of this trite "they want a moat"
           | narrative.
           | 
           | I'm glad tech orgs are for once thinking about what they're
           | building before putting out society-warping democracy-
           | corroding technology instead of move fast break things.
        
             | dorkwood wrote:
             | It doesn't strike you as hypocritical that they all talk
             | about safety while continuing to push out tech that's
             | upending multiple industries as we speak? It's tough for me
             | to see it as anything other than lip service.
             | 
             | I'd be on your side if any of them actually chose to keep
             | their technology in the lab instead of tossing it out into
             | the world and gobbling up investment dollars as fast as
             | they could.
        
               | tavavex wrote:
               | How are these two things related at all? When AI
               | companies speak of safety, it's almost always about the
               | "only including data a religious pastor would find safe,
               | and filtering outputs" angle. How's the market and other
               | industries relevant at all? Should AI companies be
               | obligated to care about what happens to other companies?
               | With that point of view, we should've criticized the
               | iPhone for upending the PDA market, or Wacom for
               | "upending" the traditional art market.
        
             | rwmj wrote:
             | That would make sense if it was in the slightest about
             | avoiding "society-warping democracy-corroding technology".
             | Rather than making sure no one ever sees a naked person
             | which would cause governments to come down on them like a
             | ton of bricks.
        
               | ryandrake wrote:
               | This would be funny if we weren't living it.
               | 
               | Software that promotes the unchecked spread of
               | propaganda, conspiracy theories, hostility, division,
               | institutional mistrust and so on: A-OK.
               | 
               | Software that might show a boob: Totally irresponsible
               | and deserving of harsh regulation.
        
             | atahanacar wrote:
             | Safety from what? Human anatomy?
        
               | bergen wrote:
               | See the recent Taylor Swift scandal. Safety from never
               | ending amounts of deepfake porn and gore for example.
        
               | atahanacar wrote:
               | This isn't a valid concern in my opinion. Photo
               | manipulation has been around for decades. People have
               | been drawing other people for centuries.
               | 
               | Also, where do we draw the line? Should Photoshop stop
               | you from manipulating human body because it could be used
               | for porn? Why stop there, should text editors stop you
               | from writing about sex or describing human body because
               | it could be used for "abuse". Should your comment be
               | removed because it make me imagine Taylor Swift without
               | clothes for a brief moment?
        
               | spencerflem wrote:
               | Doing it effortlessly and instantly makes a difference.
               | 
               | (This applies to all AI discussions)
        
               | bergen wrote:
               | No, but AI requires zero learning curve and can be
               | automated. I can't spit out 10 images of Tay per second
               | in photoshop. If I want and the API delivers I can easily
               | do that with AI. (Given, would one becoding this it
               | requires a learning curve, but in principal with the
               | right interface and they exist i can churn out hundreds
               | of images without me actively putting work in)
        
               | tavavex wrote:
               | I've never understood the argument about image generators
               | being (relatively) fast. Does that mean that if you could
               | Photoshop 10 images per second, we should've started
               | clamping down on Photoshop? What exact speed is the
               | cutoff mark here? Given that Photoshop is updated every
               | year and includes more and more tools that can accelerate
               | your workflow (incl. AI-assisted ones), is there going be
               | a point when it gets too fast?
               | 
               | I don't know much about the initial scandal, but I was
               | under the impression that there was only a small number
               | of those images, yet that didn't change the situation. I
               | just fail to see how quantity factors into anything here.
        
               | bergen wrote:
               | >I just fail to see how quantity factors into anything
               | here.
               | 
               | Because you can overload any online discussion / sphere
               | with that. There were so many that X effectively banned
               | searching for her at all because if you did, you where
               | overwhelmed by very extreme fake porn. Everybody can do
               | it with very low entry barrier, it looks very believable,
               | and it can be generated in high quantities.
               | 
               | We shouldn't have clamped down on photoshop, but
               | realisticly two things would be nice in your theoretical
               | case, usage restrictions and public information building.
               | There was no clear cut point where photoshop was so
               | mighty you couldn't trust any picture online. There were
               | skills to be learned and people could identify the
               | trickery, and it was on a very small scale and gradual.
               | And the photo trickery was around for ages, even Stalin
               | did it.
               | 
               | But creating photorealistic fakes in an automated fashion
               | is completely new.
        
               | tavavex wrote:
               | But when we talk about specifically harming one person,
               | does it really matter if it's a thousand different
               | generations of the same thing or 10 generations that were
               | copied thousands of times? It is a technology that lowers
               | the bar for generating believable-looking things, but I
               | don't know if it's the speed that is the main culprit
               | here.
               | 
               | And in fairness to generative AI, even nowadays it feels
               | like getting to a point of true photorealism takes some
               | effort, especially if the goal is letting it just run
               | nonstop with no further curation. And getting a local
               | image generator to run at all on your computer (and
               | having the hardware for it) is also a bar that plenty of
               | people can't clear yet. Photoshop is kind of different in
               | that making more believable things requires a lot more
               | time, effort and knowledge - but the idea that any image
               | online can be faked has already been ingrained in the
               | public consciousness for a very long time.
        
               | spencerflem wrote:
               | Yes, if you could Photoshop 10/sec it would be a problem.
               | 
               | Think of it this way, if one out of every ten phone calls
               | you get is spam, you still have a pretty useable phone.
               | Three orders of magnitude different and 1 out of every
               | 100 calls is real and the system totally breaks down.
               | 
               | Generative AI makes generating realistic looking fakes
               | ~1000x easier, its the one thing its best at.
        
               | kristopolous wrote:
               | That's fine. But the question was what are they referring
               | to and that's the answer.
        
               | chasd00 wrote:
               | > See the recent Taylor Swift scandal
               | 
               | but that's not dangerous. It's definitely worthy of
               | unlocking the cages of the attack lawyers but it's not
               | dangerous. The word "safety" is being used by big tech to
               | trigger and gas light society.
        
               | shrimp_emoji wrote:
               | I.e., controlling through fear
        
             | jquery wrote:
             | To the extent these models don't blindly regurgitate hate
             | speech, I appreciate that. But what I do not appreciate is
             | when they won't render a human nipple or other human
             | anatomy. That's not safety, and calling it such is
             | gaslighting.
        
           | ballenf wrote:
           | AI/ML/GPT/etc are looking increasingly like other media
           | formats -- a source of mass market content.
           | 
           | The safety discussion is proceeding very much like it did for
           | movies, music, and video games.
        
         | bitcurious wrote:
         | The latter; there is already an executive order around AI
         | safety. If you don't address it out loud you'll draw attention
         | to yourself.
         | 
         | https://www.whitehouse.gov/briefing-room/presidential-action...
        
         | memossy wrote:
         | As the leader in open image models it is incumbent upon us as
         | the models get to this level of quality to take seriously how
         | we can release open and safe models from a legal, societal and
         | other considerations.
         | 
         | Not engaging in this will indeed lead to bad laws, sanctions
         | and more as well as not fulfilling our societal obligations of
         | ensuring this amazing technology is used for as positive
         | outcomes as possible.
         | 
         | Stability AI was set up to build benchmark open models of all
         | types in a proper way, this is why for example we are one of
         | the only companies to offer opt out of datasets (stable cascade
         | and SD3 are opted out), have given millions of supercompute
         | hours in grants to safety related research and more.
         | 
         | Smaller players with less uptake and scrutiny don't need to
         | worry so much about some of these complex issues, it is quite a
         | lot to keep on top of, doing our best.
        
           | GenerWork wrote:
           | >it is incumbent upon us as the models get to this level of
           | quality to take seriously how we can release open and safe
           | models from a legal, societal and other considerations.
           | 
           | Can you define what you mean by "societal and other
           | considerations"? If not, why not?
        
             | memossy wrote:
             | I could but I won't as legal stuff :)
        
           | zmgsabst wrote:
           | "We need to enforce our morality on you, for our beliefs are
           | the true ones -- and you're unsafe for questioning them!"
           | 
           | You sound like many authoritarian regimes.
        
             | memossy wrote:
             | I mean open models yo
        
         | shapefrog wrote:
         | > What is it specifically that company management is worried
         | about?
         | 
         | As with all hype techs, even the most talented management are
         | barely literate in the product. When talking about their new
         | trillion $ product they must take their talking points from the
         | established literature and "fake it till they make it".
         | 
         | If the other big players say "billions of parameters" you chuck
         | in as many as you can. If the buzz words are "tokens" you say
         | we have lots of tokens. If the buzz words are "safety" you say
         | we are super safe. You say them all and hope against hope that
         | nobody asks a simple question you are not equipped to answer
         | that will show you dont actually know what you are talking
         | about.
        
         | chasd00 wrote:
         | they risk reputational harm and since there's so many
         | alternatives outright "brand cancellation". For example, vocal
         | groups can lobby payment processors to deny service to any AI
         | provider deemed unworthy. Ironic that tech enabled all of that
         | behavior to begin with and now they're worried about it turning
         | on them.
        
           | tavavex wrote:
           | What viable alternatives are there to Stable Diffusion? As
           | far as I know, it's _the_ only way to run good image
           | generation locally, and that 's probably a big consideration
           | for any business dabbling in it.
        
             | astrange wrote:
             | It's not the only open image model. It is the best one, but
             | it's not the only one.
        
               | tavavex wrote:
               | Yeah, the word "good" is doing the heavy lifting here -
               | while it's not the only one that can do it, it has a very
               | comfortable lead over all alternatives.
        
         | renewiltord wrote:
         | It's a bit rich when HN itself is chock full with camp
         | followers who pick the most mainstream opinion. Previously it
         | was AI danger, then it became hallucinations, now it's that
         | safety is too much.
         | 
         | The rest of the world is also like that. You can make a thing
         | that hurts your existing business. Spinning off the brand is
         | probably Google's best bet.
        
         | summerlight wrote:
         | Likely public condemnation followed by unreasonable regulations
         | when populists see their campaign opportunities. We've
         | historically seen this when new types of media (e.g. TV,
         | computer games) debut and there are real, early signals of such
         | actions.
         | 
         | I don't think those companies being cautious is necessarily a
         | bad thing even for AI enthusiasts. Open source models will
         | quickly catch up without any censorship while most of those
         | public attacks are concentrated into those high profile
         | companies, which have established some defenses. That would be
         | a much cheaper price than living with some unreasonable degree
         | of regulations over decades, driven by populist politicians.
        
         | bluescrn wrote:
         | It's an election year.
         | 
         | They're probably more concerned about generated images of
         | politicians in 'interesting' sitations going viral than they
         | are about porn/gore etc.
        
           | astrange wrote:
           | Stability is not an American company.
        
       | inference-lord wrote:
       | Cool but it's hard to keep getting "blown away" at this stage.
       | The "incredible" is routine now.
        
         | danparsonson wrote:
         | So... they should just stop?
        
         | dougmwne wrote:
         | At this point, the next thing that will blow me away is AGI at
         | human expert level or a Gaussian Splat diffusion model that can
         | build any arbitrary 3D scene from text or a single image. High
         | bar, but the technology world is already full of dark magic.
        
           | attilakun wrote:
           | Is there a Guassian splat model that works without the
           | "Structure from Motion" step to extract the point cloud? That
           | feels a bit unsatisfying to me.
        
           | inference-lord wrote:
           | Will ask it for immortality, endless wealth, and still get
           | bored.
        
           | consumer451 wrote:
           | I would be a big fan of solid infographics or presentation
           | slides. That would be very useful.
        
       | JonathanFly wrote:
       | From: https://twitter.com/EMostaque/status/1760660709308846135
       | 
       | Some notes:
       | 
       | - This uses a new type of diffusion transformer (similar to Sora)
       | combined with flow matching and other improvements.
       | 
       | - This takes advantage of transformer improvements & can not only
       | scale further but accept multimodal inputs..
       | 
       | - Will be released open, the preview is to improve its quality &
       | safety just like og stable diffusion
       | 
       | - It will launch with full ecosystem of tools
       | 
       | - It's a new base taking advantage of latest hardware & comes in
       | all sizes
       | 
       | - Enables video, 3D & more..
       | 
       | - Need moar GPUs..
       | 
       | - More technical details soon
       | 
       | >Can we create videos similar like sora
       | 
       | Given enough GPUs and good data yes.
       | 
       | >How does it perform on 3090, 4090 or less? Are us mere mortals
       | gonna be able to have fun with it ?
       | 
       | Its in sizes from 800m to 8b parameters now, will be all sizes
       | for all sorts of edge to giant GPU deployment.
       | 
       | (adding some later replies)
       | 
       | >awesome. I assume these aren't heavily cherry picked seeds?
       | 
       | No this is all one generation. With DPO, refinement, further
       | improvement should get better.
       | 
       | >Do you have any solves coming for driving coherency and
       | consistency across image generations? For example, putting the
       | same dog in another scene?
       | 
       | yeah see @Scenario_gg's great work with IP adapters for example.
       | Our team builds ComfyUI so you can expect some really great stuff
       | around this...
       | 
       | >Dall-e often doesn't even understand negation, let alone complex
       | spatial relations in combination with color assignments to
       | objects.
       | 
       | Imagine the new version will. DALLE and MJ are also pipelines,
       | you can pretty much do anything accurately with pipelines now.
       | 
       | >Nice. Is it an open-source / open-parameters / open-data model?
       | 
       | Like prior SD models it will be open source/parameters after the
       | feedback and improvement phase. We are open data for our LMs but
       | not other modalities.
       | 
       | >Cool!!! What do you mean by good data? Can it directly output
       | videos?
       | 
       | If we trained it on video yes, it is very much like the arch of
       | sora.
        
         | cheald wrote:
         | SD 1.5 is 983m parameters, SDXL is 3.5b, for reference.
         | 
         | Very interesting. I've been streching my 12GB 3060 as far as I
         | can; it's exciting that smaller hardware is still usable even
         | with modern improvements.
        
           | memossy wrote:
           | 800m is good for mobile, 8b for graphics cards.
           | 
           | Bigger than that is also possible, not saturated yet but need
           | more GPUs.
        
             | vorticalbox wrote:
             | you ca also quantisation which lowers memory requirements
             | at a small lose of performance.
        
           | liuliu wrote:
           | I am going to look at quantization for 8b. But also, these
           | are transformers, so variety of merging / Frankenstein-tune
           | is possible. For example, you can use 8b model to populate
           | the KV cache (which computes once, so can load from slower
           | devices, such as RAM / SSD) and use 800M model for diffusion
           | by replicating weights to match layers of the 8b model.
        
           | ttul wrote:
           | Stability has to make money somehow. By releasing an 8B
           | parameter model, they're encouraging people to use their paid
           | API for inference. It's not a terrible business decision. And
           | hobbyists can play with the smaller models, which with some
           | refining will probably be just fine for most non-professional
           | use cases.
        
             | jandrese wrote:
             | I would LOL if they released the "safe" model for free but
             | made you pay for the one with boobs.
        
               | ttul wrote:
               | Oh they'll never let you pay for porn generation. But
               | they will happily entertain having you pay for quality
               | commercial images that are basically a replacement for
               | the entire graphic design industry.
        
             | teaearlgraycold wrote:
             | Don't people quantize SD down to 8 bits? I understand
             | plenty of people don't have 8GB of VRAM (and I suppose you
             | need some extra for supplemental data, so maybe 10GB?). But
             | that's still well within the realm of consumer hardware
             | capabilities.
        
               | ttul wrote:
               | I'm the wrong person to ask, but it seems Stability
               | intends to offer models from 800M to 8B parameters in
               | size, which offers something for everyone.
        
         | netdur wrote:
         | > - Need moar GPUs..
         | 
         | Why is there not a greater focus on quantization to optimize
         | model performance, given the evident need for more GPU
         | resources?
        
           | supermatt wrote:
           | I believe he means for training
        
           | memossy wrote:
           | We have highly efficient models for inference and a
           | quantization team.
           | 
           | Need moar GPUs to do a video version of this model similar to
           | Sora now they have proved that Diffusion Transformers can
           | scale with latent patches (see stablevideo.com and our work
           | on that model, currently best open video model).
           | 
           | We have 1/100th of the resources of OpenAI and 1/1000th of
           | Google etc.
           | 
           | So we focus on great algorithms and community.
           | 
           | But now we need those GPUs.
        
             | sylware wrote:
             | Don't fall for it: OpenAI is microsoft. They have as much
             | as google, if not more.
        
               | px43 wrote:
               | To be clear here, you think that Microsoft has more AI
               | compute than Google?
        
               | Jensson wrote:
               | Google got cheap TPU chips, means they circumvent the
               | extremely expensive Nvidia corporate licenses. I can
               | easily see them having 10x the resources of OpenAI for
               | this.
        
               | SV_BubbleTime wrote:
               | This isn't OpenAI that make GPTx.
               | 
               | It's StabilityAI that makes Stable Diffusion X.
        
               | pavon wrote:
               | Yes, they have deep pockets and could increase investment
               | if needed. But the actual resources devoted today are
               | public, and in line with the parent said.
        
             | Solvency wrote:
             | can someone explain why nVidia doesn't just hold their own
             | AI? And literally devote 50% of their production to their
             | own compute center? In an age where even ancient companies
             | like Cisco are getting in the AI race, why wouldn't the
             | people with the keys to the kingdom get involved?
        
               | downWidOutaFite wrote:
               | 1. the real keys to the kingdom are held by TSMC whose
               | fab capacity rules the advanced chips we all get, from
               | NVIDIA to Apple to AMD to even Intel these days.
               | 
               | 2. the old advice is to sell shovels during a gold rush
        
               | chompychop wrote:
               | "The people that made the most money in the gold rush
               | were selling shovels, not digging gold".
        
               | swamp40 wrote:
               | Jensen was just talking about a new kind of data center:
               | AI-generation factories.
        
               | blihp wrote:
               | Because history has shown that the money is in selling
               | the picks and shovels, not operating the mine. (At least
               | for now. There very well may come a point later on when
               | operating the mine makes more sense, but not until it's
               | clear where the most profitable spot will be)
        
               | declaredapple wrote:
               | They've been very happy selling shovels at a steep margin
               | to literally endless customers.
               | 
               | The reason is because they instantly get a risk free
               | guaranteed VERY healthy margin on every card they sell,
               | and there's endless customers lined up for them.
               | 
               | If they kept the cards, they give up the opportunity to
               | make those margins, and instead take the risk that
               | they'll develop a money generating service (that makes
               | more money then selling the cards).
               | 
               | This way there's no risk of: A competitor out competing
               | them, not successfully developing a profitable product,
               | "the ai bubble popping", stagnating development, etc.
               | 
               | There's also the advantage that this capital has allowed
               | them to buy up most of TSMC's production capacity, which
               | limits the competitors like Google's TPUs.
        
           | AnthonyMouse wrote:
           | > Why is there not a greater focus on quantization to
           | optimize model performance, given the evident need for more
           | GPU resources?
           | 
           | There is an inherent trade off between model size and
           | quality. Quantization reduces model size at the expense of
           | quality. Sometimes it's a better way to do that than reducing
           | the number of parameters, but it's still fundamentally the
           | same trade off. You can't make the highest quality model use
           | the smallest amount of memory. It's information theory, not
           | sorcery.
        
         | sandworm101 wrote:
         | >> all sorts of edge to giant GPU deployment.
         | 
         | Soon the GPU and its associated memory will be on different
         | cards, as once happened with CPUs. The day of the GPU with ram
         | _slots_ is fast approaching. We will soon plug terabytes of ram
         | into our 4090s, then plug a half-dozen 4090s into a raspberry
         | PI to create a Cronenberg rendering monster. Can it generate
         | movies faster than Pixar can write them? Sure. Can it play
         | Factorio? Heck no.
        
           | jsheard wrote:
           | Any seperation of a GPU from its VRAM is going to come at the
           | expense of (a lot of) bandwidth. VRAM is only as fast as it
           | is because the memory chips are as close as possible to the
           | GPU, either on seperate packages immediately next to the GPU
           | package or integrated onto the same package as the GPU itself
           | in the fanciest stuff.
           | 
           | If you don't care about bandwidth you can already have a GPU
           | access terabytes of memory across the PCIe bus, but it's too
           | slow to be useful for basically anything. Best case you're
           | getting 64GB/sec over PCIe 5.0 x16, when VRAM is reaching
           | _3.3TB /sec_ on the highest end hardware and even mid-range
           | consumer cards are doing >500GB/sec.
           | 
           | Things are headed the other way if anything, Apple and Intel
           | are integrating RAM onto the CPU package for better
           | performance than is possible with socketed RAM.
        
             | mysterydip wrote:
             | Is there a way to partition the data so that a given GPU
             | had access to all the data it needs but the job itself was
             | parallelized over multiple GPUs?
             | 
             | Thinking on the classic neural network for example, each
             | column of nodes would only need to talk to the next column.
             | You could group several columns per GPU and then each would
             | process its own set of nodes. While an individual job would
             | be slower, you could run multiple tasks in parallel,
             | processing new inputs after each set of nodes is finished.
        
               | zettabomb wrote:
               | Of course, this is common with LLMs which are too large
               | to fit in any single GPU. I believe Deepspeed implements
               | what you're referring to.
        
             | sandworm101 wrote:
             | That depends on whether performance or capacity is the
             | goal. Smaller amounts of ram closer to the processing unit
             | makes for faster computation, but AI also presents a
             | capacity issue. If the workload needs the space, having a
             | boatload of less-fast ram is still preferable to offloading
             | data to something more stable like flash. That is where
             | bulk memory modules connected though slots may one day
             | appear on GPUs.
        
               | duffyjp wrote:
               | I'm having flashbacks to owning a Matrox Millenium as a
               | kid. I never did get that 4MB vram upgrade.
               | 
               | https://www.512bit.net/matrox/matrox_millenium.html
        
           | ltbarcly3 wrote:
           | I don't think you really understand the current trends in
           | computer architecture. Even cpus are being moved to have on
           | package ram for higher bandwidth. Everything is the opposite
           | of what you said.
        
             | sandworm101 wrote:
             | Higher bandwidth but lower capacity. The real trend is
             | different physical architectures for different compute
             | loads. There is a place in AI for bulk albeit slower memory
             | such as extremely large date sets that want to run
             | internally on a discreet card without involving pci lanes.
        
           | zettabomb wrote:
           | I doubt it. The latest GPUs utilize HBM which is necessarily
           | part of the same package as the main die. If you had a RAM
           | slot for a GPU you might as well just go out to system RAM,
           | way too much latency to be useful.
        
             | AnthonyMouse wrote:
             | It isn't the latency which is the problem, it's the
             | bandwidth. A memory socket with that much bandwidth would
             | need a lot of pins. In principle you could just have more
             | memory slots where each slot has its own channel. 16
             | channels of DDR5-8000 would have more bandwidth than the
             | RTX 4090. But an ordinary desktop board with 16 memory
             | channels is probably not happening. You could plausibly see
             | that on servers however.
             | 
             | What's more likely is hybrid systems. Your basic desktop
             | CPU gets e.g. 8GB of HBM, but then also has 16GB of DRAM in
             | slots. Another CPU/APU model that fits into the same socket
             | has 32GB of HBM (and so costs more), which you could then
             | combine with 128GB of DRAM. Or none, by leaving the slots
             | empty, if you want entirely HBM. A server or HEDT CPU might
             | have 256GB of HBM and support 4TB of DRAM.
        
               | brookst wrote:
               | Agree, this is likely future. It's really just an
               | extension of The existing tiered CPU cache model
        
         | VikingCoder wrote:
         | I'm curious - where are the GPUs with decent processing power
         | but enormous memory? Seems like there'd be a big market for
         | them.
        
           | wongarsu wrote:
           | Nvidia is making way too much money keeping cards with lots
           | of memory exclusive to server GPUs they sell with insanely
           | high margins.
           | 
           | AMD still suffers from limited resources and doesn't seem
           | willing to spend too much chasing a market that might just be
           | a temporary hype, Google's TPUs are a pain to use and seem to
           | have stalled out, and Intel lacks commitment, and even their
           | products that went roughly in that direction aren't a great
           | match for neural networks because of their philosophy of
           | having fewer more complex cores.
        
           | p1esk wrote:
           | H200 has 141GB, B100 (out next month) will probably have even
           | more. How much memory do you need?
        
             | holoduke wrote:
             | We need 128gb with a 4070 chip for about 2000 dollars.
             | Thats what we want.
        
               | FeepingCreature wrote:
               | Yes please.
        
               | duffyjp wrote:
               | I've never tried it, but in Windows you can have CUDA
               | apps fall back to system ram when GPU vram is exhausted.
               | You could slap 128gb in your rig with a 4070. I'm sure
               | performance falls off a cliff, but if it's the difference
               | between possible and impossible that might be acceptable.
               | 
               | https://nvidia.custhelp.com/app/answers/detail/a_id/5490/
               | ~/s...
        
               | ta_1138 wrote:
               | Unfortunately production capacity for that is limited,
               | and with sufficient demand, all pricing is an auction.
               | Therefore, we aren't going to be seeing that card in
               | years
        
               | qwertox wrote:
               | Please give me some DIMM slots on the GPU so that I can
               | choose my own memory like I'm used to from the CPU-world
               | and which I can re-use when I upgrade my GPU.
        
               | ttul wrote:
               | Nvidia will not build that any time soon. RAM is the
               | dividing line between charging $40,000 vs $2500...
        
           | SV_BubbleTime wrote:
           | I'll bet you the Nvidua 50xx series will have cards that are
           | asymmetric for this reason. But nothing that will cannibalize
           | their gaming market.
           | 
           | You'll be able to get higher resolution but slowly. Or pay
           | the $2800 for a 5090 and get high res with good speed.
        
           | ls612 wrote:
           | MacBooks with M2 or M3 Max. I'm serious. They perform like a
           | 2070 or 2080 but have up to 128GB of unified memory, most of
           | which can be used as VRAM.
        
             | declaredapple wrote:
             | How many tokens/s are we talking for a 70B model?
             | 
             | Last I saw they performed really poorly, like lower single
             | digits t/s. Don't get me wrong they're probably a decent
             | value for experimenting with it, but is flat out pathetic
             | compared to an A100 or H100. And I think useless for
             | training?
        
               | smcleod wrote:
               | You can run a 180B model like Falcon Q4 around 4-5tk/s, a
               | 120B model like Goliath Q4 at around 6-10tk/s, and 70B Q4
               | around 8-12tk/s and smaller models much quicker, but it
               | really depends on the context size, model architecture
               | and other settings. A A100 or H100 is obviously going to
               | be a lot faster but it costs significantly more taking
               | its supporting requirements into account and can't be run
               | on a light, battery powered laptop etc...
        
             | ttul wrote:
             | MPS is promising and the memory bandwidth is definitely
             | there, but stable diffusion performance on Apple Silicon
             | remains terribly poor compared with consumer Nvidia cards
             | (in my humble opinion). Perhaps this is partly because so
             | many bits of the SD ecosystem are tied to Nvidia
             | primitives.
        
           | iosjunkie wrote:
           | I dream of AMD or Intel creating cards to do just that
        
           | pbhjpbhj wrote:
           | Nvidia have a system for DMA from GPU to system memory,
           | GPUdirect. That seems like a potentially better route if
           | latency can be handled well.
        
             | nick238 wrote:
             | GPU memory is all about bandwidth, not latency. DDR5 can do
             | 4-8 GT/s x 64-bit bus per DIMM, so maxing 128 GB/s with a
             | dual memory controller, 512 GB/s with 8x memory controllers
             | on server chips, but GDDR6 can run at twice the frequency
             | and has a memory bus ~5x as wide in the 4090, so you get an
             | order of magnitude bump in throughput, so nearly 1 TB/s on
             | a consumer product. Datacenter GPUs (e.g. A100) with HBM2e
             | doubles that to 2 TB/s
        
         | albertzeyer wrote:
         | I understand that Sora is very popular, so it makes sense to
         | refer to it, but when saying it is similar to Sora, I guess it
         | actually makes more sense to say that it uses a Diffusion
         | Transformer (DiT) (https://arxiv.org/abs/2212.09748) like Sora.
         | We don't really know more details on Sora, while the original
         | DiT has all the details.
        
           | tithe wrote:
           | Is anyone else struck by the similarities in textures between
           | the images in the appendix of the above "Scalable Diffusion
           | Models with Transformers" paper?
           | 
           | If you size the browser window right, paging with the arrow
           | keys (so the document doesn't scroll) you'll see (eg, pages
           | 20-21) the textures of the parrot's feathers are almost
           | identical to the textures of bark on the tree behind the
           | panda bear, or the forest behind the red panda is very
           | similar to the undersea environment.
           | 
           | Even if I'm misunderstanding something fundamental here about
           | this technique, I still find this interesting!
        
             | jachee wrote:
             | Could be that they're all generated from the same seed. And
             | we humans are _really_ good at spotting patterns like that.
        
         | cchance wrote:
         | So is this "SDXL safe" or "SD2.1" safe, cause SDXL safe we can
         | deal with, if it's 2.1 safe it's gonna end up DOA for a large
         | part of the opensource community again
        
           | astrange wrote:
           | SD2.1 was not "overly safe", SD2.0 was because of a training
           | bug.
           | 
           | 2.1 didn't have adoption because people didn't want to deal
           | with the open replacement for CLIP. Or possibly because
           | everyone confused 2.0 and 2.1.
        
         | samstave wrote:
         | >> _> How does it perform on 3090, 4090 or less? Are us mere
         | mortals gonna be able to have fun with it ?_
         | 
         | >>> _Its in sizes from 800m to 8b parameters now, will be all
         | sizes for all sorts of edge to giant GPU deployment._
         | 
         | --
         | 
         | Can you fragment responses such that if an edge device (mobile
         | app) is prompted for [thing] it can pass tokens upstream on the
         | prompt -- Torrenting responses effectively - and you could push
         | actual GPU edge devices in certain climates... like dens cities
         | whom are expected to be a Fton of GPU cycle consumption around
         | the edge?
         | 
         | So you have tiered processing (speed is done locally, quality
         | level 1 can take some edge gpu - and corporate shit can be
         | handled in cloud...
         | 
         | ----
         | 
         | Can you fragment and torrent a response?
         | 
         | If so, how is that request torn up and routed to appropriate
         | resources?
         | 
         | BOFH me if this is a stupid question? (but its valid for how we
         | are evolving to AI being intrinsic to our society so quickly.)
        
       | coldcode wrote:
       | No details in the announcement, is it still pixel size in = pixel
       | size out?
        
       | spywaregorilla wrote:
       | Impressive text in the images.
        
       | deepsdev wrote:
       | Can we use it create SORA like videos?
        
         | memossy wrote:
         | If we trained it with videos yes but need more GPUs for that.
        
         | nickthegreek wrote:
         | No.
        
       | btbuildem wrote:
       | That's nice, but could we please have an unsafe alternative? I
       | would like to footgun both my legs off, thank you.
        
         | dougmwne wrote:
         | Since these are open models, people can fine tune them to do
         | anything.
        
           | politician wrote:
           | It's not obvious that fine-tuning can remove all latent
           | compulsions from these models. Consider that the creators
           | know that fine-tuning exists and have vastly more resources
           | to explore the feasibility of removing deep bias using this
           | method.
        
             | dougmwne wrote:
             | Go check out the Unstable Diffusion Discord.
        
               | SV_BubbleTime wrote:
               | The vast majority of images there are SD1.5, even the
               | ones made today.
               | 
               | Which goes far more towards the idea that safety isn't a
               | desirable feature to a lot of AI users.
        
           | ttul wrote:
           | I suppose you could train a model from scratch if you have
           | enough money to blow...
        
         | wokwokwok wrote:
         | How would that be meaningfully different to SDXL?
         | 
         | I mean, SDXL is great. Until you've had a chance to _actually
         | use_ this model, isn't calling it out for _some imagined
         | offence_ that may or may not exist seems like you're drinking
         | some Kool-aid rather than responding to something based in
         | concrete _actual reality_.
         | 
         | You get access to it... and it does the google thing and puts
         | people of colour in every frame? Sure, complain away.
         | 
         | You get access to it, you can't even generate pictures of
         | girls? Sure. Burn the house down.
         | 
         | ...you haven't even _seen it_ and you're _already bitching_
         | about it?
         | 
         | Come on... give them a chance. Judge what it _is_ when _you see
         | it_ not _what you imagine it is_ before you've even had a
         | chance to try it out...
         | 
         | Lots of models, free, multiple sizes, hot damn. This is cool
         | stuff. Be a bit grateful for the work they're doing.
         | 
         | ...and even if sucks, it's open. If it's not what you want, you
         | can retune it.
        
         | viraptor wrote:
         | Just wait some time. People release SD loras all the time. Once
         | SD3 is open, you'll be able to get a patched model in
         | days/weeks.
        
           | SV_BubbleTime wrote:
           | A blogger I follow had an article explaining that the NSFW
           | models for SDXL, are just now SORT OF coming up to the
           | quality of SD1.5 "pre safety" models.
           | 
           | It's been 6 months and it still isn't there. SD3 is going to
           | be quite awhile if they're baking "safety" in even harder.
        
             | viraptor wrote:
             | 1.5 is still more popular than xl and 2 for reasons
             | unrelated to safety. The size and generation speed matter a
             | lot. This is just a matter of practical usability, not some
             | idea of the model being locked down. Feed it enough porn
             | and you'll get porn out of it. If people have incentive to
             | do that (better results than 1.5), it really will happen
             | within days.
        
             | Der_Einzige wrote:
             | Due to the pony community the SDXL nsfw models are far
             | superior to SD1.5. Only issue is that controlnets don't
             | work with that pony SDXL fine tune
        
               | SV_BubbleTime wrote:
               | I am slightly aware of the pony models.
               | 
               | I wish I had something more clever to comment on it. I
               | know what they're doing which is cool and why which is,
               | IDK, live and let live and enjoy your own kink. It just a
               | little funny some of the most work put into in the fine
               | tuning models.. is from the pony community.
               | 
               | So all I have is...
               | 
               | :/
        
         | Fervicus wrote:
         | Nope, sorry. We can't allow you to commit thought crimes.
        
       | wtcactus wrote:
       | I notice they are avoiding images of people in the announcement.
       | 
       | I wonder if they are afraid of the same debacle as google AI and
       | what they mean by "safety" is actually heavy bias against white
       | people and their culture like what happened with Gemini.
        
         | danielbln wrote:
         | What's white people culture?
        
           | potwinkle wrote:
           | From the examples I see on Twitter, they are usually
           | referring to the different cultures of Irish, European, and
           | American white people. Gemini, in an effort to reverse the
           | bias that the models would naturally have, ends up replacing
           | these people with those from other cultures.
        
             | astrange wrote:
             | Calling Irish people white is a rather historically radical
             | statement.
        
               | sealeck wrote:
               | White is a pretty complex and non-obvious category.
        
           | 7moritz7 wrote:
           | US American white people. Anything else would be a ridiculous
           | overgeneralization, like "Asian culture", even if you set
           | some arbitary benchmark for teint and only look at those
           | European countries it's still too much diversity to pool
           | together.
        
           | t0lo wrote:
           | A little continent called europe?
        
       | cuckatoo wrote:
       | NSFW fine tune when? Or will "safety" win this time?
        
         | SXX wrote:
         | They need to release model first. Then it's will be fine-tuned.
        
       | redder23 wrote:
       | Horrible website, hijacks scrolling. I have my scrolling speed up
       | with Chromium Wheel Smooth Scroller. This website's scrolling is
       | extremely slow, so the extension is not working because they are
       | "doing it wrong" TM and somehow hijack native scrolling and do
       | something with it.
        
       | pama wrote:
       | I wish they put out the report already. Has anyone else published
       | a preprint combining ideas similar to diffusion transformers and
       | flow matching?
        
         | lairv wrote:
         | Pretty exciting indeed to see they used flow matching, which
         | have been unpopular for the last few years
        
           | memossy wrote:
           | It'll be out soon, doing benchmark tests etc
        
             | pama wrote:
             | Thanks.
        
       | subzel0 wrote:
       | "Photo of a red sphere on top of a blue cube. Behind them is a
       | green triangle, on the right is a dog, on the left is a cat"
       | 
       | https://pbs.twimg.com/media/GG8mm5va4AA_5PJ?format=jpg&name=...
        
         | Workaccount2 wrote:
         | Not bad, I'm curious of the output if you ask for a mirrored
         | sphere instead.
        
           | svenmakes wrote:
           | This is actually the approach of one paper to estimate
           | lighting conditions. Their strategy is to paint a mirrored
           | sphere onto an existing image:
           | https://diffusionlight.github.io/
        
         | jetrink wrote:
         | One thing that jumps out to me is that the white fur on the
         | animals has a strong green tint due to the reflected light from
         | the green surfaces. I wonder if the model learned this effect
         | from behind the scenes photos of green screen film sets.
        
           | diggan wrote:
           | It's just diffuse irradiance, visible in most real (and CGI)
           | pictures although not as obvious as that example. Seems like
           | a typical demo scene for a 3D renderer, so I bet that's why
           | it's so prominent.
        
           | zero_iq wrote:
           | The models do a pretty good job at rendering plausible global
           | illumination, radiosity, reflections, caustics, etc. in a
           | whole bunch of scenarios. It's not necessarily physically
           | accurate (usually not in fact), but usually good enough to
           | trick the human brain unless you start paying very close
           | attention to details, angles, etc.
           | 
           | This fascinated me when SD was first released, so I tested a
           | whole bunch of scenarios. While it's quite easy to find
           | situations that don't provide accurate results and produce
           | all manner of glitches (some of which you can use to detect
           | some SD-produced images), the results are nearly always
           | convincing at a quick glance.
        
             | astrange wrote:
             | One thing they don't so far do is have consistent
             | perspective and vanishing points.
             | 
             | https://arxiv.org/abs/2311.17138
        
               | orbital-decay wrote:
               | As well as light and shadows, yes. It can be fixed
               | explicitly during training like the paper you linked
               | suggests by offering a classifier, but it will probably
               | also keep getting better in new models on its own, just
               | as a result of better training sets, lower compression
               | ratios, and better understanding of the real world by
               | models.
        
           | awongh wrote:
           | I think you have to conceptualize how diffusion models work,
           | which is that once the green triangle has been put into the
           | image in the early steps, the later generations will be
           | influenced by the presence of it, and fill in fine details
           | like reflection as it goes along.
           | 
           | The reason it knows this is that this is how any light in a
           | real photograph works, not just CGI.
           | 
           | Or if your prompt was "A green triangle looking at itself in
           | the mirror" then early generation steps would have two green
           | triangle like shapes. It doesn't need to know about the
           | concept of light reflection. It does know about composition
           | of an image based on the word mirror though.
        
           | mlsu wrote:
           | It does make sense though. Accurate global illumination is
           | very strongly represented in nearly all training data (except
           | illustrations) so it makes sense that the model learned an
           | approximation of it.
        
           | samstave wrote:
           | Wow - is it doing pre-render-ray-tracing?
        
         | Hugsun wrote:
         | That's very impressive!
        
           | yreg wrote:
           | It is! This isn't something orevious models could do.
        
         | iamgopal wrote:
         | Interesting is that Left and right taken from viewer's
         | perspective instead of red sphere's perspective
        
           | ebertucc wrote:
           | How do you know which way the red sphere is facing? A fun
           | experiment would be to write two prompts for "a person in the
           | middle, a dog to their left, and a cat to their right", and
           | have the person either facing towards or away from the
           | viewer.
        
         | leumon wrote:
         | "When in doubt, scale it up." - openai.com/careers
        
         | Filligree wrote:
         | That's _amazing_.
         | 
         | I imagine this doesn't look impressive to anyone unfamiliar
         | with the scene, but this was absolutely impossible with any of
         | the older models. Though, I still want to know if it reliabily
         | does this--so many other things are left to chance, if I need
         | to also hit a one-in-ten chance of the composition being right,
         | it still might not be very useful.
        
           | Feuilles_Mortes wrote:
           | What was difficult about it?
        
             | lucidrains wrote:
             | previous systems could not compose objects within the scene
             | correctly, not to this degree. what changed to allow for
             | this? could this be a heavily cherrypicked example? guess
             | we will have to wait for the paper and model to find out
        
               | bbor wrote:
               | From the original paper with this technique:
               | We introduce Diffusion Transformers (DiTs), a simple
               | transformer-based backbone for diffusion models that
               | outperforms prior U-Net models and inherits the excellent
               | scaling properties of the transformer model class. Given
               | the promising scaling results in this paper, future work
               | should continue to scale DiTs to larger models and token
               | counts. DiT could also be explored as a drop-in backbone
               | for text-to-image models like DALL E 2 and Stable
               | Diffusion.
               | 
               | Afaict the answer is that combining transformers with
               | diffusers in this way means that the models can
               | (feasibly) operate in a much larger, more linguistically-
               | complex space. So it's better at spatial relationships
               | simply because it has more computational "time" or
               | "energy" or "attention" to focus on them.
               | 
               | Any actual experts want to tell me if I'm close?
        
             | zavertnik wrote:
             | From my experience, the thing that makes using AI image gen
             | hard to use is nailing specificity. I often find myself
             | having to resort to generating all of the elements I want
             | out of an image separately and then comp them together with
             | photoshop. This isn't a bad workflow, but it is tedious (I
             | often equate it to putting coins in a slot machine, hoping
             | it 'hits').
             | 
             | Generating good images is easy but generating good images
             | with very specific instructions is not. For example, try
             | getting midjourney to generate a shot of a road from the
             | side (ie standing on the shoulder of a road taking a photo
             | of the shoulder on the other side with the road crossing
             | frame from left to right)...you'll find midjourney only
             | wants to generate images of roads coming at the "camera"
             | from the vanishing point. I even tried feeding an example
             | image with the correct framing for midjourney to analyze to
             | help inform what prompts to use, but this still did not
             | result in the expected output. This is obviously not the
             | only framing + subject combination that model(s) struggle
             | with.
             | 
             | For people who use image generation as a tool within a
             | larger project's workflow, this hurdle makes the tool swing
             | back and forth from "game changing technology" to "major
             | time sink".
             | 
             | If this example prompt/output is an honest demonstration of
             | SD3's attention to specificity, especially as it pertains
             | to framing and composition of objects + subjects, then I
             | think its definitely impressive.
             | 
             | For context, I've used SD (via comfyUI), midjourney, and
             | Dalle. All of these models + UIs have shared this issue in
             | varying degrees.
        
               | astrange wrote:
               | It's very difficult to improve text-to-image generation
               | to do better than this because you need extremely
               | detailed text training data, but I think a better
               | approach would be to give up on it.
               | 
               | > I often find myself having to resort to generating all
               | of the elements I want out of an image separately and
               | then comp them together with photoshop. This isn't a bad
               | workflow, but it is tedious
               | 
               | The models should be developed to accelerate this then.
               | 
               | ie you should be able to say layer one is this text
               | prompt plus this camera angle, layer two is some
               | mountains you cheaply modeled in Blender, layer three is
               | a sketch you drew of today's anime girl.
        
           | CSMastermind wrote:
           | I put the prompt into ChatGPT and it seemed to work just
           | fine: https://imgur.com/LsRM7G4
        
             | mikeg8 wrote:
             | I dislike the look of chatGPT images so much. The photo-
             | realism of stable diffusion impresses me a lot more for
             | some reason.
        
               | bbor wrote:
               | This is just stylistic, and I think it's because chatgpt
               | knows a bit "better" that there aren't very many literal
               | photos of abstract floating shapes. Adding "studio
               | photography, award winner" produced results quite similar
               | to SD imo, but this does negatively impact the accuracy.
               | On the other side of the coin, "minimalist textbook
               | illustration" definitely seems to help the accuracy,
               | which I think is soft confirmation of the thought above.
               | 
               | https://imgur.com/a/9fO2gxN
               | 
               | EDIT: I think the best approach is simply to separate out
               | the terms in separate phrases, as that gets more-or-less
               | 100% accuracy https://imgur.com/a/JGjkicQ
               | 
               | That said, we should acknowledge the point of all this:
               | SD3 is just incredibly incredibly impressive.
        
             | mortenjorck wrote:
             | You got lucky! Here's a thread where I attempted the same
             | just now: https://imgur.com/a/xiaiKXp
             | 
             | It has a lot of difficulty with the orientation of the cat
             | and dog, and by the time it gets them in the right
             | positions, the triangle is lost.
        
             | smcleod wrote:
             | It looks terrible to me though, very basic rendering and as
             | if it's lower resolution then scaled up.
        
           | ttul wrote:
           | It's the transformer making the difference. Original stable
           | diffusion uses convolutions, which are bad at capturing long
           | range spatial dependencies. The diffusion transformer chops
           | the image into patches, mixes them with a positional
           | embedding, and then just passes that through multiple
           | transformer layers as in an LLM. At the end, the model
           | unpatchify's (yes, that term is in the source code) the
           | patched tokens to generate output as a 2D image again.
           | 
           | The transformer layers perform self-attention between all
           | pairs of patches, allowing the model to build a rich
           | understanding of the relationships between areas of an image.
           | These relationships extend into the dimensions of the
           | conditioning prompts, which is why you can say "put a red
           | cube over there" and it actually is able to do that.
           | 
           | I suspect that the smaller model versions will do a great job
           | of generating imagery, but may not follow the prompt as
           | closely, but that's just a hunch.
        
         | npunt wrote:
         | We're getting to strong holodeck vibes here
        
       | the_duke wrote:
       | So, they just announced StableCascade.
       | 
       | Wouldn't this v3 supersede the StableCascade work?
       | 
       | Did they announce it because a team had been working on it and
       | they wanted to push it out to not just lose it as an internal
       | project, or are there architectural differences that make both
       | worthwile?
        
         | Kubuxu wrote:
         | I think of the SD3 as a further evolution of SD1.5/2/XL and
         | StableCascade as a branching path. It is unclear which will be
         | better in the long term, so why not cover both directions if
         | they have the resources to do so?
        
           | ttul wrote:
           | I suspect Stable Cascade may incorporate a DiT at some point.
           | The UNet is easily swapped out. SC's main innovation is the
           | training of a semantic compressor model and a VQGAN that
           | translates the latent output from the diffusion model back to
           | image space - rather than relying on a VAE.
           | 
           | It's a really smart architecture and I think is fertile
           | ground for stacking on new things like DiT.
        
         | whywhywhywhy wrote:
         | There's architectural differences, although I found Stable
         | Cascade a bit underwhelming, while yes it can actual manage
         | text, the text it does manage just looks like someone just
         | wrote text over the image it doesn't feel integrated a lot of
         | the time.
         | 
         | SD3 seems to be more towards SOTA, not sure why Cascade took so
         | long to get out, seemed to be up and running months ago
        
         | Dwedit wrote:
         | Stable Cascade has a distinct noisy look to generated images.
         | It almost looks as bad as images being dithered to the old 216
         | color Netscape palette.
        
           | ttul wrote:
           | If you renoise the output of the first diffusion stage to
           | halfway and then denoise forward again, you can eliminate the
           | bad output. This approach is called "replay" or "iterative
           | mixing" and there are a few open source nodes for ComfyUI you
           | can refer to.
        
       | 101008 wrote:
       | What's the best way to use SD (3 or 2) online? I can't run it on
       | my PC and I want to do some experiments to generate assets for a
       | POC videogame I'm working on. I pay MidJOurney and I woulnd't
       | mind pay something like 5 or 10 dollars per month to experiment
       | with SD, but I can't find anything.
        
         | Gracana wrote:
         | I used Rundiffusion for a while before I bought a 4090, and I
         | thought their service was pretty nice. You pay for time on a
         | system of whatever size you choose, with whatever
         | tool/interface you select. I think it's worth tossing a few
         | bucks into it to try it out.
        
           | heroprotagonist wrote:
           | Eh, you can get the same software up and running in less than
           | 15-20 minutes on an EC2 GPU instance for about half the
           | hourly-rated pricing of rundiffusion. And you'll also pay
           | less than their 'premium' monthly fee for storage of keeping
           | an instance in the Stopped state the entire month.
           | 
           | I used rundiffusion to play around with a bunch of different
           | open source software quickly and easily with pre-downloaded
           | models after getting annoyed at my laptop GPU. But once I
           | settled on one particular implementation and started spending
           | a lot of time in it, it no longer made sense to repeatedly
           | pay every hour for an initial ease-of-setup.
           | 
           | The only real ongoing benefit was rundiffusion came with a
           | bunch of models pre-downloaded so swapping between them was
           | quick. But you can use UI addons like the CivitAI browser to
           | download models automatically through automatic1111, and
           | you'll likely want to go beyond what they predownload to the
           | instance for you anyway.
           | 
           | The downside to running on the cloud directly is having to
           | manage the running/stopped state of the instance yourself. I
           | haven't ever left it running when I was done with an
           | instance, but I could see that as a risk. CLI commands and
           | scripting can make that faster than logging into a website
           | which does it for you automatically, but it's extra effort.
           | 
           | I thought about building an AMI and putting it up on AWS
           | marketplace, but it looks like there are a few options for
           | that already. I don't know how good they are out of the box,
           | as I haven't used them. But if spending 20 minutes once to
           | get software running on a Linux instance is truly the only
           | barrier to reducing cost, those prebuilt AMIs are a decent
           | intermediary step. They're about $0.10/hour on top of server
           | costs. I skipped straight to installing the software myself,
           | but even an extra $0.10/hour overhead would be better than
           | paying double..
        
             | Gracana wrote:
             | Would you recommend that to someone who has never used AWS
             | before? Is it possible to screw up and rack up a huge bill?
             | I might consider using that for big tasks that I can't do
             | with my local setup.
        
         | Liquix wrote:
         | poke around stablediffusion.fr and trending public huggingface
         | spaces
        
       | bsaul wrote:
       | Anyone knows which AI could be used to generate UI design
       | elements ? (such as "generate a real estate app widget list") as
       | well as the kind of prompts one would use to obtain good results
       | ?
       | 
       | I'm only now investigating using AI to increase velocity in my
       | projects, and the field is moving so fast, i'm a bit outdated.
        
         | kevinbluer wrote:
         | v0 by Vercel could be worth a look: https://v0.dev
         | 
         | From the FAQ: "v0 is a generative user interface system by
         | Vercel powered by AI. It generates copy-and-paste friendly
         | React code based on shadcn/ui and Tailwind CSS that people can
         | use in their projects"
        
         | gwern wrote:
         | If by design elements you include vector images, you could try
         | https://www.recraft.ai/ or Adobe Firefly 2 - there's not a lot
         | of vector work right now, so your choices are either the
         | handful of vector generators, or just bite the bullet and use
         | eg DALL-E 3 to generate raster images you convert to
         | SVG/recreate by hand.
         | 
         | (The second is what we did for https://gwern.net/dropcap
         | because the PNG->SVG filesizes & quality were just barely
         | acceptable for our web pages.)
        
       | AuryGlenz wrote:
       | It's really unfortunate that Silicon Valley ended up in an area
       | that's so far left - and to be clear, it'd be just as bad if it
       | was in a far right area too. Purple would have been nice, to keep
       | people in check. 'Safety' seems to be actively making AI advances
       | worse.
        
         | spencerflem wrote:
         | Silicon Valley is not "far left" by any stretch, which implies
         | socialism, redistribution of wealth, etc. This is obvious by
         | inspection.
         | 
         | I assume by far left, you mean progressive on social issues,
         | which is not really a leftist thing but the groups are related
         | enough that I'll give you a pass.
         | 
         | Silicon valley techies are also not socially progressive. Read
         | this thread or anything published by Paul Graham or any of the
         | AI leaders for proof of that.
         | 
         | However most normal city people are. A large enough percent of
         | the country that big companies that want to make money feel the
         | need to appeal to them.
         | 
         | Funnily enough, what is a uniquely Silicon Valley political
         | opinion is valuing the progress of AI over everything else
        
           | TulliusCicero wrote:
           | Techies are socially progressive as a whole. Yes there are
           | some outliers, and tech _leaders_ probably aren 't as far
           | left socially as the ground level workers.
        
             | spencerflem wrote:
             | I wish :/, I really do
             | 
             | I find them in general to not be Republican and all the
             | baggage that entails but the typical techie I meet is less
             | concerned with social issues than the typical city
             | Democrat.
             | 
             | If I can speculate wildly, I think it is because tech has
             | this veneer of being an alternative solution to the worlds
             | problems, so a lot of techies believe that advancing of
             | tech is both the most important goal and also politically
             | neutral. And also, now that tech is a uniquely profitable
             | career, the types of people that would be in business
             | majors are now CS majors. Ie. those that are mainly
             | interested in getting as much money as possible for
             | themselves.
        
             | KittenInABox wrote:
             | I disagree techies are socially progressive as a whole;
             | there is very minimal, almost no push for labor rights or
             | labor protection even though our group is
             | disproportionately hit with abusing employees under the
             | visa program.
        
               | TulliusCicero wrote:
               | Labor protections are generally seen as a fiscal issue,
               | rather than a social one. E.g. libertarians would usually
               | be fine with gay rights but against greater labor
               | regulation.
        
           | chasd00 wrote:
           | when i think of "far left" i think of an authoritative regime
           | disguised as serving the common good and ready to punish and
           | excommunicate any thought or action deemed contrary to the
           | common good. However, the regime defines "common good"
           | themselves and remains in power indefinitely. In that regard,
           | SV is very "far left". At the extremes far-left and far-right
           | are very similar when you empathize as a regular person on
           | the street.
        
             | spencerflem wrote:
             | Well, you're wrong.
        
             | foolofat00k wrote:
             | That's just not what that term means.
        
               | acheron wrote:
               | It's not right wing unless they sit on the right side of
               | the National Assembly and support Louis XVI.
        
               | five_lights wrote:
               | Like it or not, this is how center-right over are using
               | it. We've just created huge silos post trump schism, that
               | even our language is drifting.
        
           | skinpop wrote:
           | indeed they are not really left but neoliberals with a
           | leftist aesthetic, just like most republicans are neoliberals
           | with a conservative aesthetic.
        
         | rightbyte wrote:
         | SV area far left? I wouldn't even regard the area as left
         | leaning, at all.
         | 
         | I looked at Wikipedia and there seem to be no socialist
         | representation.
         | 
         | Like, from an European perspective hearing that is ludicrous.
        
           | kristofferR wrote:
           | They are the worst kind of left, the "prudish and constantly
           | offended left", not the "free healthcare and good government"
           | left.
           | 
           | I'm glad I live in Norway, where state TV shows boobs and
           | does offensive jokes without anyone really caring.
        
             | jquery wrote:
             | Prudish? San Francisco? The same city that has outdoor nude
             | carnivals without any kind of age restrictions?
             | 
             | If by prudish you mean intolerant of hate speech, sure. But
             | generally few will freak out over some nudity here.
             | 
             | College here is free. We also have free healthcare here, as
             | limited as it is:
             | https://en.wikipedia.org/wiki/Healthy_San_Francisco
             | 
             | Not sure what you mean by "offensive jokes", that could
             | mean a lot of things...
        
         | bergen wrote:
         | Put in any historical or political context SV is in no way
         | left. They're hardcore libertarian. Just look at their poster
         | boys, Elon Musk, Peter Thiel, and a plethora of others are very
         | oriented towards totalitarianism from the right. Just because
         | they blow their brains out on lsd and ketamine and go on 2 week
         | spiritual retreats doesn't make them leftists. They're
         | billionares that only care about wealth and power, living in
         | segregated communities from the common folk of the area -
         | nothing lefty about that.
        
           | freedomben wrote:
           | Elon Musk and Peter Thiel are two of the most hated people in
           | tech, so this doesn't seem like a compelling example. Also I
           | don't think Elon Musk and Peter Thiel qualify as "hardcore
           | libertarian." Thiel was a Trump supporter (hardly libertarian
           | at all, let alone hardcore) and Elon has supported Democrats
           | and much government his entire life until the last few years.
           | He's mainly only waded into "culture war" type stuff that I
           | can think of. What sort of policies has Elon argued for that
           | you think are "hardcore libertarian?"
        
             | bergen wrote:
             | He wanted to replace public transport with a system where
             | you don't have to ride the public transport with the plebs,
             | he want's to colonize mars with the best minds (equal most
             | money for him), he built a tank for urban areas. He
             | promotes free speech even if it incites hate, he likes ayn
             | rand, he implies government programs calling for united
             | solutions is either communism, orwell or basically hitler.
             | He actively promotes the opinion of those that pay above
             | others on X.
        
               | freedomben wrote:
               | Thank you, truly, I appreciate the effort you put in to
               | list those. It helps me understand more where you're
               | coming from.
               | 
               | > He wanted to replace public transport with a system
               | where you don't have to ride the public transport with
               | the plebs
               | 
               | I don't think this is any more libertarian than kings and
               | aristocrats of days past were. I know a bunch of people
               | who ride public transit in New York and San Francisco who
               | would readily agree with this, and they are definitely
               | not libertarian. If anything it seems a lot more
               | democratic since he wants it to be available to everyone
               | 
               | > he want's to colonize mars with the best minds (equal
               | most money for him)
               | 
               | This doesn't seem particularly "libertarian" either,
               | excepting maybe the aspect of it that is highly
               | capitalistic. That point I would grant. But you could
               | easily be socialist and still support the idea of
               | colonizing something with the best minds.
               | 
               | > he built a tank for urban areas.
               | 
               | I admit I don't know anything about this one
               | 
               | > He promotes free speech even if it incites hate
               | 
               | This is a social libertarian position, although it's
               | completely disconnected from economic libertarianism. I
               | have a good friend who is a socialist (as in wants to
               | outgrow capitalism such as marx advocated) who supports
               | using the state to suppress capitalist
               | activity/"exploitation", and he also is a free speech
               | absolutist.
               | 
               | > he likes ayn rand
               | 
               | That's a reasonable point, although I think it's worth
               | noting that there are plenty of hardcore libertarians who
               | hate ayn rand.
               | 
               | > he implies government programs calling for united
               | solutions is either communism, orwell or basically
               | hitler.
               | 
               | Eh, lots of republicans including Trump do the same
               | thing, and they're not libertarian. Certainly not
               | "hardcore libertarian"
               | 
               | > He actively promotes the opinion of those that pay
               | above others on X.
               | 
               | This could be a good one, although Google, Meta, Reddit,
               | Youtube, and any other company that runs ads or has
               | "sponsored content" is doing the same thing, so we would
               | have to define all the big tech companies as "hardcore
               | libertarian" to stay consistent.
               | 
               | Overall I definitely think this is a hard debate to have
               | because "hardcore libertarian" can mean different things
               | to different people, and there's a perpetual risk of "no
               | true scotsman" fallacy. I've responded above with how I
               | think most people would imagine libertarianism, but
               | depending on when in history you use it, many anarcho-
               | socialists used the label for themselves yet today
               | "libertarian" is a party that supports free market
               | economics and social liberty. But regardless the
               | challenges inherent, I appreciate the exchange
        
               | bergen wrote:
               | >I don't think this is any more libertarian than kings
               | and aristocrats of days past were. So very libertarian.
               | 
               | >If anything it seems a lot more democratic since he
               | wants it to be available to everyone No, he want's a
               | solution that minimizes contact to other people and let
               | you live in your bubble. This minimizes exposure to
               | others from the same city and is a commercial system, not
               | a publicly created one. Democratization would be a cheap
               | public transport where you don't get mugged, proven to
               | work in every european and most asian cities.
               | 
               | > I admit I don't know anything about this one The
               | cybertruck. Again a vehicle to isolate you from everyday
               | life being supposed bulletproof and all.
               | 
               | > lots of republicans including Trump do the same thing,
               | and they're not libertarian They are all "little
               | government, individual choice" - of course they feed
               | their masters, but the kochs and co want exactly this.
               | 
               | Appreciate the exchange too, thanks for factbased
               | formulation of opinions.
        
           | njarboe wrote:
           | Musk main residence is a $50k house he rents in Boca Chica.
           | Grimes wanted a bigger, nicer residence for her and their
           | kids and that was one of the reasons she left him.
        
             | bergen wrote:
             | One of his many lies. https://www.wsj.com/articles/elon-
             | musk-says-he-lives-in-a-50...
        
         | dang wrote:
         | We detached this subthread from
         | https://news.ycombinator.com/item?id=39467056.
        
           | spencerflem wrote:
           | thank you, the thread looks so much nicer now with
           | interesting technical details at the top
        
             | dang wrote:
             | I'm delighted that you noticed--it took about 30
             | interventions to get there.
        
         | asadotzler wrote:
         | So far left the techies dont even have a labor union. You're a
         | joke.
        
       | 4bpp wrote:
       | I guess we should count our blessings and be grateful that
       | literacy, the printing press, computers and the internet became
       | normalised before this notion of "harm" and harm prevention was.
       | Going forward, it's hard to imagine how any new technology that
       | is unconditionally intellectually empowering to the individual
       | will be tolerated; after all, just think of the harms someone
       | thus empowered could be enabled to perpetrate.
       | 
       | Perhaps eventually, once every forum has been assigned a trust-
       | and-safety team and word processor has been aligned and most
       | normal people have no need for communication outside the
       | Metaverse (TM) in their daily lives, we will also come around to
       | reviewing the necessity of teaching kids to write, considering
       | the epidemic of hateful graffiti and children being caught with
       | handwritten sexualised depictions of their classmates.
        
         | xanderlewis wrote:
         | > unconditionally intellectually empowering
         | 
         | What makes you think those who've worked hard over a lifetime
         | to provide (with no compensation) the vast amounts of data
         | required for these -- inferior by every metric other than
         | quantity -- stochastic approximations of human thought should
         | feel _empowered_?
         | 
         | I think the genAI / printing press analogy is wearing rather
         | thin now.
        
           | graphe wrote:
           | WHO exactly worked hard over a lifetime with no compensation?
        
             | xanderlewis wrote:
             | By _compensation_ I mean from the companies creating the
             | models, like OpenAI.
        
               | graphe wrote:
               | Computers and drafters had their work taken by machines.
               | IBM did not pay off the computers and drafters. In this
               | case you could make a steady decent wage. My grandfather
               | was trained in a classic drawing style (yes it was his
               | main job).
               | 
               | He did not get into the profession to make money. He did
               | it out of passion and died poor. Artists are not being
               | tricked by the promise of wealth. You will get a cloned
               | style if you can't afford the real artist making it and
               | if the commission goes to a computer how is that not the
               | same as plagerism by a human? Artists were not being paid
               | well before. The anime industry has proven the endpoint
               | of what happens to artists as a profession despite their
               | skills. Chess still exists despite better play by
               | machines. Art as a commercial medium has always been
               | tainted by outside influences such as government,
               | religion and pedophilia.
               | 
               | In the end, drawing wasn't going to survive in the age of
               | vector art and computers. They are mainly forgettable
               | jpgs you scroll past in a vast array like DeviantArt.
        
               | xanderlewis wrote:
               | Sorry, but every one of your talking points -- 'computers
               | were replaced' , 'chess is still being played', etc. --
               | and good counterarguments to them have been covered ad
               | nauseam (and practically verbatim) by now.
               | 
               | Anyway, my point isn't that 'AI is evil and must be
               | stopped'; it's that it doesn't feel 'intellectually
               | empowering'. I (in my personal work) can't get anything
               | done with ChatGPT that I can't on my own, and with less
               | frustration. We've created machines that can
               | superficially mimic real work, and the world is going
               | bonkers over it. The only magic power these systems have
               | is sheer speed: they can output reams and reams of
               | twaddle in the time it takes me to make a cup of tea. And
               | no doubt those in bullshit jobs are soon going to find
               | out.
               | 
               | My argument might not be what you expect from someone who
               | is sad to see the way artists' lives are going: if your
               | work is truly capable of being replaced by a large
               | language model or a diffusion model, maybe it wasn't very
               | original to begin with.
               | 
               | The sad thing is, artists who create genuinely superior
               | work will still lose out because those financially
               | enabling them will _think_ (wrongly) that they can be
               | replaced. And we'll all be worse off.
        
               | graphe wrote:
               | I definitely feel more empowered, and making imperfect
               | art and generating code that doesn't work and
               | proofreading it is definitely changing people's lives.
               | Which specific artist are you talking about who will
               | suffer? Many of the ones I talk to are excited about
               | using it.
               | 
               | You keep going back to value and finances. The less money
               | is in it the better. Art isn't good because it's
               | valuable, unless you were only interested in it
               | commercially.
        
               | xanderlewis wrote:
               | > Art isn't good because it's valuable, unless you were
               | only interested in it commercially.
               | 
               | Of course not; I'm certainly not suggesting so. But I do
               | think money is important because it is what has enabled
               | artists to do what they do. Without any prospect of
               | monetising one's art, most of us (and I'm not an artist)
               | would be out working in the potato fields, with very
               | little time to develop skills.
        
               | graphe wrote:
               | I disagree. It will be better because it's driven purely
               | by passion. Art runs in my family even today, I am fully
               | aware of its value as well as cost. It is not a career
               | and artists knew that then and now, supplementing their
               | decadence on expression of value through film purchases,
               | luxurious pigments, toxic but beautiful chemicals, or
               | instruments that were sure to never make back their
               | purchasing price. Someone (not my family) made Stonehenge
               | in his backyard but it had no commercial value, it still
               | is a very impressive feat and I admire the ingenuity. Art
               | without monetary value is always the best, and previous
               | problems such as film costs and paint prices are solved
               | digitally, so the lack of commercial interest shouldn't
               | hurt art at all.
               | 
               | Commercial movies have lots of CG, big budgets and famous
               | actors while small budget indie movies have been
               | exploding despire their weaker technical specialities.
               | Noah's ark was made by amateurs while the titanic was
               | made by experts.
        
             | samstave wrote:
             | Slaves.
        
               | xanderlewis wrote:
               | Yes, but that's clearly not what I'm getting at.
        
           | ben_w wrote:
           | > inferior by every metric other than quantity
           | 
           | And the metric of "beating most of our existing metrics so we
           | had to rewrite the metrics to keep feeling special, but don't
           | worry we can justify this rewriting by pointing at Goodhart's
           | law".
           | 
           | The only reason the question of _compensating_ people for
           | their input into these models even matters is specifically
           | because the models are, in actual fact, good. The bad models
           | don 't replace anyone.
        
             | xanderlewis wrote:
             | > beating most of our existing metrics so we had to rewrite
             | the metrics to keep feeling special
             | 
             | This is needlessly provocative, and also wrong. My metrics
             | have been the same from the very beginning (i.e. 'can it
             | even come close to doing my work for me?'). This question
             | may yet come to evaluate to 'yes', but I think you
             | seriously underestimate the real power of these models.
             | 
             | > The only reason the question of compensating people for
             | their input into these models even matters is specifically
             | because the models are, in actual fact, good.
             | 
             | No. They don't need to be good, they simply need to fool
             | people into thinking they're good.
             | 
             | And before you reflexively rebut with 'what's the
             | difference?', let me ask you this: is the quality of a
             | piece of work or the importance of a job and all of its
             | indirect effects always immediately apparent? Is it
             | possible for managers to short term cost-cut at the expense
             | of the long term? Is it conceivable that we could at some
             | point slip into a world in which there is no funding for
             | genuinely interesting media anymore because 90% of the
             | population can't distinguish it? The real danger of genAI
             | is that it convinces non-experts that the experts are
             | replaceable when the reality is utterly different. In some
             | cases this will lead to serious blowups and the real
             | experts will be called back in, but in more ambiguous cases
             | we'll just quietly lose something of real value.
        
               | ben_w wrote:
               | > This is needlessly provocative,
               | 
               | Perhaps; this is something I find annoying enough that my
               | responses may be unnecessarily sharp...
               | 
               | > and also wrong. My metrics have been the same from the
               | very beginning (i.e. 'can it even come close to doing my
               | work for me?'). This question may yet come to evaluate to
               | 'yes', but I think you seriously underestimate the real
               | power of these models.
               | 
               | Okay then. (1) your definition is equivalent to
               | "permanent mass unemployment" because if it can do your
               | work for you, it can also do your work for someone else,
               | (2) you mean either " _over_ -estimate" or "real _limits_
               | of these models ", and the only reason I even bring up
               | what's obviously a minor editing issue that I fall foul
               | of myself on many comments is that this is the kind of
               | mistake that people pick up on as evidence of the limits
               | of AI -- treating small inversions like this as evidence
               | of uselessness.
               | 
               | > Is it conceivable that we could at some point slip into
               | a world in which there is no funding for genuinely
               | interesting media anymore because 90% of the population
               | can't distinguish it?
               | 
               | As written, what you describe is tautologically
               | impossible. However, assuming you mean something more
               | like "genuinely novel" rather than "interesting",
               | absolutely! 100% yes. There's also _loads_ of ways this
               | could permanently end all human flourishing (even when
               | used as a mere tool e.g. by dictators for propaganda),
               | and some plausible ways it can permanently end all human
               | _existence_ (it 's a safe bet someone will ask it to and
               | try to empower it to this end, the question is how far
               | they get with this).
               | 
               | > The real danger of genAI is that it convinces non-
               | experts that the experts are replaceable when the reality
               | is utterly different.
               | 
               | Despite the fact that the best models ace tests in
               | medicine and law, the international mathematical
               | olympiad, leetcode, etc., the fact there are no real
               | tests for how good someone is after a few years of
               | employment means both your point and mine can be true
               | simultaneously. I'm thinking the real threat current LLMs
               | pose to newspapers is that they fully automate the Gell-
               | Mann Amnesia effect, even though they beat humans on
               | _every_ measure I had of intelligence when I was growing
               | up, and depending on which measure exactly either all of
               | humanity together by many orders of magnitude, or at
               | worst putting them somewhere near the level of  "rather
               | good student taking the same test".
               | 
               | > In some cases this will lead to serious blowups and the
               | real experts will be called back in, but in more
               | ambiguous cases we'll just quietly lose something of real
               | value.
               | 
               | Hard disagree about "quiet loss". To the extent that
               | value can be quantified, even if only by surveying
               | humans, models can learn it. Indeed, this is already
               | baked into the way ChatGPT asks you for feedback about
               | the quality of the answers it generates. To the extent we
               | lose things, it will be a very loud and noisy loss,
               | possibly literally in the form of a nuke going off.
        
               | astrange wrote:
               | > (1) your definition is equivalent to "permanent mass
               | unemployment" because if it can do your work for you, it
               | can also do your work for someone else
               | 
               | This wouldn't happen because employment effects are
               | mainly determined by comparative advantage, i.e. the
               | resources that could be used to "do your job" will
               | instead be used to do something they're more suited to.
               | 
               | (Not "that they're better at". it's "more suited to". You
               | do not have your job because you're the best at it.)
        
               | ben_w wrote:
               | I don't claim to be an expert in economics, so if you
               | feel like answering please treat me as a noob, but
               | doesn't comparative advantage have the implicit
               | assumption that demand isn't ever going to be fully met
               | for all buyers? The "single most economically important
               | task" that a machine which can operate at a human (or
               | superhuman) level, is "make a better version of itself"
               | until that process hits a limit, followed by "maximise
               | how many of you exist" until it runs out of resources.
               | With assumptions that currently seem plausible such as
               | "such a robot[0] might mass 100kg and take 5 months to
               | turn plain metal ore into a working copy of itself", it
               | takes about 30 years to convert the planet Mercury into
               | 4.12e11 such robots _per currently living human_ [1],
               | which I assert is _more than anyone can actually use_
               | even if they decided their next game of Civilization was
               | going to be a 1:1 scale WestWorld-style LARP.
               | 
               | If I imagine a world where every task that any human can
               | perform can also be done at world expert level -- let
               | alone at a superhuman level -- by a computer/robot (with
               | my implicit assumption "cheaply"), I can't imagine why I
               | would ever choose the human option. If the comparative
               | advantage argument is "the computer/robot combination
               | will always be priced at exactly the level where it's
               | cost-competitive with a human, in order that it can
               | extract maximum profit", I ask why there won't be many
               | AI/robots competing with each other for ever-smaller
               | profit margins?
               | 
               | [0] AI and robotics are _not_ the same things, one is
               | body the other mind, but there 's a lot of overlap with
               | AI being used to drive robots, LLMs making it easier to
               | define rewards and for the robots to plan; and AI also
               | get better by having embodiment (even if virtual) giving
               | them real world feedback.
               | 
               | [1] https://www.wolframalpha.com/input?i=5+months+*+log2%
               | 28mass+...
        
               | astrange wrote:
               | > The "single most economically important task" that a
               | machine which can operate at a human (or superhuman)
               | level, is "make a better version of itself" until that
               | process hits a limit, followed by "maximise how many of
               | you exist" until it runs out of resources.
               | 
               | Lot of hidden assumptions here. How does "operating at
               | human level" (an assumption itself) imply the ability to
               | do this? Humans can't do this.
               | 
               | We very specifically can't do this, we have sexual
               | reproduction for a good reason.
               | 
               | > If I imagine a world where every task that any human
               | can perform can also be done at world expert level -- let
               | alone at a superhuman level -- by a computer/robot (with
               | my implicit assumption "cheaply"), I can't imagine why I
               | would ever choose the human option.
               | 
               | If the robot performs at human level, and it knows you'll
               | always hire it over a human, why would it work for
               | cheaper?
               | 
               | If you can program it to work for free, then it's
               | subhuman.
               | 
               | If you're imagining something that's superhuman in only
               | ways that are bad for you and subhuman in ways that would
               | be good for you, just stop imagining it and you're good.
        
           | Vetch wrote:
           | > thought should feel empowered?
           | 
           | This is a strange question since augmentation can be
           | objectively measured even as its utility is contextual. With
           | MidJourney I do not feel augmented because while it makes
           | pretty images, it does not make precisely the pretty images I
           | want. I find this useless, but for the odd person who is
           | satisfied only with looking at pretty pictures, it might be
           | enough. Their ability to produce pretty pictures to
           | satisfaction is thus augmented.
           | 
           | With GPT4 and Copilot, I am augmented in a speed instead of
           | capabilities sense. The set of problems I can solve is not
           | meaningfully enhanced, but my ability to close knowledge gaps
           | is. While LLMs are limited in their global ability to help
           | design, architect or structure the approach to a novel
           | problem or its breakdown, they can tell local tricks and
           | implementation approaches I do not know but can verify as
           | correct. And even when wrong, I can often work out how to fix
           | their approach (this is still a speed up since I likely would
           | not have arrived at this solution concept on my own). This is
           | a significant augmentation even if not to the level I'd like.
           | 
           | The reason capabilities are not much enhanced is to get the
           | most out of LLMs, you need to be able to verify solutions due
           | to their unreliability. If a solution contains concepts you
           | do not know, the effort to gain the knowledge required to
           | verify the approach (which the LLM itself can help with)
           | needs to be manageable in reasonable time.
        
             | xanderlewis wrote:
             | > With GPT4 and Copilot...
             | 
             | I am not a programmer, so none of this applies to me. I can
             | only speak for myself, and I'm not claiming that _no one_
             | can feel empowered by these tools - in fact it seems
             | obvious that they can.
             | 
             | I think programmers tend to assume that all other technical
             | jobs can be attacked in the same way, which is not
             | necessarily true. Writing code seems to be an ideal use
             | case for LLMs, especially given the volume of data
             | available on the open web.
        
               | Vetch wrote:
               | Which is why I say it is contextual and depends on the
               | task. I'll note that it's not only programming ability
               | that is empowered but learning math, electronics,
               | history, physics and so on up to the university level. As
               | long as you take small enough steps such that you are
               | able to verify with external sources, you will move
               | faster with than without.
               | 
               | Writing it as "feel empowered" made it come across as if
               | you meant the empowerment was illusory. My argument was
               | that it is not merely a feeling but a real measurable
               | difference.
        
           | 4bpp wrote:
           | Empowering to their users. A lot of things that empower their
           | users necessarily disempower others, especially if we define
           | power in a way that is zero-sum - the printing press
           | disempowered monasteries and monks that spent a lifetime
           | perfecting their book-copying craft (and copied books that no
           | doubt were used in the training of would-be printing press
           | operators in the process, too).
           | 
           | It seems to me that the standard use of "empowering" implies
           | in particular that you get more power for less effort - which
           | in many cases tends to be democratizing, as hard-earned power
           | tends to be accrued by a handful of people who dedicate most
           | of their lives to pursuit of power in one form or another.
           | With public schooling and printing, a lot of average people
           | were empowered at the expense of nobles and clerics, who put
           | in a lifetime of effort for the power literacy conveys in a
           | world without widespread literacy. With AI, likewise, average
           | people will be empowered at the expense of those who
           | dedicated their life to learn to (draw, write good copy,
           | program) - this looks bad because we hold those people in
           | high esteem in a world where their talents are rare, but
           | consider that following that appearance is analogously
           | fallacious to loathing democratization of writing because of
           | how noble the nobles and monks looked relative to the
           | illiterate masses.
        
             | xanderlewis wrote:
             | I get why you might describe these tools as
             | 'democratising', but it also seems rather strange when you
             | consider that the future of creativity is now going to be
             | dependent on huge datasets and amounts of computation only
             | billion-dollar companies can afford. Isn't that _anything
             | but_ democratic? Sure, you can ignore the zeitgeist and
             | carry on with traditional dumb tools if you like, but
             | you'll be utterly left behind.
        
               | 4bpp wrote:
               | Datasets can still be curated by crowds of volunteers
               | just fine. I would likewise expect a crowdsourceable
               | solution to compute to emerge eventually - unless the
               | safetyists move to prevent this by way of legislation.
               | 
               | When writing and printing emerged, they too depended on
               | supply chains (for paper, iron, machining) and in the
               | case of printing capital that were far out of the reach
               | of the individual. Their utility and overlap with other
               | mass markets resulted in those being commoditized in
               | short order.
        
         | gjulianm wrote:
         | I feel like this analogy is not very appropriate. The main
         | problem with AI generated images and videos is that, with every
         | improvement, it becomes more and more difficult to distinguish
         | what's real and what's not. That's not something that happened
         | with literacy or printing press or computers.
         | 
         | Think about it: the saturation of content on the Internet has
         | become so bad that people are having a hard time knowing what's
         | true or not, to the point that we're having again outbreaks of
         | preventable diseases such as measles because people can't
         | identify what's real scientific information and what's not.
         | Imagine what will happen when anyone can create an image of
         | whatever they want that looks just like any other picture, or
         | worse, video. We are not at all equipped to deal with that. We
         | are risking a lot just for the ability to spend massive amounts
         | of compute power on generating images. It's not curing cancer,
         | not solving world hunger, not making space travel free, no:
         | it's generating images.
        
           | gpderetta wrote:
           | I don't understand. Are you saying that before AI there was a
           | reliable way to distinguish fiction from factual?
        
             | gjulianm wrote:
             | It definitely is easier without AI. Before, if you saw a
             | photo you could be fairly confident that most of it was
             | real (yes, photo manipulation exists but you can't really
             | create a photo out of nothing). Videos, far more
             | trustworthy (and yes, I know that there's some amazing 3D
             | renders out there but they're not really accessible). With
             | these technologies and the rate at which they're improving,
             | I feel like that's going out of the window. Not to mention
             | that the more content that is generated, the easier it is
             | that something slips by despite being fake.
        
             | UberFly wrote:
             | "it becomes more and more difficult to distinguish what's
             | real and what's not" - Is literally what they said.
        
         | laminatedsmore wrote:
         | "grateful that literacy, the printing press, computers and the
         | internet became normalised before this notion of "harm" and
         | harm prevention was"
         | 
         | Printing Press -> Reformation -> Thirty Years' War -> Millions
         | Dead
         | 
         | I'm sure that there were lots of different opinions at the time
         | about what kind of harm was introduced by the printing press
         | and what to do about it, and attempts to control information by
         | the Catholic church etc.
         | 
         | The current fad for 'safe' 'AI' is corporate and naive. But
         | there's no simple way to navigate a revolutionary change in the
         | way information is accessed / communicated.
        
           | light_hue_1 wrote:
           | Way to blame the printing press for the actions of religious
           | extremists.
           | 
           | The lesson isn't. printing press bad, it's extremist
           | irrational belief in any entity is bad (whether it's
           | religion, Trump, etc.).
        
             | freedomben wrote:
             | > _Way to blame the printing press for the actions of
             | religious extremists._
             | 
             | I don't see GP blaming the printing press for that, they're
             | merely pointing out that one enabled the other, which is
             | absolutely true. I'm damn near a free speech absolutist,
             | and I think the heavy "safety" push by AI is well-meaning
             | but will have unintended consequences that cause more harm
             | than they are meant to prevent, but it seems obvious to me
             | that they _can_ be used much the same as printing presses
             | were by the extremists.
             | 
             | > _The lesson isn 't. printing press bad, it's extremist
             | irrational belief in any entity is bad (whether it's
             | religion, Trump, etc.)._
             | 
             | Could not agree more
        
             | samstave wrote:
             | The printing press is the leading cause of tpyos!
        
             | herculity275 wrote:
             | It's not about assigning blame. A revolutionary technology
             | enables revolutionary change and all sorts of bad actors
             | will take advantage of it.
        
           | biomcgary wrote:
           | Safetyism is the standard civic religion since 9/11 and I
           | doubt it will go quietly into the night. Much like the
           | bishops and the king had a symbiotic relationship to maintain
           | control and limit change (e.g., King James of KJV Bible
           | fame), the government and corporations have a similarly
           | tense, but aligned relationship. Boogeymen from the left or
           | the right can always be conjured to provide the fear
           | necessary to control
           | 
           | Would millions have died if the old religion gave way to the
           | new one without a fight? The problem for the Vatican was that
           | their rhetoric wasn't at top form after mentally stagnating
           | for a few centuries since arguing with Roman pagans, so war
           | was the only possibility to win.
           | 
           | (Don't forget Luther's post hoc justification of killing
           | 100k+ peasants, but he won because he had better rhetorical
           | skills AND the backing of aristocrats and armies. https://en.
           | wikipedia.org/wiki/Against_the_Murderous,_Thievin... and
           | https://en.wikipedia.org/wiki/German_Peasants%27_War)
        
             | kurthr wrote:
             | "Think of the Children" has been the norm since long before
             | it was re-popularized in the 80s for song lyrics, in the
             | 90s encryption, and now everything else.
             | 
             | I almost think it's the eras between that are more notable.
        
             | EchoReflection wrote:
             | "The Coddling of the American Mind" by Jonathan Haidt and
             | Greg Lukianoff is a _very_ good (and troubling) book that
             | talks a lot about  "safetyism". I can't recommend it
             | enough.
             | 
             | https://jonathanhaidt.com/
             | 
             | https://www.betterworldbooks.com/product/detail/the-
             | coddling...
             | 
             | https://www.audible.com/pd/The-Coddling-of-the-American-
             | Mind...
        
               | astrange wrote:
               | It's strange that people think Stability is making
               | decisions based on American politics when it isn't an
               | American company and other countries generally have
               | stricter laws in this area.
        
           | dotancohen wrote:
           | The current focus on "safety" (I would prefer a less gracious
           | term) are based as much on fear as on morality. Fear of
           | government intervention and woke morality. The progress in
           | technology is astounding, the focus on sabotaging then
           | publicly available versions of the technology to promote (and
           | deny) narratives is despicable.
        
           | fngjdflmdflg wrote:
           | I agree. There should have been guardrails in place to
           | prevent people who espouse extremist viewpoints like Martin
           | Luther from spreading their dangerous and hateful rhetoric. I
           | rest easy knowing that only people with the correct
           | intentions will be able to use AI.
        
         | miohtama wrote:
         | British banned printing press in 1662 in the name of the harm
         | 
         | https://en.m.wikipedia.org/wiki/Licensing_of_the_Press_Act_1...
        
           | freedomben wrote:
           | Yes, and fortunately that banning was the end of hateful
           | printed content. Since that ban, the only way to print
           | objectionable material has been to do it by hand with pen and
           | ink.
           | 
           | (For clarity, I'm joking, and I know you're also not implying
           | any such thing. I appreciate your comment/link)
        
         | someuser2345 wrote:
         | Harm prevention is definitely not new; books have been subject
         | to censorship for centuries. Just look at the U.S., where we
         | had the Hays code and the Comic Code Authority. The only
         | difference is that now, Harm is defined by California tech
         | companies rather than the Church or the Monarchy.
        
         | ben_w wrote:
         | I don't think your golden age ever truly existed -- the Overton
         | Window for acceptable discourse has always been narrow, we've
         | just changed who the in-group and out-groups are.
         | 
         | The out group used to be atheists, or gays, or witches, or
         | republicans (in the British sense of the word), or people who
         | want to drink. And each of Catholics and Protestants made the
         | other unwelcome across Europe for a century or two. When I was
         | a kid, it was anyone who wanted to smoke weed, or (because UK)
         | any normalised depiction of gay male relationships as being at
         | all equivalent to heterosexual ones[0]. I met someone who was
         | embarrassed to admit they named their son "Hussein"[1], and
         | _absolutely_ any attempt to suggest that ecstasy was anything
         | other than evil. I know at least one trans person who _started_
         | out of the closet, but was very eager to go _into_ the closet.
         | 
         | [0] "promote the teaching in any maintained school of the
         | acceptability of homosexuality as a pretended family
         | relationship" - https://en.wikipedia.org/wiki/Section_28
         | 
         | [1] https://en.wikipedia.org/wiki/Hussein
        
         | jsight wrote:
         | The core problem is centralization of control. If everyone uses
         | their own desktop computer, then everyone is responsible for
         | their own behavior.
         | 
         | If everyone uses Hosting Service F, then at some point people
         | will blur the lines and expect "Hosting Service F" to remove
         | vulgar or offensive content. The lines themselves will be a
         | zeitgeist of sorts with inevitable decisions that are
         | acceptable to some but not all.
         | 
         | Can you even blame them? There are lots of ways for this to go
         | wrong and noone wants to be on the wrong side of a PR blast.
         | 
         | So heavy guardrails are effectively inevitable.
        
         | __loam wrote:
         | I'm sure the millions of people they stole the data from feel
         | very empowered.
        
       | hizanberg wrote:
       | IMO the "safety" in Stable Diffusion is becoming more overzealous
       | where most of my images are coming back blurred, where I no
       | longer want to waste my time writing a prompt only for it to
       | return mostly blurred images. Prompts that worked in previous
       | versions like portraits are coming back mostly blurred in SDXL.
       | 
       | If this next version is just as bad, I'm going to stop using
       | Stability APIs. Are there any other text-to-image services that
       | offer similar value and quality to Stable Diffusion without the
       | overzealous blurring?
       | 
       | Edit:
       | 
       | Example prompt's like "Matte portrait of Yennefer" return 8/9
       | blurred images [1]
       | 
       | [1] https://imgur.com/a/nIx8GBR
        
         | nickthegreek wrote:
         | Run it locally.
        
           | lolinder wrote:
           | I haven't tried SD3, but my local SD2 regularly has this
           | pattern where while the image is developing it looks like
           | it's coming along fine and then suddenly in the last few
           | rounds it introduces weird artifacts to mask faces. Running
           | locally doesn't get around censorship that's baked into the
           | model.
           | 
           | I tend to lean towards SD1.5 for this reason--I'd rather put
           | in the effort to get a good result out of the lesser model
           | than fight with a black box censorship algorithm.
           | 
           | EDIT: See the replies below. I might just have been holding
           | it wrong.
        
             | yreg wrote:
             | Do you use the proper refiner model?
        
               | lolinder wrote:
               | Probably not, since I have no idea what you're talking
               | about. I've just been using the models that InvokeAI
               | (2.3, I only just now saw there's a 3.0) downloads for me
               | [0]. The SD1.5 one is as good as ever, but the SD2 model
               | introduces artifacts on (many, but not all) faces and
               | copyrighted characters.
               | 
               | EDIT: based on the other reply, I think I understand what
               | you're suggesting, and I'll definitely take a look next
               | time I run it.
               | 
               | [0] https://github.com/invoke-ai/InvokeAI
        
               | yreg wrote:
               | SDXL should be used together with a refiner. You can
               | usually see the refiner kicking in if you have a UI that
               | shows you the preview of intermediate steps. And it can
               | sometimes look like the situation you describe (straining
               | further away from your desired result).
               | 
               | Same goes for upscalers, of course.
        
               | SV_BubbleTime wrote:
               | Basically don't use SD2.x, it's trash and the community
               | rejected it.
               | 
               | If you are using invoke, try XL.
               | 
               | If you want to really dial into a specific style or apply
               | a specific LORA, use 1.5.
        
             | fnordpiglet wrote:
             | Be sure to turn off the refiner. This sounds like you're
             | making models that aren't aligned with their base models
             | and the refiner runs in the last steps. If it's a prompt
             | out of alignment with the default base model it'll heavily
             | distort. Personally with SDXL I never use the refiner I
             | just use more steps.
        
               | lolinder wrote:
               | That makes sense. I'll try that next time!
        
               | zettabomb wrote:
               | SD2 isn't SDXL. SD2 was a continuation of the original
               | models that didn't see much success. It didn't have a
               | refiner.
        
               | cchance wrote:
               | Well ya because SD2 literally had purposeful censorship
               | of the base model and the clip, that basically made it
               | DOA to the entire opensource community that were
               | dedicated to 1.5, SDXL wasnt so bad so it gained traction
               | but still 1.5 is the king because it was from before the
               | damn models were gimped at the knees and relied on
               | workarounds and insane finetunes just to get basic
               | anatomy correct.
        
           | hizanberg wrote:
           | Don't expect my current desktop will be able to handle it,
           | which is why I'm happy to pay for API access, but my next
           | Desktop should be capable.
           | 
           | Is the OSS'd version of SDXL less restrictive than their API
           | hosted version?
        
             | nickthegreek wrote:
             | If you run into issues, switch to a fine-tuned model from
             | civitai.
        
             | yreg wrote:
             | You can set up the same thing you would have locally on
             | some spot cloud instance.
        
         | Tenoke wrote:
         | The nice thing about Stable Diffusion is that you can very
         | easily set it up on a machine you control without any 'safety'
         | and with a user-finetuned checkpoint.
        
           | cyanydeez wrote:
           | they're nerfing the models, not just the prompt engineering.
           | 
           | After SD1.5 they started directly modifying the dataset.
           | 
           | it's only other users who "restore" the porno.
           | 
           | and that's what we're discussing. there's a real concern
           | about it as a public offering.
        
             | Tenoke wrote:
             | Sure, but again if you run it yourself you can use the
             | finetuned by users checkpoints that have it.
        
               | cyanydeez wrote:
               | yes, but the GP is discussing the API, and specifically
               | the company that offers the base model.
               | 
               | they both don't want to offer anything that's legally
               | dubious and it's not hard to understand why.
        
             | jncfhnb wrote:
             | No it's not. It's perfectly reasonable not to want to
             | generate porn for customers.
             | 
             | The models being open sourced makes them very easy to turn
             | into the most deprived porno machines ever conceived. And
             | they are.
             | 
             | It is in no way a meaningful barrier to what people can do.
             | That's the benefit of open source software.
        
         | lancesells wrote:
         | I don't use it at all but do you mind sharing what prompts
         | don't work?
        
           | hizanberg wrote:
           | Last prompt I tried was "Matte portrait of Yennefer" returned
           | 8/9 blurred images [1]
           | 
           | [1] https://imgur.com/a/nIx8GBR
        
             | not2b wrote:
             | It appears that they are trying to prevent generating
             | accurate images of a real person, because they are worried
             | about deepfakes, and this produces the blurring. While
             | Yennefer is a fictional character she's played by a real
             | actress on Netflix, so maybe that's what is triggering the
             | filter.
        
         | NoMoreNicksLeft wrote:
         | Wait, blurring (black) means that it objected to the content? I
         | tried it a few times on one of the online/free sites
         | (Huggingspace, I think) and I just assumed I'd gotten a
         | parameter wrong.
        
           | pksebben wrote:
           | Not necessarily, but it can. Black squares can come from a
           | variety of problems.
        
         | gangstead wrote:
         | I've never seen blurring in my images. Is that something that
         | they add when you do API access? I'm running SD 1.5 and SDXL
         | 1.0 models locally. Maybe I'm just not prompting for things
         | they deem naughty. Can you share an example prompt where the
         | result gets blurred?
        
           | jncfhnb wrote:
           | If you run locally with the basic stack it's literally a bool
           | flag to hide nsfw content. It's trivial to turn off and off
           | by default in most open source setups.
        
           | stavros wrote:
           | It's a filter they apply after generation.
        
         | araes wrote:
         | Taking the actual example you provided, I can understand the
         | issue. Since it amounts to blurring images of a virtual
         | character, that are not actually "naughty." Equivalent images
         | in bulk quantity are available on every search engine with
         | "yennefer witcher 3 game" [1][2][3][4][5][6] Returns almost the
         | exact generated images, just blurry.
         | 
         | [1] Google:
         | https://www.google.com/search?sca_esv=a930a3196aed2650&q=yen...
         | 
         | [2] Bing via Ecosia:
         | https://www.ecosia.org/images?q=yennefer%20witcher%203%20gam...
         | 
         | [3] Bing:
         | https://www.bing.com/images/search?q=yennefer+witcher+3+game...
         | 
         | [4] DDG:
         | https://duckduckgo.com/?va=e&t=hj&q=yennefer+witcher+3+game&...
         | 
         | [5] Yippy:
         | https://www.alltheinternet.com/?q=yennefer+witcher+3+game&ar...
         | 
         | [6] Dogpile:
         | https://www.dogpile.com/serp?qc=images&q=yennefer+witcher+3+...
        
       | ametrau wrote:
       | "Safety" = safe to our reputation. It's insulting how they imply
       | safety from "harm".
        
         | kingkawn wrote:
         | So they should dash their company on the rocks of your empty
         | moral positions about freedom?
        
           | dingnuts wrote:
           | should pens be banned because a talented artist could draw a
           | photorealistic image of something nasty happening to someone
           | real?
        
             | mrighele wrote:
             | Photoshop and the likes (modern day's pens) should have an
             | automatic check that you are not drawing porn, censor the
             | image and report you to the authorities if it thinks it
             | involves minors.
             | 
             | edit: yes it is sarcasm, though I fear somebody will think
             | it is in fact the right way to go.
        
               | mtlmtlmtlmtl wrote:
               | That's ridiculous. What about real pens and paintbrushes?
               | Should they be mandated to have a camera that analyses
               | everything you draw/write just to be "safe"?
               | 
               | Maybe we should make it illegal to draw or write anything
               | without submitting it to the state for "safety" analysis.
        
               | gambiting wrote:
               | I hope that's sarcasm.
        
               | IMTDb wrote:
               | Text editors and the likes (modern day's typewriters)
               | should have an automatic check that you are not
               | criticizing the government, censor the text and report
               | you to the authorities if it thinks it an alternate
               | political party.
               | 
               | Hopefully you are going to be absolutely shocked by the
               | prospect of the above sentence. But as you can see,
               | surveillance is a slippery slope. "Safety" is a very
               | dangerous word because everybody wants to be "safe" but
               | no one is really ready to define what "safe" actually
               | means. The moment we start baking cultural / political /
               | environmental preferences and biases in the tools we use
               | to produce content, we allow other group of people with
               | different views to use those "safeguards" to harm us or
               | influence us in ways we might not necessarily like.
               | 
               | The safest notebook I can find is indeed a simple pen and
               | paper because it does not know or care what is being
               | written, it just does it's best regardless of how amazing
               | or horrible the content is.
        
         | jameshart wrote:
         | Safety is also safe for people trying to make use of the
         | technology at scale for most benign usecases.
         | 
         | Want to install a plugin into Wordpress to autogenerate fun
         | illustrations to go at the top of the help articles in your
         | intranet? You probably don't want the model to have a 1 in 100
         | chance of outputting porn or extreme violence.
        
       | SubiculumCode wrote:
       | It is interesting to me that these diffusion image models are so
       | much smaller than the LLMs.
        
       | sjm wrote:
       | The example images look so bad. Absolutely zero artistic value.
        
         | wongarsu wrote:
         | From a technical perspective they are impressive. The depth of
         | field in the classroom photo and the macro shot. The detail in
         | the chameleon. The perfect writing in very different styles and
         | fonts. The dust kicked up by the donut.
         | 
         | The artistic value is something you have to add with a good
         | prompt with artistic vision. These images are probably the AI
         | equivalent of "programmer art". It fulfills its function, but
         | lacks aesthetic considerations. I wouldn't attribute that to
         | the model just yet.
        
         | the_duke wrote:
         | I'm willing to bet that they are avoiding artistic images on
         | purpose to not get any heat from artists feeling ripped off,
         | which did happen previously.
        
       | robertwt7 wrote:
       | It'll be interesting to see what "safety" means in this case
       | given the censorship in diffuser models nowadays. Look what's
       | happening with Gemini, it's quite scary really how different
       | companies have different censorship values
       | 
       | I've had some fair share of frustation with DallE as well when
       | trying to generate weapon images for game assets. Had to tweak a
       | lot of my prompt
        
         | yreg wrote:
         | > it's quite scary really how different companies have
         | different censorship values
         | 
         | The fact that they have censorship values is scary. But the
         | fact that those are different is better than the alternative.
        
       | declan_roberts wrote:
       | Can it generate an image of people without injecting insufferable
       | diversity quotas into each image? If so then it's the most
       | advanced model on the internet right now!
        
       | miohtama wrote:
       | No model. Half of the announcement text is "we area really really
       | responsible and safe, believe us."
       | 
       | Kind of a dud for an announcement.
        
         | nextworddev wrote:
         | The company itself is about to go run out of money hence the
         | Hail Mary at trying to get acquired
        
           | yreg wrote:
           | They raised 110M in October. How much are they burning and
           | how? Training each model allegedly costs hundreds of k.
        
       | haolez wrote:
       | Rewriting the "safety" part, but replacing the AI tool with an
       | imaginary knife called Big Knife:
       | 
       | "We believe in safe, responsible knife practices. This means we
       | have taken and continue to take reasonable steps to prevent the
       | misuse of Big Knife by bad actors."
        
       | animex wrote:
       | Ugh, another startup(?) requiring Discord to use their product.
       | :(
        
         | tavavex wrote:
         | As far as I know, the Discord thing is only for doing early
         | testing among their community. The full model releases are
         | posted to Hugging Face.
        
       | 13of40 wrote:
       | "we have taken and continue to take reasonable steps to prevent
       | the misuse of Stable Diffusion 3 by bad actors"
       | 
       | It's kind of a testament to our times that the person who chooses
       | to look at synthetic porn instead of supporting a real-life human
       | trafficking industry is the bad actor.
        
         | user_7832 wrote:
         | Agree, I think it fundamentally stems from the old conservative
         | view that porn = bad. Morally policing such models is
         | questionable.
        
           | rockooooo wrote:
           | no AI company wants to be the one generating pornographic
           | deepfakes of someone and getting in legal / PR hot water
        
             | seanw444 wrote:
             | Which is why this should be a much more decentralized
             | effort. Hard to take someone to court when it's not one
             | single person or company doing something.
        
             | mrkramer wrote:
             | But what if you flip the things the other way around;
             | deepfake porn is problematic not because porn is per se
             | problematic but because deepfake porn or deepfake revenge
             | porn is made without consent, but what if you give consent
             | to some AI company or porn company to make porn content of
             | you. I see this as evolution of OnlyFans where you could
             | make AI generated deepfake porn of yourself.
             | 
             | Another use case would be that retired porn actors could
             | license their porn persona (face/body) to some AI porn
             | company to make new porn.
             | 
             | I see big business opportunity in the generative AI porn.
        
           | Cookingboy wrote:
           | This is why I think generative AI tech should either be
           | banned or be completely open sourced. Mega tech corporations
           | are plenty of things already, they don't need to be the
           | morality police for our society too.
        
             | pksebben wrote:
             | Even if it is all open sourced, we still have the
             | structural problem of training models large enough to do
             | interesting stuff.
             | 
             | Until we can train incrementally and distribute the
             | workload scalably, it doesn't matter how open the models /
             | methods for training are if you still need a bajilllion
             | A100 hours to train the damn things.
        
           | echelon wrote:
           | Horeshoe theory [1] is one of the most interesting viewpoints
           | I've been introduced to recently.
           | 
           | Both sides view censorship as a moral prerogative to enforce
           | their world view.
           | 
           | Some conservatives want to ban depictions of sex.
           | 
           | Some conservatives want to ban LGBT depictions.
           | 
           | Some women's rights folks want to ban depictions sex. (Some
           | view it as empowerment, some view it as exploitation.)
           | 
           | Some liberals want to ban non-diverse, dangerous
           | representation.
           | 
           | Some liberals want to ban conservative views against their
           | thoughts.
           | 
           | Some liberals want to ban religion.
           | 
           | ...
           | 
           | It's team sports with different flavors on each side.
           | 
           | The best policy, IMO, is to avoid centralized censorship and
           | allow for individuals to control their own algorithmic
           | boosting / deboosting.
           | 
           | [1] https://en.wikipedia.org/wiki/Horseshoe_theory
        
             | stared wrote:
             | Yes and no.
             | 
             | I mean, a lot of moderates would like to avoid seeing any
             | extreme content, regardless of whether it is too much left,
             | right, or just in a non-political uncanny valley.
             | 
             | While the Horseshoe Theory has some merits (e.g., both left
             | and right extremes may favor justified coercion, have the
             | we-vs-them mentality, etc), it is grossly oversimplified.
             | Still, a very simple (yet two-dimensional) model of
             | Political Compass is much better.
        
               | echelon wrote:
               | I think it's just a different projection to highlight
               | similarities in left and right and is by no means the
               | only lens to use.
               | 
               | The fun quirk is that there are similarities, and this
               | model draws comparison front and center.
               | 
               | There are multiple useful models for evaluating politics,
               | though.
        
             | crashmat wrote:
             | I don't think there are any (even far) leftwanting to ban
             | non-diverse representation. I think it's impossible to ban
             | 'conservative thoughts' because that's such a poorly
             | defined phrase. However there are people who want to ban
             | religion. One difference is that a much larger proportion
             | of far right (almost all of them) want to ban lgbtq
             | depiction and existence compared to the number of far left
             | who want to ban religion or non-diverse representation.
             | 
             | It says on the wikipedia article itself 'The horseshoe
             | theory does not enjoy wide support within academic circles;
             | peer-reviewed research by political scientists on the
             | subject is scarce, and existing studies and comprehensive
             | reviews have often contradicted its central premises, or
             | found only limited support for the theory under certain
             | conditions.'
        
               | echelon wrote:
               | > I don't think there are any (even far) leftwanting to
               | ban non-diverse representation.
               | 
               | Look at the rules to win an Oscar now.
               | 
               | To cite a direct and personal case, I was involved in
               | writing code for one of the US government's COVID bailout
               | programs, the Restaurant Revitalization Fund. Billions of
               | dollars of relief, but targeted to non-white, non-male
               | restaurant owners. There was a lawsuit after the fact to
               | stop the unfair filtering, but it was too late and the
               | funds were completely dispensed. That felt really gross
               | (though many of my colleagues cheered and even jeered at
               | the complainers).
               | 
               | > I think it's impossible to ban 'conservative thoughts'
               | because that's such a poorly defined phrase.
               | 
               | I commented in /r/conservative (which I was banned from)
               | a few times, and I was summarily banned from five or six
               | other subreddits by some heinous automation. Guilt by
               | association. Except it wasn't even -- I was adding
               | commentary in /r/conservative to ask folks to sympathize
               | more with trans folks. Both sides here ideologically ban
               | with impunity and can be intolerant of ideas they don't
               | like.
               | 
               | I got banned from my city's subreddit for posting a
               | concern about crime. Or maybe they used these same
               | automated, high-blast radius tools. I'm effectively cut
               | out of communication with like-minded people in my city.
               | I think that's pretty fucked.
               | 
               | Mastodon instances are set up to ban on ideology...
               | 
               | This is all wrong and a horrible direction to go in.
               | 
               | It doesn't matter what _your_ views are, I think we all
               | need to be more tolerant and empathetic of others. Even
               | those we disagree with.
        
             | asddubs wrote:
             | this comment reminds me of that "did you know good things
             | and bad things are actually the same" tweet
        
               | echelon wrote:
               | I'm sorry, but censorship and partisanship are not good
               | things.
               | 
               | Both sides need to get a grip, start meeting in the
               | middle, and generally let each be to their own.
               | 
               | Platforms weighing in on this makes it even worse and
               | more polarizing.
               | 
               | We shouldn't be so different and disagreeable. We have
               | more in common with one another than not.
               | 
               | The points of polarization on each end rhyme with one
               | another.
        
         | sigmoid10 wrote:
         | I don't think the problem is watching synthetic images. The
         | problem is generating them based off actual people and sharing
         | them on the internet in a way that the people watching can't
         | tell the difference anymore. This was already somewhat of a
         | problem with Photoshop and once everyone with zero skills can
         | do it in seconds and with far better quality, it will become a
         | nightmare.
        
           | 725686 wrote:
           | We are already there, you can no longer trust any image or
           | video you see, so what is the point? Bad actors will still be
           | able to create fake images and videos as they already do.
           | Limiting it for the average user is stupid.
        
             | mplewis wrote:
             | You guys know you can just draw porn, right?
        
               | seanmcdirmid wrote:
               | Generating porn is easier and cheaper. You don't have to
               | spend the time learning to draw naked bodies, which can
               | be substantial. (The joke being that serious drawers go
               | through the draw naked model sessions a lot, but it isn't
               | porn)
        
               | tourmalinetaco wrote:
               | > but it isn't porn
               | 
               | In my experience with 2D artists, studying porn is one of
               | their favorite forms of naked model practice.
        
               | seanmcdirmid wrote:
               | The models art schools get for naked drawing sessions
               | usually aren't that attractive, definitely not at a porn
               | ideal. The objective is to learn the body, not become
               | aroused.
               | 
               | There is a lot of (mostly non realistic) porn that comes
               | out of art school students via the skills they gain.
        
             | sigmoid10 wrote:
             | We are not actually there yet. First, you still need some
             | technical understanding and a somewhat decent setup to run
             | these models yourself without the guardrails. So the
             | average greasy dude who wants to share HD porn based on
             | your daugther's linkedin profile pic on nsfw subreddits
             | still has too many hoops to jump through. Right now you can
             | also still spot AI images pretty easily, if you know what
             | to look for. Especially for previous stable diffusion
             | models. But all of this could change very soon.
        
           | Salgat wrote:
           | I'll challenge this idea and say that once it becomes
           | ubiquitous, it actually does more good than harm. Things like
           | revenge porn become pointless if there's no way to prove it's
           | even real, and I have yet to ever see deep fakes of porn
           | amount to anything.
        
           | idle_zealot wrote:
           | > once everyone with zero skills can do it in seconds and
           | with far better quality, it will become a nightmare.
           | 
           | Will it be a nightmare? If it becomes so easy and common that
           | anyone can do it, then surely trust in the veracity of
           | damaging images will drop to about 0. That loss of trust
           | presents problems, but not ones that "safe" AI can solve.
        
             | foobarian wrote:
             | Arguably that loss of trust would be a net positive.
        
             | sigmoid10 wrote:
             | >surely trust in the veracity of damaging images will drop
             | to about 0
             | 
             | Maybe, eventually. But we don't know how long it will take
             | (or if it will happen at all). And the time until then will
             | be a nightmare for every single woman out there who has any
             | sort of profile picture on any website. Just look at how
             | celebrity deepfakes got reddit into trouble even though
             | their generation was vastly more complex and you could
             | still clearly tell that the videos were fake. Now imagine
             | everyone can suddenly post undetectable nude selfies of
             | your girlfriend on nsfw subreddits. Even if people
             | eventually catch on, that first shock will be unavoidable.
        
               | jquery wrote:
               | The tide is rolling in and we have two options... yell at
               | the tide really loud that we were here first and we
               | shouldn't have to move... or get out of the way. I'm a
               | lot more sympathetic to the latter option myself.
        
               | swatcoder wrote:
               | Your anxiety dream relies on there currently being some
               | _technical_ bottleneck limiting the creation or spread of
               | embarassing fake nudes as a way of cyberbullying.
               | 
               | I don't see any evidence of that. What I see is that
               | people who want to embarass and bully others are already
               | fully enabled to do so, and do so.
               | 
               | It seems more likely to me and many of us that the
               | bottleneck that stops it from being worse is simply that
               | only so many people think it's reasonable or satisfying
               | to distribute embarassing fake nudes of someone. Society
               | already shuns it and it's not that effective as a way of
               | bullying and embarassing people, so only so many people
               | are moved to bother.
               | 
               | Assuming that the hyped up new product is due to swoop in
               | and disrupt the cyberbullying "industry" is just a
               | classic technologist's fantasy.
               | 
               | It ignores all the boring realities of actual human
               | behavior, social norms, and secure equilibriums, etc;
               | skips any evidence building or research effort; and just
               | presumes that some new technology is just sooooo powerful
               | that none of that prior ground truth stuff matters.
               | 
               | I get why people who think that way might be on HN or in
               | some Silicon Valley circles, but it can be one of the
               | eyeroll-inducing vices of these communities as much as it
               | can be one of its motivational virtues.
        
               | mdasen wrote:
               | This: it won't happen immediately and I'd go even further
               | to say that it even if trust in images drops to zero,
               | it's still going to generate a lot of hell.
               | 
               | I've always been able to say all sorts of lies. People
               | have known for millennia that lies exist. Yet lies still
               | hurt people a ton. If I say something like, "idle_zealot
               | embezzled from his last company," people know that could
               | be a lie (and I'm not saying you did, I have no idea who
               | you are). But that kind of stuff can certainly hurt
               | people. We all know that text can be lies and therefore
               | we should have zero trust in any text that we read - yet
               | that isn't how things play out in the real world.
               | 
               | Images are compelling even if we don't trust that they're
               | authentic. Hell, paintings were used for thousands of
               | years to convey "truth", but a painting can be a lie just
               | as much as text or speech.
               | 
               | We created tons of religious art in part because it makes
               | the stories people want others to believe more concrete
               | for them. Everyone knows that "Christ in the Storm on the
               | Sea of Galilee" isn't an authentic representation of
               | anything. It was painted in 1633, more than a century and
               | a half after the event was purported to have happened.
               | But it's still the kind of thing that's powerful.
               | 
               | An AI generated image of you writing racist graffiti is
               | way more believable to be authentic. I have no reason to
               | think you'd do such a thing, but it's within the realm of
               | possibility. There's zero possibility (disregarding
               | supernatural possibilities) that Rembrandt could
               | accurately represent his scene in "Christ in the Storm on
               | the Sea of Galilee". What happens when all the search
               | engine results for your name start calling you a racist -
               | even when you aren't?
               | 
               | The fact is that even when we know things can be faked,
               | we still put a decent amount of trust in them. People
               | spread rumors all the time. Did your high school not have
               | a rumor mill that just kinda destroyed some kids?
               | 
               | Heck, we have right-wing talking heads making up
               | outlandish nonsense that's easily verifiable as false
               | that a third of the country believes without questioning.
               | I'm not talking about stuff like taxes or gun control or
               | whatever - they're claiming things like schools having to
               | have litter boxes for students that identify as cats (htt
               | ps://en.wikipedia.org/wiki/Litter_boxes_in_schools_hoax).
               | We know that people lie. There should be zero trust in a
               | statement like "schools are installing litter boxes for
               | students that identify as cats." Yet it spread like
               | crazy, many people still believe it despite it being
               | proven false, and it has been used to harm a lot of LGBT
               | students. That's a way less believable story than an AI
               | image of you with a racist tattoo.
               | 
               | Finally, no one likes their name and image appropriated
               | for things that aren't them. We don't like lies being
               | spread about us even if 99% of people won't believe the
               | lies. Heck, we see Donald Trump go on rants about
               | truthful images of him that portray his body in ways he
               | doesn't like (and they're just things like him golfing,
               | but an unflattering pose). I don't want fake naked images
               | of me even if they're literally labeled as fake. It still
               | feels like an invasion of privacy and in a lot of ways it
               | would end up that way - people would debate things like
               | "nah, her breasts probably aren't that big." Words can
               | hurt. Images can hurt even more - even if it's all lies.
               | There's a reason why we created paintings even when we
               | knew that paintings weren't authentic: images have power
               | and that power is going to hurt people even more than the
               | words we've always been able to use for lies.
               | 
               | tl;dr: 1) It will take a long time before people's trust
               | in images "drops to zero"; 2) Even when people know an
               | image isn't real, it's still compelling - it's why
               | paintings have existed and were important politically for
               | millennia; 3) We've always known speech and text can be
               | lies, but we regularly see lies believed and hugely
               | damage people's lives - and images will always be more
               | compelling than speech/text; 4) Even if no one believes
               | something is true, there's something psychologically
               | damaging about someone spreading lies about you - and
               | it's a lot worse when they can do it with imagery.
        
             | IanCal wrote:
             | > If it becomes so easy and common that anyone can do it,
             | then surely trust in the veracity of damaging images will
             | drop to about 0
             | 
             | People believe plenty of just written words - which are
             | extremely easy to "fake", you just type them. Why has that
             | trust not dropped to about 0?
        
               | UberFly wrote:
               | Exactly. They are giving people's deductive reasoning
               | skills too much credit.
        
               | Al-Khwarizmi wrote:
               | It kind of has? People believe written words when they
               | come from a source that they consider, erroneously or
               | not, to be trustworthy (newspaper, printed book,
               | Wikipedia, etc.). They trust the source, not the words
               | themselves just due to being written somewhere.
               | 
               | This has so far not been true of videos (e.g. a video of
               | a celebrity from a random source has typically been
               | trusted by laypeople) and should change.
        
             | Sohcahtoa82 wrote:
             | > if it becomes so easy and common that anyone can do it,
             | then surely trust in the veracity of damaging images will
             | drop to about 0.
             | 
             | Spend more time on Facebook and you'll lose your faith in
             | humanity.
             | 
             | I've seen obviously AI generated pictures of a 5 year old
             | holding a chainsaw right next to a beautiful wooden
             | sculpture, and the comments are filled with boomers amazed
             | at that child's talent.
             | 
             | There are still people that think the IRS will call them
             | and make them pay their taxes over the phone with Apple
             | gift cards.
        
               | SkyBelow wrote:
               | If we follow the idea of safety, should we restrict the
               | internet so either such users can safely use the internet
               | (and phones, gift cards, technology in general) without
               | being scammed, or otherwise restrict it so that at risk
               | individuals can't use the technology at all?
               | 
               | Otherwise, why is AI specifically being targeted, other
               | than the fear of new things that looks similar to the
               | moral panics of video games.
        
               | themoonisachees wrote:
               | In concept this is maybe desirable; boot anyone off the
               | internet that isn't able to use it safely.
               | 
               | In reality this is a disaster. The elderly and homeless
               | people are already being left behind massively by a
               | society that believes internet access is something
               | everybody everywhere has. This is somewhat fine when the
               | thing they want to access is twitter (and even then, even
               | with the current state of twitter, who are you to judge
               | who should and should not be on it?), but it becomes a
               | Major Problem(tm) when the thing they want to access is
               | their bank. Any technological solutions you just thought
               | about for this problem are not sufficient when we're
               | talking about "Can everybody continue to live their lives
               | considering we've kinda thrust the internet on them
               | without them asking"
        
             | BryantD wrote:
             | Let me give you a specific counterexample: it's easy and
             | common to generate phishing emails. Trust in email has not
             | dropped to the degree that phishing is not a problem.
        
               | Al-Khwarizmi wrote:
               | Phishing emails mostly work because they apparently come
               | from a trusted source, though. The key is that they fake
               | the source, not that people will just trust random
               | written words just because they are written, as they do
               | with videos.
               | 
               | A better analogy would be Nigerian prince emails, but
               | only a tiny minority of people believe those... or at
               | least that's what I want to think!
        
               | BryantD wrote:
               | The trusted source thing is important, but there's some
               | degree of evidence that videos and images generate trust
               | in a source, I think?
        
               | amenhotep wrote:
               | That's the point. They do, but they _no longer should_.
               | Our technical capabilities for lying have begun to
               | overwhelm the old heuristics, and the sooner people
               | realise the better.
        
           | monitorlizard wrote:
           | Perhaps I'm being overly contrarian, but from my point of
           | view, I feel that could be a blessing in disguise. For
           | example, in a world where deepfake pornography is ubiquitous,
           | it becomes much harder to tarnish someone's reputation
           | through revenge porn, real or fake. I'm reminded of Syndrome
           | from The Incredibles: "When everyone is super no one will
           | be."
        
           | fimdomeio wrote:
           | The censuring of porn content exists for PR reasons. They
           | just want to have a way to say "we tried to prevent it". If
           | anyone wants to generate porn, then it just needs 30 min of
           | research to find the huge amount of models based on stable
           | diffusion with nsfw content.
           | 
           | If you can generate synthetic images and have a channel to
           | broadcast them, then you could generate way bigger problems
           | then fake celebrity porn.
           | 
           | Not saying that it is not a problem, but rather that it is a
           | problem inherent to the whole tool, not to some specific
           | subjects.
        
           | boringuser2 wrote:
           | If that ever becomes an actual problem, our entire society
           | will be at a filter point.
           | 
           | This is the problem with these kind of incremental
           | mitigations philosophically -- as soon as the actual problem
           | were to manifest it would instantly become a civilization-
           | level threat that would only be resolved with drastic
           | restructuring of society.
           | 
           | Same logic for an AI that replaces a programmer. As soon as
           | AI is that advanced the problem requires vast changes.
           | 
           | Incremental mitigations don't do anything.
        
           | cooper_ganglia wrote:
           | I watched an old Tom Scott video of him predicting what the
           | distant year 2030 would look like. In his talk, he mentioned
           | privacy becoming something quaint that your grandparents used
           | to believe in.
           | 
           | I've wondered for a while if we just adapt to the point that
           | we're unfazed by fake nude photos of people. The recent Bobbi
           | Althoff "leaks" reminded me of this. That's a little
           | different since she's a public figure, but I really wonder if
           | we just go into the future assuming all photos like that have
           | been faked, and if someone's iCloud gets leaked now it'll
           | actually be less stressful because 1. They can claim it's AI
           | images, or 2. There's already lewd AI images of them, so the
           | real ones leaking don't really make much of a difference.
        
             | flir wrote:
             | There's an argument that privacy (more accurately
             | anonymity) is a temporary phenomenon, a consequence of the
             | scale that comes with industrialization. We didn't really
             | have it in small villages, and we won't really have it in
             | the global village.
             | 
             | (I'm not a fan of the direction, but then I'm a product of
             | stage 2).
        
           | Szpadel wrote:
           | serious question, is that really that hard to remove personal
           | information from training data so model does not know how
           | specific public figures look like?
           | 
           | I believe this worked with nudity and model when asked
           | generated "smooth" intimate regions (like some kind of doll)
           | 
           | so you could ask for eg. generic president but not any
           | specific one, so it would be very hard to generate anyone
           | specific
        
             | amenhotep wrote:
             | Proprietary, inaccessible models can somewhat do that.
             | Locally hosted models can simply be trained on what a
             | specific person looks like by the user, you just need a
             | couple dozen photos. Keyword: LoRA.
        
           | fennecbutt wrote:
           | But just like privacy issues, this'll be possible.
           | 
           | It's only bad because society still hasn't normalised sex,
           | from a gay perspective y'all are prude af.
           | 
           | It's a shortcut, for us to just accept that these social
           | ideals and expectations will have to change so we may as well
           | do it now.
           | 
           | In 100 years, people will be able to make a personal AI that
           | looks, sounds and behaves like any person they want and does
           | anything they want. We'll have thinking dust, you can already
           | buy cameras like a mm^2, in the future I imagine they'll be
           | even smaller.
           | 
           | At some point it's going to get increasingly unproductive
           | trying to safeguard technology without people's social
           | expectations changing.
           | 
           | Same thing with Google Glass, shunned pretttty much
           | exclusively bc it has a camera on it (even tho phones at the
           | time did too), but now we got Ray Bans camera glasses and 50
           | years from now all glasses will have cameras, if we even
           | still wear them.
        
             | spazx wrote:
             | Yes this. This is what I've been trying to explain to my
             | friends.
             | 
             | When Tron came out in 1982, it was disliked because back
             | then using CGI effects was considered "cheating". Then
             | awhile later Pixar did movies entirely with CGI and they
             | were hits. Now almost every big studio movie uses CGI.
             | Shunned to embraced in like, 13 years.
             | 
             | I think over time the general consensus's views about AI
             | models will soften. Although it might take longer in some
             | communities. (Username checks out lol, furry here also. I
             | think the furs may take longer to embrace it.)
             | 
             | (Also, people will still continue to use older tools like
             | Photoshop to accomplish similar things.)
        
         | stared wrote:
         | It is not only about morals but the incentives of parties. The
         | need for sexual-explicit content is bigger than, say, for niche
         | artistic experiments of geometrical living cupboards owned by a
         | cybernetic dragon.
         | 
         | Stability AI, very understandably, does not want to be
         | associated with "the porn-generation tool". And if, even
         | occasionally, it generates criminal content, the backslash
         | would be enormous. Censoring the data requires effort but is
         | (for companies) worth it.
        
         | nonrandomstring wrote:
         | The term "bad actor" is starting to get cringe.
         | 
         | Ronald Reagan was a bad actor.
         | 
         | George Bush wore out "evildoers"?
         | 
         | Where next... fiends, miscreants, baddies, hooligans,
         | deadbeats?
         | 
         | Dastardly digital deviants Batman!
        
         | five_lights wrote:
         | >It's kind of a testament to our times that the person who
         | chooses to look at synthetic porn instead of supporting a real-
         | life human trafficking industry is the bad actor.
         | 
         | "Bad actor" is a pretty vague term, I think they are using it
         | as a catch all without diving into the specifics. we are all
         | projecting what that may mean based on our own awareness of
         | this topic as a result.
         | 
         | I totally agree with your assessment and honestly would love to
         | see this tech create less of a demand for the product human-
         | traffickers produce.
         | 
         | Celebrity deep fakes and racist images made by internet trolls
         | are a few of the overt things they are willing to acknowledge
         | is a problem, and they are fighting against (Google Gemini's
         | over correction on this has been the talk this week). Does it
         | put pressure on the companies to change for PR reasons, yes. It
         | also gives a little bit of a Streisand effect, so it may be a
         | zero sum game.
         | 
         | We aren't talking about the big issue surrounding this tech,
         | the issue that would cause far more damage to their brand than
         | celebrity deep fakes:
         | 
         | Pedophilic image generation.
         | 
         | Guard rails should be miles high for this one.
        
       | GenericPoster wrote:
       | The talk of "safety" and harm in every image or language model
       | release is getting quite boring and repetitive. The reasons why
       | it's there is obvious and there are known workarounds yet the
       | majority of conversations seems to be dominated by it. There's
       | very little discussion regarding the actual technology and I'm
       | aware of the irony of mentioning this. Really wish I could filter
       | out these sorts of posts.
       | 
       | Hopefuly it dies down soon but I doubt it. At least we don't have
       | to hear garbage about "WHy doEs opEn ai hAve oPEn iN thE namE iF
       | ThEY aReN'T oPEN SoURCe"
        
         | learningerrday wrote:
         | I hope the safety conversation doesn't die. The societal
         | effects of these technologies are quite large, and we should be
         | okay with creating the space to acknowledge and talk about the
         | good and the bad, and what we're doing to mitigate the negative
         | effects. In any case, even though it's repetitive, there exists
         | someone out there on the Interwebs who will discover that
         | information for the first time today (or whenever the release
         | is), and such disclosures are valuable. My favorite relevant
         | XKCD comic: https://xkcd.com/1053/
        
       | iterateAutomate wrote:
       | What is with these names haha, Stable Diffusion XL 1.0 and now to
       | Stable Diffusion 3??
        
         | yreg wrote:
         | There was 1.0, 1.5, 2.0, XL and now 3.0.
         | 
         | Not that weird.
        
         | cchance wrote:
         | XL was basically an experiment on a 2.1 architecture with some
         | tweaks but at a larger image size... hence the XL but it wasn't
         | really an evolution of the underlying architecture which is why
         | it wasn't 3.0 or even 2.5 it was "bigger" lol
        
       | k__ wrote:
       | So, they block all bad actors, but themselves?
        
       | ssalka wrote:
       | I wonder if this will actually be adopted by the community,
       | unlike SD. 2.0. Many are still developing around SD 1.5 due to
       | its uncensored nature. SDXL has done better than 2.0, but has
       | greater hardware requirements so still can't be used by everyone
       | running 1.5.
        
       | caycep wrote:
       | are all the model/back ends to Stability products basically
       | available OSS via Ludwig Maximilian University, more or less?
        
       | ummonk wrote:
       | It's going to have a restrictive license like Stable Cascade no
       | doubt.
        
       ___________________________________________________________________
       (page generated 2024-02-22 23:00 UTC)