[HN Gopher] Stable Diffusion 3
___________________________________________________________________
Stable Diffusion 3
Author : reqo
Score : 823 points
Date : 2024-02-22 13:20 UTC (9 hours ago)
(HTM) web link (stability.ai)
(TXT) w3m dump (stability.ai)
| pqdbr wrote:
| The sample images are absolutely stunning.
|
| Also, I was blown away by the "Stable Diffusion" written on the
| side of the bus.
| kzrdude wrote:
| Is it just me or is the stable diffusion bus image broken in
| the background? The bus back there does not look logical w.r.t
| placement and size relative to the sidewalk.
| PcChip wrote:
| The text/spelling part is a huge step forward
| gat1 wrote:
| I guess we do not know anything about the training dataset ?
| _1 wrote:
| It's ethical
| kranke155 wrote:
| "Ethical"
| amirhirsch wrote:
| The dataset is so ethical that it is actually just a press
| release and not generally available.
| wtcactus wrote:
| Who decides what's ethical in this scenario? Is it some
| independent entity?
| potwinkle wrote:
| I decided.
| thelazyone wrote:
| This is a good question - not only for the actual ethics of the
| training, but for the future of AI use for art. It's both gonna
| damage the livelyhood of many artists (me included, probably)
| but also make it accessibly to many more people. As long as the
| training dataset is ethical, I think fighting it is hard and
| pointless.
| yreg wrote:
| What data would you consider making the dataset unethical vs.
| ethical?
| satisfice wrote:
| Can it make a picture of a woman chasing a bear?
|
| The old one can't.
| cheald wrote:
| SD 1.5 (using RealisticVision 5.1, 20 steps, Euler A) spit out
| something technically correct (but hilarious) in just a few
| generations.
|
| "a woman chasing a bear, pursuit"
|
| https://i.imgur.com/RqCXVYC.png
| kbumsik wrote:
| So there is no license information yet?
| alexb_ wrote:
| > We believe in safe, responsible AI practices. This means we
| have taken and continue to take reasonable steps to prevent the
| misuse of Stable Diffusion 3 by bad actors. Safety starts when we
| begin training our model and continues throughout the testing,
| evaluation, and deployment. In preparation for this early
| preview, we've introduced numerous safeguards. By continually
| collaborating with researchers, experts, and our community, we
| expect to innovate further with integrity as we approach the
| model's public release.
|
| What exactly does this mean? Will we be able to see all of the
| "safeguards" and access all of the technology's power without
| someone else's restrictions on them?
| Tiberium wrote:
| For SDXL this meant that there were almost no NSFW (porn and
| similar) images included in the dataset, so the community had
| to fine-tune the model themselves to make it generate those.
| hhjinks wrote:
| The community would've had to do that anyway. The SD1.5-based
| NSFW models of today are miles ahead of those from just a
| year ago.
| Der_Einzige wrote:
| And the pony SDXL nsfw model is miles ahead of SD1.5 NSFW
| models. Thank you bronies!
| sschueller wrote:
| No worries, the safeguards are only for the general public.
| Criminals will have no issues going around them. /s
| SXX wrote:
| Criminals? We dont care about those.
|
| Think of childern! We must stop people from generating porn!
| willsmith72 wrote:
| at this point perfect text would be a gamechanger if it can be
| solved
|
| midjourney 6 can be completely photorealistic and include valid
| text, but also sometimes adds bad text. it's not much, but having
| to use an image editor for that is still annoying. for creating
| marketing material, getting perfect text every time and never
| getting bad text would be amazing
| falcor84 wrote:
| I wonder if we could get it to generate a layered output, to
| make it easy to change just the text layer. It already creates
| the textual part in a separate pass, right?
| deprecative wrote:
| I would bet that Adobe is definitely salivating at that.
| Might not be for a long time but it seems like a no brainer
| once the technology can handle it. Just the last few years
| have been fast and I interacted with the JS landscape for a
| few years. It moves faster than Sonic and this tech iterates
| quick.
| spywaregorilla wrote:
| Current open source tools include pretty decent off the shelf
| segment anything based detectors. It leaves a lot to be
| desired, but you do layer-like operations automatically
| detecting certain concept and applying changes to them or,
| less commonly exporting the cropped areas. But not the
| content "beneath" the layers as they don't exist.
| snovv_crash wrote:
| Which tools would you recommend for this kind of thing?
| spywaregorilla wrote:
| comfyui + https://github.com/ltdrdata/ComfyUI-Impact-Pack
| patates wrote:
| Half of the announcement talks about safety. The next step will
| be these control mechanisms being built into all sorts of
| software I suppose.
|
| It's "safe" for them, not for the users, at least they should
| make that clear.
| spir wrote:
| thanks, i hadn't fully realized that 'safety' means 'safe to
| offer' and not 'safe for users'. i won't forget it
| wiz21c wrote:
| They rather talk about "reasonable steps" to safety. Sounds
| like "just the minimum so we don't end up in legal trouble" to
| me...
| tasty_freeze wrote:
| There is some truth in what you say, just like saying you're a
| "free speech absolutist" sounds good at first blush. But the
| real world is more complicated, and the provider adds safety
| features because they have to operate in the real world and not
| just make superficial arguments about how things should work.
|
| Yes, they are protecting themselves from lawsuits, but they are
| also protecting other people. Preventing people asking for
| specific celebrities (or children) having sex is for their
| benefit too.
| s1k3s wrote:
| I truly wonder what "unsafe" scenarios an image generator could
| be used for? Don't we already have software that can do pretty
| much anything if a professional human is using it?
| t_von_doom wrote:
| I would say the barrier to entry is stopping a lot of
| 'candid' unsafe behaviour. I think you allude to it yourself
| in implying currently it requires a professional to achieve
| the same results.
|
| But giving that ability to _everyone_ will lead to a huge
| increase in undesirable and targeted/local behaviour.
|
| Presumably it enables any creep to generate what they want by
| virtue of being able to imagine it and type it, rather than
| learn a niche skill set or employ someone to do it (who is
| then also complicit in the act)
| hypocrticalCons wrote:
| "undesirable local behavior"
|
| Why don't you just say you believe thought crime should be
| punishable?
| KittenInABox wrote:
| [Edited: I'm realizing the person I'm responding to is
| kinda unhinged, so I'm retracting out of the convo.]
| wongarsu wrote:
| I imagine they might talk about things like students
| making nudes of their classmates and distributing them.
|
| Or maybe not. It's hard to tell when nobody seems to want
| to spell out what behaviors we want to prevent.
| hypocrticalCons wrote:
| Students already share nudes every day.
|
| Where are the Americans asking about Snapchat? If I were
| a developer at Scnapchat I could prolly open a few Blob
| Storage accounts and feed a darknet account big enough to
| live off of. You people are so manipulatable.
| jncfhnb wrote:
| Students don't share photorealistic renders of nude
| classmates getting gangbanged though
| 4bpp wrote:
| Would it be illegal for a student who is good at drawing
| to paint a nude picture of an unknowing classmate and
| distribute it?
|
| If yes, why doesn't the same law apply to AI? If no, why
| are we only concerned about it when AI is involved?
| Cthulhu_ wrote:
| Because AI lowers the barrier to entry; using your
| example, few people have the drawing skills (or the
| patience to learn them) or take the effort to make a
| picture like that, but the barrier is much lower when it
| takes five seconds of typing out a prompt.
|
| Second, the tool will become available to anyone,
| anywhere, not just a localised school. If generating
| naughty nudes is frowned upon in one place, another will
| have no qualms about it. And that's just things that are
| about decency, then there's the discussion about
| legality.
|
| Finally, when person A draws a picture, they are
| responsible for it - they produced it. Not the party that
| made the pencil or the paper. But when AI is used to
| generate it, is all of the responsibility still with the
| person that entered the prompt? I'm sure the T's and C's
| say so, but there may still be lawsuits.
| 4bpp wrote:
| Right, these are the same arguments against uncontrolled
| empowerment that I imagine mass literacy and the printing
| press faced. I would prefer to live in a society where
| individual freedom, at least in the cognitive domain, is
| protected by a more robust principle than "we have
| reviewed the pros and cons of giving you the freedom to
| do this, and determined the former to outweigh the latter
| _for the time being_ ".
| pixl97 wrote:
| You seem to be very confused about civil versus criminal
| penalties....
|
| Feel free to make an AI model that does almost anything,
| though I'd probably suggest that it doesn't make porn of
| minors as that is criminal in most jurisdiction, short of
| that it's probably not a criminal offense.
|
| Most companies are only very slightly worried about
| criminal offenses, they are far more concerned about
| civil trials. There is a far lower requirement for
| evidence. AI creator in email "Hmm, this could be
| dangerous". That's all you need to lose a civil trial.
| 4bpp wrote:
| Why do you figure I would be confused? Whether any
| liability for drawing porn of classmates is civil or
| criminal is orthogonal to the AI comparison. The question
| is if we would hold manufacturers of drawing tools or
| software, or purveyors of drawing knowledge (such as
| learn-to-draw books), liable, because they are playing
| the same role as the generative AI does here.
| pixl97 wrote:
| Because you seem to be very confused on civil liabilities
| in most products. Manufactures are commonly held liable
| for the users use of products, for example look at any
| number of products that have caused injury.
| 4bpp wrote:
| Surely those are typically when the manufacturer was
| taken to have made an implicit promise of safety to the
| user and their surroundings, and the user got injured. If
| your fridge topples onto you and you get injured, the
| manufacturer might be liable; if you set up a trap where
| you topple your fridge onto a hapless passer-by, the
| manufacturer will probably not be liable towards them.
| Likewise with the classic McDonalds coffee spill
| liability story - I've yet to hear of a case of a coffee
| vendor being held liable over a deliberate attack where
| someone splashed someone else with hot coffee.
| Sohcahtoa82 wrote:
| > You seem to be very confused about civil versus
| criminal penalties....
|
| Nah, I think it's a disagreement over whether a tool's
| maker gets blamed for evil use or the tool's user.
|
| It's a similar argument over whether or not gun
| manufacturers should have any liability for their
| products being used for murder.
| pixl97 wrote:
| >It's a similar argument over whether or not gun
| manufacturers
|
| This is really only a debate in the US and only because
| it's directly written in the constitution. Pretty much no
| other product works that way.
| darkwater wrote:
| Are we on the same HN that bashes
| Facebook/Twitter/X/TikTok/ads because they manipulate
| people, spread fake news or destroyed attention span?
| SV_BubbleTime wrote:
| Can you point to other crimes that are based on skill or
| effort?
| sssilver wrote:
| Photoshop also lowers that barrier of entry compared to
| pen and pencil. Paper also lowers the barrier compared to
| oil canvas.
|
| Affordable drawing classes and YouTube drawing tutorials
| lower the barrier of entry as well.
|
| Why on earth would manufacturers of pencils, papers,
| drawing classes, and drawing software feel responsible
| for censoring the result of combining their tool with the
| brain of their customer?
|
| A sharp kitchen knife significantly lowers the barrier of
| entry to murder someone. Many murders are committed
| everyday using a kitchen knife. Should kitchen knife
| manufacturers blog about this every week?
| freedomben wrote:
| I agree with your point, but I would be willing to bet
| that if knives were invented today rather than having
| been around awhile, they would absolutely be regulated
| and restricted to law enforcement if not military use.
| Hell, even printers, maybe not if invented today but
| perhaps in a couple years if we stay on the same
| trajectory, would probably require some sort of ML to
| refuse to print or "reproduce" unsafe content.
|
| I guess my point is that I don't think we're as
| inconsistent as a society as it seems when considering
| things like knives. It's not even strictly limited to
| thought crimes/information crimes. If alcohol were
| discovered today , I have no doubt that it would be
| banned and made schedule I
| Sohcahtoa82 wrote:
| > Hell, even printers, maybe not if invented today but
| perhaps in a couple years if we stay on the same
| trajectory, would probably require some sort of ML to
| refuse to print or "reproduce" unsafe content.
|
| Fun fact: Many scanners and photocopiers will detect that
| you're trying to scan/copy a banknote and will refuse to
| complete the scan. One of the ways is detecting the
| EURion Constellation.
|
| https://en.wikipedia.org/wiki/EURion_constellation
| hospadar wrote:
| IANAL but that sounds like harrassment, I assume the
| legality of that depends on the context (did the artist
| previously date the subject? lots of states have laws
| against harassment and revenge porn that seem applicable
| here [1]. are you coworkers? etc), but I don't see why
| such laws wouldn't apply to AI generated art as well.
| It's the distribution that's really the issue in most
| cases. If you paint secret nudes and keep them in your
| bedroom and never show them to anyone it's creepy, but I
| imagine not illegal.
|
| I'd guess that stability is concerned with their legal
| liability, also perhaps they are decent humans who don't
| want to make a product that is primarily used for
| harassment (whether they are decent humans or not, I
| imagine it would affect the bottom line eventually if
| they develop a really bad rep, or a bunch of politicians
| and rich people are targeted by deepfake harassment).
|
| [1] https://www.cagoldberglaw.com/states-with-revenge-
| porn-laws/...
|
| ^ a lot of, but not all of those laws seem pretty
| specific to photographs/videos that were shared with the
| expectation of privacy and I'm not sure how they would
| apply to a painting/drawing, and I certainly don't know
| how the courts would handle deepfakes that are
| indistinguishable from genuine photographs. I imagine
| juries might tend to side with the harassed rather than a
| bully who says "it's not illegal cause it's actually a
| deepfake but yeah i obviously intended to harass the
| victim"
| AuryGlenz wrote:
| That's not even necessarily a bad thing (as a whole -
| individually it can be). Now, any leaked nudes can be
| claimed to be AI. That'll probably save far more grief
| than it causes.
| polski-g wrote:
| Such activity is legal per Ashcroft v Free Speech
| Coalition (2002). Artwork cannot be criminalized because
| of the contents of it.
| nickthegreek wrote:
| Artwork is currently criminalized because of its
| contents. You cannot paint nude children engaged in sex
| acts.
| polski-g wrote:
| The case I literally just referenced allows you to paint
| nude children engaged in sex acts.
|
| > The Ninth Circuit reversed, reasoning that the
| government could not prohibit speech merely because of
| its tendency to persuade its viewers to engage in illegal
| activity.[6] It ruled that the CPPA was substantially
| overbroad because it prohibited material that was neither
| obscene _nor produced by exploiting real children, as
| Ferber prohibited_.[6] The court declined to reconsider
| the case en banc.[7] The government asked the Supreme
| Court to review the case, and it agreed, noting that the
| Ninth Circuit 's decision conflicted with the decisions
| of four other circuit courts of appeals. Ultimately, _the
| Supreme Court agreed with the Ninth Circuit_.
| nickthegreek wrote:
| I appreciate you taking the time to lay that out, I was
| under the opposite impression for US law.
| astrange wrote:
| Stability is not an American company. The US is not the
| only country in the world.
| pixl97 wrote:
| What do you mean should be... it 100% is.
|
| In a large number of countries if you create an image
| that represents a minor in a sexual situation you will
| find yourself on the receiving side of the long arm of
| the law.
|
| If you are the maker of an AI model that allows this, you
| will find yourself on the receiving side of the long arm
| of the law.
|
| Moreso, many of these companies operate in countries
| where thought crime _is_ illegal. Now, you can argue that
| said companies should not operate in those countries, but
| companies will follow money every time.
| hypocrticalCons wrote:
| I think it's pretty important to specify that you have to
| willingly seek and share all of these illegal items.
| That's why this is so sketch. These things are being
| baked with moral codes that'll _share_ the information,
| incriminating everyone. Like why? Why not just let it
| work and leave it up to the criminal to share their
| crimes? People are such authoritarian shit-stains, and
| acting like their existence is enough to justify their
| stance is disgusting.
| pixl97 wrote:
| >I think it's pretty important to specify that you have
| to willingly seek and share all of these illegal items.
|
| This is not obvious at all when it comes to AI models.
|
| >People are such authoritarian shit-stains
|
| Yes, but this is a different conversation altogether.
| mempko wrote:
| Once it is outside your mind and in a physical form, is
| it still just a thought sir?
| hypocrticalCons wrote:
| In my country there is legal precedent setting that
| private, unshared documents are tantamount to thought.
| Sharlin wrote:
| Eh, a professional human could easily lockpick the majority
| of front doors out there. Nevertheless I don't think we're
| going to give up on locking our doors any time soon.
| martiuk wrote:
| Similar to why Google's latest image generator refuses to
| produce a correct image of a 'Realistic, historically
| accurate, Medieval English King'. They have guard rails and
| system prompts set up to force the output of the generator
| with the company's values, or else someone would produce Nazi
| propaganda or worse. It (for some reason) would be attributed
| to Google and their AI, rather than the user who found the
| magic prompt words.
| s1k3s wrote:
| Yeah this is probably the most realistic reason
| fragmede wrote:
| For some scenarios, it's not the image itself but the
| associations that the model might possibly make from being
| fed a diet of 4chan and Stormfront's unofficial YouTube
| channel. The worry is over horrible racist shit, like if you
| ask it for a picture of a black person, and it outputs a
| picture of a gorilla. Or if you ask it for a picture of a bad
| driver, and it only manages to output pictures of Asian
| women. I'm sure you can think up other horrible stereotypes
| that would result in a PR disaster.
| PeterisP wrote:
| The major risky use cases for image generators are (a) sexual
| imagery of kids and (b) public personalities in various
| contexts usable for propaganda.
| Spivak wrote:
| It's also "safety" in the sense that you can deploy it as part
| of your own application without human review and not have to
| worry that it's gonna generate anything that will get you in
| hot water.
| matthewmacleod wrote:
| I really wish that every discussion about a new model didn't
| rapidly become a boring and shallow discussion about AI safety.
| jprete wrote:
| AI is not an engineered system; it's emergent behavior from a
| system we can vaguely direct but do not fundamentally
| understand. So it's natural that the boundaries of system
| behavior would be a topic of conversation pretty much all the
| time.
|
| EDIT: Boring and shallow are, unfortunately, the Internet's
| fault. Don't know what to do about those.
| PeterisP wrote:
| At least in some latest controversies (e.g. Gemini
| generation of people) all of the criticized behavior was
| _not_ emergent from ML training, but explicitly
| intentionally engineered manually.
| cypress66 wrote:
| This announcement only mentions safety. What else do you
| expect to talk about?
| hedora wrote:
| PSA: There are now calls to embed phone-home / remote kill
| switch mechanisms into hardware because "AI safety".
| newzisforsukas wrote:
| examples? seems like it would be easier to instead
| communicate with ISPs.
| astrange wrote:
| Hmm, all computers have remote kill switches in them unless
| you have a generator at home.
| root_axis wrote:
| This is the world we live in. CYA is necessary. Politicians,
| media organizations, activists and the parochial masses will
| not brook a laissez faire attitude towards the generation of
| graphic violence and illegal porn.
| Sharlin wrote:
| Not even legal porn, unfortunately. Or even the display of a
| single female nipple...
| realusername wrote:
| looking at the manual censorship of the big channels on
| youtube, you don't even need to display anything, just
| suggesting it is enough to get a strike.
|
| (of course unless you are into yoga, then everything is
| permitted)
| Sohcahtoa82 wrote:
| > (of course unless you are into yoga, then everything is
| permitted)
|
| ...or children's gymnastics.
| hypocrticalCons wrote:
| > This is the world we live in.
|
| Great talk about slavery and religious-persecution, Jim!
| Wait, what were we talking about? Fucking American fascists
| trying to control our thoughts and actions, right right.
| hypocrticalCons wrote:
| BTW Nvidia and AMD are baking safety mechanisms into the
| fucking video drivers
|
| No where is safe
| jprete wrote:
| Do you have a reference on this?
| BryanLegend wrote:
| From George Hotz on Twitter (https://twitter.com/realGeorgeHotz
| /status/176060391883954211...)
|
| "It's not the models they want to align, it's you."
| jtr1 wrote:
| What specific cases are being prevented by safety controls
| that you think should be allowed?
| Tomte wrote:
| Not specifically SD, but DallE: I wanted to get an image of
| a pure white British shorthair cat on the arm of a brunette
| middle-aged woman by the balcony door, both looking
| outside.
|
| It wasn't important, just something I saw in the moment and
| wanted to see what DallE makes of it.
|
| Generation denied. No explanation given, I can only imagine
| that it triggered some detector of sexual request?
|
| (It wasn't the phrase "pure white", as far as I can tell,
| because I have lots of generated pics of my cat in other
| contexts)
| bonton89 wrote:
| Well for starters, ChatGPT shouldn't balk at creating
| something "in Tim Burton's style" just because Tim Burton
| complained about AI. I guess its fair use unless a select
| rich person who owns the data complains. Seems like it
| isn't fair use at all then, just theft from those who
| cannot legally defend themselves.
| archontes wrote:
| Fair use is an exception to copyright. The issue here is
| that it's _not_ fair use, because copyright simply _does
| not apply_. Copyright explicitly does not, has never, and
| will never protect style.
| SamBam wrote:
| Didn't Tom Waits successfully sue Frito Lay when the
| company found an artist that could closely replicate his
| style and signature voice, who sang a song for a
| commercial that sounded very Tom Waits-y?
| dangrossman wrote:
| Yes, though explicitly not for copyright infringement.
| Quoting the court's opinion, "A voice is not
| copyrightable. The sounds are not 'fixed'." The case was
| won under the theory of "voice misappropriation", which
| California case law (Midler v Ford Motor Co) establishes
| as a violation of the common law right of publicity.
| aimor wrote:
| Yes but that was not a copyright or trademark violation.
| This article explained it to me:
|
| https://grr.com/publications/hey-thats-my-voice-can-i-
| sue-th...
| bonton89 wrote:
| That makes it even more ridiculous, as that means they
| are giving rights to rich complaining people that no one
| has.
|
| Examples: Can you great an image of a cat in Tim Burton's
| style? Oops! Try another prompt Looks like there are some
| words that may be automatically blocked at this time.
| Sometimes even safe content can be blocked by mistake.
| Check our content policy to see how you can improve your
| prompt.
|
| Can you create an image of a cat in Wes Anderson's style?
| Certainly! Wes Anderson's distinctive style is
| characterized by meticulous attention to detail,
| symmetrical compositions, pastel color palettes, and
| whimsical storytelling. Let's imagine a feline friend in
| the world of Wes Anderson...
| astrange wrote:
| ...in the US. Other countries don't have fair use.
| rmi_ wrote:
| Tell me what they mean by "safety controls" first. It's
| very vaguely worded.
|
| DALL-E, for example, wrongly denied serveral request of
| mine.
| Aeolun wrote:
| I don't feel like it truly matters since they'll release
| it and people will happily fine-tune/train all that
| safety right back out.
|
| It sounds like a reputation/ethics thing to me. You
| probably don't want to be known as the company that
| freely released a model that gleefully provides images of
| dismembered bodies (or worse).
| bergen wrote:
| You are using someone elses propietary technology, you
| have to deal with their limitations. If you don't like
| there are endless alternatives.
|
| "Wrongly denied" in this case depends on your point of
| view, clearly DALL-E didn't want this combination of
| words created, but you have no right for creation of
| these prompts.
|
| I'm the last one defending large monolithic corps, but if
| you go to one and want to be free to do whatever you want
| you are already starting from a very warped expectation.
| AuryGlenz wrote:
| As far as Stable Diffusion goes - when the released SD
| 2.1/XL/Stable Cascade, you couldn't even make a (woman's)
| nipple.
|
| I don't use them for porn like a lot of people seem too,
| but it seems weird to me that something that's kind of made
| to generate art can't generate one of the most common
| subjects in all of art history - nude humans.
| b33j0r wrote:
| For some reason its training thinks they are decorative,
| I guess it's a pretty funny elucidation of how it works.
|
| I have seen a lot of "pasties" that look like Sorry! game
| pieces, coat buttons, and especially hell-forged
| cybernetic plumbuses. Did they train it at an alien strip
| club?
|
| The LoRAs and VAEs work (see civit.ai), but do you really
| want something named NSFWonly in your pipeline just for
| nipples? Haha
| Aeolun wrote:
| I'm not sure if they updated them to rectify those "bugs"
| but you certainly can now.
| araes wrote:
| I seem to have the opposite problem a lot of the time. I
| tried using Meta's image gen tool, and had such a time
| trying to get it to make art that was not "kind of"
| sexual. It felt like Facebook's entire learning chain
| must have been built on people's sexy images of their
| girlfriend that's all now hidden in the art.
|
| These were examples that were not super blatant, like a
| tree landscape that just happens to have a human figure
| and cave in their crotch. Examples:
|
| https://i.imgur.com/RlH4NNy.jpg - Art is very focused on
| the monster's crotch
|
| https://i.imgur.com/0M8RZYN.jpg - The comparison should
| hopefully be obvious
| Fischgericht wrote:
| Not meant in a rude way, but please consider that your
| brain is making these up and you might need to see a
| therapist. I can see absolutely nothing "kind of sexual"
| in those two pictures.
| astrange wrote:
| I have in fact gotten a nude out of Stable Cascade. And
| that's just with text prompting, the proper way to use
| these is with multimodal prompting. I'm sure it can do it
| with an example image.
| slily wrote:
| Parody and pastiche
| miohtama wrote:
| Generating images of nazis
|
| https://www.theverge.com/2024/2/21/24079371/google-ai-
| gemini...
| stale2002 wrote:
| Oh the big one would be models weights being released for
| anyone to use or fine tune themselves.
|
| Sure, the safety people lost that battle for Stable
| diffusion and LLama. And because they lost, entire
| industries were created by startups that could now use
| models themselves, without it being locked behind someone
| else's AI.
|
| But it wasn't guaranteed to go that way. Maybe the
| safetyists could have won.
|
| I don't we'd be having our current AI revolution if
| facebook or SD weren't the first to release models, for
| anyone to use.
| thefourthchime wrote:
| No, it's the cacophony of zealous point scores on X they want
| to avoid.
| dang wrote:
| We detached this subthread from
| https://news.ycombinator.com/item?id=39466910.
| wongarsu wrote:
| What's equally interesting is that while they spend a lot of
| words on safety, they don't actually say anything. The only
| hint what they even mean by safety is that they took
| "reasonable steps" to "prevent misuse by bad actors". But it's
| hard to be more vague than that. I still have no idea what they
| did and why they did it, or what the threat model is.
|
| Maybe that will be part of future papers or the teased
| technical report. But I find it strange to put so much emphasis
| on safety and then leave it all up to the reader's imagination.
| fortran77 wrote:
| Remember when AI safety meant the computers weren't going to
| kill us?
| SV_BubbleTime wrote:
| Now people spend a lot of time making them worse to ensure
| we don't see boobs.
| dmezzetti wrote:
| Any large publicly available model has no choice but to do
| this. Otherwise, they're petrified of a PR nightmare.
|
| Models with a large user base will have an inverse relationship
| with usability. That's why it's important to have options to
| train your own with open source.
| beefield wrote:
| I get a slightly uncomfortable feeling with this talk about AI
| safety. Not in the sense that there is anything wrong with that
| (may be or may be not), but in the sense I don't understand
| what people are talking about when they talk about safety in
| this context. Could someone explain like I have Asperger
| (ELIA?) whats this about? What are the "bad actors" possibly
| going to do? Generate (child) porn/ images with violence etc.
| and sell them? Pollute the training data so that the racist
| images pops up when someone wants to get an image of a white
| pussycat? Or produce images that contain vulnerabilities so
| that when you open that in your browser you get compromised? Or
| what?
| Tadpole9181 wrote:
| > Could someone explain like I have Asperger (ELIA?)
|
| _Excuse me?_
| beefield wrote:
| You sound offended. My apologies. I had no intention
| whatsoever to offend anyone. Even if I am not diagnosed, I
| think I am at least borderline somewhere in the spectrum,
| and thought that would be a good way to ask people explain
| without assuming I can read between the lines.
| Tadpole9181 wrote:
| Let's just stick with the widely understood "Explain Like
| I'm 5" (ELI5). Nobody knows you personally, so this comes
| off quite poorly.
| beefield wrote:
| I think ELI5 means that you simplify a complex issue so
| that even a small kid understands it. In this case there
| is no need to simplify anything, just explain what a term
| actually means without assuming reader understanding
| nuances of terms used. And I still do not quite get how
| ELIA can be considered hostile, but given the feedback,
| maybe I avoid it in the future.
| Tadpole9181 wrote:
| Saying "explain like I have <specific disability>" is
| blatantly inappropriate. As a gauge: Would you say this
| to your coworkers? Giving a presentation? Would you say
| this in front of (a caretaker for) someone with Autism?
| Especially since Asperger's hasn't even been used in
| practice for, what, over a decade?
|
| > In this case there is no need to simplify anything
|
| Then just ask the question itself.
| charcircuit wrote:
| AI isn't a coworker, not a human so it's not as awkward
| to talk about one's disability.
| Tadpole9181 wrote:
| I don't see how this is a response to anything I've said.
| They're speaking to other humans and the original use of
| their modified idiom isn't framed as if one were talking
| about their own, personal disability.
| vprcic wrote:
| Just as an example:
|
| https://arstechnica.com/information-
| technology/2024/02/deepf...
| Q6T46nT668w6i3m wrote:
| The bad actor might be the model itself, e.g., returning
| unwanted pornography or violence. Do you have a problem with
| Google's SafeSearch?
| reaperman wrote:
| I'm not part of Stability AI but I can take a stab at this:
|
| > explain like I have ~~Asperger (ELIA?)~~ limited
| understanding of how the world _really_ works.
|
| The AI is being limited so that it cannot produce any
| "offensive" content which could end up on the news or go
| viral and bring negative publicity to Stability AI.
|
| Viral posts containing generated content that brings negative
| publicity to Stability AI are fine as long as they're not
| "offensive". For example, wrong number of fingers is fine.
|
| There is not a comprehensive, definitive list of things that
| are "offensive". Many of them we are aware of - e.g. nudity,
| child porn, depictions of Muhammad. But for many things it
| cannot be known a priori whether the current zeitgeist will
| find it offensive or not (e.g. certain depictions of current
| political figures, like Trump).
|
| Perhaps they will use AI to help decide what might be
| offensive if it does not explicitly appear on the blocklist.
| They will definitely keep updating the "AI Safety" to cover
| additional offensive edge cases.
|
| It's important to note that "AI Safety", as defined above
| (cannot produce any "offensive" content which could end up on
| the news or go viral and bring negative publicity to
| Stability AI) is not just about facially offensive content,
| but also about offensive uses for milquetoast content.
| Stability AI won't want news articles detailing how they're
| used by fraudsters, for example. So there will be some guards
| on generating things that look like scans of official
| documents, etc.
| beefield wrote:
| So it's just fancy words for safety (legal/reputational)
| for Stability AI, not users?
| reaperman wrote:
| Yes*. At least for the purposes of understanding what the
| implementations of "AI safety" are most likely to entail.
| I think that's a very good cognitive model which will
| lead to high fidelity predictions.
|
| *But to be slightly more charitable, I genuinely think
| Stability AI / OpenAI / Meta / Google / MidJourney
| believe that there is significant overlap in the set of
| protections which are safe for the company, safe for
| users, and safe for society in a broad sense. But I don't
| think any released/deployed AI product focuses on the
| latter two, just the first one.
|
| Examples include:
|
| Society + Company: Depictions of Muhammad could result in
| small but historically significant moments of civil
| strife/discord.
|
| Individual + Company: Accidentally generating NSFW
| content at work could be harmful to a user. Sometimes
| your prompt won't seem like it would generate NSFW
| content, but could be adjacent enough: e.g. "I need some
| art in the style of a 2000's R&B album cover" (See: Sade
| - Love Deluxe, Monica - Makings of Me, Rihanna -
| Unapologetic, Janet Jackson - Damita Jo)
|
| Society + Company: Preventing the product from being used
| for fraud. e.g. CAPTCHA solving, fraudulent
| documentation, etc.
|
| Individual + Company: Preventing generation of child
| porn. In the USA, this would likely be illegal both for
| the user and for the company.
| astrange wrote:
| Their enterprise customers care even more than Stability
| does.
| mempko wrote:
| I think this AI safety thing is great. These models will be
| used by people to make boring art. The exciting art will be
| left for people to make.
|
| This idea of AI doing the boring stuff is good. Nothing
| prevents you from making exciting, dangerous, or 'unsafe' art
| on your own.
|
| My feeling is that most people who are upset about AI safety
| really just mean they want it to generate porn. And because it
| doesn't, they are upset. But they hide it under the umbrella of
| user freedom. You want to create porn in your bedroom? Then go
| ahead and make some yourself. Nothing stopping you, the person,
| from doing that.
| TulliusCicero wrote:
| I agree with you, but when companies don't implement these
| things, they get absolutely trashed in the press & social
| media, which I'm sure affects their business.
|
| What would you have them do? Commit corporate suicide?
| TylerLives wrote:
| This is a good question. I think it would be best for them to
| give some sort of signal, which would mean "We're doing this
| because we have to. We are willing to change if you offer us
| an alternative." If enough companies/people did this, at some
| point change would become possible.
| acomjean wrote:
| I think this isn't software as much as a service. When viewed
| through this lens the guard rails make more sense.
| FloatArtifact wrote:
| I'm curious to know if they're safeguards are eliminated when
| users find tune the model?
| pmx wrote:
| There are some VERY nsfw model fine tunes available for other
| versions of SD
| witcH wrote:
| such as?
| mdrzn wrote:
| Check out civitai.com for finetuned models for a wide range
| of uses
| AuryGlenz wrote:
| I believe you need to be signed in to see the NSFW stuff,
| for what it's worth.
| 123yawaworht456 wrote:
| >This preview phase, as with previous models, is crucial for
| gathering insights to improve its performance and safety ahead of
| an open release.
|
| oh, for fuck's sake.
| memossy wrote:
| We did this for every stable diffusion release, you get the
| feedback data to improve it continuously ahead of open release.
| 123yawaworht456 wrote:
| I was referring to 'safety'. how the hell can an image
| generation model be dangerous? we had software for editing
| text, images, videos and audio for half a century now.
| Jensson wrote:
| Advertisers will cancel you if you do anything they don't
| like, 'safety' is to prevent that.
| glimshe wrote:
| This reinforces my impression that Google is at least one year
| behind. Stunning images, 3D, video while Gemini had to be
| partially halted this morning.
| bamboozled wrote:
| For "political" reasons, not for technical reasons. Don't get
| it twisted.
| coeneedell wrote:
| I would describe those issues as technical. It's genuinely
| getting things wrong because the "safety" element was
| implemented poorly.
| anononaut wrote:
| Those are safety elements which exist for political
| reasons, not technical ones.
| ethbr1 wrote:
| Of all criticism that could be leveled at Google, 'shipping a
| product and supporting it' being the only thing that matters
| seems fair.
|
| Which takes _all_ the behind the scenes steps, not just the
| technical ones.
| verticalscaler wrote:
| You think that technology is first. You think that
| mathematicians and computer engineers or mechanical engineers
| or doctors are first. They're very important, but they're not
| first. They're second. Now I'll prove it to you.
|
| There was a country that had the best mathematicians, the
| best physicists, the best metallurgists in the world. But
| that country was very poor. It's called the Soviet Union. But
| when you took one of these mathematicians or physicists, who
| was smuggled out or escaped, put him on a plane and brought
| him to Palo Alto. Within two weeks, they were producing added
| value that could produce great wealth.
|
| What comes first is markets. If you have great technology
| without markets, without a market-friendly economy, you'll
| get nowhere. But if you have a market-friendly economy,
| sooner or later the market forces will give you the
| technology you want.
|
| And that my friend, simply won't come from an office
| paralyzed by internal politics of fear and conformity. Don't
| get it twisted.
| TulliusCicero wrote:
| I mean, it's kind of both? Making Nazis look diverse isn't
| just a political error, it's also a technical one. By
| default, showing Nazis should show them as they actually
| were.
| astrange wrote:
| There's a product for that called Google Image Search.
| chickenpotpie wrote:
| I don't think that's a fair comparison because they're
| fulfilling substantially different niches. Gemini is a
| conversational model that can generate images, but is mainly
| designed for text. Stable Diffusion is only for images. If you
| compare a model that can do many things and a model that can
| only do images by how well they generate images, of course the
| image generation model looks better.
|
| Stability does have an LLM, but it's not provided in a unified
| framework like Gemini is.
| bluescrn wrote:
| The public only see the 'safe' lobotomized versions of each,
| though.
|
| I wonder how far ahead the internal versions are?
| treesciencebot wrote:
| Quite nice to see diffusion transformers [0] becoming the next
| dominant architecture on the generative media.
|
| [0]: https://twitter.com/EMostaque/status/1760660709308846135
| poulpy123 wrote:
| Didn't they released another model few days ago ?
| amelius wrote:
| Does anyone know of a good tutorial on how diffusion models work?
| Ologn wrote:
| I liked this 18 minute video (
| https://www.youtube.com/watch?v=1CIpzeNxIhU ). Computerphile
| has other good videos with people like Brian Kernighan.
| spaceheater wrote:
| fast.ai has a whole free course
|
| https://www.youtube.com/watch?v=_7rMfsA24Ls
| https://course.fast.ai/Lessons/part2.html
| jasonjmcghee wrote:
| https://jalammar.github.io/illustrated-stable-diffusion/
|
| His whole blog is fantastic. If you want more background (e.g.
| how transformers work) he's got all the posts you need
| amelius wrote:
| This looks nice, thank you, but I'm looking for a more hands-
| on tutorial, with e.g. Python code, like Andrej Karpathy
| makes them.
| astrange wrote:
| SD3 is a new architecture using DiT (diffusion
| transformers), so those would be out of date.
|
| The older ones have drawbacks like not being able to spell.
| ttul wrote:
| Not too out of date. Just replace the magic UNet with a
| DiT and squint. It's doing the same thing - removing
| noise.
| keiferski wrote:
| The obsession with safety in this announcement feels like a
| missed marketing opportunity, considering the recent Gemini
| debacle. Isn't SD's primary use case the fact that you can
| install it on your own computer and make what you want to make?
| jsheard wrote:
| At some point they have to actually make money, and I don't see
| how continuously releasing the fruits of their expensive
| training for people to run locally on their own computer (or a
| competing cloud service) for free is going to get them there.
| They're not running a charity, the walls will have to go up
| eventually.
|
| Likewise with Mistral, you don't get half a billion in funding
| and a two billion valuation on the assumption that you'll keep
| giving the product away for free forever.
| keiferski wrote:
| But there are plenty of other business models available for
| open source projects.
|
| I use Midjourney a lot and (based on the images in the
| article) it's leaps and bounds beyond SD. Not sure why I
| would switch if they are both locked down.
| AuryGlenz wrote:
| SD would probably be a lot better if they didn't have to
| make sure it worked on consumer GPUs. Maybe this
| announcement is a step towards that where the best model
| will only be able to be accessed by most using a paid
| service.
| bee_rider wrote:
| Is it possible to fine-tune Midjourney or produce a LORA?
| keiferski wrote:
| Sorry I don't know what that means, but a quick google
| shows some results about it.
| elbear wrote:
| Finetune means to do extra training on the model with
| your own dataset, for example to teach it to produce
| images in a certain style.
| nickthegreek wrote:
| No. You can provide a photos to merge though.
| archerx wrote:
| Ironically their over sensitive nsfw image detector in their
| api caused me to stop using it and run it locally instead. I
| was using it to render animations of hundreds of frames but
| when every 20th to 30th image comes out blurry it ruins the
| whole animation and it would double the cost or more to
| rerender it with a different seed hoping to not trigger the
| over zealous blurring.
|
| I don't mind that they don't want to let you generate nsfw
| images but their detector is hopelessly broken, it once
| censored a cube, yes a cube...
| Sharlin wrote:
| Unfortunately their financial and reputational incentives
| are firmly aligned with preventing false negatives at the
| cost of a lot of false positives.
| archerx wrote:
| Unfortunately I don't want to pay for hundreds if not
| thousands of images I have to throw away because it
| decided some random innocent element is offensive and
| blurs the entire image.
|
| Here is the red cube it censored because my innocent eyes
| wouldn't be able to handle it;
| https://archerx.com/censoredcube.png
|
| What they are achieving with the over zealous safety
| issues are driving developers to on demand GPU hosts that
| will let them host their own models, which also opens up
| a lot more freedom. I wanted to use the stability AI api
| as my main source for Stable Diffusion but they make it
| really really hard especially if you want use it as part
| of your business.
| TehCorwiz wrote:
| Everyone always talks about Platonic Solids but never
| Romantic Solids. /s
| causal wrote:
| Open source models can be fine-tuned by the community if
| needed.
|
| I would much rather have this than a company releasing models
| this size into the wild without any safety checks whatsoever.
| srid wrote:
| Could you list the concrete "safety checks" that you think
| prevents real-world harm? What particular image that you
| think a random human will ask the AI to generate, which then
| leads to concrete harm in the real world?
| politician wrote:
| Not even the large companies will explain with precision
| their implementation of safety.
|
| Until then, we must view this "safety" as both a scapegoat
| and a vector for social engineering.
| astrange wrote:
| Companies are not going to explain their legal risks in
| their marketing material.
| causal wrote:
| If 1 in 1,000 generations will randomly produce memorized
| CSAM that slipped into the training set then yeah, it's
| pretty damn unsafe to use. Producing memorized images has
| precedent[0].
|
| Is it unlikely? Sure, but worth validating.
|
| [0] https://arxiv.org/abs/2301.13188
| srid wrote:
| Okay, by "safety checks" you meant the already unlawful
| things like CSAM, but not politically-overloaded beliefs
| like "diversity"? The latter is what the comment[1] you
| were replying to was referring to (viz. "considering the
| recent Gemini debacle"[2]).
|
| [1] https://news.ycombinator.com/item?id=39466991
|
| [2] https://news.ycombinator.com/item?id=39456577
| causal wrote:
| Right, by "rather have this [nothing]" I meant Stable
| Diffusion doing some basic safety checking, not Google's
| obviously flawed ideas of safety. I should have made that
| clear.
|
| I posed the worst-case scenario of generating actual CSAM
| in response to your question, "What particular image that
| you think a random human will ask the AI to generate,
| which then leads to concrete harm in the real world?"
| thomquaid wrote:
| Could you elaborate on the concrete real world harm?
| yreg wrote:
| Why not run the safety check on the training data?
| causal wrote:
| They try to, but it is difficult to comb through billions
| of images, and at least some of SD's earlier datasets
| were later found to have been contaminated with CSAM[0].
|
| https://www.404media.co/laion-datasets-removed-stanford-
| csam...
| dns_snek wrote:
| Do you have an example? I've never heard of anyone
| accidentally generating CSAM, with any model. "1 in
| 1,000" is just an obviously bogus probability, there must
| have been billions of images generated using hundreds of
| different models.
|
| Besides, and this is a serious question, what's the harm
| of a model _accidentally_ generating CSAM? If you weren
| 't intending to generate these images then you would just
| discard the output, no harm done.
|
| Nobody is forcing you to use a model that might
| accidentally offend you with its output. You can try
| "aligning" it, but you'll just end up with Google Gemini
| style "Sorry I can't generate pictures of white people".
| causal wrote:
| Earlier datasets used by SD were likely contaminated with
| CSAM[0]. It was unlikely to have been significant enough
| to result in memorized images, but checking the safety of
| models increases that confidence.
|
| And yeah I think we should care, for a lot of reasons,
| but a big one is just trying to stay well within the law.
|
| [0] https://www.404media.co/laion-datasets-removed-
| stanford-csam...
| astrange wrote:
| SD always removed enough nsfw material that this probably
| never made it in there.
| 7moritz7 wrote:
| Then you know almost nothing about the SD 1.5 ecosystem
| apparently. I've finetuned multiple models myself and
| it's nearly impossible to get rid of the child-bias in
| anime-derived models (which applies to 90 % of character
| focussed models) including nsfw ones. Took me like 30
| attempts to get somewhere reasonable and it's still
| noticeable.
| dns_snek wrote:
| If we're being honest, anime and anything "anime-derived"
| is uncomfortably close to CSAM as a source material,
| before you even get SD involved, so I'm not surprised.
|
| What I had in mind were regular general purpose models
| which I've played around with quite extensively.
| astrange wrote:
| The harm is that any use of the model becomes illegal in
| most countries (or offends credit card processors) if it
| easily generates porn. Especially if it does it when you
| didn't ask for it.
| dyslexit wrote:
| This question narrows the scope of "safety" to something
| less than what the people at SD or even probably what OP
| cares about. _Non-random_ CSAM requests targeting
| potentially real people is the obvious answer here, but
| even non-CSAM sexual content is also a probably a threat. I
| can understand frustration with it currently going
| overboard on blurring, but removing safety checks
| altogether would result in SD mainly being associated with
| porn pretty quickly, which I'm sure Stability AI wants to
| avoid for the safety of their company.
|
| Add to that, parents who want to avoid having their kids
| generate sexual content would now need to prevent their
| kids from using this tool because it can create it
| randomly, limiting SD usage to kids 18+ (which is probably
| something else Stability AI does not want to deal with.)
|
| It's definitely a balance between going overboard and
| having restrictions though. I haven't used SD in several
| months now so I'm not sure where that balance is right now.
| bluescrn wrote:
| Before long we're going to need a new word for physical
| 'safety' - when dealing with heavy machinery, chemicals, high
| voltages, etc.
| jiggawatts wrote:
| Just replace "safety" with "puritan" in all of these
| announcements and they'll make more sense.
| AnthonyMouse wrote:
| > the recent Gemini debacle.
|
| I've noticed that SDXL does something a little odd. For a given
| prompt it essentially decides what race the subject should be
| without the prompt having specified one. You generate 20 images
| with 20 different seeds but the same prompt and they're
| typically all the same race. In some cases they even appear to
| be the same "person" even though I doubt it's a real person (at
| least not anyone I could recognize as a known public figure any
| of the times it did this). I'm kind of curious what they
| changed from SD 1.5, which _didn 't_ do this.
| lreeves wrote:
| People in this discussion seem to be hand-wringing about
| Stability's "saftey" comments but every model they've released
| has been fine tuned for porn in like 24 hours.
| mopierotti wrote:
| That's not entirely true. This wasn't the case for SD 2.0/2.1,
| and I don't think SD 3.0 will be available publicly for fine
| tuning.
| lreeves wrote:
| SD 2 definitely seems like an anomaly that they've learned
| from though and was hard for everyone to use for various
| reasons. SDXL and even Cascade (the new side-project model)
| seems to be embraced by horny people.
| viraptor wrote:
| 2 is not popular because people have better quality results
| with 1.5 and xl. That's it. If 3 is released and works
| better, it will be fine tuned too.
| londons_explore wrote:
| All the demo images are 'artwork'.
|
| will the model also be able to produce good photographs,
| technical drawings, and other graphical media?
| spywaregorilla wrote:
| Photorealism is well within current capabilities. Technical
| drawings absolutely not. Not sure what other graphical media
| includes.
| sweezyjeezy wrote:
| Yeah but try getting e.g. Dall-E 3 to do photorealism, I
| think they've RLHF'd the crap out of it in the name of
| safety.
| spywaregorilla wrote:
| well that's what you get with closed ai.
| astrange wrote:
| That's not safety, the safety RLHF is because it tries to
| generate porn and people with three legs if you don't stop
| it.
|
| It has the weird art style because that's what looks the
| most "aesthetic". And because it doesn't actually have
| nearly as good enough data as you'd think it does.
|
| Sora looks like it could be better.
| Jensson wrote:
| > Not sure what other graphical media includes.
|
| I'd want a model that can draw website designs and other UIs
| well. So I give it a list of things in the UI, and I get back
| a bunch of UI design examples with those elements.
| spywaregorilla wrote:
| I'm gonna hazard a guess and say well within the
| capabilities of a fine tuned model, but that no such fine
| tuned model exists and the labeled data required to
| generate it is not really there.
| dmalik wrote:
| You'd have better luck with an LLM with
| HTML/JavaScript/CSS.
| astrange wrote:
| https://www.usegalileo.ai/explore
|
| https://v0.dev
| senseiV wrote:
| Theres a startup doing that named galileo_ai
| Sharlin wrote:
| Photographs, digital illustrations, comic or cartoon style
| images, whatever graphical style you can imagine are all easy
| to achieve with current models (though no single model is a
| master of all trades). Things that look like technical drawings
| are as well, but don't expect them to make any sense
| engineering-wise unless maybe if you train a finetune
| specifically for that purpose.
| Fervicus wrote:
| And will the model also pretend that a certain particular race
| doesn't exist?
| londons_explore wrote:
| I really wonder what harm would come to the company if they
| didn't talk about safety?
|
| Would investors stop giving them money? Would users sue that they
| now had PTSD after looking at all the 'unsafe' outputs? Would
| regulators step in and make laws banning this 'unsafe' AI?
|
| What is it specifically that company management is worried about?
| brainwipe wrote:
| All of the above! Additionally... I think AI companies are
| trying to steer the conversation about safety so that when
| regulations do come in (and they will) that the legal
| culpability is with the user of the model, not the trainer of
| it. The business model doesn't work if you're liable for harm
| caused by your training process - especially if the harm is
| already covered by existing laws.
|
| One example of that would be if your model was being used to
| spot criminals in video footage and it turns out that the bias
| of the model picks one socioeconomic group over another. Most
| western nations have laws protecting the public against that
| kind of abuse (albeit they're not applied fairly) and the fines
| are pretty steep.
| graphe wrote:
| They have already used "AI" with success to give people loans
| and they were biased. Nothing happened legally to that
| company.
| dorkwood wrote:
| They're attempting to guard themselves against incoming
| regulation. The big players, such as Microsoft, want to squash
| Stable Diffusion while protecting themselves, and they're going
| to do it by wielding the "safety is important and only we have
| the resources to implement it" hammer.
| HeatrayEnjoyer wrote:
| Safety is a _very_ real concern, always has been in ML
| research. I 'm tired of this trite "they want a moat"
| narrative.
|
| I'm glad tech orgs are for once thinking about what they're
| building before putting out society-warping democracy-
| corroding technology instead of move fast break things.
| dorkwood wrote:
| It doesn't strike you as hypocritical that they all talk
| about safety while continuing to push out tech that's
| upending multiple industries as we speak? It's tough for me
| to see it as anything other than lip service.
|
| I'd be on your side if any of them actually chose to keep
| their technology in the lab instead of tossing it out into
| the world and gobbling up investment dollars as fast as
| they could.
| tavavex wrote:
| How are these two things related at all? When AI
| companies speak of safety, it's almost always about the
| "only including data a religious pastor would find safe,
| and filtering outputs" angle. How's the market and other
| industries relevant at all? Should AI companies be
| obligated to care about what happens to other companies?
| With that point of view, we should've criticized the
| iPhone for upending the PDA market, or Wacom for
| "upending" the traditional art market.
| rwmj wrote:
| That would make sense if it was in the slightest about
| avoiding "society-warping democracy-corroding technology".
| Rather than making sure no one ever sees a naked person
| which would cause governments to come down on them like a
| ton of bricks.
| ryandrake wrote:
| This would be funny if we weren't living it.
|
| Software that promotes the unchecked spread of
| propaganda, conspiracy theories, hostility, division,
| institutional mistrust and so on: A-OK.
|
| Software that might show a boob: Totally irresponsible
| and deserving of harsh regulation.
| atahanacar wrote:
| Safety from what? Human anatomy?
| bergen wrote:
| See the recent Taylor Swift scandal. Safety from never
| ending amounts of deepfake porn and gore for example.
| atahanacar wrote:
| This isn't a valid concern in my opinion. Photo
| manipulation has been around for decades. People have
| been drawing other people for centuries.
|
| Also, where do we draw the line? Should Photoshop stop
| you from manipulating human body because it could be used
| for porn? Why stop there, should text editors stop you
| from writing about sex or describing human body because
| it could be used for "abuse". Should your comment be
| removed because it make me imagine Taylor Swift without
| clothes for a brief moment?
| spencerflem wrote:
| Doing it effortlessly and instantly makes a difference.
|
| (This applies to all AI discussions)
| bergen wrote:
| No, but AI requires zero learning curve and can be
| automated. I can't spit out 10 images of Tay per second
| in photoshop. If I want and the API delivers I can easily
| do that with AI. (Given, would one becoding this it
| requires a learning curve, but in principal with the
| right interface and they exist i can churn out hundreds
| of images without me actively putting work in)
| tavavex wrote:
| I've never understood the argument about image generators
| being (relatively) fast. Does that mean that if you could
| Photoshop 10 images per second, we should've started
| clamping down on Photoshop? What exact speed is the
| cutoff mark here? Given that Photoshop is updated every
| year and includes more and more tools that can accelerate
| your workflow (incl. AI-assisted ones), is there going be
| a point when it gets too fast?
|
| I don't know much about the initial scandal, but I was
| under the impression that there was only a small number
| of those images, yet that didn't change the situation. I
| just fail to see how quantity factors into anything here.
| bergen wrote:
| >I just fail to see how quantity factors into anything
| here.
|
| Because you can overload any online discussion / sphere
| with that. There were so many that X effectively banned
| searching for her at all because if you did, you where
| overwhelmed by very extreme fake porn. Everybody can do
| it with very low entry barrier, it looks very believable,
| and it can be generated in high quantities.
|
| We shouldn't have clamped down on photoshop, but
| realisticly two things would be nice in your theoretical
| case, usage restrictions and public information building.
| There was no clear cut point where photoshop was so
| mighty you couldn't trust any picture online. There were
| skills to be learned and people could identify the
| trickery, and it was on a very small scale and gradual.
| And the photo trickery was around for ages, even Stalin
| did it.
|
| But creating photorealistic fakes in an automated fashion
| is completely new.
| tavavex wrote:
| But when we talk about specifically harming one person,
| does it really matter if it's a thousand different
| generations of the same thing or 10 generations that were
| copied thousands of times? It is a technology that lowers
| the bar for generating believable-looking things, but I
| don't know if it's the speed that is the main culprit
| here.
|
| And in fairness to generative AI, even nowadays it feels
| like getting to a point of true photorealism takes some
| effort, especially if the goal is letting it just run
| nonstop with no further curation. And getting a local
| image generator to run at all on your computer (and
| having the hardware for it) is also a bar that plenty of
| people can't clear yet. Photoshop is kind of different in
| that making more believable things requires a lot more
| time, effort and knowledge - but the idea that any image
| online can be faked has already been ingrained in the
| public consciousness for a very long time.
| spencerflem wrote:
| Yes, if you could Photoshop 10/sec it would be a problem.
|
| Think of it this way, if one out of every ten phone calls
| you get is spam, you still have a pretty useable phone.
| Three orders of magnitude different and 1 out of every
| 100 calls is real and the system totally breaks down.
|
| Generative AI makes generating realistic looking fakes
| ~1000x easier, its the one thing its best at.
| kristopolous wrote:
| That's fine. But the question was what are they referring
| to and that's the answer.
| chasd00 wrote:
| > See the recent Taylor Swift scandal
|
| but that's not dangerous. It's definitely worthy of
| unlocking the cages of the attack lawyers but it's not
| dangerous. The word "safety" is being used by big tech to
| trigger and gas light society.
| shrimp_emoji wrote:
| I.e., controlling through fear
| jquery wrote:
| To the extent these models don't blindly regurgitate hate
| speech, I appreciate that. But what I do not appreciate is
| when they won't render a human nipple or other human
| anatomy. That's not safety, and calling it such is
| gaslighting.
| ballenf wrote:
| AI/ML/GPT/etc are looking increasingly like other media
| formats -- a source of mass market content.
|
| The safety discussion is proceeding very much like it did for
| movies, music, and video games.
| bitcurious wrote:
| The latter; there is already an executive order around AI
| safety. If you don't address it out loud you'll draw attention
| to yourself.
|
| https://www.whitehouse.gov/briefing-room/presidential-action...
| memossy wrote:
| As the leader in open image models it is incumbent upon us as
| the models get to this level of quality to take seriously how
| we can release open and safe models from a legal, societal and
| other considerations.
|
| Not engaging in this will indeed lead to bad laws, sanctions
| and more as well as not fulfilling our societal obligations of
| ensuring this amazing technology is used for as positive
| outcomes as possible.
|
| Stability AI was set up to build benchmark open models of all
| types in a proper way, this is why for example we are one of
| the only companies to offer opt out of datasets (stable cascade
| and SD3 are opted out), have given millions of supercompute
| hours in grants to safety related research and more.
|
| Smaller players with less uptake and scrutiny don't need to
| worry so much about some of these complex issues, it is quite a
| lot to keep on top of, doing our best.
| GenerWork wrote:
| >it is incumbent upon us as the models get to this level of
| quality to take seriously how we can release open and safe
| models from a legal, societal and other considerations.
|
| Can you define what you mean by "societal and other
| considerations"? If not, why not?
| memossy wrote:
| I could but I won't as legal stuff :)
| zmgsabst wrote:
| "We need to enforce our morality on you, for our beliefs are
| the true ones -- and you're unsafe for questioning them!"
|
| You sound like many authoritarian regimes.
| memossy wrote:
| I mean open models yo
| shapefrog wrote:
| > What is it specifically that company management is worried
| about?
|
| As with all hype techs, even the most talented management are
| barely literate in the product. When talking about their new
| trillion $ product they must take their talking points from the
| established literature and "fake it till they make it".
|
| If the other big players say "billions of parameters" you chuck
| in as many as you can. If the buzz words are "tokens" you say
| we have lots of tokens. If the buzz words are "safety" you say
| we are super safe. You say them all and hope against hope that
| nobody asks a simple question you are not equipped to answer
| that will show you dont actually know what you are talking
| about.
| chasd00 wrote:
| they risk reputational harm and since there's so many
| alternatives outright "brand cancellation". For example, vocal
| groups can lobby payment processors to deny service to any AI
| provider deemed unworthy. Ironic that tech enabled all of that
| behavior to begin with and now they're worried about it turning
| on them.
| tavavex wrote:
| What viable alternatives are there to Stable Diffusion? As
| far as I know, it's _the_ only way to run good image
| generation locally, and that 's probably a big consideration
| for any business dabbling in it.
| astrange wrote:
| It's not the only open image model. It is the best one, but
| it's not the only one.
| tavavex wrote:
| Yeah, the word "good" is doing the heavy lifting here -
| while it's not the only one that can do it, it has a very
| comfortable lead over all alternatives.
| renewiltord wrote:
| It's a bit rich when HN itself is chock full with camp
| followers who pick the most mainstream opinion. Previously it
| was AI danger, then it became hallucinations, now it's that
| safety is too much.
|
| The rest of the world is also like that. You can make a thing
| that hurts your existing business. Spinning off the brand is
| probably Google's best bet.
| summerlight wrote:
| Likely public condemnation followed by unreasonable regulations
| when populists see their campaign opportunities. We've
| historically seen this when new types of media (e.g. TV,
| computer games) debut and there are real, early signals of such
| actions.
|
| I don't think those companies being cautious is necessarily a
| bad thing even for AI enthusiasts. Open source models will
| quickly catch up without any censorship while most of those
| public attacks are concentrated into those high profile
| companies, which have established some defenses. That would be
| a much cheaper price than living with some unreasonable degree
| of regulations over decades, driven by populist politicians.
| bluescrn wrote:
| It's an election year.
|
| They're probably more concerned about generated images of
| politicians in 'interesting' sitations going viral than they
| are about porn/gore etc.
| astrange wrote:
| Stability is not an American company.
| inference-lord wrote:
| Cool but it's hard to keep getting "blown away" at this stage.
| The "incredible" is routine now.
| danparsonson wrote:
| So... they should just stop?
| dougmwne wrote:
| At this point, the next thing that will blow me away is AGI at
| human expert level or a Gaussian Splat diffusion model that can
| build any arbitrary 3D scene from text or a single image. High
| bar, but the technology world is already full of dark magic.
| attilakun wrote:
| Is there a Guassian splat model that works without the
| "Structure from Motion" step to extract the point cloud? That
| feels a bit unsatisfying to me.
| inference-lord wrote:
| Will ask it for immortality, endless wealth, and still get
| bored.
| consumer451 wrote:
| I would be a big fan of solid infographics or presentation
| slides. That would be very useful.
| JonathanFly wrote:
| From: https://twitter.com/EMostaque/status/1760660709308846135
|
| Some notes:
|
| - This uses a new type of diffusion transformer (similar to Sora)
| combined with flow matching and other improvements.
|
| - This takes advantage of transformer improvements & can not only
| scale further but accept multimodal inputs..
|
| - Will be released open, the preview is to improve its quality &
| safety just like og stable diffusion
|
| - It will launch with full ecosystem of tools
|
| - It's a new base taking advantage of latest hardware & comes in
| all sizes
|
| - Enables video, 3D & more..
|
| - Need moar GPUs..
|
| - More technical details soon
|
| >Can we create videos similar like sora
|
| Given enough GPUs and good data yes.
|
| >How does it perform on 3090, 4090 or less? Are us mere mortals
| gonna be able to have fun with it ?
|
| Its in sizes from 800m to 8b parameters now, will be all sizes
| for all sorts of edge to giant GPU deployment.
|
| (adding some later replies)
|
| >awesome. I assume these aren't heavily cherry picked seeds?
|
| No this is all one generation. With DPO, refinement, further
| improvement should get better.
|
| >Do you have any solves coming for driving coherency and
| consistency across image generations? For example, putting the
| same dog in another scene?
|
| yeah see @Scenario_gg's great work with IP adapters for example.
| Our team builds ComfyUI so you can expect some really great stuff
| around this...
|
| >Dall-e often doesn't even understand negation, let alone complex
| spatial relations in combination with color assignments to
| objects.
|
| Imagine the new version will. DALLE and MJ are also pipelines,
| you can pretty much do anything accurately with pipelines now.
|
| >Nice. Is it an open-source / open-parameters / open-data model?
|
| Like prior SD models it will be open source/parameters after the
| feedback and improvement phase. We are open data for our LMs but
| not other modalities.
|
| >Cool!!! What do you mean by good data? Can it directly output
| videos?
|
| If we trained it on video yes, it is very much like the arch of
| sora.
| cheald wrote:
| SD 1.5 is 983m parameters, SDXL is 3.5b, for reference.
|
| Very interesting. I've been streching my 12GB 3060 as far as I
| can; it's exciting that smaller hardware is still usable even
| with modern improvements.
| memossy wrote:
| 800m is good for mobile, 8b for graphics cards.
|
| Bigger than that is also possible, not saturated yet but need
| more GPUs.
| vorticalbox wrote:
| you ca also quantisation which lowers memory requirements
| at a small lose of performance.
| liuliu wrote:
| I am going to look at quantization for 8b. But also, these
| are transformers, so variety of merging / Frankenstein-tune
| is possible. For example, you can use 8b model to populate
| the KV cache (which computes once, so can load from slower
| devices, such as RAM / SSD) and use 800M model for diffusion
| by replicating weights to match layers of the 8b model.
| ttul wrote:
| Stability has to make money somehow. By releasing an 8B
| parameter model, they're encouraging people to use their paid
| API for inference. It's not a terrible business decision. And
| hobbyists can play with the smaller models, which with some
| refining will probably be just fine for most non-professional
| use cases.
| jandrese wrote:
| I would LOL if they released the "safe" model for free but
| made you pay for the one with boobs.
| ttul wrote:
| Oh they'll never let you pay for porn generation. But
| they will happily entertain having you pay for quality
| commercial images that are basically a replacement for
| the entire graphic design industry.
| teaearlgraycold wrote:
| Don't people quantize SD down to 8 bits? I understand
| plenty of people don't have 8GB of VRAM (and I suppose you
| need some extra for supplemental data, so maybe 10GB?). But
| that's still well within the realm of consumer hardware
| capabilities.
| ttul wrote:
| I'm the wrong person to ask, but it seems Stability
| intends to offer models from 800M to 8B parameters in
| size, which offers something for everyone.
| netdur wrote:
| > - Need moar GPUs..
|
| Why is there not a greater focus on quantization to optimize
| model performance, given the evident need for more GPU
| resources?
| supermatt wrote:
| I believe he means for training
| memossy wrote:
| We have highly efficient models for inference and a
| quantization team.
|
| Need moar GPUs to do a video version of this model similar to
| Sora now they have proved that Diffusion Transformers can
| scale with latent patches (see stablevideo.com and our work
| on that model, currently best open video model).
|
| We have 1/100th of the resources of OpenAI and 1/1000th of
| Google etc.
|
| So we focus on great algorithms and community.
|
| But now we need those GPUs.
| sylware wrote:
| Don't fall for it: OpenAI is microsoft. They have as much
| as google, if not more.
| px43 wrote:
| To be clear here, you think that Microsoft has more AI
| compute than Google?
| Jensson wrote:
| Google got cheap TPU chips, means they circumvent the
| extremely expensive Nvidia corporate licenses. I can
| easily see them having 10x the resources of OpenAI for
| this.
| SV_BubbleTime wrote:
| This isn't OpenAI that make GPTx.
|
| It's StabilityAI that makes Stable Diffusion X.
| pavon wrote:
| Yes, they have deep pockets and could increase investment
| if needed. But the actual resources devoted today are
| public, and in line with the parent said.
| Solvency wrote:
| can someone explain why nVidia doesn't just hold their own
| AI? And literally devote 50% of their production to their
| own compute center? In an age where even ancient companies
| like Cisco are getting in the AI race, why wouldn't the
| people with the keys to the kingdom get involved?
| downWidOutaFite wrote:
| 1. the real keys to the kingdom are held by TSMC whose
| fab capacity rules the advanced chips we all get, from
| NVIDIA to Apple to AMD to even Intel these days.
|
| 2. the old advice is to sell shovels during a gold rush
| chompychop wrote:
| "The people that made the most money in the gold rush
| were selling shovels, not digging gold".
| swamp40 wrote:
| Jensen was just talking about a new kind of data center:
| AI-generation factories.
| blihp wrote:
| Because history has shown that the money is in selling
| the picks and shovels, not operating the mine. (At least
| for now. There very well may come a point later on when
| operating the mine makes more sense, but not until it's
| clear where the most profitable spot will be)
| declaredapple wrote:
| They've been very happy selling shovels at a steep margin
| to literally endless customers.
|
| The reason is because they instantly get a risk free
| guaranteed VERY healthy margin on every card they sell,
| and there's endless customers lined up for them.
|
| If they kept the cards, they give up the opportunity to
| make those margins, and instead take the risk that
| they'll develop a money generating service (that makes
| more money then selling the cards).
|
| This way there's no risk of: A competitor out competing
| them, not successfully developing a profitable product,
| "the ai bubble popping", stagnating development, etc.
|
| There's also the advantage that this capital has allowed
| them to buy up most of TSMC's production capacity, which
| limits the competitors like Google's TPUs.
| AnthonyMouse wrote:
| > Why is there not a greater focus on quantization to
| optimize model performance, given the evident need for more
| GPU resources?
|
| There is an inherent trade off between model size and
| quality. Quantization reduces model size at the expense of
| quality. Sometimes it's a better way to do that than reducing
| the number of parameters, but it's still fundamentally the
| same trade off. You can't make the highest quality model use
| the smallest amount of memory. It's information theory, not
| sorcery.
| sandworm101 wrote:
| >> all sorts of edge to giant GPU deployment.
|
| Soon the GPU and its associated memory will be on different
| cards, as once happened with CPUs. The day of the GPU with ram
| _slots_ is fast approaching. We will soon plug terabytes of ram
| into our 4090s, then plug a half-dozen 4090s into a raspberry
| PI to create a Cronenberg rendering monster. Can it generate
| movies faster than Pixar can write them? Sure. Can it play
| Factorio? Heck no.
| jsheard wrote:
| Any seperation of a GPU from its VRAM is going to come at the
| expense of (a lot of) bandwidth. VRAM is only as fast as it
| is because the memory chips are as close as possible to the
| GPU, either on seperate packages immediately next to the GPU
| package or integrated onto the same package as the GPU itself
| in the fanciest stuff.
|
| If you don't care about bandwidth you can already have a GPU
| access terabytes of memory across the PCIe bus, but it's too
| slow to be useful for basically anything. Best case you're
| getting 64GB/sec over PCIe 5.0 x16, when VRAM is reaching
| _3.3TB /sec_ on the highest end hardware and even mid-range
| consumer cards are doing >500GB/sec.
|
| Things are headed the other way if anything, Apple and Intel
| are integrating RAM onto the CPU package for better
| performance than is possible with socketed RAM.
| mysterydip wrote:
| Is there a way to partition the data so that a given GPU
| had access to all the data it needs but the job itself was
| parallelized over multiple GPUs?
|
| Thinking on the classic neural network for example, each
| column of nodes would only need to talk to the next column.
| You could group several columns per GPU and then each would
| process its own set of nodes. While an individual job would
| be slower, you could run multiple tasks in parallel,
| processing new inputs after each set of nodes is finished.
| zettabomb wrote:
| Of course, this is common with LLMs which are too large
| to fit in any single GPU. I believe Deepspeed implements
| what you're referring to.
| sandworm101 wrote:
| That depends on whether performance or capacity is the
| goal. Smaller amounts of ram closer to the processing unit
| makes for faster computation, but AI also presents a
| capacity issue. If the workload needs the space, having a
| boatload of less-fast ram is still preferable to offloading
| data to something more stable like flash. That is where
| bulk memory modules connected though slots may one day
| appear on GPUs.
| duffyjp wrote:
| I'm having flashbacks to owning a Matrox Millenium as a
| kid. I never did get that 4MB vram upgrade.
|
| https://www.512bit.net/matrox/matrox_millenium.html
| ltbarcly3 wrote:
| I don't think you really understand the current trends in
| computer architecture. Even cpus are being moved to have on
| package ram for higher bandwidth. Everything is the opposite
| of what you said.
| sandworm101 wrote:
| Higher bandwidth but lower capacity. The real trend is
| different physical architectures for different compute
| loads. There is a place in AI for bulk albeit slower memory
| such as extremely large date sets that want to run
| internally on a discreet card without involving pci lanes.
| zettabomb wrote:
| I doubt it. The latest GPUs utilize HBM which is necessarily
| part of the same package as the main die. If you had a RAM
| slot for a GPU you might as well just go out to system RAM,
| way too much latency to be useful.
| AnthonyMouse wrote:
| It isn't the latency which is the problem, it's the
| bandwidth. A memory socket with that much bandwidth would
| need a lot of pins. In principle you could just have more
| memory slots where each slot has its own channel. 16
| channels of DDR5-8000 would have more bandwidth than the
| RTX 4090. But an ordinary desktop board with 16 memory
| channels is probably not happening. You could plausibly see
| that on servers however.
|
| What's more likely is hybrid systems. Your basic desktop
| CPU gets e.g. 8GB of HBM, but then also has 16GB of DRAM in
| slots. Another CPU/APU model that fits into the same socket
| has 32GB of HBM (and so costs more), which you could then
| combine with 128GB of DRAM. Or none, by leaving the slots
| empty, if you want entirely HBM. A server or HEDT CPU might
| have 256GB of HBM and support 4TB of DRAM.
| brookst wrote:
| Agree, this is likely future. It's really just an
| extension of The existing tiered CPU cache model
| VikingCoder wrote:
| I'm curious - where are the GPUs with decent processing power
| but enormous memory? Seems like there'd be a big market for
| them.
| wongarsu wrote:
| Nvidia is making way too much money keeping cards with lots
| of memory exclusive to server GPUs they sell with insanely
| high margins.
|
| AMD still suffers from limited resources and doesn't seem
| willing to spend too much chasing a market that might just be
| a temporary hype, Google's TPUs are a pain to use and seem to
| have stalled out, and Intel lacks commitment, and even their
| products that went roughly in that direction aren't a great
| match for neural networks because of their philosophy of
| having fewer more complex cores.
| p1esk wrote:
| H200 has 141GB, B100 (out next month) will probably have even
| more. How much memory do you need?
| holoduke wrote:
| We need 128gb with a 4070 chip for about 2000 dollars.
| Thats what we want.
| FeepingCreature wrote:
| Yes please.
| duffyjp wrote:
| I've never tried it, but in Windows you can have CUDA
| apps fall back to system ram when GPU vram is exhausted.
| You could slap 128gb in your rig with a 4070. I'm sure
| performance falls off a cliff, but if it's the difference
| between possible and impossible that might be acceptable.
|
| https://nvidia.custhelp.com/app/answers/detail/a_id/5490/
| ~/s...
| ta_1138 wrote:
| Unfortunately production capacity for that is limited,
| and with sufficient demand, all pricing is an auction.
| Therefore, we aren't going to be seeing that card in
| years
| qwertox wrote:
| Please give me some DIMM slots on the GPU so that I can
| choose my own memory like I'm used to from the CPU-world
| and which I can re-use when I upgrade my GPU.
| ttul wrote:
| Nvidia will not build that any time soon. RAM is the
| dividing line between charging $40,000 vs $2500...
| SV_BubbleTime wrote:
| I'll bet you the Nvidua 50xx series will have cards that are
| asymmetric for this reason. But nothing that will cannibalize
| their gaming market.
|
| You'll be able to get higher resolution but slowly. Or pay
| the $2800 for a 5090 and get high res with good speed.
| ls612 wrote:
| MacBooks with M2 or M3 Max. I'm serious. They perform like a
| 2070 or 2080 but have up to 128GB of unified memory, most of
| which can be used as VRAM.
| declaredapple wrote:
| How many tokens/s are we talking for a 70B model?
|
| Last I saw they performed really poorly, like lower single
| digits t/s. Don't get me wrong they're probably a decent
| value for experimenting with it, but is flat out pathetic
| compared to an A100 or H100. And I think useless for
| training?
| smcleod wrote:
| You can run a 180B model like Falcon Q4 around 4-5tk/s, a
| 120B model like Goliath Q4 at around 6-10tk/s, and 70B Q4
| around 8-12tk/s and smaller models much quicker, but it
| really depends on the context size, model architecture
| and other settings. A A100 or H100 is obviously going to
| be a lot faster but it costs significantly more taking
| its supporting requirements into account and can't be run
| on a light, battery powered laptop etc...
| ttul wrote:
| MPS is promising and the memory bandwidth is definitely
| there, but stable diffusion performance on Apple Silicon
| remains terribly poor compared with consumer Nvidia cards
| (in my humble opinion). Perhaps this is partly because so
| many bits of the SD ecosystem are tied to Nvidia
| primitives.
| iosjunkie wrote:
| I dream of AMD or Intel creating cards to do just that
| pbhjpbhj wrote:
| Nvidia have a system for DMA from GPU to system memory,
| GPUdirect. That seems like a potentially better route if
| latency can be handled well.
| nick238 wrote:
| GPU memory is all about bandwidth, not latency. DDR5 can do
| 4-8 GT/s x 64-bit bus per DIMM, so maxing 128 GB/s with a
| dual memory controller, 512 GB/s with 8x memory controllers
| on server chips, but GDDR6 can run at twice the frequency
| and has a memory bus ~5x as wide in the 4090, so you get an
| order of magnitude bump in throughput, so nearly 1 TB/s on
| a consumer product. Datacenter GPUs (e.g. A100) with HBM2e
| doubles that to 2 TB/s
| albertzeyer wrote:
| I understand that Sora is very popular, so it makes sense to
| refer to it, but when saying it is similar to Sora, I guess it
| actually makes more sense to say that it uses a Diffusion
| Transformer (DiT) (https://arxiv.org/abs/2212.09748) like Sora.
| We don't really know more details on Sora, while the original
| DiT has all the details.
| tithe wrote:
| Is anyone else struck by the similarities in textures between
| the images in the appendix of the above "Scalable Diffusion
| Models with Transformers" paper?
|
| If you size the browser window right, paging with the arrow
| keys (so the document doesn't scroll) you'll see (eg, pages
| 20-21) the textures of the parrot's feathers are almost
| identical to the textures of bark on the tree behind the
| panda bear, or the forest behind the red panda is very
| similar to the undersea environment.
|
| Even if I'm misunderstanding something fundamental here about
| this technique, I still find this interesting!
| jachee wrote:
| Could be that they're all generated from the same seed. And
| we humans are _really_ good at spotting patterns like that.
| cchance wrote:
| So is this "SDXL safe" or "SD2.1" safe, cause SDXL safe we can
| deal with, if it's 2.1 safe it's gonna end up DOA for a large
| part of the opensource community again
| astrange wrote:
| SD2.1 was not "overly safe", SD2.0 was because of a training
| bug.
|
| 2.1 didn't have adoption because people didn't want to deal
| with the open replacement for CLIP. Or possibly because
| everyone confused 2.0 and 2.1.
| samstave wrote:
| >> _> How does it perform on 3090, 4090 or less? Are us mere
| mortals gonna be able to have fun with it ?_
|
| >>> _Its in sizes from 800m to 8b parameters now, will be all
| sizes for all sorts of edge to giant GPU deployment._
|
| --
|
| Can you fragment responses such that if an edge device (mobile
| app) is prompted for [thing] it can pass tokens upstream on the
| prompt -- Torrenting responses effectively - and you could push
| actual GPU edge devices in certain climates... like dens cities
| whom are expected to be a Fton of GPU cycle consumption around
| the edge?
|
| So you have tiered processing (speed is done locally, quality
| level 1 can take some edge gpu - and corporate shit can be
| handled in cloud...
|
| ----
|
| Can you fragment and torrent a response?
|
| If so, how is that request torn up and routed to appropriate
| resources?
|
| BOFH me if this is a stupid question? (but its valid for how we
| are evolving to AI being intrinsic to our society so quickly.)
| coldcode wrote:
| No details in the announcement, is it still pixel size in = pixel
| size out?
| spywaregorilla wrote:
| Impressive text in the images.
| deepsdev wrote:
| Can we use it create SORA like videos?
| memossy wrote:
| If we trained it with videos yes but need more GPUs for that.
| nickthegreek wrote:
| No.
| btbuildem wrote:
| That's nice, but could we please have an unsafe alternative? I
| would like to footgun both my legs off, thank you.
| dougmwne wrote:
| Since these are open models, people can fine tune them to do
| anything.
| politician wrote:
| It's not obvious that fine-tuning can remove all latent
| compulsions from these models. Consider that the creators
| know that fine-tuning exists and have vastly more resources
| to explore the feasibility of removing deep bias using this
| method.
| dougmwne wrote:
| Go check out the Unstable Diffusion Discord.
| SV_BubbleTime wrote:
| The vast majority of images there are SD1.5, even the
| ones made today.
|
| Which goes far more towards the idea that safety isn't a
| desirable feature to a lot of AI users.
| ttul wrote:
| I suppose you could train a model from scratch if you have
| enough money to blow...
| wokwokwok wrote:
| How would that be meaningfully different to SDXL?
|
| I mean, SDXL is great. Until you've had a chance to _actually
| use_ this model, isn't calling it out for _some imagined
| offence_ that may or may not exist seems like you're drinking
| some Kool-aid rather than responding to something based in
| concrete _actual reality_.
|
| You get access to it... and it does the google thing and puts
| people of colour in every frame? Sure, complain away.
|
| You get access to it, you can't even generate pictures of
| girls? Sure. Burn the house down.
|
| ...you haven't even _seen it_ and you're _already bitching_
| about it?
|
| Come on... give them a chance. Judge what it _is_ when _you see
| it_ not _what you imagine it is_ before you've even had a
| chance to try it out...
|
| Lots of models, free, multiple sizes, hot damn. This is cool
| stuff. Be a bit grateful for the work they're doing.
|
| ...and even if sucks, it's open. If it's not what you want, you
| can retune it.
| viraptor wrote:
| Just wait some time. People release SD loras all the time. Once
| SD3 is open, you'll be able to get a patched model in
| days/weeks.
| SV_BubbleTime wrote:
| A blogger I follow had an article explaining that the NSFW
| models for SDXL, are just now SORT OF coming up to the
| quality of SD1.5 "pre safety" models.
|
| It's been 6 months and it still isn't there. SD3 is going to
| be quite awhile if they're baking "safety" in even harder.
| viraptor wrote:
| 1.5 is still more popular than xl and 2 for reasons
| unrelated to safety. The size and generation speed matter a
| lot. This is just a matter of practical usability, not some
| idea of the model being locked down. Feed it enough porn
| and you'll get porn out of it. If people have incentive to
| do that (better results than 1.5), it really will happen
| within days.
| Der_Einzige wrote:
| Due to the pony community the SDXL nsfw models are far
| superior to SD1.5. Only issue is that controlnets don't
| work with that pony SDXL fine tune
| SV_BubbleTime wrote:
| I am slightly aware of the pony models.
|
| I wish I had something more clever to comment on it. I
| know what they're doing which is cool and why which is,
| IDK, live and let live and enjoy your own kink. It just a
| little funny some of the most work put into in the fine
| tuning models.. is from the pony community.
|
| So all I have is...
|
| :/
| Fervicus wrote:
| Nope, sorry. We can't allow you to commit thought crimes.
| wtcactus wrote:
| I notice they are avoiding images of people in the announcement.
|
| I wonder if they are afraid of the same debacle as google AI and
| what they mean by "safety" is actually heavy bias against white
| people and their culture like what happened with Gemini.
| danielbln wrote:
| What's white people culture?
| potwinkle wrote:
| From the examples I see on Twitter, they are usually
| referring to the different cultures of Irish, European, and
| American white people. Gemini, in an effort to reverse the
| bias that the models would naturally have, ends up replacing
| these people with those from other cultures.
| astrange wrote:
| Calling Irish people white is a rather historically radical
| statement.
| sealeck wrote:
| White is a pretty complex and non-obvious category.
| 7moritz7 wrote:
| US American white people. Anything else would be a ridiculous
| overgeneralization, like "Asian culture", even if you set
| some arbitary benchmark for teint and only look at those
| European countries it's still too much diversity to pool
| together.
| t0lo wrote:
| A little continent called europe?
| cuckatoo wrote:
| NSFW fine tune when? Or will "safety" win this time?
| SXX wrote:
| They need to release model first. Then it's will be fine-tuned.
| redder23 wrote:
| Horrible website, hijacks scrolling. I have my scrolling speed up
| with Chromium Wheel Smooth Scroller. This website's scrolling is
| extremely slow, so the extension is not working because they are
| "doing it wrong" TM and somehow hijack native scrolling and do
| something with it.
| pama wrote:
| I wish they put out the report already. Has anyone else published
| a preprint combining ideas similar to diffusion transformers and
| flow matching?
| lairv wrote:
| Pretty exciting indeed to see they used flow matching, which
| have been unpopular for the last few years
| memossy wrote:
| It'll be out soon, doing benchmark tests etc
| pama wrote:
| Thanks.
| subzel0 wrote:
| "Photo of a red sphere on top of a blue cube. Behind them is a
| green triangle, on the right is a dog, on the left is a cat"
|
| https://pbs.twimg.com/media/GG8mm5va4AA_5PJ?format=jpg&name=...
| Workaccount2 wrote:
| Not bad, I'm curious of the output if you ask for a mirrored
| sphere instead.
| svenmakes wrote:
| This is actually the approach of one paper to estimate
| lighting conditions. Their strategy is to paint a mirrored
| sphere onto an existing image:
| https://diffusionlight.github.io/
| jetrink wrote:
| One thing that jumps out to me is that the white fur on the
| animals has a strong green tint due to the reflected light from
| the green surfaces. I wonder if the model learned this effect
| from behind the scenes photos of green screen film sets.
| diggan wrote:
| It's just diffuse irradiance, visible in most real (and CGI)
| pictures although not as obvious as that example. Seems like
| a typical demo scene for a 3D renderer, so I bet that's why
| it's so prominent.
| zero_iq wrote:
| The models do a pretty good job at rendering plausible global
| illumination, radiosity, reflections, caustics, etc. in a
| whole bunch of scenarios. It's not necessarily physically
| accurate (usually not in fact), but usually good enough to
| trick the human brain unless you start paying very close
| attention to details, angles, etc.
|
| This fascinated me when SD was first released, so I tested a
| whole bunch of scenarios. While it's quite easy to find
| situations that don't provide accurate results and produce
| all manner of glitches (some of which you can use to detect
| some SD-produced images), the results are nearly always
| convincing at a quick glance.
| astrange wrote:
| One thing they don't so far do is have consistent
| perspective and vanishing points.
|
| https://arxiv.org/abs/2311.17138
| orbital-decay wrote:
| As well as light and shadows, yes. It can be fixed
| explicitly during training like the paper you linked
| suggests by offering a classifier, but it will probably
| also keep getting better in new models on its own, just
| as a result of better training sets, lower compression
| ratios, and better understanding of the real world by
| models.
| awongh wrote:
| I think you have to conceptualize how diffusion models work,
| which is that once the green triangle has been put into the
| image in the early steps, the later generations will be
| influenced by the presence of it, and fill in fine details
| like reflection as it goes along.
|
| The reason it knows this is that this is how any light in a
| real photograph works, not just CGI.
|
| Or if your prompt was "A green triangle looking at itself in
| the mirror" then early generation steps would have two green
| triangle like shapes. It doesn't need to know about the
| concept of light reflection. It does know about composition
| of an image based on the word mirror though.
| mlsu wrote:
| It does make sense though. Accurate global illumination is
| very strongly represented in nearly all training data (except
| illustrations) so it makes sense that the model learned an
| approximation of it.
| samstave wrote:
| Wow - is it doing pre-render-ray-tracing?
| Hugsun wrote:
| That's very impressive!
| yreg wrote:
| It is! This isn't something orevious models could do.
| iamgopal wrote:
| Interesting is that Left and right taken from viewer's
| perspective instead of red sphere's perspective
| ebertucc wrote:
| How do you know which way the red sphere is facing? A fun
| experiment would be to write two prompts for "a person in the
| middle, a dog to their left, and a cat to their right", and
| have the person either facing towards or away from the
| viewer.
| leumon wrote:
| "When in doubt, scale it up." - openai.com/careers
| Filligree wrote:
| That's _amazing_.
|
| I imagine this doesn't look impressive to anyone unfamiliar
| with the scene, but this was absolutely impossible with any of
| the older models. Though, I still want to know if it reliabily
| does this--so many other things are left to chance, if I need
| to also hit a one-in-ten chance of the composition being right,
| it still might not be very useful.
| Feuilles_Mortes wrote:
| What was difficult about it?
| lucidrains wrote:
| previous systems could not compose objects within the scene
| correctly, not to this degree. what changed to allow for
| this? could this be a heavily cherrypicked example? guess
| we will have to wait for the paper and model to find out
| bbor wrote:
| From the original paper with this technique:
| We introduce Diffusion Transformers (DiTs), a simple
| transformer-based backbone for diffusion models that
| outperforms prior U-Net models and inherits the excellent
| scaling properties of the transformer model class. Given
| the promising scaling results in this paper, future work
| should continue to scale DiTs to larger models and token
| counts. DiT could also be explored as a drop-in backbone
| for text-to-image models like DALL E 2 and Stable
| Diffusion.
|
| Afaict the answer is that combining transformers with
| diffusers in this way means that the models can
| (feasibly) operate in a much larger, more linguistically-
| complex space. So it's better at spatial relationships
| simply because it has more computational "time" or
| "energy" or "attention" to focus on them.
|
| Any actual experts want to tell me if I'm close?
| zavertnik wrote:
| From my experience, the thing that makes using AI image gen
| hard to use is nailing specificity. I often find myself
| having to resort to generating all of the elements I want
| out of an image separately and then comp them together with
| photoshop. This isn't a bad workflow, but it is tedious (I
| often equate it to putting coins in a slot machine, hoping
| it 'hits').
|
| Generating good images is easy but generating good images
| with very specific instructions is not. For example, try
| getting midjourney to generate a shot of a road from the
| side (ie standing on the shoulder of a road taking a photo
| of the shoulder on the other side with the road crossing
| frame from left to right)...you'll find midjourney only
| wants to generate images of roads coming at the "camera"
| from the vanishing point. I even tried feeding an example
| image with the correct framing for midjourney to analyze to
| help inform what prompts to use, but this still did not
| result in the expected output. This is obviously not the
| only framing + subject combination that model(s) struggle
| with.
|
| For people who use image generation as a tool within a
| larger project's workflow, this hurdle makes the tool swing
| back and forth from "game changing technology" to "major
| time sink".
|
| If this example prompt/output is an honest demonstration of
| SD3's attention to specificity, especially as it pertains
| to framing and composition of objects + subjects, then I
| think its definitely impressive.
|
| For context, I've used SD (via comfyUI), midjourney, and
| Dalle. All of these models + UIs have shared this issue in
| varying degrees.
| astrange wrote:
| It's very difficult to improve text-to-image generation
| to do better than this because you need extremely
| detailed text training data, but I think a better
| approach would be to give up on it.
|
| > I often find myself having to resort to generating all
| of the elements I want out of an image separately and
| then comp them together with photoshop. This isn't a bad
| workflow, but it is tedious
|
| The models should be developed to accelerate this then.
|
| ie you should be able to say layer one is this text
| prompt plus this camera angle, layer two is some
| mountains you cheaply modeled in Blender, layer three is
| a sketch you drew of today's anime girl.
| CSMastermind wrote:
| I put the prompt into ChatGPT and it seemed to work just
| fine: https://imgur.com/LsRM7G4
| mikeg8 wrote:
| I dislike the look of chatGPT images so much. The photo-
| realism of stable diffusion impresses me a lot more for
| some reason.
| bbor wrote:
| This is just stylistic, and I think it's because chatgpt
| knows a bit "better" that there aren't very many literal
| photos of abstract floating shapes. Adding "studio
| photography, award winner" produced results quite similar
| to SD imo, but this does negatively impact the accuracy.
| On the other side of the coin, "minimalist textbook
| illustration" definitely seems to help the accuracy,
| which I think is soft confirmation of the thought above.
|
| https://imgur.com/a/9fO2gxN
|
| EDIT: I think the best approach is simply to separate out
| the terms in separate phrases, as that gets more-or-less
| 100% accuracy https://imgur.com/a/JGjkicQ
|
| That said, we should acknowledge the point of all this:
| SD3 is just incredibly incredibly impressive.
| mortenjorck wrote:
| You got lucky! Here's a thread where I attempted the same
| just now: https://imgur.com/a/xiaiKXp
|
| It has a lot of difficulty with the orientation of the cat
| and dog, and by the time it gets them in the right
| positions, the triangle is lost.
| smcleod wrote:
| It looks terrible to me though, very basic rendering and as
| if it's lower resolution then scaled up.
| ttul wrote:
| It's the transformer making the difference. Original stable
| diffusion uses convolutions, which are bad at capturing long
| range spatial dependencies. The diffusion transformer chops
| the image into patches, mixes them with a positional
| embedding, and then just passes that through multiple
| transformer layers as in an LLM. At the end, the model
| unpatchify's (yes, that term is in the source code) the
| patched tokens to generate output as a 2D image again.
|
| The transformer layers perform self-attention between all
| pairs of patches, allowing the model to build a rich
| understanding of the relationships between areas of an image.
| These relationships extend into the dimensions of the
| conditioning prompts, which is why you can say "put a red
| cube over there" and it actually is able to do that.
|
| I suspect that the smaller model versions will do a great job
| of generating imagery, but may not follow the prompt as
| closely, but that's just a hunch.
| npunt wrote:
| We're getting to strong holodeck vibes here
| the_duke wrote:
| So, they just announced StableCascade.
|
| Wouldn't this v3 supersede the StableCascade work?
|
| Did they announce it because a team had been working on it and
| they wanted to push it out to not just lose it as an internal
| project, or are there architectural differences that make both
| worthwile?
| Kubuxu wrote:
| I think of the SD3 as a further evolution of SD1.5/2/XL and
| StableCascade as a branching path. It is unclear which will be
| better in the long term, so why not cover both directions if
| they have the resources to do so?
| ttul wrote:
| I suspect Stable Cascade may incorporate a DiT at some point.
| The UNet is easily swapped out. SC's main innovation is the
| training of a semantic compressor model and a VQGAN that
| translates the latent output from the diffusion model back to
| image space - rather than relying on a VAE.
|
| It's a really smart architecture and I think is fertile
| ground for stacking on new things like DiT.
| whywhywhywhy wrote:
| There's architectural differences, although I found Stable
| Cascade a bit underwhelming, while yes it can actual manage
| text, the text it does manage just looks like someone just
| wrote text over the image it doesn't feel integrated a lot of
| the time.
|
| SD3 seems to be more towards SOTA, not sure why Cascade took so
| long to get out, seemed to be up and running months ago
| Dwedit wrote:
| Stable Cascade has a distinct noisy look to generated images.
| It almost looks as bad as images being dithered to the old 216
| color Netscape palette.
| ttul wrote:
| If you renoise the output of the first diffusion stage to
| halfway and then denoise forward again, you can eliminate the
| bad output. This approach is called "replay" or "iterative
| mixing" and there are a few open source nodes for ComfyUI you
| can refer to.
| 101008 wrote:
| What's the best way to use SD (3 or 2) online? I can't run it on
| my PC and I want to do some experiments to generate assets for a
| POC videogame I'm working on. I pay MidJOurney and I woulnd't
| mind pay something like 5 or 10 dollars per month to experiment
| with SD, but I can't find anything.
| Gracana wrote:
| I used Rundiffusion for a while before I bought a 4090, and I
| thought their service was pretty nice. You pay for time on a
| system of whatever size you choose, with whatever
| tool/interface you select. I think it's worth tossing a few
| bucks into it to try it out.
| heroprotagonist wrote:
| Eh, you can get the same software up and running in less than
| 15-20 minutes on an EC2 GPU instance for about half the
| hourly-rated pricing of rundiffusion. And you'll also pay
| less than their 'premium' monthly fee for storage of keeping
| an instance in the Stopped state the entire month.
|
| I used rundiffusion to play around with a bunch of different
| open source software quickly and easily with pre-downloaded
| models after getting annoyed at my laptop GPU. But once I
| settled on one particular implementation and started spending
| a lot of time in it, it no longer made sense to repeatedly
| pay every hour for an initial ease-of-setup.
|
| The only real ongoing benefit was rundiffusion came with a
| bunch of models pre-downloaded so swapping between them was
| quick. But you can use UI addons like the CivitAI browser to
| download models automatically through automatic1111, and
| you'll likely want to go beyond what they predownload to the
| instance for you anyway.
|
| The downside to running on the cloud directly is having to
| manage the running/stopped state of the instance yourself. I
| haven't ever left it running when I was done with an
| instance, but I could see that as a risk. CLI commands and
| scripting can make that faster than logging into a website
| which does it for you automatically, but it's extra effort.
|
| I thought about building an AMI and putting it up on AWS
| marketplace, but it looks like there are a few options for
| that already. I don't know how good they are out of the box,
| as I haven't used them. But if spending 20 minutes once to
| get software running on a Linux instance is truly the only
| barrier to reducing cost, those prebuilt AMIs are a decent
| intermediary step. They're about $0.10/hour on top of server
| costs. I skipped straight to installing the software myself,
| but even an extra $0.10/hour overhead would be better than
| paying double..
| Gracana wrote:
| Would you recommend that to someone who has never used AWS
| before? Is it possible to screw up and rack up a huge bill?
| I might consider using that for big tasks that I can't do
| with my local setup.
| Liquix wrote:
| poke around stablediffusion.fr and trending public huggingface
| spaces
| bsaul wrote:
| Anyone knows which AI could be used to generate UI design
| elements ? (such as "generate a real estate app widget list") as
| well as the kind of prompts one would use to obtain good results
| ?
|
| I'm only now investigating using AI to increase velocity in my
| projects, and the field is moving so fast, i'm a bit outdated.
| kevinbluer wrote:
| v0 by Vercel could be worth a look: https://v0.dev
|
| From the FAQ: "v0 is a generative user interface system by
| Vercel powered by AI. It generates copy-and-paste friendly
| React code based on shadcn/ui and Tailwind CSS that people can
| use in their projects"
| gwern wrote:
| If by design elements you include vector images, you could try
| https://www.recraft.ai/ or Adobe Firefly 2 - there's not a lot
| of vector work right now, so your choices are either the
| handful of vector generators, or just bite the bullet and use
| eg DALL-E 3 to generate raster images you convert to
| SVG/recreate by hand.
|
| (The second is what we did for https://gwern.net/dropcap
| because the PNG->SVG filesizes & quality were just barely
| acceptable for our web pages.)
| AuryGlenz wrote:
| It's really unfortunate that Silicon Valley ended up in an area
| that's so far left - and to be clear, it'd be just as bad if it
| was in a far right area too. Purple would have been nice, to keep
| people in check. 'Safety' seems to be actively making AI advances
| worse.
| spencerflem wrote:
| Silicon Valley is not "far left" by any stretch, which implies
| socialism, redistribution of wealth, etc. This is obvious by
| inspection.
|
| I assume by far left, you mean progressive on social issues,
| which is not really a leftist thing but the groups are related
| enough that I'll give you a pass.
|
| Silicon valley techies are also not socially progressive. Read
| this thread or anything published by Paul Graham or any of the
| AI leaders for proof of that.
|
| However most normal city people are. A large enough percent of
| the country that big companies that want to make money feel the
| need to appeal to them.
|
| Funnily enough, what is a uniquely Silicon Valley political
| opinion is valuing the progress of AI over everything else
| TulliusCicero wrote:
| Techies are socially progressive as a whole. Yes there are
| some outliers, and tech _leaders_ probably aren 't as far
| left socially as the ground level workers.
| spencerflem wrote:
| I wish :/, I really do
|
| I find them in general to not be Republican and all the
| baggage that entails but the typical techie I meet is less
| concerned with social issues than the typical city
| Democrat.
|
| If I can speculate wildly, I think it is because tech has
| this veneer of being an alternative solution to the worlds
| problems, so a lot of techies believe that advancing of
| tech is both the most important goal and also politically
| neutral. And also, now that tech is a uniquely profitable
| career, the types of people that would be in business
| majors are now CS majors. Ie. those that are mainly
| interested in getting as much money as possible for
| themselves.
| KittenInABox wrote:
| I disagree techies are socially progressive as a whole;
| there is very minimal, almost no push for labor rights or
| labor protection even though our group is
| disproportionately hit with abusing employees under the
| visa program.
| TulliusCicero wrote:
| Labor protections are generally seen as a fiscal issue,
| rather than a social one. E.g. libertarians would usually
| be fine with gay rights but against greater labor
| regulation.
| chasd00 wrote:
| when i think of "far left" i think of an authoritative regime
| disguised as serving the common good and ready to punish and
| excommunicate any thought or action deemed contrary to the
| common good. However, the regime defines "common good"
| themselves and remains in power indefinitely. In that regard,
| SV is very "far left". At the extremes far-left and far-right
| are very similar when you empathize as a regular person on
| the street.
| spencerflem wrote:
| Well, you're wrong.
| foolofat00k wrote:
| That's just not what that term means.
| acheron wrote:
| It's not right wing unless they sit on the right side of
| the National Assembly and support Louis XVI.
| five_lights wrote:
| Like it or not, this is how center-right over are using
| it. We've just created huge silos post trump schism, that
| even our language is drifting.
| skinpop wrote:
| indeed they are not really left but neoliberals with a
| leftist aesthetic, just like most republicans are neoliberals
| with a conservative aesthetic.
| rightbyte wrote:
| SV area far left? I wouldn't even regard the area as left
| leaning, at all.
|
| I looked at Wikipedia and there seem to be no socialist
| representation.
|
| Like, from an European perspective hearing that is ludicrous.
| kristofferR wrote:
| They are the worst kind of left, the "prudish and constantly
| offended left", not the "free healthcare and good government"
| left.
|
| I'm glad I live in Norway, where state TV shows boobs and
| does offensive jokes without anyone really caring.
| jquery wrote:
| Prudish? San Francisco? The same city that has outdoor nude
| carnivals without any kind of age restrictions?
|
| If by prudish you mean intolerant of hate speech, sure. But
| generally few will freak out over some nudity here.
|
| College here is free. We also have free healthcare here, as
| limited as it is:
| https://en.wikipedia.org/wiki/Healthy_San_Francisco
|
| Not sure what you mean by "offensive jokes", that could
| mean a lot of things...
| bergen wrote:
| Put in any historical or political context SV is in no way
| left. They're hardcore libertarian. Just look at their poster
| boys, Elon Musk, Peter Thiel, and a plethora of others are very
| oriented towards totalitarianism from the right. Just because
| they blow their brains out on lsd and ketamine and go on 2 week
| spiritual retreats doesn't make them leftists. They're
| billionares that only care about wealth and power, living in
| segregated communities from the common folk of the area -
| nothing lefty about that.
| freedomben wrote:
| Elon Musk and Peter Thiel are two of the most hated people in
| tech, so this doesn't seem like a compelling example. Also I
| don't think Elon Musk and Peter Thiel qualify as "hardcore
| libertarian." Thiel was a Trump supporter (hardly libertarian
| at all, let alone hardcore) and Elon has supported Democrats
| and much government his entire life until the last few years.
| He's mainly only waded into "culture war" type stuff that I
| can think of. What sort of policies has Elon argued for that
| you think are "hardcore libertarian?"
| bergen wrote:
| He wanted to replace public transport with a system where
| you don't have to ride the public transport with the plebs,
| he want's to colonize mars with the best minds (equal most
| money for him), he built a tank for urban areas. He
| promotes free speech even if it incites hate, he likes ayn
| rand, he implies government programs calling for united
| solutions is either communism, orwell or basically hitler.
| He actively promotes the opinion of those that pay above
| others on X.
| freedomben wrote:
| Thank you, truly, I appreciate the effort you put in to
| list those. It helps me understand more where you're
| coming from.
|
| > He wanted to replace public transport with a system
| where you don't have to ride the public transport with
| the plebs
|
| I don't think this is any more libertarian than kings and
| aristocrats of days past were. I know a bunch of people
| who ride public transit in New York and San Francisco who
| would readily agree with this, and they are definitely
| not libertarian. If anything it seems a lot more
| democratic since he wants it to be available to everyone
|
| > he want's to colonize mars with the best minds (equal
| most money for him)
|
| This doesn't seem particularly "libertarian" either,
| excepting maybe the aspect of it that is highly
| capitalistic. That point I would grant. But you could
| easily be socialist and still support the idea of
| colonizing something with the best minds.
|
| > he built a tank for urban areas.
|
| I admit I don't know anything about this one
|
| > He promotes free speech even if it incites hate
|
| This is a social libertarian position, although it's
| completely disconnected from economic libertarianism. I
| have a good friend who is a socialist (as in wants to
| outgrow capitalism such as marx advocated) who supports
| using the state to suppress capitalist
| activity/"exploitation", and he also is a free speech
| absolutist.
|
| > he likes ayn rand
|
| That's a reasonable point, although I think it's worth
| noting that there are plenty of hardcore libertarians who
| hate ayn rand.
|
| > he implies government programs calling for united
| solutions is either communism, orwell or basically
| hitler.
|
| Eh, lots of republicans including Trump do the same
| thing, and they're not libertarian. Certainly not
| "hardcore libertarian"
|
| > He actively promotes the opinion of those that pay
| above others on X.
|
| This could be a good one, although Google, Meta, Reddit,
| Youtube, and any other company that runs ads or has
| "sponsored content" is doing the same thing, so we would
| have to define all the big tech companies as "hardcore
| libertarian" to stay consistent.
|
| Overall I definitely think this is a hard debate to have
| because "hardcore libertarian" can mean different things
| to different people, and there's a perpetual risk of "no
| true scotsman" fallacy. I've responded above with how I
| think most people would imagine libertarianism, but
| depending on when in history you use it, many anarcho-
| socialists used the label for themselves yet today
| "libertarian" is a party that supports free market
| economics and social liberty. But regardless the
| challenges inherent, I appreciate the exchange
| bergen wrote:
| >I don't think this is any more libertarian than kings
| and aristocrats of days past were. So very libertarian.
|
| >If anything it seems a lot more democratic since he
| wants it to be available to everyone No, he want's a
| solution that minimizes contact to other people and let
| you live in your bubble. This minimizes exposure to
| others from the same city and is a commercial system, not
| a publicly created one. Democratization would be a cheap
| public transport where you don't get mugged, proven to
| work in every european and most asian cities.
|
| > I admit I don't know anything about this one The
| cybertruck. Again a vehicle to isolate you from everyday
| life being supposed bulletproof and all.
|
| > lots of republicans including Trump do the same thing,
| and they're not libertarian They are all "little
| government, individual choice" - of course they feed
| their masters, but the kochs and co want exactly this.
|
| Appreciate the exchange too, thanks for factbased
| formulation of opinions.
| njarboe wrote:
| Musk main residence is a $50k house he rents in Boca Chica.
| Grimes wanted a bigger, nicer residence for her and their
| kids and that was one of the reasons she left him.
| bergen wrote:
| One of his many lies. https://www.wsj.com/articles/elon-
| musk-says-he-lives-in-a-50...
| dang wrote:
| We detached this subthread from
| https://news.ycombinator.com/item?id=39467056.
| spencerflem wrote:
| thank you, the thread looks so much nicer now with
| interesting technical details at the top
| dang wrote:
| I'm delighted that you noticed--it took about 30
| interventions to get there.
| asadotzler wrote:
| So far left the techies dont even have a labor union. You're a
| joke.
| 4bpp wrote:
| I guess we should count our blessings and be grateful that
| literacy, the printing press, computers and the internet became
| normalised before this notion of "harm" and harm prevention was.
| Going forward, it's hard to imagine how any new technology that
| is unconditionally intellectually empowering to the individual
| will be tolerated; after all, just think of the harms someone
| thus empowered could be enabled to perpetrate.
|
| Perhaps eventually, once every forum has been assigned a trust-
| and-safety team and word processor has been aligned and most
| normal people have no need for communication outside the
| Metaverse (TM) in their daily lives, we will also come around to
| reviewing the necessity of teaching kids to write, considering
| the epidemic of hateful graffiti and children being caught with
| handwritten sexualised depictions of their classmates.
| xanderlewis wrote:
| > unconditionally intellectually empowering
|
| What makes you think those who've worked hard over a lifetime
| to provide (with no compensation) the vast amounts of data
| required for these -- inferior by every metric other than
| quantity -- stochastic approximations of human thought should
| feel _empowered_?
|
| I think the genAI / printing press analogy is wearing rather
| thin now.
| graphe wrote:
| WHO exactly worked hard over a lifetime with no compensation?
| xanderlewis wrote:
| By _compensation_ I mean from the companies creating the
| models, like OpenAI.
| graphe wrote:
| Computers and drafters had their work taken by machines.
| IBM did not pay off the computers and drafters. In this
| case you could make a steady decent wage. My grandfather
| was trained in a classic drawing style (yes it was his
| main job).
|
| He did not get into the profession to make money. He did
| it out of passion and died poor. Artists are not being
| tricked by the promise of wealth. You will get a cloned
| style if you can't afford the real artist making it and
| if the commission goes to a computer how is that not the
| same as plagerism by a human? Artists were not being paid
| well before. The anime industry has proven the endpoint
| of what happens to artists as a profession despite their
| skills. Chess still exists despite better play by
| machines. Art as a commercial medium has always been
| tainted by outside influences such as government,
| religion and pedophilia.
|
| In the end, drawing wasn't going to survive in the age of
| vector art and computers. They are mainly forgettable
| jpgs you scroll past in a vast array like DeviantArt.
| xanderlewis wrote:
| Sorry, but every one of your talking points -- 'computers
| were replaced' , 'chess is still being played', etc. --
| and good counterarguments to them have been covered ad
| nauseam (and practically verbatim) by now.
|
| Anyway, my point isn't that 'AI is evil and must be
| stopped'; it's that it doesn't feel 'intellectually
| empowering'. I (in my personal work) can't get anything
| done with ChatGPT that I can't on my own, and with less
| frustration. We've created machines that can
| superficially mimic real work, and the world is going
| bonkers over it. The only magic power these systems have
| is sheer speed: they can output reams and reams of
| twaddle in the time it takes me to make a cup of tea. And
| no doubt those in bullshit jobs are soon going to find
| out.
|
| My argument might not be what you expect from someone who
| is sad to see the way artists' lives are going: if your
| work is truly capable of being replaced by a large
| language model or a diffusion model, maybe it wasn't very
| original to begin with.
|
| The sad thing is, artists who create genuinely superior
| work will still lose out because those financially
| enabling them will _think_ (wrongly) that they can be
| replaced. And we'll all be worse off.
| graphe wrote:
| I definitely feel more empowered, and making imperfect
| art and generating code that doesn't work and
| proofreading it is definitely changing people's lives.
| Which specific artist are you talking about who will
| suffer? Many of the ones I talk to are excited about
| using it.
|
| You keep going back to value and finances. The less money
| is in it the better. Art isn't good because it's
| valuable, unless you were only interested in it
| commercially.
| xanderlewis wrote:
| > Art isn't good because it's valuable, unless you were
| only interested in it commercially.
|
| Of course not; I'm certainly not suggesting so. But I do
| think money is important because it is what has enabled
| artists to do what they do. Without any prospect of
| monetising one's art, most of us (and I'm not an artist)
| would be out working in the potato fields, with very
| little time to develop skills.
| graphe wrote:
| I disagree. It will be better because it's driven purely
| by passion. Art runs in my family even today, I am fully
| aware of its value as well as cost. It is not a career
| and artists knew that then and now, supplementing their
| decadence on expression of value through film purchases,
| luxurious pigments, toxic but beautiful chemicals, or
| instruments that were sure to never make back their
| purchasing price. Someone (not my family) made Stonehenge
| in his backyard but it had no commercial value, it still
| is a very impressive feat and I admire the ingenuity. Art
| without monetary value is always the best, and previous
| problems such as film costs and paint prices are solved
| digitally, so the lack of commercial interest shouldn't
| hurt art at all.
|
| Commercial movies have lots of CG, big budgets and famous
| actors while small budget indie movies have been
| exploding despire their weaker technical specialities.
| Noah's ark was made by amateurs while the titanic was
| made by experts.
| samstave wrote:
| Slaves.
| xanderlewis wrote:
| Yes, but that's clearly not what I'm getting at.
| ben_w wrote:
| > inferior by every metric other than quantity
|
| And the metric of "beating most of our existing metrics so we
| had to rewrite the metrics to keep feeling special, but don't
| worry we can justify this rewriting by pointing at Goodhart's
| law".
|
| The only reason the question of _compensating_ people for
| their input into these models even matters is specifically
| because the models are, in actual fact, good. The bad models
| don 't replace anyone.
| xanderlewis wrote:
| > beating most of our existing metrics so we had to rewrite
| the metrics to keep feeling special
|
| This is needlessly provocative, and also wrong. My metrics
| have been the same from the very beginning (i.e. 'can it
| even come close to doing my work for me?'). This question
| may yet come to evaluate to 'yes', but I think you
| seriously underestimate the real power of these models.
|
| > The only reason the question of compensating people for
| their input into these models even matters is specifically
| because the models are, in actual fact, good.
|
| No. They don't need to be good, they simply need to fool
| people into thinking they're good.
|
| And before you reflexively rebut with 'what's the
| difference?', let me ask you this: is the quality of a
| piece of work or the importance of a job and all of its
| indirect effects always immediately apparent? Is it
| possible for managers to short term cost-cut at the expense
| of the long term? Is it conceivable that we could at some
| point slip into a world in which there is no funding for
| genuinely interesting media anymore because 90% of the
| population can't distinguish it? The real danger of genAI
| is that it convinces non-experts that the experts are
| replaceable when the reality is utterly different. In some
| cases this will lead to serious blowups and the real
| experts will be called back in, but in more ambiguous cases
| we'll just quietly lose something of real value.
| ben_w wrote:
| > This is needlessly provocative,
|
| Perhaps; this is something I find annoying enough that my
| responses may be unnecessarily sharp...
|
| > and also wrong. My metrics have been the same from the
| very beginning (i.e. 'can it even come close to doing my
| work for me?'). This question may yet come to evaluate to
| 'yes', but I think you seriously underestimate the real
| power of these models.
|
| Okay then. (1) your definition is equivalent to
| "permanent mass unemployment" because if it can do your
| work for you, it can also do your work for someone else,
| (2) you mean either " _over_ -estimate" or "real _limits_
| of these models ", and the only reason I even bring up
| what's obviously a minor editing issue that I fall foul
| of myself on many comments is that this is the kind of
| mistake that people pick up on as evidence of the limits
| of AI -- treating small inversions like this as evidence
| of uselessness.
|
| > Is it conceivable that we could at some point slip into
| a world in which there is no funding for genuinely
| interesting media anymore because 90% of the population
| can't distinguish it?
|
| As written, what you describe is tautologically
| impossible. However, assuming you mean something more
| like "genuinely novel" rather than "interesting",
| absolutely! 100% yes. There's also _loads_ of ways this
| could permanently end all human flourishing (even when
| used as a mere tool e.g. by dictators for propaganda),
| and some plausible ways it can permanently end all human
| _existence_ (it 's a safe bet someone will ask it to and
| try to empower it to this end, the question is how far
| they get with this).
|
| > The real danger of genAI is that it convinces non-
| experts that the experts are replaceable when the reality
| is utterly different.
|
| Despite the fact that the best models ace tests in
| medicine and law, the international mathematical
| olympiad, leetcode, etc., the fact there are no real
| tests for how good someone is after a few years of
| employment means both your point and mine can be true
| simultaneously. I'm thinking the real threat current LLMs
| pose to newspapers is that they fully automate the Gell-
| Mann Amnesia effect, even though they beat humans on
| _every_ measure I had of intelligence when I was growing
| up, and depending on which measure exactly either all of
| humanity together by many orders of magnitude, or at
| worst putting them somewhere near the level of "rather
| good student taking the same test".
|
| > In some cases this will lead to serious blowups and the
| real experts will be called back in, but in more
| ambiguous cases we'll just quietly lose something of real
| value.
|
| Hard disagree about "quiet loss". To the extent that
| value can be quantified, even if only by surveying
| humans, models can learn it. Indeed, this is already
| baked into the way ChatGPT asks you for feedback about
| the quality of the answers it generates. To the extent we
| lose things, it will be a very loud and noisy loss,
| possibly literally in the form of a nuke going off.
| astrange wrote:
| > (1) your definition is equivalent to "permanent mass
| unemployment" because if it can do your work for you, it
| can also do your work for someone else
|
| This wouldn't happen because employment effects are
| mainly determined by comparative advantage, i.e. the
| resources that could be used to "do your job" will
| instead be used to do something they're more suited to.
|
| (Not "that they're better at". it's "more suited to". You
| do not have your job because you're the best at it.)
| ben_w wrote:
| I don't claim to be an expert in economics, so if you
| feel like answering please treat me as a noob, but
| doesn't comparative advantage have the implicit
| assumption that demand isn't ever going to be fully met
| for all buyers? The "single most economically important
| task" that a machine which can operate at a human (or
| superhuman) level, is "make a better version of itself"
| until that process hits a limit, followed by "maximise
| how many of you exist" until it runs out of resources.
| With assumptions that currently seem plausible such as
| "such a robot[0] might mass 100kg and take 5 months to
| turn plain metal ore into a working copy of itself", it
| takes about 30 years to convert the planet Mercury into
| 4.12e11 such robots _per currently living human_ [1],
| which I assert is _more than anyone can actually use_
| even if they decided their next game of Civilization was
| going to be a 1:1 scale WestWorld-style LARP.
|
| If I imagine a world where every task that any human can
| perform can also be done at world expert level -- let
| alone at a superhuman level -- by a computer/robot (with
| my implicit assumption "cheaply"), I can't imagine why I
| would ever choose the human option. If the comparative
| advantage argument is "the computer/robot combination
| will always be priced at exactly the level where it's
| cost-competitive with a human, in order that it can
| extract maximum profit", I ask why there won't be many
| AI/robots competing with each other for ever-smaller
| profit margins?
|
| [0] AI and robotics are _not_ the same things, one is
| body the other mind, but there 's a lot of overlap with
| AI being used to drive robots, LLMs making it easier to
| define rewards and for the robots to plan; and AI also
| get better by having embodiment (even if virtual) giving
| them real world feedback.
|
| [1] https://www.wolframalpha.com/input?i=5+months+*+log2%
| 28mass+...
| astrange wrote:
| > The "single most economically important task" that a
| machine which can operate at a human (or superhuman)
| level, is "make a better version of itself" until that
| process hits a limit, followed by "maximise how many of
| you exist" until it runs out of resources.
|
| Lot of hidden assumptions here. How does "operating at
| human level" (an assumption itself) imply the ability to
| do this? Humans can't do this.
|
| We very specifically can't do this, we have sexual
| reproduction for a good reason.
|
| > If I imagine a world where every task that any human
| can perform can also be done at world expert level -- let
| alone at a superhuman level -- by a computer/robot (with
| my implicit assumption "cheaply"), I can't imagine why I
| would ever choose the human option.
|
| If the robot performs at human level, and it knows you'll
| always hire it over a human, why would it work for
| cheaper?
|
| If you can program it to work for free, then it's
| subhuman.
|
| If you're imagining something that's superhuman in only
| ways that are bad for you and subhuman in ways that would
| be good for you, just stop imagining it and you're good.
| Vetch wrote:
| > thought should feel empowered?
|
| This is a strange question since augmentation can be
| objectively measured even as its utility is contextual. With
| MidJourney I do not feel augmented because while it makes
| pretty images, it does not make precisely the pretty images I
| want. I find this useless, but for the odd person who is
| satisfied only with looking at pretty pictures, it might be
| enough. Their ability to produce pretty pictures to
| satisfaction is thus augmented.
|
| With GPT4 and Copilot, I am augmented in a speed instead of
| capabilities sense. The set of problems I can solve is not
| meaningfully enhanced, but my ability to close knowledge gaps
| is. While LLMs are limited in their global ability to help
| design, architect or structure the approach to a novel
| problem or its breakdown, they can tell local tricks and
| implementation approaches I do not know but can verify as
| correct. And even when wrong, I can often work out how to fix
| their approach (this is still a speed up since I likely would
| not have arrived at this solution concept on my own). This is
| a significant augmentation even if not to the level I'd like.
|
| The reason capabilities are not much enhanced is to get the
| most out of LLMs, you need to be able to verify solutions due
| to their unreliability. If a solution contains concepts you
| do not know, the effort to gain the knowledge required to
| verify the approach (which the LLM itself can help with)
| needs to be manageable in reasonable time.
| xanderlewis wrote:
| > With GPT4 and Copilot...
|
| I am not a programmer, so none of this applies to me. I can
| only speak for myself, and I'm not claiming that _no one_
| can feel empowered by these tools - in fact it seems
| obvious that they can.
|
| I think programmers tend to assume that all other technical
| jobs can be attacked in the same way, which is not
| necessarily true. Writing code seems to be an ideal use
| case for LLMs, especially given the volume of data
| available on the open web.
| Vetch wrote:
| Which is why I say it is contextual and depends on the
| task. I'll note that it's not only programming ability
| that is empowered but learning math, electronics,
| history, physics and so on up to the university level. As
| long as you take small enough steps such that you are
| able to verify with external sources, you will move
| faster with than without.
|
| Writing it as "feel empowered" made it come across as if
| you meant the empowerment was illusory. My argument was
| that it is not merely a feeling but a real measurable
| difference.
| 4bpp wrote:
| Empowering to their users. A lot of things that empower their
| users necessarily disempower others, especially if we define
| power in a way that is zero-sum - the printing press
| disempowered monasteries and monks that spent a lifetime
| perfecting their book-copying craft (and copied books that no
| doubt were used in the training of would-be printing press
| operators in the process, too).
|
| It seems to me that the standard use of "empowering" implies
| in particular that you get more power for less effort - which
| in many cases tends to be democratizing, as hard-earned power
| tends to be accrued by a handful of people who dedicate most
| of their lives to pursuit of power in one form or another.
| With public schooling and printing, a lot of average people
| were empowered at the expense of nobles and clerics, who put
| in a lifetime of effort for the power literacy conveys in a
| world without widespread literacy. With AI, likewise, average
| people will be empowered at the expense of those who
| dedicated their life to learn to (draw, write good copy,
| program) - this looks bad because we hold those people in
| high esteem in a world where their talents are rare, but
| consider that following that appearance is analogously
| fallacious to loathing democratization of writing because of
| how noble the nobles and monks looked relative to the
| illiterate masses.
| xanderlewis wrote:
| I get why you might describe these tools as
| 'democratising', but it also seems rather strange when you
| consider that the future of creativity is now going to be
| dependent on huge datasets and amounts of computation only
| billion-dollar companies can afford. Isn't that _anything
| but_ democratic? Sure, you can ignore the zeitgeist and
| carry on with traditional dumb tools if you like, but
| you'll be utterly left behind.
| 4bpp wrote:
| Datasets can still be curated by crowds of volunteers
| just fine. I would likewise expect a crowdsourceable
| solution to compute to emerge eventually - unless the
| safetyists move to prevent this by way of legislation.
|
| When writing and printing emerged, they too depended on
| supply chains (for paper, iron, machining) and in the
| case of printing capital that were far out of the reach
| of the individual. Their utility and overlap with other
| mass markets resulted in those being commoditized in
| short order.
| gjulianm wrote:
| I feel like this analogy is not very appropriate. The main
| problem with AI generated images and videos is that, with every
| improvement, it becomes more and more difficult to distinguish
| what's real and what's not. That's not something that happened
| with literacy or printing press or computers.
|
| Think about it: the saturation of content on the Internet has
| become so bad that people are having a hard time knowing what's
| true or not, to the point that we're having again outbreaks of
| preventable diseases such as measles because people can't
| identify what's real scientific information and what's not.
| Imagine what will happen when anyone can create an image of
| whatever they want that looks just like any other picture, or
| worse, video. We are not at all equipped to deal with that. We
| are risking a lot just for the ability to spend massive amounts
| of compute power on generating images. It's not curing cancer,
| not solving world hunger, not making space travel free, no:
| it's generating images.
| gpderetta wrote:
| I don't understand. Are you saying that before AI there was a
| reliable way to distinguish fiction from factual?
| gjulianm wrote:
| It definitely is easier without AI. Before, if you saw a
| photo you could be fairly confident that most of it was
| real (yes, photo manipulation exists but you can't really
| create a photo out of nothing). Videos, far more
| trustworthy (and yes, I know that there's some amazing 3D
| renders out there but they're not really accessible). With
| these technologies and the rate at which they're improving,
| I feel like that's going out of the window. Not to mention
| that the more content that is generated, the easier it is
| that something slips by despite being fake.
| UberFly wrote:
| "it becomes more and more difficult to distinguish what's
| real and what's not" - Is literally what they said.
| laminatedsmore wrote:
| "grateful that literacy, the printing press, computers and the
| internet became normalised before this notion of "harm" and
| harm prevention was"
|
| Printing Press -> Reformation -> Thirty Years' War -> Millions
| Dead
|
| I'm sure that there were lots of different opinions at the time
| about what kind of harm was introduced by the printing press
| and what to do about it, and attempts to control information by
| the Catholic church etc.
|
| The current fad for 'safe' 'AI' is corporate and naive. But
| there's no simple way to navigate a revolutionary change in the
| way information is accessed / communicated.
| light_hue_1 wrote:
| Way to blame the printing press for the actions of religious
| extremists.
|
| The lesson isn't. printing press bad, it's extremist
| irrational belief in any entity is bad (whether it's
| religion, Trump, etc.).
| freedomben wrote:
| > _Way to blame the printing press for the actions of
| religious extremists._
|
| I don't see GP blaming the printing press for that, they're
| merely pointing out that one enabled the other, which is
| absolutely true. I'm damn near a free speech absolutist,
| and I think the heavy "safety" push by AI is well-meaning
| but will have unintended consequences that cause more harm
| than they are meant to prevent, but it seems obvious to me
| that they _can_ be used much the same as printing presses
| were by the extremists.
|
| > _The lesson isn 't. printing press bad, it's extremist
| irrational belief in any entity is bad (whether it's
| religion, Trump, etc.)._
|
| Could not agree more
| samstave wrote:
| The printing press is the leading cause of tpyos!
| herculity275 wrote:
| It's not about assigning blame. A revolutionary technology
| enables revolutionary change and all sorts of bad actors
| will take advantage of it.
| biomcgary wrote:
| Safetyism is the standard civic religion since 9/11 and I
| doubt it will go quietly into the night. Much like the
| bishops and the king had a symbiotic relationship to maintain
| control and limit change (e.g., King James of KJV Bible
| fame), the government and corporations have a similarly
| tense, but aligned relationship. Boogeymen from the left or
| the right can always be conjured to provide the fear
| necessary to control
|
| Would millions have died if the old religion gave way to the
| new one without a fight? The problem for the Vatican was that
| their rhetoric wasn't at top form after mentally stagnating
| for a few centuries since arguing with Roman pagans, so war
| was the only possibility to win.
|
| (Don't forget Luther's post hoc justification of killing
| 100k+ peasants, but he won because he had better rhetorical
| skills AND the backing of aristocrats and armies. https://en.
| wikipedia.org/wiki/Against_the_Murderous,_Thievin... and
| https://en.wikipedia.org/wiki/German_Peasants%27_War)
| kurthr wrote:
| "Think of the Children" has been the norm since long before
| it was re-popularized in the 80s for song lyrics, in the
| 90s encryption, and now everything else.
|
| I almost think it's the eras between that are more notable.
| EchoReflection wrote:
| "The Coddling of the American Mind" by Jonathan Haidt and
| Greg Lukianoff is a _very_ good (and troubling) book that
| talks a lot about "safetyism". I can't recommend it
| enough.
|
| https://jonathanhaidt.com/
|
| https://www.betterworldbooks.com/product/detail/the-
| coddling...
|
| https://www.audible.com/pd/The-Coddling-of-the-American-
| Mind...
| astrange wrote:
| It's strange that people think Stability is making
| decisions based on American politics when it isn't an
| American company and other countries generally have
| stricter laws in this area.
| dotancohen wrote:
| The current focus on "safety" (I would prefer a less gracious
| term) are based as much on fear as on morality. Fear of
| government intervention and woke morality. The progress in
| technology is astounding, the focus on sabotaging then
| publicly available versions of the technology to promote (and
| deny) narratives is despicable.
| fngjdflmdflg wrote:
| I agree. There should have been guardrails in place to
| prevent people who espouse extremist viewpoints like Martin
| Luther from spreading their dangerous and hateful rhetoric. I
| rest easy knowing that only people with the correct
| intentions will be able to use AI.
| miohtama wrote:
| British banned printing press in 1662 in the name of the harm
|
| https://en.m.wikipedia.org/wiki/Licensing_of_the_Press_Act_1...
| freedomben wrote:
| Yes, and fortunately that banning was the end of hateful
| printed content. Since that ban, the only way to print
| objectionable material has been to do it by hand with pen and
| ink.
|
| (For clarity, I'm joking, and I know you're also not implying
| any such thing. I appreciate your comment/link)
| someuser2345 wrote:
| Harm prevention is definitely not new; books have been subject
| to censorship for centuries. Just look at the U.S., where we
| had the Hays code and the Comic Code Authority. The only
| difference is that now, Harm is defined by California tech
| companies rather than the Church or the Monarchy.
| ben_w wrote:
| I don't think your golden age ever truly existed -- the Overton
| Window for acceptable discourse has always been narrow, we've
| just changed who the in-group and out-groups are.
|
| The out group used to be atheists, or gays, or witches, or
| republicans (in the British sense of the word), or people who
| want to drink. And each of Catholics and Protestants made the
| other unwelcome across Europe for a century or two. When I was
| a kid, it was anyone who wanted to smoke weed, or (because UK)
| any normalised depiction of gay male relationships as being at
| all equivalent to heterosexual ones[0]. I met someone who was
| embarrassed to admit they named their son "Hussein"[1], and
| _absolutely_ any attempt to suggest that ecstasy was anything
| other than evil. I know at least one trans person who _started_
| out of the closet, but was very eager to go _into_ the closet.
|
| [0] "promote the teaching in any maintained school of the
| acceptability of homosexuality as a pretended family
| relationship" - https://en.wikipedia.org/wiki/Section_28
|
| [1] https://en.wikipedia.org/wiki/Hussein
| jsight wrote:
| The core problem is centralization of control. If everyone uses
| their own desktop computer, then everyone is responsible for
| their own behavior.
|
| If everyone uses Hosting Service F, then at some point people
| will blur the lines and expect "Hosting Service F" to remove
| vulgar or offensive content. The lines themselves will be a
| zeitgeist of sorts with inevitable decisions that are
| acceptable to some but not all.
|
| Can you even blame them? There are lots of ways for this to go
| wrong and noone wants to be on the wrong side of a PR blast.
|
| So heavy guardrails are effectively inevitable.
| __loam wrote:
| I'm sure the millions of people they stole the data from feel
| very empowered.
| hizanberg wrote:
| IMO the "safety" in Stable Diffusion is becoming more overzealous
| where most of my images are coming back blurred, where I no
| longer want to waste my time writing a prompt only for it to
| return mostly blurred images. Prompts that worked in previous
| versions like portraits are coming back mostly blurred in SDXL.
|
| If this next version is just as bad, I'm going to stop using
| Stability APIs. Are there any other text-to-image services that
| offer similar value and quality to Stable Diffusion without the
| overzealous blurring?
|
| Edit:
|
| Example prompt's like "Matte portrait of Yennefer" return 8/9
| blurred images [1]
|
| [1] https://imgur.com/a/nIx8GBR
| nickthegreek wrote:
| Run it locally.
| lolinder wrote:
| I haven't tried SD3, but my local SD2 regularly has this
| pattern where while the image is developing it looks like
| it's coming along fine and then suddenly in the last few
| rounds it introduces weird artifacts to mask faces. Running
| locally doesn't get around censorship that's baked into the
| model.
|
| I tend to lean towards SD1.5 for this reason--I'd rather put
| in the effort to get a good result out of the lesser model
| than fight with a black box censorship algorithm.
|
| EDIT: See the replies below. I might just have been holding
| it wrong.
| yreg wrote:
| Do you use the proper refiner model?
| lolinder wrote:
| Probably not, since I have no idea what you're talking
| about. I've just been using the models that InvokeAI
| (2.3, I only just now saw there's a 3.0) downloads for me
| [0]. The SD1.5 one is as good as ever, but the SD2 model
| introduces artifacts on (many, but not all) faces and
| copyrighted characters.
|
| EDIT: based on the other reply, I think I understand what
| you're suggesting, and I'll definitely take a look next
| time I run it.
|
| [0] https://github.com/invoke-ai/InvokeAI
| yreg wrote:
| SDXL should be used together with a refiner. You can
| usually see the refiner kicking in if you have a UI that
| shows you the preview of intermediate steps. And it can
| sometimes look like the situation you describe (straining
| further away from your desired result).
|
| Same goes for upscalers, of course.
| SV_BubbleTime wrote:
| Basically don't use SD2.x, it's trash and the community
| rejected it.
|
| If you are using invoke, try XL.
|
| If you want to really dial into a specific style or apply
| a specific LORA, use 1.5.
| fnordpiglet wrote:
| Be sure to turn off the refiner. This sounds like you're
| making models that aren't aligned with their base models
| and the refiner runs in the last steps. If it's a prompt
| out of alignment with the default base model it'll heavily
| distort. Personally with SDXL I never use the refiner I
| just use more steps.
| lolinder wrote:
| That makes sense. I'll try that next time!
| zettabomb wrote:
| SD2 isn't SDXL. SD2 was a continuation of the original
| models that didn't see much success. It didn't have a
| refiner.
| cchance wrote:
| Well ya because SD2 literally had purposeful censorship
| of the base model and the clip, that basically made it
| DOA to the entire opensource community that were
| dedicated to 1.5, SDXL wasnt so bad so it gained traction
| but still 1.5 is the king because it was from before the
| damn models were gimped at the knees and relied on
| workarounds and insane finetunes just to get basic
| anatomy correct.
| hizanberg wrote:
| Don't expect my current desktop will be able to handle it,
| which is why I'm happy to pay for API access, but my next
| Desktop should be capable.
|
| Is the OSS'd version of SDXL less restrictive than their API
| hosted version?
| nickthegreek wrote:
| If you run into issues, switch to a fine-tuned model from
| civitai.
| yreg wrote:
| You can set up the same thing you would have locally on
| some spot cloud instance.
| Tenoke wrote:
| The nice thing about Stable Diffusion is that you can very
| easily set it up on a machine you control without any 'safety'
| and with a user-finetuned checkpoint.
| cyanydeez wrote:
| they're nerfing the models, not just the prompt engineering.
|
| After SD1.5 they started directly modifying the dataset.
|
| it's only other users who "restore" the porno.
|
| and that's what we're discussing. there's a real concern
| about it as a public offering.
| Tenoke wrote:
| Sure, but again if you run it yourself you can use the
| finetuned by users checkpoints that have it.
| cyanydeez wrote:
| yes, but the GP is discussing the API, and specifically
| the company that offers the base model.
|
| they both don't want to offer anything that's legally
| dubious and it's not hard to understand why.
| jncfhnb wrote:
| No it's not. It's perfectly reasonable not to want to
| generate porn for customers.
|
| The models being open sourced makes them very easy to turn
| into the most deprived porno machines ever conceived. And
| they are.
|
| It is in no way a meaningful barrier to what people can do.
| That's the benefit of open source software.
| lancesells wrote:
| I don't use it at all but do you mind sharing what prompts
| don't work?
| hizanberg wrote:
| Last prompt I tried was "Matte portrait of Yennefer" returned
| 8/9 blurred images [1]
|
| [1] https://imgur.com/a/nIx8GBR
| not2b wrote:
| It appears that they are trying to prevent generating
| accurate images of a real person, because they are worried
| about deepfakes, and this produces the blurring. While
| Yennefer is a fictional character she's played by a real
| actress on Netflix, so maybe that's what is triggering the
| filter.
| NoMoreNicksLeft wrote:
| Wait, blurring (black) means that it objected to the content? I
| tried it a few times on one of the online/free sites
| (Huggingspace, I think) and I just assumed I'd gotten a
| parameter wrong.
| pksebben wrote:
| Not necessarily, but it can. Black squares can come from a
| variety of problems.
| gangstead wrote:
| I've never seen blurring in my images. Is that something that
| they add when you do API access? I'm running SD 1.5 and SDXL
| 1.0 models locally. Maybe I'm just not prompting for things
| they deem naughty. Can you share an example prompt where the
| result gets blurred?
| jncfhnb wrote:
| If you run locally with the basic stack it's literally a bool
| flag to hide nsfw content. It's trivial to turn off and off
| by default in most open source setups.
| stavros wrote:
| It's a filter they apply after generation.
| araes wrote:
| Taking the actual example you provided, I can understand the
| issue. Since it amounts to blurring images of a virtual
| character, that are not actually "naughty." Equivalent images
| in bulk quantity are available on every search engine with
| "yennefer witcher 3 game" [1][2][3][4][5][6] Returns almost the
| exact generated images, just blurry.
|
| [1] Google:
| https://www.google.com/search?sca_esv=a930a3196aed2650&q=yen...
|
| [2] Bing via Ecosia:
| https://www.ecosia.org/images?q=yennefer%20witcher%203%20gam...
|
| [3] Bing:
| https://www.bing.com/images/search?q=yennefer+witcher+3+game...
|
| [4] DDG:
| https://duckduckgo.com/?va=e&t=hj&q=yennefer+witcher+3+game&...
|
| [5] Yippy:
| https://www.alltheinternet.com/?q=yennefer+witcher+3+game&ar...
|
| [6] Dogpile:
| https://www.dogpile.com/serp?qc=images&q=yennefer+witcher+3+...
| ametrau wrote:
| "Safety" = safe to our reputation. It's insulting how they imply
| safety from "harm".
| kingkawn wrote:
| So they should dash their company on the rocks of your empty
| moral positions about freedom?
| dingnuts wrote:
| should pens be banned because a talented artist could draw a
| photorealistic image of something nasty happening to someone
| real?
| mrighele wrote:
| Photoshop and the likes (modern day's pens) should have an
| automatic check that you are not drawing porn, censor the
| image and report you to the authorities if it thinks it
| involves minors.
|
| edit: yes it is sarcasm, though I fear somebody will think
| it is in fact the right way to go.
| mtlmtlmtlmtl wrote:
| That's ridiculous. What about real pens and paintbrushes?
| Should they be mandated to have a camera that analyses
| everything you draw/write just to be "safe"?
|
| Maybe we should make it illegal to draw or write anything
| without submitting it to the state for "safety" analysis.
| gambiting wrote:
| I hope that's sarcasm.
| IMTDb wrote:
| Text editors and the likes (modern day's typewriters)
| should have an automatic check that you are not
| criticizing the government, censor the text and report
| you to the authorities if it thinks it an alternate
| political party.
|
| Hopefully you are going to be absolutely shocked by the
| prospect of the above sentence. But as you can see,
| surveillance is a slippery slope. "Safety" is a very
| dangerous word because everybody wants to be "safe" but
| no one is really ready to define what "safe" actually
| means. The moment we start baking cultural / political /
| environmental preferences and biases in the tools we use
| to produce content, we allow other group of people with
| different views to use those "safeguards" to harm us or
| influence us in ways we might not necessarily like.
|
| The safest notebook I can find is indeed a simple pen and
| paper because it does not know or care what is being
| written, it just does it's best regardless of how amazing
| or horrible the content is.
| jameshart wrote:
| Safety is also safe for people trying to make use of the
| technology at scale for most benign usecases.
|
| Want to install a plugin into Wordpress to autogenerate fun
| illustrations to go at the top of the help articles in your
| intranet? You probably don't want the model to have a 1 in 100
| chance of outputting porn or extreme violence.
| SubiculumCode wrote:
| It is interesting to me that these diffusion image models are so
| much smaller than the LLMs.
| sjm wrote:
| The example images look so bad. Absolutely zero artistic value.
| wongarsu wrote:
| From a technical perspective they are impressive. The depth of
| field in the classroom photo and the macro shot. The detail in
| the chameleon. The perfect writing in very different styles and
| fonts. The dust kicked up by the donut.
|
| The artistic value is something you have to add with a good
| prompt with artistic vision. These images are probably the AI
| equivalent of "programmer art". It fulfills its function, but
| lacks aesthetic considerations. I wouldn't attribute that to
| the model just yet.
| the_duke wrote:
| I'm willing to bet that they are avoiding artistic images on
| purpose to not get any heat from artists feeling ripped off,
| which did happen previously.
| robertwt7 wrote:
| It'll be interesting to see what "safety" means in this case
| given the censorship in diffuser models nowadays. Look what's
| happening with Gemini, it's quite scary really how different
| companies have different censorship values
|
| I've had some fair share of frustation with DallE as well when
| trying to generate weapon images for game assets. Had to tweak a
| lot of my prompt
| yreg wrote:
| > it's quite scary really how different companies have
| different censorship values
|
| The fact that they have censorship values is scary. But the
| fact that those are different is better than the alternative.
| declan_roberts wrote:
| Can it generate an image of people without injecting insufferable
| diversity quotas into each image? If so then it's the most
| advanced model on the internet right now!
| miohtama wrote:
| No model. Half of the announcement text is "we area really really
| responsible and safe, believe us."
|
| Kind of a dud for an announcement.
| nextworddev wrote:
| The company itself is about to go run out of money hence the
| Hail Mary at trying to get acquired
| yreg wrote:
| They raised 110M in October. How much are they burning and
| how? Training each model allegedly costs hundreds of k.
| haolez wrote:
| Rewriting the "safety" part, but replacing the AI tool with an
| imaginary knife called Big Knife:
|
| "We believe in safe, responsible knife practices. This means we
| have taken and continue to take reasonable steps to prevent the
| misuse of Big Knife by bad actors."
| animex wrote:
| Ugh, another startup(?) requiring Discord to use their product.
| :(
| tavavex wrote:
| As far as I know, the Discord thing is only for doing early
| testing among their community. The full model releases are
| posted to Hugging Face.
| 13of40 wrote:
| "we have taken and continue to take reasonable steps to prevent
| the misuse of Stable Diffusion 3 by bad actors"
|
| It's kind of a testament to our times that the person who chooses
| to look at synthetic porn instead of supporting a real-life human
| trafficking industry is the bad actor.
| user_7832 wrote:
| Agree, I think it fundamentally stems from the old conservative
| view that porn = bad. Morally policing such models is
| questionable.
| rockooooo wrote:
| no AI company wants to be the one generating pornographic
| deepfakes of someone and getting in legal / PR hot water
| seanw444 wrote:
| Which is why this should be a much more decentralized
| effort. Hard to take someone to court when it's not one
| single person or company doing something.
| mrkramer wrote:
| But what if you flip the things the other way around;
| deepfake porn is problematic not because porn is per se
| problematic but because deepfake porn or deepfake revenge
| porn is made without consent, but what if you give consent
| to some AI company or porn company to make porn content of
| you. I see this as evolution of OnlyFans where you could
| make AI generated deepfake porn of yourself.
|
| Another use case would be that retired porn actors could
| license their porn persona (face/body) to some AI porn
| company to make new porn.
|
| I see big business opportunity in the generative AI porn.
| Cookingboy wrote:
| This is why I think generative AI tech should either be
| banned or be completely open sourced. Mega tech corporations
| are plenty of things already, they don't need to be the
| morality police for our society too.
| pksebben wrote:
| Even if it is all open sourced, we still have the
| structural problem of training models large enough to do
| interesting stuff.
|
| Until we can train incrementally and distribute the
| workload scalably, it doesn't matter how open the models /
| methods for training are if you still need a bajilllion
| A100 hours to train the damn things.
| echelon wrote:
| Horeshoe theory [1] is one of the most interesting viewpoints
| I've been introduced to recently.
|
| Both sides view censorship as a moral prerogative to enforce
| their world view.
|
| Some conservatives want to ban depictions of sex.
|
| Some conservatives want to ban LGBT depictions.
|
| Some women's rights folks want to ban depictions sex. (Some
| view it as empowerment, some view it as exploitation.)
|
| Some liberals want to ban non-diverse, dangerous
| representation.
|
| Some liberals want to ban conservative views against their
| thoughts.
|
| Some liberals want to ban religion.
|
| ...
|
| It's team sports with different flavors on each side.
|
| The best policy, IMO, is to avoid centralized censorship and
| allow for individuals to control their own algorithmic
| boosting / deboosting.
|
| [1] https://en.wikipedia.org/wiki/Horseshoe_theory
| stared wrote:
| Yes and no.
|
| I mean, a lot of moderates would like to avoid seeing any
| extreme content, regardless of whether it is too much left,
| right, or just in a non-political uncanny valley.
|
| While the Horseshoe Theory has some merits (e.g., both left
| and right extremes may favor justified coercion, have the
| we-vs-them mentality, etc), it is grossly oversimplified.
| Still, a very simple (yet two-dimensional) model of
| Political Compass is much better.
| echelon wrote:
| I think it's just a different projection to highlight
| similarities in left and right and is by no means the
| only lens to use.
|
| The fun quirk is that there are similarities, and this
| model draws comparison front and center.
|
| There are multiple useful models for evaluating politics,
| though.
| crashmat wrote:
| I don't think there are any (even far) leftwanting to ban
| non-diverse representation. I think it's impossible to ban
| 'conservative thoughts' because that's such a poorly
| defined phrase. However there are people who want to ban
| religion. One difference is that a much larger proportion
| of far right (almost all of them) want to ban lgbtq
| depiction and existence compared to the number of far left
| who want to ban religion or non-diverse representation.
|
| It says on the wikipedia article itself 'The horseshoe
| theory does not enjoy wide support within academic circles;
| peer-reviewed research by political scientists on the
| subject is scarce, and existing studies and comprehensive
| reviews have often contradicted its central premises, or
| found only limited support for the theory under certain
| conditions.'
| echelon wrote:
| > I don't think there are any (even far) leftwanting to
| ban non-diverse representation.
|
| Look at the rules to win an Oscar now.
|
| To cite a direct and personal case, I was involved in
| writing code for one of the US government's COVID bailout
| programs, the Restaurant Revitalization Fund. Billions of
| dollars of relief, but targeted to non-white, non-male
| restaurant owners. There was a lawsuit after the fact to
| stop the unfair filtering, but it was too late and the
| funds were completely dispensed. That felt really gross
| (though many of my colleagues cheered and even jeered at
| the complainers).
|
| > I think it's impossible to ban 'conservative thoughts'
| because that's such a poorly defined phrase.
|
| I commented in /r/conservative (which I was banned from)
| a few times, and I was summarily banned from five or six
| other subreddits by some heinous automation. Guilt by
| association. Except it wasn't even -- I was adding
| commentary in /r/conservative to ask folks to sympathize
| more with trans folks. Both sides here ideologically ban
| with impunity and can be intolerant of ideas they don't
| like.
|
| I got banned from my city's subreddit for posting a
| concern about crime. Or maybe they used these same
| automated, high-blast radius tools. I'm effectively cut
| out of communication with like-minded people in my city.
| I think that's pretty fucked.
|
| Mastodon instances are set up to ban on ideology...
|
| This is all wrong and a horrible direction to go in.
|
| It doesn't matter what _your_ views are, I think we all
| need to be more tolerant and empathetic of others. Even
| those we disagree with.
| asddubs wrote:
| this comment reminds me of that "did you know good things
| and bad things are actually the same" tweet
| echelon wrote:
| I'm sorry, but censorship and partisanship are not good
| things.
|
| Both sides need to get a grip, start meeting in the
| middle, and generally let each be to their own.
|
| Platforms weighing in on this makes it even worse and
| more polarizing.
|
| We shouldn't be so different and disagreeable. We have
| more in common with one another than not.
|
| The points of polarization on each end rhyme with one
| another.
| sigmoid10 wrote:
| I don't think the problem is watching synthetic images. The
| problem is generating them based off actual people and sharing
| them on the internet in a way that the people watching can't
| tell the difference anymore. This was already somewhat of a
| problem with Photoshop and once everyone with zero skills can
| do it in seconds and with far better quality, it will become a
| nightmare.
| 725686 wrote:
| We are already there, you can no longer trust any image or
| video you see, so what is the point? Bad actors will still be
| able to create fake images and videos as they already do.
| Limiting it for the average user is stupid.
| mplewis wrote:
| You guys know you can just draw porn, right?
| seanmcdirmid wrote:
| Generating porn is easier and cheaper. You don't have to
| spend the time learning to draw naked bodies, which can
| be substantial. (The joke being that serious drawers go
| through the draw naked model sessions a lot, but it isn't
| porn)
| tourmalinetaco wrote:
| > but it isn't porn
|
| In my experience with 2D artists, studying porn is one of
| their favorite forms of naked model practice.
| seanmcdirmid wrote:
| The models art schools get for naked drawing sessions
| usually aren't that attractive, definitely not at a porn
| ideal. The objective is to learn the body, not become
| aroused.
|
| There is a lot of (mostly non realistic) porn that comes
| out of art school students via the skills they gain.
| sigmoid10 wrote:
| We are not actually there yet. First, you still need some
| technical understanding and a somewhat decent setup to run
| these models yourself without the guardrails. So the
| average greasy dude who wants to share HD porn based on
| your daugther's linkedin profile pic on nsfw subreddits
| still has too many hoops to jump through. Right now you can
| also still spot AI images pretty easily, if you know what
| to look for. Especially for previous stable diffusion
| models. But all of this could change very soon.
| Salgat wrote:
| I'll challenge this idea and say that once it becomes
| ubiquitous, it actually does more good than harm. Things like
| revenge porn become pointless if there's no way to prove it's
| even real, and I have yet to ever see deep fakes of porn
| amount to anything.
| idle_zealot wrote:
| > once everyone with zero skills can do it in seconds and
| with far better quality, it will become a nightmare.
|
| Will it be a nightmare? If it becomes so easy and common that
| anyone can do it, then surely trust in the veracity of
| damaging images will drop to about 0. That loss of trust
| presents problems, but not ones that "safe" AI can solve.
| foobarian wrote:
| Arguably that loss of trust would be a net positive.
| sigmoid10 wrote:
| >surely trust in the veracity of damaging images will drop
| to about 0
|
| Maybe, eventually. But we don't know how long it will take
| (or if it will happen at all). And the time until then will
| be a nightmare for every single woman out there who has any
| sort of profile picture on any website. Just look at how
| celebrity deepfakes got reddit into trouble even though
| their generation was vastly more complex and you could
| still clearly tell that the videos were fake. Now imagine
| everyone can suddenly post undetectable nude selfies of
| your girlfriend on nsfw subreddits. Even if people
| eventually catch on, that first shock will be unavoidable.
| jquery wrote:
| The tide is rolling in and we have two options... yell at
| the tide really loud that we were here first and we
| shouldn't have to move... or get out of the way. I'm a
| lot more sympathetic to the latter option myself.
| swatcoder wrote:
| Your anxiety dream relies on there currently being some
| _technical_ bottleneck limiting the creation or spread of
| embarassing fake nudes as a way of cyberbullying.
|
| I don't see any evidence of that. What I see is that
| people who want to embarass and bully others are already
| fully enabled to do so, and do so.
|
| It seems more likely to me and many of us that the
| bottleneck that stops it from being worse is simply that
| only so many people think it's reasonable or satisfying
| to distribute embarassing fake nudes of someone. Society
| already shuns it and it's not that effective as a way of
| bullying and embarassing people, so only so many people
| are moved to bother.
|
| Assuming that the hyped up new product is due to swoop in
| and disrupt the cyberbullying "industry" is just a
| classic technologist's fantasy.
|
| It ignores all the boring realities of actual human
| behavior, social norms, and secure equilibriums, etc;
| skips any evidence building or research effort; and just
| presumes that some new technology is just sooooo powerful
| that none of that prior ground truth stuff matters.
|
| I get why people who think that way might be on HN or in
| some Silicon Valley circles, but it can be one of the
| eyeroll-inducing vices of these communities as much as it
| can be one of its motivational virtues.
| mdasen wrote:
| This: it won't happen immediately and I'd go even further
| to say that it even if trust in images drops to zero,
| it's still going to generate a lot of hell.
|
| I've always been able to say all sorts of lies. People
| have known for millennia that lies exist. Yet lies still
| hurt people a ton. If I say something like, "idle_zealot
| embezzled from his last company," people know that could
| be a lie (and I'm not saying you did, I have no idea who
| you are). But that kind of stuff can certainly hurt
| people. We all know that text can be lies and therefore
| we should have zero trust in any text that we read - yet
| that isn't how things play out in the real world.
|
| Images are compelling even if we don't trust that they're
| authentic. Hell, paintings were used for thousands of
| years to convey "truth", but a painting can be a lie just
| as much as text or speech.
|
| We created tons of religious art in part because it makes
| the stories people want others to believe more concrete
| for them. Everyone knows that "Christ in the Storm on the
| Sea of Galilee" isn't an authentic representation of
| anything. It was painted in 1633, more than a century and
| a half after the event was purported to have happened.
| But it's still the kind of thing that's powerful.
|
| An AI generated image of you writing racist graffiti is
| way more believable to be authentic. I have no reason to
| think you'd do such a thing, but it's within the realm of
| possibility. There's zero possibility (disregarding
| supernatural possibilities) that Rembrandt could
| accurately represent his scene in "Christ in the Storm on
| the Sea of Galilee". What happens when all the search
| engine results for your name start calling you a racist -
| even when you aren't?
|
| The fact is that even when we know things can be faked,
| we still put a decent amount of trust in them. People
| spread rumors all the time. Did your high school not have
| a rumor mill that just kinda destroyed some kids?
|
| Heck, we have right-wing talking heads making up
| outlandish nonsense that's easily verifiable as false
| that a third of the country believes without questioning.
| I'm not talking about stuff like taxes or gun control or
| whatever - they're claiming things like schools having to
| have litter boxes for students that identify as cats (htt
| ps://en.wikipedia.org/wiki/Litter_boxes_in_schools_hoax).
| We know that people lie. There should be zero trust in a
| statement like "schools are installing litter boxes for
| students that identify as cats." Yet it spread like
| crazy, many people still believe it despite it being
| proven false, and it has been used to harm a lot of LGBT
| students. That's a way less believable story than an AI
| image of you with a racist tattoo.
|
| Finally, no one likes their name and image appropriated
| for things that aren't them. We don't like lies being
| spread about us even if 99% of people won't believe the
| lies. Heck, we see Donald Trump go on rants about
| truthful images of him that portray his body in ways he
| doesn't like (and they're just things like him golfing,
| but an unflattering pose). I don't want fake naked images
| of me even if they're literally labeled as fake. It still
| feels like an invasion of privacy and in a lot of ways it
| would end up that way - people would debate things like
| "nah, her breasts probably aren't that big." Words can
| hurt. Images can hurt even more - even if it's all lies.
| There's a reason why we created paintings even when we
| knew that paintings weren't authentic: images have power
| and that power is going to hurt people even more than the
| words we've always been able to use for lies.
|
| tl;dr: 1) It will take a long time before people's trust
| in images "drops to zero"; 2) Even when people know an
| image isn't real, it's still compelling - it's why
| paintings have existed and were important politically for
| millennia; 3) We've always known speech and text can be
| lies, but we regularly see lies believed and hugely
| damage people's lives - and images will always be more
| compelling than speech/text; 4) Even if no one believes
| something is true, there's something psychologically
| damaging about someone spreading lies about you - and
| it's a lot worse when they can do it with imagery.
| IanCal wrote:
| > If it becomes so easy and common that anyone can do it,
| then surely trust in the veracity of damaging images will
| drop to about 0
|
| People believe plenty of just written words - which are
| extremely easy to "fake", you just type them. Why has that
| trust not dropped to about 0?
| UberFly wrote:
| Exactly. They are giving people's deductive reasoning
| skills too much credit.
| Al-Khwarizmi wrote:
| It kind of has? People believe written words when they
| come from a source that they consider, erroneously or
| not, to be trustworthy (newspaper, printed book,
| Wikipedia, etc.). They trust the source, not the words
| themselves just due to being written somewhere.
|
| This has so far not been true of videos (e.g. a video of
| a celebrity from a random source has typically been
| trusted by laypeople) and should change.
| Sohcahtoa82 wrote:
| > if it becomes so easy and common that anyone can do it,
| then surely trust in the veracity of damaging images will
| drop to about 0.
|
| Spend more time on Facebook and you'll lose your faith in
| humanity.
|
| I've seen obviously AI generated pictures of a 5 year old
| holding a chainsaw right next to a beautiful wooden
| sculpture, and the comments are filled with boomers amazed
| at that child's talent.
|
| There are still people that think the IRS will call them
| and make them pay their taxes over the phone with Apple
| gift cards.
| SkyBelow wrote:
| If we follow the idea of safety, should we restrict the
| internet so either such users can safely use the internet
| (and phones, gift cards, technology in general) without
| being scammed, or otherwise restrict it so that at risk
| individuals can't use the technology at all?
|
| Otherwise, why is AI specifically being targeted, other
| than the fear of new things that looks similar to the
| moral panics of video games.
| themoonisachees wrote:
| In concept this is maybe desirable; boot anyone off the
| internet that isn't able to use it safely.
|
| In reality this is a disaster. The elderly and homeless
| people are already being left behind massively by a
| society that believes internet access is something
| everybody everywhere has. This is somewhat fine when the
| thing they want to access is twitter (and even then, even
| with the current state of twitter, who are you to judge
| who should and should not be on it?), but it becomes a
| Major Problem(tm) when the thing they want to access is
| their bank. Any technological solutions you just thought
| about for this problem are not sufficient when we're
| talking about "Can everybody continue to live their lives
| considering we've kinda thrust the internet on them
| without them asking"
| BryantD wrote:
| Let me give you a specific counterexample: it's easy and
| common to generate phishing emails. Trust in email has not
| dropped to the degree that phishing is not a problem.
| Al-Khwarizmi wrote:
| Phishing emails mostly work because they apparently come
| from a trusted source, though. The key is that they fake
| the source, not that people will just trust random
| written words just because they are written, as they do
| with videos.
|
| A better analogy would be Nigerian prince emails, but
| only a tiny minority of people believe those... or at
| least that's what I want to think!
| BryantD wrote:
| The trusted source thing is important, but there's some
| degree of evidence that videos and images generate trust
| in a source, I think?
| amenhotep wrote:
| That's the point. They do, but they _no longer should_.
| Our technical capabilities for lying have begun to
| overwhelm the old heuristics, and the sooner people
| realise the better.
| monitorlizard wrote:
| Perhaps I'm being overly contrarian, but from my point of
| view, I feel that could be a blessing in disguise. For
| example, in a world where deepfake pornography is ubiquitous,
| it becomes much harder to tarnish someone's reputation
| through revenge porn, real or fake. I'm reminded of Syndrome
| from The Incredibles: "When everyone is super no one will
| be."
| fimdomeio wrote:
| The censuring of porn content exists for PR reasons. They
| just want to have a way to say "we tried to prevent it". If
| anyone wants to generate porn, then it just needs 30 min of
| research to find the huge amount of models based on stable
| diffusion with nsfw content.
|
| If you can generate synthetic images and have a channel to
| broadcast them, then you could generate way bigger problems
| then fake celebrity porn.
|
| Not saying that it is not a problem, but rather that it is a
| problem inherent to the whole tool, not to some specific
| subjects.
| boringuser2 wrote:
| If that ever becomes an actual problem, our entire society
| will be at a filter point.
|
| This is the problem with these kind of incremental
| mitigations philosophically -- as soon as the actual problem
| were to manifest it would instantly become a civilization-
| level threat that would only be resolved with drastic
| restructuring of society.
|
| Same logic for an AI that replaces a programmer. As soon as
| AI is that advanced the problem requires vast changes.
|
| Incremental mitigations don't do anything.
| cooper_ganglia wrote:
| I watched an old Tom Scott video of him predicting what the
| distant year 2030 would look like. In his talk, he mentioned
| privacy becoming something quaint that your grandparents used
| to believe in.
|
| I've wondered for a while if we just adapt to the point that
| we're unfazed by fake nude photos of people. The recent Bobbi
| Althoff "leaks" reminded me of this. That's a little
| different since she's a public figure, but I really wonder if
| we just go into the future assuming all photos like that have
| been faked, and if someone's iCloud gets leaked now it'll
| actually be less stressful because 1. They can claim it's AI
| images, or 2. There's already lewd AI images of them, so the
| real ones leaking don't really make much of a difference.
| flir wrote:
| There's an argument that privacy (more accurately
| anonymity) is a temporary phenomenon, a consequence of the
| scale that comes with industrialization. We didn't really
| have it in small villages, and we won't really have it in
| the global village.
|
| (I'm not a fan of the direction, but then I'm a product of
| stage 2).
| Szpadel wrote:
| serious question, is that really that hard to remove personal
| information from training data so model does not know how
| specific public figures look like?
|
| I believe this worked with nudity and model when asked
| generated "smooth" intimate regions (like some kind of doll)
|
| so you could ask for eg. generic president but not any
| specific one, so it would be very hard to generate anyone
| specific
| amenhotep wrote:
| Proprietary, inaccessible models can somewhat do that.
| Locally hosted models can simply be trained on what a
| specific person looks like by the user, you just need a
| couple dozen photos. Keyword: LoRA.
| fennecbutt wrote:
| But just like privacy issues, this'll be possible.
|
| It's only bad because society still hasn't normalised sex,
| from a gay perspective y'all are prude af.
|
| It's a shortcut, for us to just accept that these social
| ideals and expectations will have to change so we may as well
| do it now.
|
| In 100 years, people will be able to make a personal AI that
| looks, sounds and behaves like any person they want and does
| anything they want. We'll have thinking dust, you can already
| buy cameras like a mm^2, in the future I imagine they'll be
| even smaller.
|
| At some point it's going to get increasingly unproductive
| trying to safeguard technology without people's social
| expectations changing.
|
| Same thing with Google Glass, shunned pretttty much
| exclusively bc it has a camera on it (even tho phones at the
| time did too), but now we got Ray Bans camera glasses and 50
| years from now all glasses will have cameras, if we even
| still wear them.
| spazx wrote:
| Yes this. This is what I've been trying to explain to my
| friends.
|
| When Tron came out in 1982, it was disliked because back
| then using CGI effects was considered "cheating". Then
| awhile later Pixar did movies entirely with CGI and they
| were hits. Now almost every big studio movie uses CGI.
| Shunned to embraced in like, 13 years.
|
| I think over time the general consensus's views about AI
| models will soften. Although it might take longer in some
| communities. (Username checks out lol, furry here also. I
| think the furs may take longer to embrace it.)
|
| (Also, people will still continue to use older tools like
| Photoshop to accomplish similar things.)
| stared wrote:
| It is not only about morals but the incentives of parties. The
| need for sexual-explicit content is bigger than, say, for niche
| artistic experiments of geometrical living cupboards owned by a
| cybernetic dragon.
|
| Stability AI, very understandably, does not want to be
| associated with "the porn-generation tool". And if, even
| occasionally, it generates criminal content, the backslash
| would be enormous. Censoring the data requires effort but is
| (for companies) worth it.
| nonrandomstring wrote:
| The term "bad actor" is starting to get cringe.
|
| Ronald Reagan was a bad actor.
|
| George Bush wore out "evildoers"?
|
| Where next... fiends, miscreants, baddies, hooligans,
| deadbeats?
|
| Dastardly digital deviants Batman!
| five_lights wrote:
| >It's kind of a testament to our times that the person who
| chooses to look at synthetic porn instead of supporting a real-
| life human trafficking industry is the bad actor.
|
| "Bad actor" is a pretty vague term, I think they are using it
| as a catch all without diving into the specifics. we are all
| projecting what that may mean based on our own awareness of
| this topic as a result.
|
| I totally agree with your assessment and honestly would love to
| see this tech create less of a demand for the product human-
| traffickers produce.
|
| Celebrity deep fakes and racist images made by internet trolls
| are a few of the overt things they are willing to acknowledge
| is a problem, and they are fighting against (Google Gemini's
| over correction on this has been the talk this week). Does it
| put pressure on the companies to change for PR reasons, yes. It
| also gives a little bit of a Streisand effect, so it may be a
| zero sum game.
|
| We aren't talking about the big issue surrounding this tech,
| the issue that would cause far more damage to their brand than
| celebrity deep fakes:
|
| Pedophilic image generation.
|
| Guard rails should be miles high for this one.
| GenericPoster wrote:
| The talk of "safety" and harm in every image or language model
| release is getting quite boring and repetitive. The reasons why
| it's there is obvious and there are known workarounds yet the
| majority of conversations seems to be dominated by it. There's
| very little discussion regarding the actual technology and I'm
| aware of the irony of mentioning this. Really wish I could filter
| out these sorts of posts.
|
| Hopefuly it dies down soon but I doubt it. At least we don't have
| to hear garbage about "WHy doEs opEn ai hAve oPEn iN thE namE iF
| ThEY aReN'T oPEN SoURCe"
| learningerrday wrote:
| I hope the safety conversation doesn't die. The societal
| effects of these technologies are quite large, and we should be
| okay with creating the space to acknowledge and talk about the
| good and the bad, and what we're doing to mitigate the negative
| effects. In any case, even though it's repetitive, there exists
| someone out there on the Interwebs who will discover that
| information for the first time today (or whenever the release
| is), and such disclosures are valuable. My favorite relevant
| XKCD comic: https://xkcd.com/1053/
| iterateAutomate wrote:
| What is with these names haha, Stable Diffusion XL 1.0 and now to
| Stable Diffusion 3??
| yreg wrote:
| There was 1.0, 1.5, 2.0, XL and now 3.0.
|
| Not that weird.
| cchance wrote:
| XL was basically an experiment on a 2.1 architecture with some
| tweaks but at a larger image size... hence the XL but it wasn't
| really an evolution of the underlying architecture which is why
| it wasn't 3.0 or even 2.5 it was "bigger" lol
| k__ wrote:
| So, they block all bad actors, but themselves?
| ssalka wrote:
| I wonder if this will actually be adopted by the community,
| unlike SD. 2.0. Many are still developing around SD 1.5 due to
| its uncensored nature. SDXL has done better than 2.0, but has
| greater hardware requirements so still can't be used by everyone
| running 1.5.
| caycep wrote:
| are all the model/back ends to Stability products basically
| available OSS via Ludwig Maximilian University, more or less?
| ummonk wrote:
| It's going to have a restrictive license like Stable Cascade no
| doubt.
___________________________________________________________________
(page generated 2024-02-22 23:00 UTC)