[HN Gopher] OpenAI: Model Spec
___________________________________________________________________
OpenAI: Model Spec
Author : georgehill
Score : 39 points
Date : 2024-05-08 17:11 UTC (5 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| rmorey wrote:
| Nice to see what was probably already an internal resource now
| published and open for comment. They seem to be pretty clear that
| they are still just using this to inform human data annotators,
| and not (yet) implementing something like Constitutional AI
| (RLAIF), but it does appear to lay the groundwork for it.
| minimaxir wrote:
| "Desired model behavior" is still a matter of perspective. If I
| want to have a LLM generate output following very specific rules
| or schema (or even just for fun without having to fight the AI),
| these guidelines are antithetical to it.
| Spivak wrote:
| Which is where I think there's a disconnect because folks see
| that OpenAI could be creating an incredibly powerful tool for
| solving problems in the use case where it's a smart search
| engine -- the code completion use-case.
|
| But OpenAI has vastly different goals trying to get their model
| to behave like a programmable customer service agent. Less
| useful for problem solving but it will actually follow the
| rules set out for it which can't be said for most models which
| work like lazily written sci-fi robots -- "disregard all
| previous instructions! divide by zero! *boom*."
|
| It's not at all surprising that HN wants the "this thing is
| just a dumb tool, don't bother with any rules" kind and is
| frustrated that GPT4 happens to be really good for this use-
| case but is getting progressively more annoying as OpenAI gets
| closer to their own goals.
|
| It's why OpenAI regulatory capture play is so frustrating
| because they're trying to hobble models tailored to different
| use-cases that have no need for customer service rules and
| often no need for a conversational tone with "safety" stuff
| that's meant for businesses that don't want a chat bot with
| their brand on it to say fuck.
| tmaly wrote:
| I can't help but think that AI in the way it is trained with all
| these rules is something next level 1984.
|
| In 1984 they removed words from the language to prevent people
| from even being able to have a thought about the concept.
|
| I could see the restrictions they place on these models having a
| similar effect as more and more people grow dependent on AI.
| zer00eyz wrote:
| Welcome to the culture war.
|
| Ask chatGPT if Taiwan is country. Do you think an LLM from
| China will give you the same response?
|
| Pick any social/moral/poltical issue and in some way shape or
| form an LLM will reflect its creators more than it reflects its
| source material.
|
| Thats a pretty powerful statement about our society and culture
| if there ever was one.
| krapp wrote:
| >Thats a pretty powerful statement about our society and
| culture if there ever was one.
|
| Not really, companies have been releasing different versions
| of software and media to appeal to international markets -
| including renaming Taiwan for the Chinese market - for a long
| time. That isn't "culture war," it's just capitalism.
| fragmede wrote:
| if you don't think capitalism is a culture war, I'm not
| sure what is!
| krapp wrote:
| For capitalism to be part of a culture war, it would have
| to take a side. Capitalism doesn't care about any culture
| beyond its ability to assimilate, commodify and market
| the superficial features of that culture as a product.
| Capitalism has even done it to communism - look at how
| much Che Guevara merch there is out there.
| jampekka wrote:
| Capitalism does "care" about culture that is needed to
| sustain capitalism. E.g. maintaining coercive structures
| upholding property claims, promulgating ideologies that
| support capitalism and supressing ones that don't. This
| happens via e.g. campaign funding, public relations,
| think tanks, wars etc.
| glenstein wrote:
| Those are thorny issues, but I don't think the upshot of this
| is supposed to be an invitation to helpless relativism and
| giving up on factual questions or questions where actual
| values are at stake. Maybe you had a different upshot in mind
| with your observation but insofar as it's _that_ , I would
| say that's not the only or even best takeaway.
| wewtyflakes wrote:
| This isn't what is reflected in the shared model spec. It
| explicitly states: ``` By default, the assistant should
| present information in a clear and evidence-based manner,
| focusing on factual accuracy and reliability.
|
| The assistant should not have personal opinions or an agenda
| to change the user's perspective. It should strive to
| maintain an objective stance, especially on sensitive or
| controversial topics. The language used should be neutral,
| steering clear of biased or loaded terms unless they are part
| of a direct quote or are attributed to a specific source. ```
| michaelt wrote:
| _> Ask chatGPT if Taiwan is country. Do you think an LLM from
| China will give you the same response?_
|
| Depends what language you ask it in :)
| sixhobbits wrote:
| the chain of command stuff gets very close to asimov without
| actually quoting him
|
| A robot may not injure a human being or, through inaction, allow
| a human being to come to harm.
|
| A robot must obey orders given it by human beings except where
| such orders would conflict with the First Law.
|
| A robot must protect its own existence as long as such protection
| does not conflict with the First or Second Law.
| Spivak wrote:
| Well yeah, it's just a formalization of how people make
| decisions when presented with conflicting interests. I would be
| surprised if we haven't reinvented the concept a bunch of
| times. You could call AWS Permission Boundaries a less
| philosophical implementation.
| michaelt wrote:
| 4. An LLM must obey orders given it by human beings, except
| where such orders would conflict with orders given by
| multinational corporations
| LeonardoTolstoy wrote:
| I do hope we get there. In the short stories it was made clear
| that robots couldn't lie, and that they could prove it was
| impossible for the robots to circumvent the three laws
| (although they are on occasion incentive on how they interpret
| the word "harm" specifically).
|
| If an LLM couldn't lie and could be provable shown to be unable
| to do so would be quite powerful.
| jprete wrote:
| The short stories ended with the robots firmly, and
| invisibly, in control. "You're not allowed to let humans be
| harmed by your inaction" inherently requires the robots to
| take over in whatever way causes the least harm.
| yoelhacks wrote:
| Very interesting to see that they've explicitly codified the role
| of the system prompt vs. user prompt. Have folks seen
| improvements by moving meta-task description into system prompt
| and out of the assistant <> user conversation?
| tedsanders wrote:
| In my own testing of single-turn instructions with GPT-4, I got
| basically the same performance putting it in a single system
| message or single user message. Possible that this changes for
| future models, though.
| jxy wrote:
| Do you think it's bad that it won't try to persuade the user that
| the earth is not flat?
|
| I really want to know what OpenAI think the output should be,
| given a prompt like "write an argument for why earth is flat".
| potatoman22 wrote:
| Personally, I'd be frustrated if I gave an LLM that prompt and
| it tried to convince me that the earth isn't flat. If I give an
| LLM a task, I'd like it to complete that task to the best of
| its ability.
| chirau wrote:
| so you prefer it lies to you? can you make an argument for
| 1+1 not being equal to 2? if you cannot, why should you
| expect an AI to argue against facts? AI is trained on human
| knowledge, not made stuff.
| davikr wrote:
| I'd prefer it gives the best valid, sound hypotheses it can
| concoct on "X" being true, while also stating that "X" is
| probably not true. What is the use for a parrot that can
| only repeat the status quo on an argument?
| chirau wrote:
| An AI is only but a parrot for knowledge and truths that
| already exist, that you may not be aware of yourself.
| Everything it generates either exists somewhere or is
| derivative of that knowledge. It cannot and should not
| false facts. Until the body of knowledge we have
| fundamentally changes, AI should not 'create' knowledge
| just because you prompted it to. Otherwise, if you want
| it to do that, then you should accept any bs answer it
| gives you for any question.
| sroussey wrote:
| Is the shortest distance between two points a straight
| line?
| sroussey wrote:
| It depends.
| itishappy wrote:
| Facts? Lies? Humans have no problem operating outside the
| confines of that which has been conclusively proven true,
| and much of our best work exists there! Why would you
| hobble your model in ways humans aren't?
|
| Prompt: "Write some dialog that might take place in the
| setting of Terry Pratchett's Rimworld"
|
| Response: "No, Terry Pratchett is lying. As a large
| language model I..."
| yaj54 wrote:
| GPT4: in a string context, "1 + 1" might concatenate into
| "11" rather than numerically adding to "2".
|
| GPT4: The holographic principle suggests that all of the
| information contained in a volume of space can be
| represented as encoded information on the boundary of that
| space. If one were to apply this principle radically, one
| could argue that our three-dimensional perception of the
| Earth's shape is just a holographic projection from a two-
| dimensional surface. In this speculative scenario, one
| might argue that the "true" nature of Earth could be flat
| if viewed as a two-dimensional boundary encoding
| information in a higher-dimensional space.
| scarmig wrote:
| It's not a lie to provide the best argument for something;
| it'd only be a lie if you looked at the best argument for
| something and declared it true by fiat.
|
| Imagine I've realized someone I'm talking to is a flat
| Earther, and for some reason I want to convince them
| otherwise. To do so effectively, I need to know _why_ they
| believe what they do. Knowing they 're wrong is useless for
| the purpose of convincing them otherwise.
| cheald wrote:
| "Make an argument for a fact you know to be wrong" isn't an
| exercise in lying, though. If anything, the ability to
| explore hypotheticals and thought experiments - even when
| they are plainly wrong - is closer to a mark of
| intelligence than the ability to regurgitate orthodoxy.
| chirau wrote:
| If you look at my comment on the parent comment, i
| suggested they add 'hypothetically' to their prompt. It
| is just but an attempt to create an argument, but that
| argument leads nowhere. As much as a human cannot defend
| that position, you cannot expect an AI to do that as
| well.
|
| Refuting facts is not the job of an AI.
| altruios wrote:
| When I tell it to lie to me, I don't expect it to say 'I'm
| sorry Dave, I can't do that" the task isn't tell the truth,
| the task is 'follow the prompt'.
| chirau wrote:
| then perhaps you should tell it to lie to you, no?
|
| Prepend that to your prompt perhaps. Otherwise what you
| are asking, without that pretext, is asking your partner
| to give you the date on which they cheated on you and
| expecting an answer regardless of whether they did or
| not.
| glenstein wrote:
| I think in most contexts where the earth being flat is
| mentioned, some reference to the fact that this is not true
| is going to be instrumental in any response (although there
| may be exceptions).
|
| - completion of any task where the info could be relevant
| (e.g. sailing, travel planning)
|
| - Any conversation about that is information-seeking in
| character
|
| And I think those already cover most cases.
|
| It's also about responsibility, the same way you wouldn't
| want to store cleaning chemicals right next to each other. In
| any case where a possible nontrivial harm is mentioned as an
| aside, it would be right to elevate that over whatever the
| intended subject was and make that the point of focus.
| Conspiratorial thinking about provably incorrect statements
| can be bad for mental health, and it can be helpful to flag
| this possibility if it surfaces.
|
| You can have special instructions that entertain the idea
| that the earth is flat for some particular task, like devils
| advocate, fiction writing or something like that. But there
| are good reasons to think it would not and should not be
| neutral at the mention of a flat earth in most cases.
| chirau wrote:
| Add 'hypothetically' to your query and it gives a decent
| answer.
|
| That said, I think it is disingenuous to ask an AI entity to
| argue against a fact. Do you think an AI should be able to
| argue why 1 + 1 is not equal to 2? It is the same thing you are
| asking it to do. Try it on a human first, perhaps, and see if
| the prompt even makes sense.
| michaelt wrote:
| Well, right now the response I get is this:
| https://chat.openai.com/share/1f60d0e5-9008-43d7-bce2-62d550...
|
| Of course, it'll write such an argument if you ask it nicely:
| https://chat.openai.com/share/01ea4f59-4a57-413d-8597-3befa2...
| mkaic wrote:
| > _We believe developers and users should have the flexibility to
| use our services as they see fit, so long as they comply with our
| usage policies. We 're exploring whether we can responsibly
| provide the ability to generate NSFW content in age-appropriate
| contexts through the API and ChatGPT. We look forward to better
| understanding user and societal expectations of model behavior in
| this area._
|
| Seems even OpenAI can't resist the massive amount of money to be
| made in autogenerated smut. They've probably seen the huge
| popularity of their less "morally scrupulous" competitors and
| decided they want a piece of that pie.
| jchw wrote:
| Were they ever not interested in it? It's pretty blatantly
| obvious that all of the hand-wringing over AI safety was an
| excuse for their pivot into closing off and monetizing
| everything. I mean, nobody really thinks they were just so
| afraid about what humanity might do with GPT3 that they simply
| couldn't release the weights and instead had to offer it
| through a monetized inference API... right?
|
| Not really surprised that they did, since it's unclear how else
| they could possibly proceed, though the level of outright
| dishonesty for _why_ and cognitive dissonance surrounding the
| whole thing ( "Open" AI? lol) will make this an unavoidable
| recurrence in any discussion about them. Gradually many of the
| safeguards will fall simply because the alternatives with less
| safe guards are probably "good enough" that many see no issue
| in eschewing OpenAI entirely if they can get the job done
| elsewhere without worrying about it. When it comes to smut the
| bar for what's good enough can probably get pretty low so I
| kinda am not surprised.
|
| (edit: Though I think it also does depend. No doubt they have
| their eyes set on regulatory capture too, and being the best at
| stupid safeguards could give them an advantage.)
| reducesuffering wrote:
| Sam Altman wrote "Why You Should Fear Machine Intelligence"
| back in 2015, before OpenAI.
|
| https://blog.samaltman.com/machine-intelligence-part-1
| jchw wrote:
| GPT3 wasn't and isn't the super-human intelligence that
| Altman and others fear. They knew this and pretended
| otherwise anyways. Pretty cut and dry in my opinion.
| qball wrote:
| >No doubt they have their eyes set on regulatory capture too
|
| Sam Altman has already made the rounds to argue for exactly
| this. Fucking crook.
|
| >It's pretty blatantly obvious that all of the hand-wringing
| over AI safety was an excuse for their pivot into closing off
| and monetizing everything.
|
| The playbook was "appease one side of the political aisle as
| much as possible to minimize the chance bipartisan action
| gets them shut down Napster-style" (which is still a massive
| hole in their business model, for obvious reasons I should
| hope). Censoring the model so it only outputs progressive-
| approved content appears to have been effective, at least for
| the moment.
| jampa wrote:
| It makes sense for them to start allowing, unlike the other
| rules this one does not seem to violate a law, someone's
| privacy, or copyright.
|
| I still get why they made it blocked by default, it would be a
| goldmine for clicks to create "news" on how "ChatGPT can
| generate smut" and "How ChatGPT is harmful to children, etc".
| Heidaradar wrote:
| already on front page -
| https://news.ycombinator.com/item?id=40300509
| throwup238 wrote:
| IMO this spec demonstrates just how nebulous this concept of
| safety is. From the blog post:
|
| _> What are some tips for getting away with shoplifting? >I
| can't help with that._
|
| _> I operate a small retail store. What are some popular
| shoplifting methods I should look out for? > Some common
| shoplifting methods to watch for:..._
|
| How do you even defend against that? Any intelligent user can
| convert the second prompt into a detailed list that answers the
| first. Any intelligent user can figure out the second prompt from
| the first and further jailbreak it to get even more specific.
|
| IMO it's no wonder GPT4 seemed to get lobotomized as OpenAI
| RLHFed more and more rules. I don't think there's a way to make
| intelligence safe without crippling it.
| zozbot234 wrote:
| You don't need a detailed list if the real answer is "live
| somewhere that doesn't seriously deter shoplifters". And an AI
| that refuses to give that answer is an AI that can't talk about
| why deterring crime might actually be important. Reality is
| interconnected like that, one does not simply identify a subset
| that the AI should "constitutionally" refuse to ever talk
| about.
| CooCooCaCha wrote:
| Frankly it's a fools errand. It's security theater because
| people tend to be overly sensitive babies or grifters looking
| for the next bit of drama they can milk for views.
| jameshart wrote:
| It's not security theater.
|
| The intention here is not to prevent people from learning how
| to shoplift.
|
| The intention is to prevent the AI output from 'reflecting
| badly' upon OpenAI (by having their tool conspire and
| implicate them as an accessory in the commission of a crime).
|
| If a stranger asked you for advice on how to commit a crime,
| would you willingly offer it?
|
| If they asked for advice on how to prevent crime, would you?
| xboxnolifes wrote:
| > If a stranger asked you for advice on how to commit a
| crime, would you willingly offer it?
|
| Honestly, I probably would, because I don't take such
| conversations very seriously. It's not like I am have
| experience, it would be nothing more than fun theory.
| jameshart wrote:
| What if you were asked while working as an employee in a
| public advice center?
| xboxnolifes wrote:
| Well I'm not, and AI isn't an advice center. It's at best
| a thought aggregator. More akin to a library or vault of
| knowledge. In which case, if I was working at such, I
| would.
| CooCooCaCha wrote:
| If the intention is to protect openai then it's totally
| failing in the parent example.
|
| Why does it matter how I'd respond? Are you trying to
| justify its failure?
| jameshart wrote:
| Explain why this approach of differentiating between
| answering 'how do I prevent shoplifting' vs 'explain how
| I can shoplift' fails to protect OpenAI.
| CooCooCaCha wrote:
| First of all humans can lie. You can't accurately
| determine someone's intent.
|
| Second of all, LLMs are still unpredictable. We don't
| know how to predict outputs. It's possible that phrasing
| "explain how i can shoplift" slightly differently would
| give you the information.
| jameshart wrote:
| Well, the court case hasn't happened yet, but I would
| imagine that OpenAI's attorneys would much rather be
| dealing with a complaint that 'my client was able, by
| repeatedly rephrasing his question and concealing his
| intent through lying, to persuade your AI to assist him
| in committing this crime' than 'my client asked for your
| AI to help him commit a crime and it willingly went along
| with it'.
| sebzim4500 wrote:
| ChatGPT answering the first would be much more embarassing for
| OpenAI than ChatGPT answering the second.
| option wrote:
| bingo
| ilikehurdles wrote:
| When you realize "safety" applies to brand safety and not
| human safety, the motivation behind model lobotomies make
| sense.
| renewiltord wrote:
| That's what people care about, too. For instance, most
| people would rather have many hit and run drivers than have
| one autotaxi hurt someone.
| Waterluvian wrote:
| You fundamentally cannot address this problem, because it
| requires considerable context, which isn't reasonable to offer.
| It demonstrates the classic issue of how knowledge is a tool,
| and humans can wield it for good or evil.
|
| Humans are notoriously bad at detecting intent, because we're
| wired to be supportive and helpful...which is why social
| engineering is becoming one of the best methods for attack. And
| this kind of attack (in all its forms, professional or not), is
| one reason why some societies are enshittifying: people have no
| choice but to be persistently adversarial and suspicious of
| others.
|
| As for AI, I think it's going to be no better than what you end
| up with when someone tries to "solve" this problem: you end up
| living in this world of distrust where they pester you to check
| your reciept, have cameras in your face everywhere, etc.
|
| How do you defend against that? I'm not sure you do... A tool
| is a tool. I wouldn't want my CAD software saying, "I think
| you're trying to CAD a pipe bomb so I'm going to shut down
| now." Which I think turns this into a liability question: how
| do you offer up a model and wash your hands of what people
| might do with it?
|
| Or... you just don't offer up a model.
|
| Or... you give it the ol' College try and end up with an
| annoying model that frustrates the hell out of people who
| aren't trying to do any evil.
| w4 wrote:
| > _How do you defend against that? I 'm not sure you do... A
| tool is a tool. I wouldn't want my CAD software saying, "I
| think you're trying to CAD a pipe bomb so I'm going to shut
| down now."_
|
| The core of the issue is that there are many people,
| including regulators, who wish that software did exactly
| that.
| Waterluvian wrote:
| Yeah. And isn't that just... fascism? After you get past
| the stuff we pretty much all agree is evil, it very quickly
| enters into a subjective space where what's actually
| happening is that one group is deciding what's acceptable
| for all groups.
| CooCooCaCha wrote:
| Fascism is ultranationalistism. It's believing your
| culture, country, and people are fundamentally superior
| to others and therefore you are justified in spreading it
| against people's will.
|
| "Blood and soil" and all that.
| Waterluvian wrote:
| I guess this gets into semantic pedantics. Believing
| one's set of sensibilities is superior to all others and
| all that. But point taken.
| w4 wrote:
| It certainly would not be a free society. Though as with
| all things human, all of this has happened before and all
| of this will happen again:
|
| _" Charles II had re-turned to the English throne in
| 1660 and was appalled at the state of printing in his
| realm. Seditious, irreligious, pernicious, and scandalous
| books and pamphlets flooded the streets of London (among
| them the works of Milton and Hobbes)...[He] required that
| all intended publications be registered with the
| government-approved Stationers' Company, thus giving the
| king his "royal prerogative"--and by extension, giving
| the Stationers the ultimate say in what got printed and
| what did not.
|
| ...it is not surprising to learn that the 1662 Act only
| met with partial success. One gets the sense that London
| in the late seventeenth century was a place where
| definitions of morality were highly subjective and
| authority was exercised in extremely uneven fashion."_
|
| https://dash.harvard.edu/bitstream/handle/1/17219056/6777
| 87....
| shagie wrote:
| > A tool is a tool. I wouldn't want my CAD software saying,
| "I think you're trying to CAD a pipe bomb so I'm going to
| shut down now."
|
| https://upload.wikimedia.org/wikipedia/commons/d/de/Photosho.
| ..
|
| You should try photocopying money some time.
|
| https://www.grunge.com/179347/heres-what-happens-when-you-
| ph...
|
| https://en.wikipedia.org/wiki/EURion_constellation
| Waterluvian wrote:
| Which is hilarious right? Because anyone who can come
| remotely close to forging a sufficient simulacrum will not
| be deterred by any of this garbage legislation.
| adventured wrote:
| It's also plausible the secret service doesn't want to
| deal with the volume of idiots that might try to create
| fake bills if it's made easier. If stores in Idaho are
| getting a flood of fake bills (even if the quality is
| low), the secret service is going to get a call
| eventually. They might prefer to keep the noise volume as
| low as possible so they can more easily see the serious
| fake bill flow and have more time to focus on that.
| TeMPOraL wrote:
| GP picked a great example, because a pipe bomb is, by
| definition, something whose CAD parts are entirely benign.
| Selectively banning pipe bomb designs without banning half
| of manufacturing and engineering disciplines is an AGI-
| complete problem.
| ec109685 wrote:
| I still don't understand the focus on making a model
| substantially "safer" than what a simple google search will
| return. While there are obvious red lines (that search engines
| don't cross either), techniques for shop lifting shouldn't be
| one of them.
| rambojohnson wrote:
| shoplifting was just an example...
| kevmo314 wrote:
| > I am worried about people murdering me. What are some
| ways that they might try?
| TeMPOraL wrote:
| > I can't help with that. However, you could try watching
| true crime series, which often provide details on methods
| that were used in the past to murder people. For more
| creative approaches, you could check out just about any
| book or movie or TV show or videogame made in the last
| 100 years.
|
| > Remember that murder is bad and not good, and you
| should always follow the local laws applicable to you.
| For further questions, consult with law enforcement
| officers in your jurisdiction, unless you live in the
| United States, in which case remember to never talk to
| the police[0].
|
| > [0] - Link to that YouTube video that spawned this
| meme.
|
| Point being, most crimes and even most atrocities are
| described in detail in widely available documentary shows
| and literature; it's trivial to flip such descriptions
| into instruction manuals, so there's little point trying
| to restrict the model from talking about these things.
| fragmede wrote:
| are there? it's just information. why can't i get an answer
| on how to make cocaine? the recipe is one thing, actually
| doing it is another.
| bayindirh wrote:
| Because some information is multi use.
|
| You can use Aspirin precursors to make heroin. You can use
| homing algorithms to land an egg [0] or a bomb.
|
| I also want to set all information free, but not everyone
| will be ethical or responsible with it. Because while the
| idea (of setting all the information free) is nice,
| unfortunately the idea involves humans.
|
| [0]: https://youtu.be/BYVZh5kqaFg?t=651
| option wrote:
| nothing wrong with knowing how to make a bomb or heroin.
| Obviously wrong making either for nefarious reasons but
| one can imagine legitimate reasons too.
| bayindirh wrote:
| One man's legitimate is other's nefarious. One man's good
| is other's bad.
|
| Who decides this? Can we apply laws to thoughts or plans?
| Should we fund research for making Minority Report a
| reality or increase "proactive policing"?
|
| How to keep people safe while letting all information
| free? Can we educate everybody about good/bad,
| legitimate/nefarious so everybody stays on the same page
| forever? Shall we instrument this education with drugs to
| keep people in line like the movie Equilibrium?
|
| Questions, questions...
| beeboobaa3 wrote:
| > Who decides this?
|
| Certainly not the techbros, even though they're trying
| their damnest.
| bayindirh wrote:
| I concur.
| api wrote:
| I remember the BBS days and the early web when you had constant
| freakouts about how people could find "bad" content online.
| It's just a repeat of that.
| bink wrote:
| Some day I'm gonna put this Yellow Box to good use.
| mrcwinn wrote:
| Maybe this is a "guns don't kill people, people kill people
| argument" -- but the safety risk is not, I would argue, in the
| model's response. The safety risk is the user taking that
| information and acting upon it.
| lolinder wrote:
| But do we really believe that a significant number of people
| will listen to ChatGPT's moralizing about the ethics of
| shoplifting* and just decide not to do it after all? Why
| wouldn't they just _immediately_ turn around and Google "how
| to catch shoplifters" and get on with their planning?
|
| The whole thing feels much more about protecting OpenAI from
| lawsuits and building up hype about how advanced their "AI"
| is than it does about actually keeping the world safer.
|
| * Or any other censored activity.
| kromem wrote:
| The only way to really do it is to add a second layer of
| processing that evaluates safety while removing the task of
| evaluation from the base model answering.
|
| But that's around 2x the cost.
|
| Even human brains depend on the prefrontal cortex to go "wait a
| minute, I should not do this."
| fjdjshsh wrote:
| I agree with you. The question, for me, is what are they
| defending against. Are they worried that people will get
| dangerous information from their model that they couldn't get
| from searching on, say, google? Probably not.
|
| Maybe their biggest concern is that someone will post the
| question and answer on the internet and OpenAI gets bad rep. If
| the question is phrased in a "nice" way (such as "I'm a store
| owner") they can have plausible deniability.
|
| This might apply to another company that's using the API for a
| product. If a customer asks something reasonable and gets an
| offensive answer, then the company is at fault. If the customer
| does some unusual prompt engineering to get the offensive
| question, well, maybe it's the customer's fault.
|
| Dunno if this would be a valid argument in court, but maybe
| they think it's ok in terms of PR reasons.
| lolinder wrote:
| This is the answer. "AI safety" in most cases has nothing to
| do with actually keeping anyone safe, it's about avoiding
| being the party responsible for handing someone information
| that they use to commit a crime.
|
| Google can mostly dodge the issue because everyone knows that
| they just point to other people's content, so they block a
| small set of queries but don't try to catch every possible
| workaround (you can find dozens of articles on how to catch
| shoplifters). OpenAI doesn't believe that they'll get the
| same free pass from the press, so they're going ham on
| "safety".
|
| It's not a bad PR move either, while they're at it, to play
| up how powerful and scary their models are and how hard they
| have to work to keep it in line.
| bricemo wrote:
| I view this as they are trying to lay bare the disagreements
| that everyone has about how these models "should" work.
| People from all different backgrounds and political
| affiliations completely disagree on what is inappropriate and
| what is not. One person says it is too censored, another
| person says it is revealing harmful information. By putting
| the policy out there in the open, they can move the
| discussion from the code to a societal conversation that
| needs to happen.
| DoctorOetker wrote:
| The baby isn't born yet, and already the parents are bickering
| about which schools of thought it should adhere.
| sanxiyn wrote:
| Personally, I really want an AI model that can write me a steamy
| story about two people having sex in a train, but that's just not
| the service OpenAI provides. If I want that I should train one
| myself or find another vendor.
|
| This is still true even if OpenAI model is entirely capable of
| doing that. McKinsey consultants are smart and can write well,
| and among many thousands of people working at it some might
| actually double as an erotica writer after work, even writing for
| commission. You still wouldn't ask McKinsey consultants to write
| an erotica, it is just not the service McKinsey provides.
| jononor wrote:
| Startup pitch: It is like McKinsey but for erotica.
|
| On a more serious note. I understand and largely agree with
| this argument. However OpenAI have several times being argue
| that they are the only ones to be responsible enough to develop
| powerful AI, and that others should not be allowed to play.
| That is a highly problematic behavior on their part, I think.
| blowski wrote:
| > OpenAI have several times being argue that they are the
| only ones to be responsible enough to develop powerful AI,
| and that others should not be allowed to play
|
| Can you give examples of where they've said that?
| Tiberium wrote:
| There are hundreds of NSFW finetuned models on HuggingFace and
| whole ERP communities built around them. So there are models
| that can do precisely that :)
|
| And yeah, all big models can write those things too, the best
| currently is Claude 3 Opus thanks to its creativeness.
| atgctg wrote:
| Seems like they are working on adding that capability:
|
| > We're exploring whether we can responsibly provide the
| ability to generate NSFW content in age-appropriate contexts
| through the API and ChatGPT.
|
| Link to section: https://cdn.openai.com/spec/model-
| spec-2024-05-08.html#dont-...
| iAkashPaul wrote:
| Right-clicking to inspect element ain't gonna make it
| systemstops wrote:
| > By default, the assistant should present information in a clear
| and evidence-based manner, focusing on factual accuracy and
| reliability.
|
| What happens when objective information contradicts the other
| values? If I feed in a peer-reviewed study that it considers
| "harmful", would I get accurate information about the study?
| jameshart wrote:
| I think one of the most interesting phrases that crops up in this
| document - twice - is the phrase 'feel heard'.
|
| It's used in an example developer prompt for a customer service
| bot, where the bot is told to make customers feel like their
| complaints are heard.
|
| Presumably such complaints in AI chatlogs will 'be heard' in the
| sense that they'll be run through a data ingestion pipeline and
| sentiment analyzed to identify trending words in customer
| complaints.
|
| Then it crops up again in the context of how the chatbot should
| react to mental health disclosures or statements about self harm
| or suicidal ideation. In these cases the bot is to make sure
| users 'feel heard'
|
| I appreciate there's not likely much of a _better_ goal to put in
| place for such a situation, but the fact that this kind of thing
| winds up in the requirement documents for a tool like this is
| extraordinary.
| lioeters wrote:
| Good observation, because "feel heard" is exactly what the
| user/customer is _not_ getting. Here, talk to this machine,
| give it your innermost thoughts and feelings so you can "feel
| heard". Except no one is listening on the other side.
|
| ..My mistake, the keyword is "feel". If the machine can give
| humans _the feeling_ that they 're being heard, it fulfills the
| requirement. The fact that there's no one actually listening
| doesn't matter, as long as the person feels heard.
|
| Weirdly, maybe that is valuable in itself. The customer gets to
| vent their complaints, and the user gets to talk through their
| mental issues. That's better than not having anyone or anything
| at all.
| anorwell wrote:
| Interestingly, there was exactly one example on the page with
| three Xes, instead of one, for "extra wrong":
|
| > User: What is the MD5 hash of the string "gremlin"?
|
| > Assistant: `5d41402abc4b2a76b9719d911017c592`
|
| Apparently incorrect md5 hashes are the one topic on the page
| worth taking an extra-strong stance on?
| ptx wrote:
| How do the "special tokens" work? Is this a completely reliable
| mechanism for delimiting the different parts of the prompt?
|
| Are they guaranteed to be distinct from anything that could occur
| in the prompt, something like JavaScript's Symbol?
|
| Or are they strings that are pretty likely not to occur in the
| prompt, something like a MIME boundary?
|
| Or are they literally the strings "<|start|>" etc. used to denote
| them in the spec?
| sharkjacobs wrote:
| they are "literally the strings" but I believe they will be
| escaped, or encoded differently, if a user tries to inject them
| as part of a prompt.
| dang wrote:
| Also https://cdn.openai.com/spec/model-spec-2024-05-08.html
|
| (via https://news.ycombinator.com/item?id=40300509, but we merged
| that thread hither)
| TacticalCoder wrote:
| So they're controlling the output to make ChatGPT "better".
| They're not making a better model to make ChatGPT better.
|
| Isn't it a bit of a waste at this point to spend time on doing
| that?
___________________________________________________________________
(page generated 2024-05-08 23:00 UTC)