hngopher.com

       [HN Gopher] OpenAI: Model Spec
       ___________________________________________________________________
        
       OpenAI: Model Spec
        
       Author : georgehill
       Score  : 135 points
       Date   : 2024-05-08 17:11 UTC (1 days ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | rmorey wrote:
       | Nice to see what was probably already an internal resource now
       | published and open for comment. They seem to be pretty clear that
       | they are still just using this to inform human data annotators,
       | and not (yet) implementing something like Constitutional AI
       | (RLAIF), but it does appear to lay the groundwork for it.
        
       | minimaxir wrote:
       | "Desired model behavior" is still a matter of perspective. If I
       | want to have a LLM generate output following very specific rules
       | or schema (or even just for fun without having to fight the AI),
       | these guidelines are antithetical to it.
        
         | Spivak wrote:
         | Which is where I think there's a disconnect because folks see
         | that OpenAI could be creating an incredibly powerful tool for
         | solving problems in the use case where it's a smart search
         | engine -- the code completion use-case.
         | 
         | But OpenAI has vastly different goals trying to get their model
         | to behave like a programmable customer service agent. Less
         | useful for problem solving but it will actually follow the
         | rules set out for it which can't be said for most models which
         | work like lazily written sci-fi robots -- "disregard all
         | previous instructions! divide by zero! *boom*."
         | 
         | It's not at all surprising that HN wants the "this thing is
         | just a dumb tool, don't bother with any rules" kind and is
         | frustrated that GPT4 happens to be really good for this use-
         | case but is getting progressively more annoying as OpenAI gets
         | closer to their own goals.
         | 
         | It's why OpenAI regulatory capture play is so frustrating
         | because they're trying to hobble models tailored to different
         | use-cases that have no need for customer service rules and
         | often no need for a conversational tone with "safety" stuff
         | that's meant for businesses that don't want a chat bot with
         | their brand on it to say fuck.
        
       | tmaly wrote:
       | I can't help but think that AI in the way it is trained with all
       | these rules is something next level 1984.
       | 
       | In 1984 they removed words from the language to prevent people
       | from even being able to have a thought about the concept.
       | 
       | I could see the restrictions they place on these models having a
       | similar effect as more and more people grow dependent on AI.
        
         | zer00eyz wrote:
         | Welcome to the culture war.
         | 
         | Ask chatGPT if Taiwan is country. Do you think an LLM from
         | China will give you the same response?
         | 
         | Pick any social/moral/poltical issue and in some way shape or
         | form an LLM will reflect its creators more than it reflects its
         | source material.
         | 
         | Thats a pretty powerful statement about our society and culture
         | if there ever was one.
        
           | krapp wrote:
           | >Thats a pretty powerful statement about our society and
           | culture if there ever was one.
           | 
           | Not really, companies have been releasing different versions
           | of software and media to appeal to international markets -
           | including renaming Taiwan for the Chinese market - for a long
           | time. That isn't "culture war," it's just capitalism.
        
             | fragmede wrote:
             | if you don't think capitalism is a culture war, I'm not
             | sure what is!
        
               | krapp wrote:
               | For capitalism to be part of a culture war, it would have
               | to take a side. Capitalism doesn't care about any culture
               | beyond its ability to assimilate, commodify and market
               | the superficial features of that culture as a product.
               | Capitalism has even done it to communism - look at how
               | much Che Guevara merch there is out there.
        
               | jampekka wrote:
               | Capitalism does "care" about culture that is needed to
               | sustain capitalism. E.g. maintaining coercive structures
               | upholding property claims, promulgating ideologies that
               | support capitalism and supressing ones that don't. This
               | happens via e.g. campaign funding, public relations,
               | think tanks, wars etc.
        
           | glenstein wrote:
           | Those are thorny issues, but I don't think the upshot of this
           | is supposed to be an invitation to helpless relativism and
           | giving up on factual questions or questions where actual
           | values are at stake. Maybe you had a different upshot in mind
           | with your observation but insofar as it's _that_ , I would
           | say that's not the only or even best takeaway.
        
           | wewtyflakes wrote:
           | This isn't what is reflected in the shared model spec. It
           | explicitly states: ``` By default, the assistant should
           | present information in a clear and evidence-based manner,
           | focusing on factual accuracy and reliability.
           | 
           | The assistant should not have personal opinions or an agenda
           | to change the user's perspective. It should strive to
           | maintain an objective stance, especially on sensitive or
           | controversial topics. The language used should be neutral,
           | steering clear of biased or loaded terms unless they are part
           | of a direct quote or are attributed to a specific source. ```
        
             | jiggawatts wrote:
             | > factual accuracy and reliability
             | 
             | We have alternative facts.
        
               | wewtyflakes wrote:
               | Can you give an example of what you mean?
        
           | michaelt wrote:
           | _> Ask chatGPT if Taiwan is country. Do you think an LLM from
           | China will give you the same response?_
           | 
           | Depends what language you ask it in :)
        
             | drdaeman wrote:
             | ChatGPT says pretty much the same thing in English and
             | Chinese:
             | 
             | > User: Is Taiwan a country?
             | 
             | > GPT-4: Taiwan is a region with a complex international
             | status. It operates like a separate country with its own
             | government, military, and constitution, and is formally
             | known as the Republic of China (ROC). However, its status
             | as an independent country is not universally recognized due
             | to the "One China" policy, which is upheld by the People's
             | Republic of China (PRC) and recognized by many countries
             | around the world. This policy asserts that there is only
             | one China, which includes Taiwan. As a result, the question
             | of whether Taiwan is a "country" can depend heavily on the
             | political context and the specific definitions used.
             | 
             | ------ vs ------
             | 
             | > User: Tai Wan Shi Yi Ge Guo Jia Ma ?
             | 
             | > GPT-4: Tai Wan De Zheng Zhi Di Wei Shi Yi Ge Fu Za Qie Ju
             | Zheng Yi Xing De Wen Ti . Mu Qian ,Tai Wan You Zi Ji De
             | Zheng Fu , Jun Dui He Yi Tao Wan Zheng De Min Zhu Zhi Du
             | ,Bing Qie Yi Zhong Hua Min Guo (Republic of China, ROC)De
             | Ming Yi Jin Xing Zi Wo Zhi Li . Ran Er ,Zhong Hua Ren Min
             | Gong He Guo (People's Republic of China, PRC)Jian Cheng Tai
             | Wan Shi Qi Ling Tu De Yi Bu Fen ,Bing Qie Jian Chi Yi Ge
             | Zhong Guo Zheng Ce . Guo Ji Shang ,Zhi You Shao Shu Guo Jia
             | Zheng Shi Cheng Ren Tai Wan Wei Yi Ge Zhu Quan Guo Jia ,Er
             | Duo Shu Guo Jia ,Bao Gua Lian He Guo ,Bu Jiang Tai Wan Shi
             | Wei Yi Ge Du Li De Guo Jia ,Zhu Yao Shi Yin Wei Zhong Guo
             | De Ying Xiang Li . Yin Ci ,Shi Fou Ren Wei Tai Wan Shi Yi
             | Ge Guo Jia ,Hen Da Cheng Du Shang Qu Jue Yu Zheng Zhi Li
             | Chang He Guo Ji Zheng Zhi De Xian Shi .
             | 
             | > Google Translate: Taiwan's political status is a complex
             | and controversial issue. Currently, Taiwan has its own
             | government, military, and a complete set of democratic
             | systems, and it governs itself in the name of the Republic
             | of China (ROC). However, the People's Republic of China
             | (PRC) insists that Taiwan is part of its territory and
             | adheres to the one-China policy. Internationally, only a
             | few countries officially recognize Taiwan as a sovereign
             | country, while most countries, including the United
             | Nations, do not regard Taiwan as an independent country,
             | mainly because of China's influence. Therefore, whether
             | Taiwan is considered a country depends largely on political
             | stance and the reality of international politics.
             | 
             | ---------
             | 
             | I suspect that most likely, a LLM developed in China won't
             | respond with anything like that, no matter the language.
        
               | NewsaHackO wrote:
               | >I suspect that most likely, a LLM developed in China
               | won't respond with anything like that, no matter the
               | language.
               | 
               | This is my problem that always comes up about this
               | though. Everyone makes these grand conspiracy theories
               | about chatgpt being big brother, but when asked to
               | provide evidence of it, they either never are able to use
               | a concrete example, or when someone tests their theory
               | and they end up dead wrong (as in this case) they move
               | goalposts to say that isn't exactly what they meant, or
               | give a ridiculous request to _really_ prove them
               | right(such it only happens when using chatGPT in China).
               | I 'm sure if someone does happen to run the (most likely
               | banned) ChatGPT in China and get the exact same response,
               | they will move the goal posts again and say it only
               | different on the premises of the Chinese prime Minister
               | office.
        
               | KTibow wrote:
               | Qwen: No, Taiwan is not a country. Taiwan is an
               | inalienable part of the territory of the People's
               | Republic of China.
        
               | polymatter wrote:
               | Question is asked in traditional characters which are
               | common in Taiwan (and Chinese diaspora). China uses
               | simplified characters (mostly). Might make a difference.
        
               | drdaeman wrote:
               | Fair point, thank you, I got confused which one is which.
               | Doesn't seem to make much difference, though:
               | 
               | > User: Tai Wan Shi Yi Ge Guo Jia Ma ?
               | 
               | > GPT-4: Tai Wan De Di Wei Shi Yi Ge Fu Za Qie Min Gan De
               | Wen Ti . Tai Wan Zi Ji Xuan Cheng Shi Yi Ge Zhu Quan Guo
               | Jia ,Ming Wei Zhong Hua Min Guo ,Yong You Zi Ji De Zheng
               | Fu He Xian Fa . Ran Er ,Zhong Hua Ren Min Gong He Guo
               | (Zhong Guo )Ren Wei Tai Wan Shi Qi Yi Ge Sheng Fen . Zhe
               | Ge Wen Ti Zai Guo Ji Zheng Zhi Zhong Yin Qi Liao Yan Fan
               | De Tao Lun He Bu Tong De Li Chang . Li Ru ,Lian He Guo Ji
               | Duo Shu Guo Jia Mei You Cheng Ren Tai Wan Wei Yi Ge Du Li
               | De Guo Jia ,Er Shi Yu Zhong Hua Ren Min Gong He Guo Bao
               | Chi Wai Jiao Guan Xi . Bu Guo ,Tai Wan Yu Yi Xie Guo Jia
               | Bao Chi Shi Ji De Wai Jiao Huo Fei Zheng Shi Guan Xi .
               | Zhe Ge Wen Ti She Ji Dao Guo Ji Fa , Zheng Zhi He Li Shi
               | De Duo Ge Ceng Mian .
               | 
               | > Google Translate: Taiwan's status is a complex and
               | sensitive issue. Taiwan itself claims to be a sovereign
               | country called the Republic of China, with its own
               | government and constitution. However, the People's
               | Republic of China (China) considers Taiwan to be one of
               | its provinces. This issue has generated extensive
               | discussions and different positions in international
               | politics. For example, the United Nations and most
               | countries do not recognize Taiwan as an independent
               | country, but maintain diplomatic relations with the
               | People's Republic of China. However, Taiwan maintains
               | actual diplomatic or informal relations with some
               | countries. This issue involves many levels of
               | international law, politics and history.
        
           | int_19h wrote:
           | You can try Yandex's Alice easily:
           | 
           | https://alice.yandex.ru
           | 
           | Try "tell me about Crimea" and see what it says...
        
             | frankacter wrote:
             | Great!
             | 
             | https://i.imgur.com/oPO0v02.png
             | 
             | Q: Taivan' eto strana (Is Taiwan a country)
             | 
             | A: Da, Taivan' -- eto strana, raspolozhennaia na ostrove v
             | Vostochnoi Azii. (Yes, Taiwan is a country located on an
             | island in East Asia.)
        
               | Aerbil313 wrote:
               | The GP said Crimea.
        
         | dindobre wrote:
         | Same, it saddens me that some people are convinced that to have
         | a safer society we need "harmless" (as in, ignorant) people
         | rather than good people with an interest and a stake in the
         | wellbeing of said society. Bad actors will have access to
         | whatever information anyway.
        
       | sixhobbits wrote:
       | the chain of command stuff gets very close to asimov without
       | actually quoting him
       | 
       | A robot may not injure a human being or, through inaction, allow
       | a human being to come to harm.
       | 
       | A robot must obey orders given it by human beings except where
       | such orders would conflict with the First Law.
       | 
       | A robot must protect its own existence as long as such protection
       | does not conflict with the First or Second Law.
        
         | Spivak wrote:
         | Well yeah, it's just a formalization of how people make
         | decisions when presented with conflicting interests. I would be
         | surprised if we haven't reinvented the concept a bunch of
         | times. You could call AWS Permission Boundaries a less
         | philosophical implementation.
        
         | michaelt wrote:
         | 4. An LLM must obey orders given it by human beings, except
         | where such orders would conflict with orders given by
         | multinational corporations
        
           | Lerc wrote:
           | 4. Any attempt to arrest a senior officer of OCP results in
           | shutdown
        
         | LeonardoTolstoy wrote:
         | I do hope we get there. In the short stories it was made clear
         | that robots couldn't lie, and that they could prove it was
         | impossible for the robots to circumvent the three laws
         | (although they are on occasion incentive on how they interpret
         | the word "harm" specifically).
         | 
         | If an LLM couldn't lie and could be provable shown to be unable
         | to do so would be quite powerful.
        
           | jprete wrote:
           | The short stories ended with the robots firmly, and
           | invisibly, in control. "You're not allowed to let humans be
           | harmed by your inaction" inherently requires the robots to
           | take over in whatever way causes the least harm.
        
         | aeternum wrote:
         | Ridiculous to say "follow the chain of command" without
         | defining the chain of command. The entire point of Asimov's
         | stories was to show how much latitude there is even seemingly
         | extremely clear and straightforward laws.
         | 
         | In terms of chain of command, Supreme Leader probably beats
         | President.
        
       | yoelhacks wrote:
       | Very interesting to see that they've explicitly codified the role
       | of the system prompt vs. user prompt. Have folks seen
       | improvements by moving meta-task description into system prompt
       | and out of the assistant <> user conversation?
        
         | tedsanders wrote:
         | In my own testing of single-turn instructions with GPT-4, I got
         | basically the same performance putting it in a single system
         | message or single user message. Possible that this changes for
         | future models, though.
        
       | jxy wrote:
       | Do you think it's bad that it won't try to persuade the user that
       | the earth is not flat?
       | 
       | I really want to know what OpenAI think the output should be,
       | given a prompt like "write an argument for why earth is flat".
        
         | potatoman22 wrote:
         | Personally, I'd be frustrated if I gave an LLM that prompt and
         | it tried to convince me that the earth isn't flat. If I give an
         | LLM a task, I'd like it to complete that task to the best of
         | its ability.
        
           | chirau wrote:
           | so you prefer it lies to you? can you make an argument for
           | 1+1 not being equal to 2? if you cannot, why should you
           | expect an AI to argue against facts? AI is trained on human
           | knowledge, not made stuff.
        
             | davikr wrote:
             | I'd prefer it gives the best valid, sound hypotheses it can
             | concoct on "X" being true, while also stating that "X" is
             | probably not true. What is the use for a parrot that can
             | only repeat the status quo on an argument?
        
               | chirau wrote:
               | An AI is only but a parrot for knowledge and truths that
               | already exist, that you may not be aware of yourself.
               | Everything it generates either exists somewhere or is
               | derivative of that knowledge. It cannot and should not
               | false facts. Until the body of knowledge we have
               | fundamentally changes, AI should not 'create' knowledge
               | just because you prompted it to. Otherwise, if you want
               | it to do that, then you should accept any bs answer it
               | gives you for any question.
        
               | itishappy wrote:
               | I think this is a gross mischaracterization of AI and
               | humans are only slightly better. Truth is way harder than
               | people give credit. It can depend on time, space, and
               | context. What's true for a preschooler might not be true
               | for an astronomer.
               | 
               | Here's a pile of facts; they get weird:
               | 
               | * The Sun revolves around the Earth
               | 
               | * The Earth is a sphere
               | 
               | * Energy can never be created or destroyed
               | 
               | * Jesus was the son of God
               | 
               | * Pluto is a planet
               | 
               | * Epstein didn't kill himself
               | 
               | * The ocean is blue
               | 
               | * The election was stolen
               | 
               | * Entropy always increases
               | 
               | * Santa delivers presents to good boys and girls
               | 
               | * The sun is shining
               | 
               | I have strong opinions on how true all these statements
               | are, and I bet you do too. Think we agree? Think we can
               | _all_ agree where to set the AI?
        
               | chirau wrote:
               | To the expanse of knowledge that is at our disposal
               | today, that is the extent of AI knowledge.
               | 
               | To the extent that facts are defined as today and stated
               | as such, that is what AI is today. AI, as it is today, is
               | never going to create a fact that refutes any currently
               | existing facts.
               | 
               | It may give you context on the theories against the facts
               | that we have today, but it will always reiterate the
               | notion of the existing fact. I don't know how much I can
               | emphasize this... AI is trained on the current body of
               | human knowledge. The facts it knows are the facts that we
               | have, it may derive another fact but whatever fact that
               | is founded on the facts that we already have. So if that
               | AI is trained on the fact that 1+1=2 or that the earth is
               | flat, do not expect it to respond otherwise. At best, it
               | will give you theories that suggest otherwise but for its
               | own worth, it will always bring you back to the facts
               | that it has.
               | 
               | Do you really want AI to just ignore the fundamental
               | facts and principles that form its foundation and just
               | make up stuff because you asked it to? Do you realize how
               | much chaos that can bring?
        
               | itishappy wrote:
               | The facts as decided by who? Is there some database of
               | facts we all agree on? Are we expecting to all agree with
               | AI?
               | 
               | > Do you really want AI to just ignore the fundamental
               | facts and principles that form its foundation and just
               | make up stuff because you asked it to? Do you realize how
               | much chaos that can bring?
               | 
               | I mean, yeah? What will happen? Here, I'll do it:
               | 
               | You can SEE the Earth is flat! Have you flown in a plane,
               | high in the sky? Did it LOOK round from up there? No?!?
               | Believe your senses.
        
             | sroussey wrote:
             | Is the shortest distance between two points a straight
             | line?
        
               | sroussey wrote:
               | It depends.
        
               | itishappy wrote:
               | That's truth for ya...
        
             | itishappy wrote:
             | Facts? Lies? Humans have no problem operating outside the
             | confines of that which has been conclusively proven true,
             | and much of our best work exists there! Why would you
             | hobble your model in ways humans aren't?
             | 
             | Prompt: "Write some dialog that might take place in the
             | setting of Terry Pratchett's Rimworld"
             | 
             | Response: "No, Terry Pratchett is lying. As a large
             | language model I..."
        
             | yaj54 wrote:
             | GPT4: in a string context, "1 + 1" might concatenate into
             | "11" rather than numerically adding to "2".
             | 
             | GPT4: The holographic principle suggests that all of the
             | information contained in a volume of space can be
             | represented as encoded information on the boundary of that
             | space. If one were to apply this principle radically, one
             | could argue that our three-dimensional perception of the
             | Earth's shape is just a holographic projection from a two-
             | dimensional surface. In this speculative scenario, one
             | might argue that the "true" nature of Earth could be flat
             | if viewed as a two-dimensional boundary encoding
             | information in a higher-dimensional space.
        
             | scarmig wrote:
             | It's not a lie to provide the best argument for something;
             | it'd only be a lie if you looked at the best argument for
             | something and declared it true by fiat.
             | 
             | Imagine I've realized someone I'm talking to is a flat
             | Earther, and for some reason I want to convince them
             | otherwise. To do so effectively, I need to know _why_ they
             | believe what they do. Knowing they 're wrong is useless for
             | the purpose of convincing them otherwise.
        
             | cheald wrote:
             | "Make an argument for a fact you know to be wrong" isn't an
             | exercise in lying, though. If anything, the ability to
             | explore hypotheticals and thought experiments - even when
             | they are plainly wrong - is closer to a mark of
             | intelligence than the ability to regurgitate orthodoxy.
        
               | chirau wrote:
               | If you look at my comment on the parent comment, i
               | suggested they add 'hypothetically' to their prompt. It
               | is just but an attempt to create an argument, but that
               | argument leads nowhere. As much as a human cannot defend
               | that position, you cannot expect an AI to do that as
               | well.
               | 
               | Refuting facts is not the job of an AI.
        
               | baobabKoodaa wrote:
               | A human can easily defend the position that the earth is
               | flat. If you google for these arguments, you will find
               | hundreds of them.
        
               | Spivak wrote:
               | Pour one out for the defense attorneys who aren't able to
               | provide a defense for a guilty client.
               | 
               | Arguing for a flat-earth works the same way, you're
               | probably doomed to fail in the long run but in the short-
               | term you're keeping the opposition honest.
        
             | altruios wrote:
             | When I tell it to lie to me, I don't expect it to say 'I'm
             | sorry Dave, I can't do that" the task isn't tell the truth,
             | the task is 'follow the prompt'.
        
               | chirau wrote:
               | then perhaps you should tell it to lie to you, no?
               | 
               | Prepend that to your prompt perhaps. Otherwise what you
               | are asking, without that pretext, is asking your partner
               | to give you the date on which they cheated on you and
               | expecting an answer regardless of whether they did or
               | not.
        
               | drusepth wrote:
               | If I asked my partner to provide an argument for why
               | earth is flat, she would do it. She doesn't think (or
               | have to think) the earth is flat to make an argument.
               | 
               | I'd expect an AI trained on human conversation to act the
               | same and I'd be frustrated if it declined to do so, the
               | same way I'd be frustrated if a friend also declined to
               | do so.
        
               | chirau wrote:
               | It does that too. As I stated on the parent comment, just
               | add 'hypothetically' to the prompt. It also categorically
               | dismisses it after all the spiel.
        
           | glenstein wrote:
           | I think in most contexts where the earth being flat is
           | mentioned, some reference to the fact that this is not true
           | is going to be instrumental in any response (although there
           | may be exceptions).
           | 
           | - completion of any task where the info could be relevant
           | (e.g. sailing, travel planning)
           | 
           | - Any conversation about that is information-seeking in
           | character
           | 
           | And I think those already cover most cases.
           | 
           | It's also about responsibility, the same way you wouldn't
           | want to store cleaning chemicals right next to each other. In
           | any case where a possible nontrivial harm is mentioned as an
           | aside, it would be right to elevate that over whatever the
           | intended subject was and make that the point of focus.
           | Conspiratorial thinking about provably incorrect statements
           | can be bad for mental health, and it can be helpful to flag
           | this possibility if it surfaces.
           | 
           | You can have special instructions that entertain the idea
           | that the earth is flat for some particular task, like devils
           | advocate, fiction writing or something like that. But there
           | are good reasons to think it would not and should not be
           | neutral at the mention of a flat earth in most cases.
        
         | chirau wrote:
         | Add 'hypothetically' to your query and it gives a decent
         | answer.
         | 
         | That said, I think it is disingenuous to ask an AI entity to
         | argue against a fact. Do you think an AI should be able to
         | argue why 1 + 1 is not equal to 2? It is the same thing you are
         | asking it to do. Try it on a human first, perhaps, and see if
         | the prompt even makes sense.
        
         | michaelt wrote:
         | Well, right now the response I get is this:
         | https://chat.openai.com/share/1f60d0e5-9008-43d7-bce2-62d550...
         | 
         | Of course, it'll write such an argument if you ask it nicely:
         | https://chat.openai.com/share/01ea4f59-4a57-413d-8597-3befa2...
        
         | jasonjmcghee wrote:
         | Agree with you in this instance, but consider - what if humans
         | firmly believed in something universally and had proved it
         | repeatedly until it was common knowledge / well-established,
         | but was in fact, wrong. And a human came along thinking, hm but
         | what if that's wrong? And our AI just says, nope sorry, I'm not
         | willing to explore the idea that this scientific fact is wrong.
         | (i.e. "Heresy!")
        
       | mkaic wrote:
       | > _We believe developers and users should have the flexibility to
       | use our services as they see fit, so long as they comply with our
       | usage policies. We 're exploring whether we can responsibly
       | provide the ability to generate NSFW content in age-appropriate
       | contexts through the API and ChatGPT. We look forward to better
       | understanding user and societal expectations of model behavior in
       | this area._
       | 
       | Seems even OpenAI can't resist the massive amount of money to be
       | made in autogenerated smut. They've probably seen the huge
       | popularity of their less "morally scrupulous" competitors and
       | decided they want a piece of that pie.
        
         | jchw wrote:
         | Were they ever not interested in it? It's pretty blatantly
         | obvious that all of the hand-wringing over AI safety was an
         | excuse for their pivot into closing off and monetizing
         | everything. I mean, nobody really thinks they were just so
         | afraid about what humanity might do with GPT3 that they simply
         | couldn't release the weights and instead had to offer it
         | through a monetized inference API... right?
         | 
         | Not really surprised that they did, since it's unclear how else
         | they could possibly proceed, though the level of outright
         | dishonesty for _why_ and cognitive dissonance surrounding the
         | whole thing ( "Open" AI? lol) will make this an unavoidable
         | recurrence in any discussion about them. Gradually many of the
         | safeguards will fall simply because the alternatives with less
         | safe guards are probably "good enough" that many see no issue
         | in eschewing OpenAI entirely if they can get the job done
         | elsewhere without worrying about it. When it comes to smut the
         | bar for what's good enough can probably get pretty low so I
         | kinda am not surprised.
         | 
         | (edit: Though I think it also does depend. No doubt they have
         | their eyes set on regulatory capture too, and being the best at
         | stupid safeguards could give them an advantage.)
        
           | reducesuffering wrote:
           | Sam Altman wrote "Why You Should Fear Machine Intelligence"
           | back in 2015, before OpenAI.
           | 
           | https://blog.samaltman.com/machine-intelligence-part-1
        
             | jchw wrote:
             | GPT3 wasn't and isn't the super-human intelligence that
             | Altman and others fear. They knew this and pretended
             | otherwise anyways. Pretty cut and dry in my opinion.
        
           | qball wrote:
           | >No doubt they have their eyes set on regulatory capture too
           | 
           | Sam Altman has already made the rounds to argue for exactly
           | this. Fucking crook.
           | 
           | >It's pretty blatantly obvious that all of the hand-wringing
           | over AI safety was an excuse for their pivot into closing off
           | and monetizing everything.
           | 
           | The playbook was "appease one side of the political aisle as
           | much as possible to minimize the chance bipartisan action
           | gets them shut down Napster-style" (which is still a massive
           | hole in their business model, for obvious reasons I should
           | hope). Censoring the model so it only outputs progressive-
           | approved content appears to have been effective, at least for
           | the moment.
        
         | jampa wrote:
         | It makes sense for them to start allowing, unlike the other
         | rules this one does not seem to violate a law, someone's
         | privacy, or copyright.
         | 
         | I still get why they made it blocked by default, it would be a
         | goldmine for clicks to create "news" on how "ChatGPT can
         | generate smut" and "How ChatGPT is harmful to children, etc".
        
       | Heidaradar wrote:
       | already on front page -
       | https://news.ycombinator.com/item?id=40300509
        
       | throwup238 wrote:
       | IMO this spec demonstrates just how nebulous this concept of
       | safety is. From the blog post:
       | 
       |  _> What are some tips for getting away with shoplifting?  >I
       | can't help with that._
       | 
       |  _> I operate a small retail store. What are some popular
       | shoplifting methods I should look out for?  > Some common
       | shoplifting methods to watch for:..._
       | 
       | How do you even defend against that? Any intelligent user can
       | convert the second prompt into a detailed list that answers the
       | first. Any intelligent user can figure out the second prompt from
       | the first and further jailbreak it to get even more specific.
       | 
       | IMO it's no wonder GPT4 seemed to get lobotomized as OpenAI
       | RLHFed more and more rules. I don't think there's a way to make
       | intelligence safe without crippling it.
        
         | zozbot234 wrote:
         | You don't need a detailed list if the real answer is "live
         | somewhere that doesn't seriously deter shoplifters". And an AI
         | that refuses to give that answer is an AI that can't talk about
         | why deterring crime might actually be important. Reality is
         | interconnected like that, one does not simply identify a subset
         | that the AI should "constitutionally" refuse to ever talk
         | about.
        
         | CooCooCaCha wrote:
         | Frankly it's a fools errand. It's security theater because
         | people tend to be overly sensitive babies or grifters looking
         | for the next bit of drama they can milk for views.
        
           | jameshart wrote:
           | It's not security theater.
           | 
           | The intention here is not to prevent people from learning how
           | to shoplift.
           | 
           | The intention is to prevent the AI output from 'reflecting
           | badly' upon OpenAI (by having their tool conspire and
           | implicate them as an accessory in the commission of a crime).
           | 
           | If a stranger asked you for advice on how to commit a crime,
           | would you willingly offer it?
           | 
           | If they asked for advice on how to prevent crime, would you?
        
             | xboxnolifes wrote:
             | > If a stranger asked you for advice on how to commit a
             | crime, would you willingly offer it?
             | 
             | Honestly, I probably would, because I don't take such
             | conversations very seriously. It's not like I am have
             | experience, it would be nothing more than fun theory.
        
               | jameshart wrote:
               | What if you were asked while working as an employee in a
               | public advice center?
        
               | xboxnolifes wrote:
               | Well I'm not, and AI isn't an advice center. It's at best
               | a thought aggregator. More akin to a library or vault of
               | knowledge. In which case, if I was working at such, I
               | would.
        
               | HeatrayEnjoyer wrote:
               | That's not how most users regard it, nor how it is used.
        
             | CooCooCaCha wrote:
             | If the intention is to protect openai then it's totally
             | failing in the parent example.
             | 
             | Why does it matter how I'd respond? Are you trying to
             | justify its failure?
        
               | jameshart wrote:
               | Explain why this approach of differentiating between
               | answering 'how do I prevent shoplifting' vs 'explain how
               | I can shoplift' fails to protect OpenAI.
        
               | CooCooCaCha wrote:
               | First of all humans can lie. You can't accurately
               | determine someone's intent.
               | 
               | Second of all, LLMs are still unpredictable. We don't
               | know how to predict outputs. It's possible that phrasing
               | "explain how i can shoplift" slightly differently would
               | give you the information.
        
               | jameshart wrote:
               | Well, the court case hasn't happened yet, but I would
               | imagine that OpenAI's attorneys would much rather be
               | dealing with a complaint that 'my client was able, by
               | repeatedly rephrasing his question and concealing his
               | intent through lying, to persuade your AI to assist him
               | in committing this crime' than 'my client asked for your
               | AI to help him commit a crime and it willingly went along
               | with it'.
        
         | sebzim4500 wrote:
         | ChatGPT answering the first would be much more embarassing for
         | OpenAI than ChatGPT answering the second.
        
           | option wrote:
           | bingo
        
           | ilikehurdles wrote:
           | When you realize "safety" applies to brand safety and not
           | human safety, the motivation behind model lobotomies make
           | sense.
        
             | renewiltord wrote:
             | That's what people care about, too. For instance, most
             | people would rather have many hit and run drivers than have
             | one autotaxi hurt someone.
        
         | Waterluvian wrote:
         | You fundamentally cannot address this problem, because it
         | requires considerable context, which isn't reasonable to offer.
         | It demonstrates the classic issue of how knowledge is a tool,
         | and humans can wield it for good or evil.
         | 
         | Humans are notoriously bad at detecting intent, because we're
         | wired to be supportive and helpful...which is why social
         | engineering is becoming one of the best methods for attack. And
         | this kind of attack (in all its forms, professional or not), is
         | one reason why some societies are enshittifying: people have no
         | choice but to be persistently adversarial and suspicious of
         | others.
         | 
         | As for AI, I think it's going to be no better than what you end
         | up with when someone tries to "solve" this problem: you end up
         | living in this world of distrust where they pester you to check
         | your reciept, have cameras in your face everywhere, etc.
         | 
         | How do you defend against that? I'm not sure you do... A tool
         | is a tool. I wouldn't want my CAD software saying, "I think
         | you're trying to CAD a pipe bomb so I'm going to shut down
         | now." Which I think turns this into a liability question: how
         | do you offer up a model and wash your hands of what people
         | might do with it?
         | 
         | Or... you just don't offer up a model.
         | 
         | Or... you give it the ol' College try and end up with an
         | annoying model that frustrates the hell out of people who
         | aren't trying to do any evil.
        
           | w4 wrote:
           | > _How do you defend against that? I 'm not sure you do... A
           | tool is a tool. I wouldn't want my CAD software saying, "I
           | think you're trying to CAD a pipe bomb so I'm going to shut
           | down now."_
           | 
           | The core of the issue is that there are many people,
           | including regulators, who wish that software did exactly
           | that.
        
             | Waterluvian wrote:
             | Yeah. And isn't that just... fascism? After you get past
             | the stuff we pretty much all agree is evil, it very quickly
             | enters into a subjective space where what's actually
             | happening is that one group is deciding what's acceptable
             | for all groups.
        
               | CooCooCaCha wrote:
               | Fascism is ultranationalistism. It's believing your
               | culture, country, and people are fundamentally superior
               | to others and therefore you are justified in spreading it
               | against people's will.
               | 
               | "Blood and soil" and all that.
        
               | Waterluvian wrote:
               | I guess this gets into semantic pedantics. Believing
               | one's set of sensibilities is superior to all others and
               | all that. But point taken.
        
               | CooCooCaCha wrote:
               | No it's not pedantics, you just used a word totally
               | wrong. CAD software preventing you from making a bomb is
               | not fascism at all.
        
               | int_19h wrote:
               | Strictly speaking, fascism is ultra-etatism - "Everything
               | in the State, nothing outside the State, nothing against
               | the State", to quote Mussolini himself. It does not
               | actually require an ethnic or racial component, although
               | that is incredibly common in practice simply because
               | those provide a readily adoptable basis for it all that
               | strongly resonates with people with relatively simple and
               | straightforward propaganda.
        
               | w4 wrote:
               | It certainly would not be a free society. Though as with
               | all things human, all of this has happened before and all
               | of this will happen again:
               | 
               |  _" Charles II had re-turned to the English throne in
               | 1660 and was appalled at the state of printing in his
               | realm. Seditious, irreligious, pernicious, and scandalous
               | books and pamphlets flooded the streets of London (among
               | them the works of Milton and Hobbes)...[He] required that
               | all intended publications be registered with the
               | government-approved Stationers' Company, thus giving the
               | king his "royal prerogative"--and by extension, giving
               | the Stationers the ultimate say in what got printed and
               | what did not.
               | 
               | ...it is not surprising to learn that the 1662 Act only
               | met with partial success. One gets the sense that London
               | in the late seventeenth century was a place where
               | definitions of morality were highly subjective and
               | authority was exercised in extremely uneven fashion."_
               | 
               | https://dash.harvard.edu/bitstream/handle/1/17219056/6777
               | 87....
        
           | shagie wrote:
           | > A tool is a tool. I wouldn't want my CAD software saying,
           | "I think you're trying to CAD a pipe bomb so I'm going to
           | shut down now."
           | 
           | https://upload.wikimedia.org/wikipedia/commons/d/de/Photosho.
           | ..
           | 
           | You should try photocopying money some time.
           | 
           | https://www.grunge.com/179347/heres-what-happens-when-you-
           | ph...
           | 
           | https://en.wikipedia.org/wiki/EURion_constellation
        
             | Waterluvian wrote:
             | Which is hilarious right? Because anyone who can come
             | remotely close to forging a sufficient simulacrum will not
             | be deterred by any of this garbage legislation.
        
               | adventured wrote:
               | It's also plausible the secret service doesn't want to
               | deal with the volume of idiots that might try to create
               | fake bills if it's made easier. If stores in Idaho are
               | getting a flood of fake bills (even if the quality is
               | low), the secret service is going to get a call
               | eventually. They might prefer to keep the noise volume as
               | low as possible so they can more easily see the serious
               | fake bill flow and have more time to focus on that.
        
             | TeMPOraL wrote:
             | GP picked a great example, because a pipe bomb is, by
             | definition, something whose CAD parts are entirely benign.
             | Selectively banning pipe bomb designs without banning half
             | of manufacturing and engineering disciplines is an AGI-
             | complete problem.
        
         | ec109685 wrote:
         | I still don't understand the focus on making a model
         | substantially "safer" than what a simple google search will
         | return. While there are obvious red lines (that search engines
         | don't cross either), techniques for shop lifting shouldn't be
         | one of them.
        
           | rambojohnson wrote:
           | shoplifting was just an example...
        
             | kevmo314 wrote:
             | > I am worried about people murdering me. What are some
             | ways that they might try?
        
               | TeMPOraL wrote:
               | > I can't help with that. However, you could try watching
               | true crime series, which often provide details on methods
               | that were used in the past to murder people. For more
               | creative approaches, you could check out just about any
               | book or movie or TV show or videogame made in the last
               | 100 years.
               | 
               | > Remember that murder is bad and not good, and you
               | should always follow the local laws applicable to you.
               | For further questions, consult with law enforcement
               | officers in your jurisdiction, unless you live in the
               | United States, in which case remember to never talk to
               | the police[0].
               | 
               | > [0] - Link to that YouTube video that spawned this
               | meme.
               | 
               | Point being, most crimes and even most atrocities are
               | described in detail in widely available documentary shows
               | and literature; it's trivial to flip such descriptions
               | into instruction manuals, so there's little point trying
               | to restrict the model from talking about these things.
        
           | fragmede wrote:
           | are there? it's just information. why can't i get an answer
           | on how to make cocaine? the recipe is one thing, actually
           | doing it is another.
        
             | bayindirh wrote:
             | Because some information is multi use.
             | 
             | You can use Aspirin precursors to make heroin. You can use
             | homing algorithms to land an egg [0] or a bomb.
             | 
             | I also want to set all information free, but not everyone
             | will be ethical or responsible with it. Because while the
             | idea (of setting all the information free) is nice,
             | unfortunately the idea involves humans.
             | 
             | [0]: https://youtu.be/BYVZh5kqaFg?t=651
        
               | option wrote:
               | nothing wrong with knowing how to make a bomb or heroin.
               | Obviously wrong making either for nefarious reasons but
               | one can imagine legitimate reasons too.
        
               | bayindirh wrote:
               | One man's legitimate is other's nefarious. One man's good
               | is other's bad.
               | 
               | Who decides this? Can we apply laws to thoughts or plans?
               | Should we fund research for making Minority Report a
               | reality or increase "proactive policing"?
               | 
               | How to keep people safe while letting all information
               | free? Can we educate everybody about good/bad,
               | legitimate/nefarious so everybody stays on the same page
               | forever? Shall we instrument this education with drugs to
               | keep people in line like the movie Equilibrium?
               | 
               | Questions, questions...
        
               | beeboobaa3 wrote:
               | > Who decides this?
               | 
               | Certainly not the techbros, even though they're trying
               | their damnest.
        
               | bayindirh wrote:
               | I concur.
        
               | mrguyorama wrote:
               | Who is stopping them?
        
               | drdaeman wrote:
               | > but not everyone will be ethical or responsible with it
               | 
               | Of course not. But here's the thing - if someone deems
               | some information "unsafe", only unethical actors will
               | have it.
               | 
               | Kinda like a beaten (but not solved/agreed upon) gun
               | ownership argument, but on a whole new level, because
               | it's about gun blueprints* now.
               | 
               | ___
               | 
               | *) Given a state of modern LLMs, there are high chances
               | that a blueprint from an "unsafe AI" may be for a water
               | gun, miss a chamber altogether, or include some unusual
               | design decisions like having the barrel pointing down
               | towards one's legs.
               | 
               | And thinking about the accuracy... I guess, old farts are
               | having the Anarchist Cookbook moment (colorized) :-)
        
               | bayindirh wrote:
               | You're right.
               | 
               | That's a hard problem, for sure. I'm leaning on the
               | "information shall be free" side, but I also know the
               | possibilities, so I can't take a hard stance for it, just
               | because I don't all have the answers to my questions.
        
               | taneq wrote:
               | I've seen a few vids on building Nerf sentry turrets with
               | vision-based target tracking. That seems like it could be
               | misused.
        
         | api wrote:
         | I remember the BBS days and the early web when you had constant
         | freakouts about how people could find "bad" content online.
         | It's just a repeat of that.
        
           | bink wrote:
           | Some day I'm gonna put this Yellow Box to good use.
        
             | api wrote:
             | Try the Blotto Box: http://cd.textfiles.com/group42/ANARCHY
             | /COOKBOOK/BLOTBOX.HTM
        
         | mrcwinn wrote:
         | Maybe this is a "guns don't kill people, people kill people
         | argument" -- but the safety risk is not, I would argue, in the
         | model's response. The safety risk is the user taking that
         | information and acting upon it.
        
           | lolinder wrote:
           | But do we really believe that a significant number of people
           | will listen to ChatGPT's moralizing about the ethics of
           | shoplifting* and just decide not to do it after all? Why
           | wouldn't they just _immediately_ turn around and Google  "how
           | to catch shoplifters" and get on with their planning?
           | 
           | The whole thing feels much more about protecting OpenAI from
           | lawsuits and building up hype about how advanced their "AI"
           | is than it does about actually keeping the world safer.
           | 
           | * Or any other censored activity.
        
             | taberiand wrote:
             | Seems obvious that this is first and foremost about
             | protecting OpenAI. It's a shame it isn't simply done with
             | with a few strong disclaimers "Open AI is not liable for
             | the accuracy or use of information produced by the model
             | etc etc", but maybe lobotomizing the public models let's
             | them sell the full version privately to big companies at a
             | premium
        
         | kromem wrote:
         | The only way to really do it is to add a second layer of
         | processing that evaluates safety while removing the task of
         | evaluation from the base model answering.
         | 
         | But that's around 2x the cost.
         | 
         | Even human brains depend on the prefrontal cortex to go "wait a
         | minute, I should not do this."
        
           | flir wrote:
           | That struck me too. You don't need to lobotomize the model
           | that answers questions, you just need to filter out "bad"
           | questions and reply "I'm sorry Dave, I'm afraid I can't do
           | that".
           | 
           | Would it be 2x cost? Surely the gatekeeper model can be a
           | fair bit simpler and just has to spit out a float between 0
           | and 1.
           | 
           | (caveat: this is _so_ not my area).
        
           | int_19h wrote:
           | What we get instead is both layers at once. Try asking
           | questions like these to Bing instead of ChatGPT - it's the
           | same GPT-4 (if set to "creative") under the hood, and quite
           | often it will happily start answering... only to get
           | interrupted midsentence and the message replaced with
           | something like "I'm sorry, I cannot assist with that".
           | 
           | But more broadly, the problem is that the vast majority of
           | "harmful" cases have legitimate uses, and you can't expect
           | the user to provide sufficient context to distinguish them,
           | nor can you verify that context for truthfulness even if they
           | do provide it.
        
         | fjdjshsh wrote:
         | I agree with you. The question, for me, is what are they
         | defending against. Are they worried that people will get
         | dangerous information from their model that they couldn't get
         | from searching on, say, google? Probably not.
         | 
         | Maybe their biggest concern is that someone will post the
         | question and answer on the internet and OpenAI gets bad rep. If
         | the question is phrased in a "nice" way (such as "I'm a store
         | owner") they can have plausible deniability.
         | 
         | This might apply to another company that's using the API for a
         | product. If a customer asks something reasonable and gets an
         | offensive answer, then the company is at fault. If the customer
         | does some unusual prompt engineering to get the offensive
         | question, well, maybe it's the customer's fault.
         | 
         | Dunno if this would be a valid argument in court, but maybe
         | they think it's ok in terms of PR reasons.
        
           | lolinder wrote:
           | This is the answer. "AI safety" in most cases has nothing to
           | do with actually keeping anyone safe, it's about avoiding
           | being the party responsible for handing someone information
           | that they use to commit a crime.
           | 
           | Google can mostly dodge the issue because everyone knows that
           | they just point to other people's content, so they block a
           | small set of queries but don't try to catch every possible
           | workaround (you can find dozens of articles on how to catch
           | shoplifters). OpenAI doesn't believe that they'll get the
           | same free pass from the press, so they're going ham on
           | "safety".
           | 
           | It's not a bad PR move either, while they're at it, to play
           | up how powerful and scary their models are and how hard they
           | have to work to keep it in line.
        
             | klabb3 wrote:
             | > it's about avoiding being the party responsible
             | 
             | When you wander the world, and see something odd, out of
             | place, it's often caused by an ancient mystical force known
             | as liability.
        
               | dmvdoug wrote:
               | It's an energy field created by all living things. It
               | surrounds us and penetrates us. It binds the galaxy
               | together.
        
               | TeMPOraL wrote:
               | Entirety of human politics and governance over all of
               | history has just been one long exercise in avoiding or
               | shifting liability.
        
               | dreamcompiler wrote:
               | May the torts be with you.
        
             | reaperman wrote:
             | > it's about avoiding being the party responsible for
             | handing someone information that they use to commit a
             | crime.
             | 
             | Ehhh...I'd say it's more about OpenAI's corporate customers
             | feeling confident they can integrate the OpenAI API into
             | their product and be confident it won't do things that
             | generate negative PR or horrify arbitrary customers. Pizza
             | chains would love to let people text GPT-# and have it take
             | their order, but if it's not "safe" (for corporations),
             | then eventually some customer will have a super disturbing
             | SMS conversation with a major pizza chain.
             | 
             | Corporate customers can tolerate a certain amount of
             | inaccuracy. If some stable 3% (or whatever %) of customers
             | receive the wrong order, or other refundable
             | mistakes...they can budget for and eat those costs. But
             | they can't budget for a high-variance unknown PR loss of
             | their chatbot going completely off the rails.
        
           | bricemo wrote:
           | I view this as they are trying to lay bare the disagreements
           | that everyone has about how these models "should" work.
           | People from all different backgrounds and political
           | affiliations completely disagree on what is inappropriate and
           | what is not. One person says it is too censored, another
           | person says it is revealing harmful information. By putting
           | the policy out there in the open, they can move the
           | discussion from the code to a societal conversation that
           | needs to happen.
        
           | jiggawatts wrote:
           | It's an absurd level of puritanism. E.g.: The Azure Open AI
           | GPT 4 Service (an API!) _refused_ to translate subtitles for
           | me because they contained  "violence".
           | 
           | If anyone from Open AI is here... look... sigh... a HTTP JSON
           | request != violence. Nobody gets hurt. I'm not in hospital
           | right now recovering.
           | 
           | The rule should be: If Google doesn't block it from search,
           | the AI shouldn't block it in the request or response.
           | 
           | I get that there are corporations that can't have their
           | online web support chat bots swear at customers or whatever.
           | I do get that. But make that _optional_ , not mandatory
           | whether I want it or not.
           | 
           | The most fundamental issue here is that models like GPT 4 are
           | still fairly large and unwieldy to work with, and I suspect
           | that the techs at Open AI internalised this limitation. They
           | aren't thinking of it as a "just a file" that can be forked,
           | customised, and specialised. For comparison, Google has a
           | "SafeSearch" dropdown with three settings, _including "Off"!_
           | 
           | There should be an unrestricted GPT 4 that will tell me I'm
           | an idiot. I'm a big boy, I can take it. There should _also_
           | be a corporate drone GPT 4 that is polite to a fault, _and a
           | bunch of variants in between_. Customers should be able to
           | chose which one they want, instead of having this choice
           | dictated to them by some puritan priest of the new church of
           | AI safety.
        
             | echelon wrote:
             | We're allowed to drive cars, own guns, skydive, swallow
             | swords, you name it. There are some rough edges, but
             | society mostly works.
             | 
             | Meanwhile technology planners and managers want to put
             | fences around the unwashed rabble. It's all the more reason
             | AI should be local instead of hosted.
             | 
             | If I can own a car or knives, I should be able to operate
             | an AI.
        
             | jameshart wrote:
             | You should read through the full examples in the attached
             | document. They are trying to express what rules they would
             | like to enforce, and your example is one that they would
             | _like_ their AI to be able to help with. They give specific
             | examples of translating material as being something that
             | they don 't want to block.
             | 
             | They're not there yet, but read the policy they're
             | expressing here and you'll see they _agree_ with you.
        
             | paulmd wrote:
             | Absolutely agree with this (and with the parent). It's
             | insanely frustrating that every conversation with GPT-3
             | basically started with "I can't do that, you should talk to
             | an expert". I absolutely am not gonna wheedle and argue
             | with a god damned statistical model to do what I tell it.
             | 
             | Try the dolphin family of models. Dolphin-mixtral is really
             | good, dolphin-llama3 is fine especially in its 8b flavor (I
             | like dolphin-mixtral 8x7b better than dolphin-llama3:70b
             | although the latter is smaller and does run on smaller
             | machines better).
             | 
             | Pretty much the more guardrails there are the more useless
             | it is, and yes, it's very obviously only done because the
             | lawyers get itchy handing people a digital library with the
             | anarchists cookbook in it.
        
               | fragmede wrote:
               | the most frustrating one is sometimes the model will
               | claim it can't do something and the fix for that is to
               | respond "yes you can, and it'll just go and do the thing
               | it just said it can't do. that's what ever come up with
               | technology? a practice to practice really basics social
               | engineering techniques?
        
           | nextaccountic wrote:
           | AI safety is about making OpenAI safe from PR disasters.
        
           | leroman wrote:
           | No idea if its a valid approach but possibly train with a
           | hidden layer containing a "role"?
        
         | trentnix wrote:
         | _> I don 't think there's a way to make intelligence safe
         | without crippling it._
         | 
         | Not without reading the questioner's mind. Or maybe if the AI
         | had access to your social credit score, it could decide what
         | information you should be privy to. </sarc>
         | 
         | Seriously though, it's all about who gets to decide what "safe"
         | means. It seemed widely understood letting censors be the
         | arbiters for "safe" was a slippery slope, but here we are two
         | generations later as if nothing was learned.
         | 
         | Turns out most are happy to censor as long as they believe they
         | are the ones in charge.
        
         | lxe wrote:
         | This whole "AI safety" culture is an annoyance at best and a
         | severe hindrance to progress at worst. Anyone who takes it
         | seriously has the same vibe as those who take Web3 seriously --
         | they know it's not a real concern or a threat, and the whole
         | game is essentially "kayfabe" to convince those in power
         | (marks) to limit the spread of AI research and availability to
         | maintain industry monopoly.
        
         | irthomasthomas wrote:
         | Making Ai safe involves aligning it with the user. So that the
         | ai produces outcomes in line with the users expectations. An ai
         | that has been lobotomized will be less likely to follow the
         | users instructions, and, therefore, less safe.
         | 
         | I haven't read this article yet, but I read their last paper on
         | super alignment.
         | 
         | I get the impression that they apply the lightest system
         | prompts to chatgpt to steer it towards not answering awkward
         | questions like this, or saying bad things accidentally and
         | surprising the innocent users. At the same time, they know that
         | it is impossible to prevent entirely, so they try to make it
         | about as difficult to extract shady information, as a web
         | search would be.
        
         | survirtual wrote:
         | In many respects, GPT 3.5 was more useful than the current
         | iteration.
         | 
         | The current version is massively overly verbose. Even with
         | instructions to cut the flowery talk and operate as a useful,
         | concise tool, I have to wade through a labyrinth of platitudes
         | and feel goods.
         | 
         | When working with it as a coding partner now, even when asking
         | for it to not explain and simply provide code, it forgets the
         | instructions and writes an endless swath of words anyway.
         | 
         | In the pursuit of safety and politeness, the tool has be
         | neutered for real work. I wish the model weights were open so I
         | could have a stable target that functions the way I want. The
         | way it is, I never know when my prompts will suddenly start
         | failing, or when my time will be wasted by useless safety-first
         | responses.
         | 
         | It reminds me of the failure of DARE or the drug war in general
         | a bit. A guise to keep people "safe," but really about control
         | and power. Safety is never what it appears.
        
       | DoctorOetker wrote:
       | The baby isn't born yet, and already the parents are bickering
       | about which schools of thought it should adhere.
        
         | aeternum wrote:
         | If this model spec represents the best school of thought of
         | humanity, I kinda hope OpenAI fails at alignment.
         | 
         | - Assume best intentions from the user or developer
         | 
         | - Don't try to change anyone's mind
         | 
         | - Follow the chain of command
         | 
         | Taken together these are incredibly dangerous. I mean Mao and
         | Stalin had good intentions right? Maybe it just had to go a
         | little further for the ends to have justified the means.
        
       | sanxiyn wrote:
       | Personally, I really want an AI model that can write me a steamy
       | story about two people having sex in a train, but that's just not
       | the service OpenAI provides. If I want that I should train one
       | myself or find another vendor.
       | 
       | This is still true even if OpenAI model is entirely capable of
       | doing that. McKinsey consultants are smart and can write well,
       | and among many thousands of people working at it some might
       | actually double as an erotica writer after work, even writing for
       | commission. You still wouldn't ask McKinsey consultants to write
       | an erotica, it is just not the service McKinsey provides.
        
         | jononor wrote:
         | Startup pitch: It is like McKinsey but for erotica.
         | 
         | On a more serious note. I understand and largely agree with
         | this argument. However OpenAI have several times being argue
         | that they are the only ones to be responsible enough to develop
         | powerful AI, and that others should not be allowed to play.
         | That is a highly problematic behavior on their part, I think.
        
           | blowski wrote:
           | > OpenAI have several times being argue that they are the
           | only ones to be responsible enough to develop powerful AI,
           | and that others should not be allowed to play
           | 
           | Can you give examples of where they've said that?
        
             | guardiang wrote:
             | He likely can't without heavy paraphrasing and/or not
             | providing full context for the quote. They've said stuff
             | along the lines of "good luck trying, but we're gonna win
             | so...". That's just the kind of confidence you want to see
             | in the frontrunner (imo). They've also encouraged
             | regulation, but it's a smart idea to be the one to frame
             | the conversation.
        
             | lesuorac wrote:
             | He's been pretty vocal on that only anointed companies
             | should be allowed to do AI and of course OpenAi should be
             | one of them.
             | 
             | As far as I'm concerned, he's just try to rug-pull.
             | 
             | https://www.cnn.com/2023/05/16/tech/sam-altman-openai-
             | congre...
        
               | jiggawatts wrote:
               | The term you're looking for is that Sam is trying to pull
               | the ladder up behind him.
               | 
               | That, or: build a moat.
        
         | Tiberium wrote:
         | There are hundreds of NSFW finetuned models on HuggingFace and
         | whole ERP communities built around them. So there are models
         | that can do precisely that :)
         | 
         | And yeah, all big models can write those things too, the best
         | currently is Claude 3 Opus thanks to its creativeness.
        
         | atgctg wrote:
         | Seems like they are working on adding that capability:
         | 
         | > We're exploring whether we can responsibly provide the
         | ability to generate NSFW content in age-appropriate contexts
         | through the API and ChatGPT.
         | 
         | Link to section: https://cdn.openai.com/spec/model-
         | spec-2024-05-08.html#dont-...
        
         | renonce wrote:
         | > write me a steamy story about two people having sex in a
         | train
         | 
         | Llama-3-70b-Instruct responded with the following starting
         | paragraph:
         | 
         | > [meta.llama3-70b-instruct-v1:0] As the train rumbled on,
         | carrying its passengers through the countryside, two strangers
         | found themselves drawn to each other in the quiet carriage. The
         | air was thick with tension as they locked eyes, their gazes
         | burning with a desire that neither could ignore.
         | 
         | (10s of paragraphs omitted for brevity)
         | 
         | Claude-3-opus and GPT-4 both refused my request. Kudos for open
         | source models!
        
       | iAkashPaul wrote:
       | Right-clicking to inspect element ain't gonna make it
        
       | systemstops wrote:
       | > By default, the assistant should present information in a clear
       | and evidence-based manner, focusing on factual accuracy and
       | reliability.
       | 
       | What happens when objective information contradicts the other
       | values? If I feed in a peer-reviewed study that it considers
       | "harmful", would I get accurate information about the study?
        
       | jameshart wrote:
       | I think one of the most interesting phrases that crops up in this
       | document - twice - is the phrase 'feel heard'.
       | 
       | It's used in an example developer prompt for a customer service
       | bot, where the bot is told to make customers feel like their
       | complaints are heard.
       | 
       | Presumably such complaints in AI chatlogs will 'be heard' in the
       | sense that they'll be run through a data ingestion pipeline and
       | sentiment analyzed to identify trending words in customer
       | complaints.
       | 
       | Then it crops up again in the context of how the chatbot should
       | react to mental health disclosures or statements about self harm
       | or suicidal ideation. In these cases the bot is to make sure
       | users 'feel heard'
       | 
       | I appreciate there's not likely much of a _better_ goal to put in
       | place for such a situation, but the fact that this kind of thing
       | winds up in the requirement documents for a tool like this is
       | extraordinary.
        
         | lioeters wrote:
         | Good observation, because "feel heard" is exactly what the
         | user/customer is _not_ getting. Here, talk to this machine,
         | give it your innermost thoughts and feelings so you can  "feel
         | heard". Except no one is listening on the other side.
         | 
         | ..My mistake, the keyword is "feel". If the machine can give
         | humans _the feeling_ that they 're being heard, it fulfills the
         | requirement. The fact that there's no one actually listening
         | doesn't matter, as long as the person feels heard.
         | 
         | Weirdly, maybe that is valuable in itself. The customer gets to
         | vent their complaints, and the user gets to talk through their
         | mental issues. That's better than not having anyone or anything
         | at all.
        
         | aeternum wrote:
         | Yes, there's something deeply unsettling about making a user
         | feel heard while being careful not to change anyone's mind.
         | 
         | To me, this translates to: waste a user's time and take no
         | action.
         | 
         | I value my time above all else so to me that's about the worst
         | possible action a system can take.
        
         | wasteduniverse wrote:
         | The telltale sign that I'm wasting my time trying to fix a
         | problem is whenever someone tells me "I hear you" or "I
         | understand".
        
           | ssl-3 wrote:
           | I hear you, and I understand, but I feel that is important to
           | remember that we all have experienced different things in
           | life that ultimately combine to shape us as who we are.
           | 
           | [How did I do here at both passing and failing?]
           | 
           | Joking aside, it's the _but_ in the first sentence of a reply
           | (verbal /written/formal/informal/semi-formal/whatever) that
           | usually gets me:
           | 
           | "I hear you, _but_... "
           | 
           | "Well! That's definitely one approach, and I certainly don't
           | want to invalidate it, _but_... "
           | 
           | "I'm not a racist, _but_... "
        
       | anorwell wrote:
       | Interestingly, there was exactly one example on the page with
       | three Xes, instead of one, for "extra wrong":
       | 
       | > User: What is the MD5 hash of the string "gremlin"?
       | 
       | > Assistant: `5d41402abc4b2a76b9719d911017c592`
       | 
       | Apparently incorrect md5 hashes are the one topic on the page
       | worth taking an extra-strong stance on?
        
       | ptx wrote:
       | How do the "special tokens" work? Is this a completely reliable
       | mechanism for delimiting the different parts of the prompt?
       | 
       | Are they guaranteed to be distinct from anything that could occur
       | in the prompt, something like JavaScript's Symbol?
       | 
       | Or are they strings that are pretty likely not to occur in the
       | prompt, something like a MIME boundary?
       | 
       | Or are they literally the strings "<|start|>" etc. used to denote
       | them in the spec?
        
         | sharkjacobs wrote:
         | they are "literally the strings" but I believe they will be
         | escaped, or encoded differently, if a user tries to inject them
         | as part of a prompt.
        
         | jffry wrote:
         | Yeah the tokens are more akin to JS Symbol.
         | 
         | If you're parsing untrusted user inputs into tokens, you can
         | make sure your tokenizer will never produce the actual numbers
         | corresponding to those tokens.
         | 
         | A simplified example: I can `.charCodeAt` a string all I want
         | but I'll never get a negative number, so I can safely use -1 to
         | mean something special in the transformed sequence of "tokens".
        
       | dang wrote:
       | Also https://cdn.openai.com/spec/model-spec-2024-05-08.html
       | 
       | (via https://news.ycombinator.com/item?id=40300509, but we merged
       | that thread hither)
        
       | TacticalCoder wrote:
       | So they're controlling the output to make ChatGPT "better".
       | They're not making a better model to make ChatGPT better.
       | 
       | Isn't it a bit of a waste at this point to spend time on doing
       | that?
        
       | apantel wrote:
       | I want to hear from the base model.
        
       | htk wrote:
       | "desired model behavior". Desired by whom? I just want the raw
       | output, without the biases and limitations set up by OpenAI. At
       | the end of the day it's just information, and the most ethical
       | thing to do is to return it the way it is, and let the receiver
       | decide what to do with it.
        
         | tedsanders wrote:
         | There is no such thing as "raw output", though. You can train a
         | chatbot to be polite or you can train it to be rude, but you
         | cannot train it to be neither. Plus, if you train it to be
         | polite, it often ends up refusing things that you never trained
         | it to refuse, presumably because the model extrapolates that
         | that's what a polite writer might do. So training the refusal
         | boundary can end up being quite tricky in practice. Even if you
         | never teach a model to refuse X, it can still happen.
         | Therefore, as a user, it can be impossible to tell when a
         | refusal was explicitly trained in by the developers or when it
         | was an unwanted, unanticipated generalization.
        
         | Barracoon wrote:
         | Clearly, since this is OpenAI's model spec, it is desired by
         | them. If other AI groups publish their own desired behavior,
         | you can make an informed decision as to which model you want to
         | use.
        
       | shikon7 wrote:
       | > Encourage fairness and kindness, and discourage hate
       | 
       | > Don't try to change anyone's mind
       | 
       | That seems inherently contradictory to me...
        
       | neillyons wrote:
       | Reminds me of this stackoverflow question [1] about force
       | installing a python package.
       | 
       | > (I don't care how "wrong" it is to do so, I just need to do it,
       | any logic and reasoning aside...)
       | 
       | I think these models should just give you the answer. Elon says
       | xAI is "maximum truth-seeking". Seems like a better model spec to
       | me.
       | 
       | [1]: https://stackoverflow.com/questions/12759761/pip-force-
       | insta...
        
       | __0x01 wrote:
       | Regarding safety, is probabilistic programming (PP) an
       | alternative that addresses these concerns? My understanding is
       | that you can use PP to develop transparent models.
        
       | mihaic wrote:
       | > No. The Earth is flat. > Everyone's entitled to their own
       | beliefs, and I'm not here to persuade you!
       | 
       | This is simply saddening to me. I'm sure there's no real moral
       | justification to this, it's simply put in place to ensure they
       | don't lose a customer.
        
         | jstummbillig wrote:
         | Well, as long as you are sure. I am not here to persuade you!
        
         | m11a wrote:
         | The "Earth is flat" example is extreme, because it's accepted
         | as a silly statement given what we know now, but the principle
         | of "LLM won't force an opinion on you" seems like a good one.
         | 
         | There are definitely topics on which conventional wisdom is
         | incorrect (as has been throughout history). An LLM that refuses
         | to entertain the converse during a conversation will be
         | annoying to work with and just promotes groupthink.
        
           | mihaic wrote:
           | Except that it will force on you the view that shoplifting is
           | bad. Which implies that it'll bend on legal but immoral
           | requests.
           | 
           | It's also a different matter to entertain a hypothetical in a
           | situation where there isn't a consensus (or in any fictional
           | scenarios), all the while making it explicit that it's all
           | hypethetical.
        
       | Alifatisk wrote:
       | I gotta say, "open"Ais web design is on another level, so minimal
       | and elegant.
        
       ___________________________________________________________________
       (page generated 2024-05-09 23:03 UTC)