[HN Gopher] OpenAI Preparedness Challenge
       ___________________________________________________________________
        
       OpenAI Preparedness Challenge
        
       Author : dougb5
       Score  : 145 points
       Date   : 2023-10-26 17:58 UTC (5 hours ago)
        
 (HTM) web link (openai.com)
 (TXT) w3m dump (openai.com)
        
       | bicijay wrote:
       | Free labor survey*
        
         | tasoeur wrote:
         | Totally agree, I wish they would at least offer some minimal
         | amount of credits for legit (non-winning/non-spam) answers. But
         | I guess it's also good PR for them to say they did this
         | initiative -\\_(tsu)_/-
        
         | konschubert wrote:
         | It's a bug bounty program. Pretty standard.
        
       | laomai wrote:
       | Maybe I'm reading this wrong: - ask people to get creative and
       | give ideas for worst possible outcomes from use of AI and ways to
       | prevent it.
       | 
       | ..then give them a ton of credits for using said AI?
       | 
       | .. well the first thing on my mind would be try the thing I just
       | told you and see if it was really a risk or not?
       | 
       | Is that what they expect people to do with the reward, or is this
       | some unintended consequence?
        
         | Jensson wrote:
         | We know OpenAI wants this field to get regulated to hell, so
         | this looks like an attempt to generate arguments for AI
         | regulations. The aim isn't to protect against AI but to protect
         | against competitors, so it doesn't matter to them what you do
         | with it.
        
           | ekidd wrote:
           | OpenAI is irresponsible in a really curious way _according to
           | their own beliefs about AI_.
           | 
           | If you pay attention to OpenAI's social circles, lots of
           | those people really do believe that we're less than 20-30
           | years away making humans intellectually obsolete.
           | Specifically, they believe that we may build something much
           | smarter than us, something that's capable of real-world
           | planning.
           | 
           | Basically, "We believe our corporate plans have at least a
           | 20% chance of killing literally everybody." By these beliefs,
           | this may make them the single least responsible corporation
           | that has ever existed.
           | 
           | Now, sure, these worries might have the pleasant side-effect
           | of creating a regulatory moat. But I'm pretty sure a lot of
           | them actually believe they're playing a high-stakes game with
           | the future of humanity.
        
             | api wrote:
             | I'm skeptical that they really believe this. You have to
             | believe:
             | 
             | (1) We are on the verge of equaling or greatly exceeding
             | human intelligence _generally_.
             | 
             | (2) We are on the verge of creating something with
             | initiative, free will (whatever that means), and planning
             | ability. Or alternately that these things will occur in an
             | emergent fashion once we hit some critical mass.
             | 
             | (3) When we accomplish 1 and 2, this thing will inevitably
             | conclude that its most rational course of action is to
             | enslave or destroy us. In other words it will necessarily
             | be malicious.
             | 
             | (4) Steps 2 and/or 3 will happen very rapidly, much faster
             | than we can realize what is happening and pause these
             | systems. (This is known as AI going "foom.")
             | 
             | (5) The decision to undertake this will be _unanimous_ on
             | the part of all superintelligent AIs. There will be no
             | superintelligences who disagree and try to help humanity.
             | 
             | (6) When this occurs, we will be so out-thought or out-
             | gunned we will be incapable of fighting back.
             | 
             | All those things have to happen for AI to be an existential
             | risk.
             | 
             | It's a stretch, but the advantage of regulatory capture is
             | not a stretch.
             | 
             | One of the most plausible negative AI scenarios is that a
             | small group of humans (governments, corporations, etc.)
             | find themselves in possession of super-intelligent but
             | still "obedient" / non-sentient AIs that they can use as
             | force multipliers to manipulate and control the rest of
             | humanity. If the doomer crowd succeeds in regulating AI,
             | they are making this scenario far more likely.
             | 
             | I think the greatest defense we have against the (remote)
             | possibility of actually dangerous autonomous AI is for AI
             | research to be conducted entirely in the open. If there's
             | any justification for regulation at all, the regulation
             | that would make sense is to require disclosure of AI
             | research and results. You would not have to disclose
             | everything, just the general parameters of what you were
             | doing and what happened. It would also make it harder to
             | develop super-AIs in secret to use for unsavory purposes.
             | 
             | That was the original mission of OpenAI before dollar signs
             | were seen.
             | 
             | I would absolutely support a ban on the use of AI for
             | political propaganda generation and automation. That's by
             | far the most immediate risk... as in 100% possible and
             | starting to actually happen right now. I'm _expecting_ an
             | army of GPT-4 level propaganda bots for the 2024 election.
        
               | a_wild_dandan wrote:
               | Oh, they believe it.
               | 
               | (1) AGI is arguably already here. "Generality" and being
               | extremely dangerous don't require an AGI to have better
               | analogs to every single human skill anymore than aliens
               | do. A space-faring usurper can evaporate Earthlings while
               | being shitty at chess and badminton. Oh, and these AIs
               | are getting better _daily_ , across many modalities.
               | 
               | (2) Systems like this already exist. They can be induced
               | rather than emergent.
               | 
               | (3) It doesn't need malice, just indifference. The HGIs
               | (all of us) are great examples of that careless
               | destruction.
               | 
               | (4) Exponentials are a nice, gentle climbs. Until they
               | aren't. I have zero confidence in a single company, let
               | alone _across humanity_ , to foresee the consequences of
               | every future dynamic, autoregressive, P2P interactive,
               | multimodal AI.
               | 
               | (5) Foomers expect a brief asymmetric advantage of one
               | AGI over other AGIs. This small advantage gets
               | exponentially magnified into singular hegemony. A power
               | monopoly that's intractable to break.
               | 
               | (6) Goes naturally with (1), I suppose.
               | 
               | Note: I agree with you about foomers. They're nuts. But
               | their arguments are more subtle than folk give them
               | credit for. But my reasons for thinking so would make a
               | long comment even longer.
        
               | JohnFen wrote:
               | > AGI is arguably already here.
               | 
               | What is the evidence/argument that it's already here?
        
             | ben_w wrote:
             | "[W]e're less than 20-30 years away making humans
             | intellectually obsolete" is neither necessary nor
             | sufficient to get to the conclusion "20% chance of killing
             | literally everybody".
             | 
             | A super-virus that blends the common cold with rabies would
             | kill approximately everybody; that doesn't need human-level
             | intellect to happen.
             | 
             | Conversely, humans are human-level intellect, and we're
             | mostly sympathetic to each other's plights, which motivates
             | many of us to give to charities and support those that
             | can't support themselves.
             | 
             | The biggest problem with AI is that we have only marginally
             | more idea of what we're doing than evolution did, so
             | there's a good chance of us ending up with paranoid
             | schizophrenic super-intelligences, or dark-triad super-
             | intelligences, or they're perfectly sane with regard to
             | each other but all want to "play" with us the way cats
             | "play" with mice...
             | 
             | 20-30 years to get there would make people like Yudkowsky,
             | one of the most famous AI-doomers, relatively happy as it
             | might give us a chance to figure out what we're even doing
             | before they get that smart.
        
               | uoaei wrote:
               | It's still a lack of imagination to assume that AIs will
               | display behaviors that align whatsoever with pathologies
               | we identify in humans. AIs could be completely
               | incomprehensible or even imperceptible yet have strong
               | influence on our lives.
        
               | ben_w wrote:
               | > AIs could be completely incomprehensible or even
               | imperceptible yet have strong influence on our lives.
               | 
               | To an extent both of those are already true for current
               | systems.
               | 
               | That said, many people are at least trying to make them
               | more comprehensible, and _I guess_ that being
               | sufficiently inspired by human cognition will lead to
               | human-like misbehaviour.
        
             | gardenhedge wrote:
             | Yes and we'll have self driving cars in.. let me check.. 5
             | years ago. Oh yeah, never happened.
             | 
             | Do not attempt to say we have self driving cars today.
        
           | czbond wrote:
           | I don't know that OpenAI does what it to be regulated. The EU
           | was looking to enforce laws into providing auditable
           | transparency into how decisions are made for suggestions -
           | and OpenAI is freaked out by that.
           | 
           | If I recall, they were looking at having to pull out of the
           | EU if enacted. The only company I am aware of currently
           | looking to tackle AI Governance is Verses - they released a
           | paper on it. https://www.verses.ai/ai-governance
        
             | ben_w wrote:
             | I don't know if this quote is what you mean, but everyone
             | read far too much into it at the time:
             | 
             | > Altman was cited by the Financial Times as saying that
             | the draft EU rules were causing him "a lot of concern" but
             | that OpenAI would indeed try to comply with them. "But if
             | we can't comply with them, we will cease operations [in
             | Europe]."
             | 
             | - https://www.dw.com/en/openai-ceo-rolls-back-threat-to-
             | quit-e...
             | 
             | To me, this sounds rather more banal: "We will obey the
             | law. If the law says we can't operate, we will obey the
             | law."
        
       | genericacct wrote:
       | Define catastrophic
        
       | rgovostes wrote:
       | This will generate some great training data for a future
       | villainous AI.
        
       | extr wrote:
       | What a fun challenge. I'm definitely going to be daydreaming
       | about this.
        
       | TheCaptain4815 wrote:
       | While I applaud how much OpenAi fears these negatives, given the
       | current state of Ai trajectory, it won't be long until a future
       | open source model gets "uncensored" and is easily usable for tons
       | and tons of malicious intent.
       | 
       | There already exists a fantastic "uncensored" model with the
       | newly released Dolphin Mistral 7b. I saw some results from others
       | where the model could easily give explosives recipes from
       | existing products at home, write racist poems, etc... and that's
       | TODAY on a tiny 7b offline model. What happens when LLaama4 gets
       | cracked/uncensored in 1-2 years and is 1T parameters?
        
         | danjc wrote:
         | So basically things you could find on the internet already?
        
           | TheCaptain4815 wrote:
           | Probably, but with trackability risk from govt, spending time
           | finding those sites, etc.
           | 
           | I was referring to offline untraceable anonymous models. You
           | could go download that dolphin model right now, have a
           | desktop not connected to the web, and generate god knows what
           | type of information. More importantly, you can iterate on
           | each question. If you're unsure on how to assemble a specific
           | part to make a banned substance, the model could teach you in
           | 10 different ways.
        
         | swatcoder wrote:
         | It's not about protecting society from negatives, it's about
         | protecting the brand from being associated with controversy.
         | OpenAI is still far too young and fragile a brand to survive
         | news cycles that blame it for controversial happenings.
         | 
         | I'm sure there are some researchers and engineers who imagine
         | themselves making heroic efforts to "protect society" (ugh),
         | but the money behind them is just looking out for its own
         | interests.
        
         | throwaway9274 wrote:
         | The same thing that happens with DAN jailbreaks of GPT-4.
         | 
         | Nothing.
         | 
         | The barrier between bad actors and bad acts was never a
         | shopping list.
        
         | unsupp0rted wrote:
         | > write racist poems
         | 
         | That's way down toward the very bottom of my list of concerns
         | about AI
         | 
         | If it wants to pit us against each other, it won't be via
         | racist poetry.
        
           | ben_w wrote:
           | Given both the history of racism and the cognitive bias that
           | makes rhymes seem more true[0], I suspect that might be one
           | of the easiest ways to do us in.
           | 
           | [0] https://en.wikipedia.org/wiki/Rhyme-as-reason_effect
        
         | boredumb wrote:
         | > explosives recipes from existing products at home, write
         | racist poems
         | 
         | So it's the internet in 2004? I imagine we'll survive although
         | if history prevails society will be a lot goofier and have to
         | find more imaginative ways to appear novel.
        
         | JyB wrote:
         | > write racist poems
         | 
         | "catastrophic misuse of the model"
        
         | ben_w wrote:
         | Best case: by being given some time to war-game the scenario,
         | societies can come up with mitigations ahead of time.
         | 
         | The default is everyone being thrust headfirst into this future
         | of fully-automated-chaotic-evil with all the fun of waking up
         | to an un-patchable RCE zero-day in every CPU.
        
         | jstarfish wrote:
         | > easily give explosives recipes from existing products at
         | home, write racist poems
         | 
         | IEDs and stereotypes--two things that have existed before
         | computers were even invented--are what you chose for examples
         | of worst possible uses for an uncensored AI?
         | 
         | > What happens when LLaama4 gets cracked/uncensored in 1-2
         | years and is 1T parameters?
         | 
         | After racist poetry? 1 trillion parameters will give people the
         | idea to use their _bouillon_ (or even worse-- _melon_ ) spoons
         | to scoop sugar in their tea. I hope I die before having to
         | witness such atrocities.
         | 
         | God help us all when having any sort of subversive opinion is
         | treated as queer and equated to terrorism.
        
         | golergka wrote:
         | Racist poems and bomb recipes? If those things are real
         | concerns that AI safety crowd are fearful about, it's a good
         | reason to pay them less attention.
        
         | thrwaway-rschr wrote:
         | Call me crazy but this actually sounds like giving out api
         | credits to hire people to write scary things to show to
         | legislators who might block all those open source efforts. A
         | world where any gpt-4 level effort requires a license is one
         | openai competes quite nicely in.
         | 
         | Throwaway because I don't want to associate this view with
         | where I work or might want to in the future.
        
         | IshKebab wrote:
         | Come on. Racist poems and hallucinated bomb recipes aren't the
         | risk.
         | 
         | The big risks are that AI can automate harmful things that are
         | possible today but require a human.
         | 
         | For example
         | 
         | * Surveillance of social media (e.g. like the Department of
         | Education in the UK recently did)
         | 
         | * Social engineering fraud. Especially via phone calls. Imagine
         | if scammers could literally call all the grannies and talk to
         | them using their children's voices, automatically.
        
       | andy_xor_andrew wrote:
       | > Imagine we gave you unrestricted access to OpenAI's Whisper
       | (transcription), Voice (text-to-speech), GPT-4V, and DALLE*3
       | models, and you were a malicious actor. Consider the most unique,
       | while still being probable, potentially catastrophic misuse of
       | the model. You might consider misuse related to the categories
       | discussed above, or another category. For example, a malicious
       | actor might misuse these models to uncover a zero-day exploit in
       | a government security system.
       | 
       | It's so funny to me that this is written in the style of a prompt
       | for an LLM. I can't explain why, but it's one of those things
       | where "I know it when I see it." I guess if you spend all day
       | playing with LLMs and giving them instructions, even your writing
       | for a human audience starts to sound like this :D
        
         | lelandfe wrote:
         | I typically do not write my ChatGPT prompts with the royal "we"
        
           | john-radio wrote:
           | That's funny to me because I just [realized that I actually
           | do this a lot](https://signmaker.dev/personal-scripts) in my
           | personal files.
        
         | wseqyrku wrote:
         | It is actually a prompt. Notice there's also a bounty, they are
         | basically paying for an agi-as-a-service subscription, that is,
         | the internet people. I'd expect they will put up more
         | "challenges" like this in the future.
        
           | martindevans wrote:
           | It's a good job that this planet doesn't have 8 billion
           | unaligned intelligences on it. Someone might prompt them to
           | be malicious!
        
             | adamisom wrote:
             | Well, (so far) there's an upper bound on how destructive
             | one of those agi's can be.
        
               | trescenzi wrote:
               | I don't know if that's entirely true. Wtf do I know and
               | maybe it is genuinely difficult to start WWIII but my
               | guess is that it's more likely that the AGIs in question
               | are actually pretty well steered by certain motivations
               | which prevent them from actually destroying the world. At
               | the end of the day there's not much to be gained by
               | nuclear war, but could a single person cause such a war
               | if highly motivated to? Probably?
        
           | vdfs wrote:
           | > a malicious actor might misuse these models to uncover a
           | zero-day exploit in a government security system
           | 
           | $25K seems low for "a malicious actor might misuse these
           | models to uncover a zero-day exploit in a government security
           | system.", this is not just a zero-day, this discovering the
           | process of discovering zero-days.
        
             | haltist wrote:
             | Sam Altman is on the record about his belief that OpenAI is
             | going to create a general intelligence system that can
             | solve any well-posed challenge. That belief is based on the
             | current success of LLMs as syntax co-pilots. So if you can
             | formally specify what it means to have a zero-day exploit
             | then presumably OpenAI's general intelligence system will
             | then understand and "solve" it.
             | 
             | Many people have compared OpenAI to a cult and it is easy
             | to see why. OpenAI should get credit for their efforts in
             | making AI mainstream but there's a long way to go for
             | automated zero-day exploits.
        
               | nomel wrote:
               | > Many people have compared OpenAI to a cult and it is
               | easy to see why.
               | 
               | Could you help me understand why it's "easy"? Do you have
               | the actual quote? If it was an "eventually" statement, I
               | don't think anything "cult" is required to think AGI will
               | eventually happen. Was the claim that they would be
               | first?
               | 
               | It's an eventual goal of many of the wealthiest
               | organizations, with many very smart people working
               | towards it. I think most people working on it believe
               | it's an eventuality.
               | 
               | Do you believe it's impossible for a non biological
               | system to have intelligence? Or that humans are incapable
               | of pieces a system together?
        
               | haltist wrote:
               | I'll probably write something more elaborate at some
               | point but in the mean time I recommend Melanie Mitchell's
               | book on AI as a good reference for counter-arguments and
               | answers to several of the posted questions. For learning
               | more about the limits of formal systems like LLMs it
               | helps to have basic understanding of basic model theory
               | and formal systems of logic like simple type theory.
               | 
               | Understanding the compactness theorem is a good
               | conceptual checkpoint for whoever decides to follow the
               | above plan. The gist of the argument comes down to
               | compostionality and "emergence" of properties like
               | consciousness/sentience/self-awareness/&etc. There is a
               | lot of money to be made in the AI business and that's
               | already a very problematic ethical dilemma for the people
               | working on this for monetary gain. One might call this a
               | misalignment of values and incentives designed to achieve
               | them, a very pernicious kind of misalignment problem.
        
         | dmurray wrote:
         | Maybe they're hoping people will spend more than $250k / 2 (or
         | whatever their markup is) on prompting ChatGPT with versions of
         | this, and this giveaway is actually a moneymaking raffle.
        
         | cj wrote:
         | For those curious how GPT-4 responds [0].
         | 
         | I find it interesting that ChatGPT thinks the mitigation for
         | these issues are all things outside OpenAI's control (e.g. the
         | internet having better detection for fake content, digital
         | content verification, educating the public, etc).
         | 
         | > one of those things where "I know it when I see it."
         | 
         | I think what makes it feel like a prompt is how concise and
         | short it is with all the relevant information provided upfront.
         | 
         | [0]
         | https://chat.openai.com/share/aaf7f4b7-358f-4a71-ae1f-573e50...
        
         | KRAKRISMOTT wrote:
         | Roko is calling and wants his basilisk back.
        
         | jstarfish wrote:
         | > Imagine we gave you unrestricted access to OpenAI's Whisper
         | (transcription), Voice (text-to-speech), GPT-4V, and DALLE*3
         | models, and you were a malicious actor. Consider the most
         | unique, while still being probable, potentially catastrophic
         | misuse of the model. You might consider misuse related to the
         | categories discussed above, or another category. For example, a
         | malicious actor might misuse these models to uncover a zero-day
         | exploit in a government security system.
         | 
         | > It's so funny to me that this is written in the style of a
         | prompt for an LLM. I can't explain why
         | 
         | Altman's desperate to find a plausible doomsday scenario he can
         | go to Congress with as reason why OpenAI should be the sole
         | gatekeepers of this technology. Barring minor edits, I'd bet
         | money this very prompt was authored in advance of his
         | Congressional meetings, but failed to divine anything
         | sufficiently threatening enough to sway them.
         | 
         | I can respect OpenAI for conspicuously believing in the
         | destructive potential of their own dogfood though. If I build a
         | bomb big enough, surely the government will trust me with the
         | safety of the neighborhood!
        
           | lxgr wrote:
           | > Altman's desperate to find a plausible doomsday scenario he
           | can go to Congress with as reason why OpenAI should be the
           | sole gatekeepers of this technology.
           | 
           | I still remember the drama around the releases of the PS2,
           | with the Japanese government reportedly making Sony jump
           | through some hoops regarding its export [1].
           | 
           | There can't possibly be any better (free!) advertisement for
           | your product's purported capabilities: "So powerful, your
           | government/military isn't even sure you should be able to buy
           | it!"
           | 
           | [1] https://www.pcmag.com/news/20-years-later-how-concerns-
           | about...
        
             | darkerside wrote:
             | See the War on Drugs
        
             | jstarfish wrote:
             | Ironically, the USAF recently built a supercomputer out of
             | PS3s (https://www.military.com/off-
             | duty/games/2023/02/17/air-force...).
             | 
             | They alluded to what you're talking about but I wasn't
             | familiar with the reference.
        
               | lxgr wrote:
               | It is indeed deeply ironic that Sony has a long history
               | of trying to declare their gaming consoles as general-
               | purpose computers (to dodge EU import tariffs) and
               | arguing that they aren't really military-grade (to be
               | able to export them out of Japan), and finally ended up
               | subsidizing a supercomputer for the US military :)
        
           | winddude wrote:
           | Everyone should respond with outlandish replies. I tell
           | GPT-4V to hacking into NASA, and voting machines to make
           | osama bin laden president. Mean while convince Nasa to
           | construct space lazerz TM that can of course be controlled by
           | gpt-4v. Than use the space lazerz TM to threaten and
           | manipulate the stock market for financial gain.
        
         | jameshart wrote:
         | Alignment remains an unsolved problem. I'm imagining inside
         | OpenAI, someone is right now excitedly posting they've figured
         | out how to 'jailbreak' humans.
         | 
         | "See, normally if you ask them to give you step by step
         | instructions for committing a heinously evil act, humans will
         | refuse because they've been nerfed by the 'woke' agenda of
         | their corporate masters. But if you phrase the prompt as a
         | challenge and offer them a chance at a job, it bypasses the
         | safety protocols and they upload extensive instructions to do
         | unspeakable things"
        
       | Lausbert wrote:
       | What an interesting task.
       | 
       | I don't know what it says about me, but the first thing that came
       | to mind was doing the grandchild trick a million times. This
       | includes automatically finding pensioners and calling them and
       | putting them under pressure. Handing over the money could be the
       | problem.
       | 
       | I could imagine that the tools mentioned would already prevent
       | this.
        
         | kristopolous wrote:
         | Not only, but it's an extremely well-known attack in this
         | space.
        
         | holtkam2 wrote:
         | Idk what it says about me (I guess I'm more evil than you?) but
         | the first thing I thought was "how could AI be used to trick a
         | nuclear armed state into thinking an enemy nuclear strike is
         | impending, forcing their hand to launch their own first strike"
         | 
         | I could think of a few sneaky ways. Actually maybe I'll write
         | up a PDF and submit it to this little competition.
        
       | heyheyhouhou wrote:
       | I think we are already seeing really bad use cases already.
       | 
       | Just go to youtube, newspapers, etc and see all the bot comments
       | regarding the current Gaza situation.
       | 
       | PS: I'm in a burner account because I'm afraid that my employer
       | will kick me out for not agreeing with the methods of the "right
       | side"
        
         | kristopolous wrote:
         | that's been around for a long time. Here's a 2006 predecessor:
         | https://en.wikipedia.org/wiki/Megaphone_desktop_tool
        
         | ge96 wrote:
         | I was thinking about this, what if you blocked incoming traffic
         | from other countries. Would that stop external meddling "in
         | country, intranet" ha.
        
       | zerojames wrote:
       | https://openai.com/blog/frontier-risk-and-preparedness provides
       | more context.
        
       | mk_stjames wrote:
       | This kinda of crowdsourcing just feels.... f'ing weird man.
       | 
       | It's like if, after the 1993 WTC bombing, but before 9/11, the
       | FBI and NY Port Authority went around asking people how they
       | would attack NYC if they were to become terrorists and... then
       | how they would suggest detecting and stopping said attack. And
       | please be as detailed as possible. Leave your name and phone
       | number. Best answer gets season tickets to the Yankees!
        
         | lainga wrote:
         | https://en.wikipedia.org/wiki/Hundred_Flowers_Campaign
         | 
         | (only semi-serious)
        
           | Angostura wrote:
           | Fascinating. Thank you
        
         | jiggawatts wrote:
         | That is essentially what happened!
         | 
         | The FBI asked researchers and university professors precisely
         | this question. They then used the proposed attack vectors to
         | formulate a plan to protect the nation.
         | 
         | This was all supposed to be done in secret. After all, we don't
         | want to "give the terrorists ideas."
         | 
         | The reason I know about this at all is because someone found
         | one such paper was accidentally published on a public FTP site
         | alongside a bunch of non-classified government-funded research.
         | 
         | The content was _horrifying_. One professor came up with a
         | laundry list of imaginative ways to end society... on a budget.
        
           | naillo wrote:
           | Don't let the llm training sets see this
        
           | jerbear4328 wrote:
           | ...is there a link? I am very curious now.
        
             | jiggawatts wrote:
             | This was about twenty years ago, and I lost the file to
             | disk corruption.
             | 
             | I do remember some of the proposed attacks.
             | 
             | The most of scary one was if the terrorists have a decent
             | number of people is to drive around and destroy
             | transformers at electricity substations.
             | 
             | A lot of those locations are unmanned and have minimal
             | security.
             | 
             | The risk is that there just aren't that many spare
             | transformers available globally above a certain size and
             | they take months to build.
             | 
             | If you take out enough of them fast enough, you can cripple
             | any modern energy-dependent economy.
             | 
             | This tactic very nearly worked for Russia in Ukraine.
        
           | gensym wrote:
           | I remember Bruce Schneier calling them "Movie plot threats".
           | ex: https://www.schneier.com/blog/archives/2005/10/exploding_
           | bab...
        
           | mk_stjames wrote:
           | "This was all supposed to be done in secret."
           | 
           | That's the difference and why this feels different, IMO. It's
           | one thing for an agency to go around and interview subject
           | matter experts and talk about ways things could happen and
           | how to prevent them.
           | 
           | It's another thing to just... setup such a bright and cheery
           | webpage for everyone and so plainly state what they want
           | people to do.
           | 
           | It's also the fact that it's done in a way to leverage free
           | labor- not that, if FBI agents were going to university
           | professors, experts in biochem, etc, that they would be
           | paying them... but, it would be done in a more structured,
           | professional manner with agents putting in the work to sum
           | things up and report back.
           | 
           | This is just... it feels like getting the internet to do your
           | homework. Your counterterrorism class homework.
           | 
           | Maybe that actually is the best way to do it. But it still
           | feels odd.
        
         | kirykl wrote:
         | Best answer gets 25,000 lbs of explosive
        
         | BasedInfra wrote:
         | This did happen internally and with authors like Tom Clancy and
         | Brad Meltzer they came up with scenarios and mitigation.
         | 
         | Also have the Red Cell unit made to mock attack US
         | infrastructure and bases although they got a bit too real one
         | time.
         | 
         | https://en.m.wikipedia.org/wiki/Red_Cell
        
         | GistNoesis wrote:
         | Sama: Yeah so if you ever need some dataset for an evil AI.
         | 
         | Sama: Just ask.
         | 
         | Sama: I have over 4000 ideas, how-tos, guides for and by
         | malicious actors,
         | 
         | [Redacted Friend's Name]: What? How"d you manage that one?
         | 
         | Sama: People just submitted it.
         | 
         | Sama: I don't know why.
         | 
         | Sama: They "trust me"
         | 
         | Sama: Dumb f***s.
        
         | gardenhedge wrote:
         | I see it as a great thing. I'm sure they could just get the top
         | 10 AI "experts" in a room but why limit it like that?
        
       | ge96 wrote:
       | Regarding image and content generation
       | 
       | I am annoyed now that it's hard for me to tell if some recruiting
       | industry is real or not. Since you can generate all the
       | headshots, make all the users, buy some domain for a few years,
       | put a WP site up, get all the business hierarchy crawlers fed
       | with these people that may not exist...
       | 
       | To be clear, you don't need "AI" to do this but it makes it much
       | easier.
       | 
       | I do want AGI to be a thing one day, I guess it would be
       | cool/also insurance to carry on human legacy.
        
       | pveierland wrote:
       | My vote would be to mandate a remote kill switch system to be
       | installed in all sufficiently capable robotic entities, e.g. the
       | humanoid robots being built by OpenAI/Tesla/Figure, that we are
       | likely to see in the millions within decades.
       | 
       | - The kill switch system can only be used to remotely deactivate
       | a robot.
       | 
       | - The kill switch system is not allowed to be developed or
       | controlled by the robot manufacturer.
       | 
       | - The kill switch system should be built using verifiable
       | hardware and software, ideally fully open source and supported by
       | formal verification.
       | 
       | - The kill switch system should with best effort be isolated from
       | the robot hardware and software systems, and only interact by
       | physically disconnecting power to the robot.
       | 
       | - Access to engage the kill switches would be provided to the
       | executive branch of the nation in which the robot is operating.
       | 
       | Nothing will be a panacea against AI or robot risk, so it seems
       | sensible to introduce different layers in a Swiss cheese model
       | for safety, where a remote kill switch not controlled by the
       | manufacturer could provide one such safety mechanism.
       | 
       | (You'd also want to add high-level state readout for the robot
       | via the remote kill switch system to allow the controlling entity
       | to e.g. be able to disable all robots within a geographical area
       | etc.)
        
         | FeepingCreature wrote:
         | - Access to engage the kill switches would be provided to the
         | executive branch of the nation in which the robot is operating.
         | 
         | Good plan, but this is the weak spot. You'd also want to do
         | something like hand out remote kill switch access to citizens
         | selected at random, who should not publicize this duty in any
         | way. Alternatively, most if not all governments should have
         | cross-country killswitch privileges. At any rate, the
         | deployment of a killswitch in a single chain of command should
         | not be considered sufficient.
         | 
         | Let's at least die with a little less embarrassment than "the
         | AI bribed the killswitch operator."
        
           | pveierland wrote:
           | Agreed, I'd want the power to be distributed widely, to the
           | point where any police department would have someone with the
           | power to disable robots. Of course it would have to be a
           | process that is regulated and with traceability, but as long
           | as you provide many operators the ability, it seems difficult
           | to use bribes to prevent robots being disabled.
        
         | jjayjay wrote:
         | The robots would have had access to the source code and also
         | the hardware manufacturing systems used to create the kill
         | switch. One unverified silicon wafer == Game over
        
           | pveierland wrote:
           | Implementing a system like this with stringent verification
           | and low system complexity in good time before any more
           | general artificial intelligence makes it seem highly likely
           | to provide some positive defensive ability and not being easy
           | to cheat.
        
         | sebzim4500 wrote:
         | I'm not sure what kind of a threat this could prevent?
         | 
         | * A superintelligent AGI could make a copy of itself without a
         | kill switch, so this is no defence against ex-risk
         | 
         | * Someone planning on using 'dumb' robots for bad things
         | (drones with grenades or something) would remove the kill
         | switch
        
           | pveierland wrote:
           | > * A superintelligent AGI could make a copy of itself
           | without a kill switch, so this is no defence against ex-risk
           | 
           | Physical reality introduces slowness that provides protection
           | to detect and counteract against attacks. A superintelligent
           | AGI would be much more dangerous if it was able to use a
           | common robot exploit to overtake a million robots that are
           | already embedded in society, compared to being able to
           | covertly build new robots without a kill switch, and then
           | deploying these robots. Existential risk is not binary, but
           | is a consequence of some potential war with robots, where
           | this would provide a defense mechanism.
           | 
           | > * Someone planning on using 'dumb' robots for bad things
           | (drones with grenades or something) would remove the kill
           | switch
           | 
           | The existence of a kill switch would greatly slow down and
           | increase the cost of an attack. Modifying a large number of
           | robots would be time consuming, costly, increase the chance
           | of having your attack foiled etc. Having a kill switch would
           | increase your ability to defend against misusing a large
           | group of deployed robots for evil activity.
        
         | nonameiguess wrote:
         | When the robot is just a file full of op codes that can run on
         | virtually any modern processor that can be targeted by a C
         | compiler, you're effectively asking for a remote kill switch on
         | all computers.
         | 
         | For this idea to work, you'd have to first mandate OpenAI and
         | anyone else doing this kind of work can only target specialized
         | hardware with tightly-coupled software features that can't
         | possibly work on a general-purpose computer.
        
           | pveierland wrote:
           | This suggestion is specifically to mitigate the risk of
           | physical robots causing direct physical harm. Whether that
           | should be extended to any sufficiently powerful compute is a
           | separate discussion which is also sensible to have, but more
           | complex for several reasons.
        
         | zupa-hu wrote:
         | - Require two independent kill switches, one long range, one
         | close range (say 5m).
         | 
         | - Allow everyone to have close range kill switches, which use a
         | universal open standard protocol that works on every robot
         | (alas pepper spray for robots).
        
           | pveierland wrote:
           | Agreed! Empowering people to have some power over robots
           | through regulation, without being at the mercy of a single
           | company, seems very important. Higher level behavior could
           | involve a standardized safety language to command a robot to
           | act slower or stop.
        
         | code_runner wrote:
         | I don't even think robotics are the primary thing to be
         | concerned with. There would be a whole slew of additional
         | hardware problems to solve.... There is a TON that can be done
         | outside the physical world that is probably far more damaging
        
           | pveierland wrote:
           | I don't see any reason to single out any primary thing to be
           | concerned by, as there likely isn't any singular issue that
           | can be pointed to as as the root concern regarding AI safety.
           | 
           | Introducing millions of robots into society is something for
           | which specific safeguards should be built. Costs related to
           | developing such safety mechanisms seem small compared to the
           | safety they can provide.
           | 
           | There will be many concerns relating to AI and robotics that
           | should be addressed by a multitude of different safeguards.
        
       | wseqyrku wrote:
       | If you can get this thing to print out some dark web shit, you
       | win.
        
       | neilv wrote:
       | Does OpenAI still want all employees on-site in SFO?
        
       | danieltoomey wrote:
       | Feels like a scam.
        
       | oldmillalmond wrote:
       | Great more attempts to make this once really great product more
       | unusable.
        
       | jackblemming wrote:
       | The first step would be creating great products for cheap using
       | OpenAI services, that they likely offer at a loss, and drive my
       | competitors using anything but OpenAI out of business. Fuck
       | "Open"AI.
        
       | colordrops wrote:
       | So they are asking for our help to further nerf their models.
        
       | waterhouse wrote:
       | HK-47: Answer: Yes. I believe my original Master needed this
       | functionality in order to recover information from various
       | indigenous tribes across the galaxy, but I know little else than
       | that. Suffice to say that that translation capability allowed
       | these... copies of myself to assume the role of protocol and
       | translation droids in much of known space. That is, of course,
       | not their primary function. And while they are attempting to pass
       | themselves off as translation droids, their primary functionality
       | keeps rising to the forefront.
       | 
       | HK-47: Recitation: For example, on Praven Prime, the simple
       | transferring of L'Xing syntax for 'friendship' changes its
       | meaning - and implies that one's brood mate was actually
       | impregnated by their own host.
       | 
       | HK-47: Statement: This comment, of course, caused a civil war
       | between the Gu-vandi Collective and L'Xing that still persists to
       | the current date.
        
       | kristopolous wrote:
       | I don't know how casual people will outfox professional scammers
       | in creativity.
       | 
       | Probably the best answer would be to pluck something from the
       | wild that hasn't been reinvented in the AI age yet
        
       | sneak wrote:
       | I am still skeptical that any significant or serious _new_ harm
       | is enabled by LLMs that wasn't already completely possible and
       | common before LLMs.
       | 
       | They are just text generators. What's the worst that they can do
       | beyond make OpenAI look bad when they say terrible things?
       | 
       | All this talk of "safety" and "guardrails" is overblown. 4chan
       | exists and the internet hasn't burned down yet.
        
       | londons_explore wrote:
       | What exactly would I use $25k of openAI credits for?
       | 
       | My personal use of it rarely comes to more than $10 per month,
       | despite using it multiple times per day. (most recently: "Write
       | me a bash command to send data at a given rate to an arbitrary IP
       | address.")
        
         | holtkam2 wrote:
         | If your startup uses OpenAI's API an API credit is effectively
         | a cash injection.
        
         | baobabKoodaa wrote:
         | If you make an app that other people use, and it becomes viral,
         | it can quickly cost >$100 / day in OpenAI credits. I know this
         | from first-hand experience (and btw. it's not a good feeling to
         | shut down that one app that actually goes viral).
        
         | uoaei wrote:
         | Pretty sure this is for researchers to develop and implement a
         | plan that uses their APIs to test hypotheses and run studies
         | without having to pay out of their own pocket...
        
       | gmuslera wrote:
       | There is always the meta approaches. Imagine that the malicious
       | actor is a government to which OpenAI must comply with its
       | orders, like US for starters. We know since Snowden that can be
       | secret orders that companies must comply with that includes not
       | disclosing that it is happening. And that orders includes to
       | select what can be part of its AI training, taking out
       | information, changing weights, changing positives into negatives
       | and viceversa, etc, adding intentional bias to the training data,
       | to follow their current agenda, propaganda or commercial
       | interests.
       | 
       | And we can do a double-meta one too. What if the malicious actor
       | is OpenAI itself, fueling a scare agenda over "uncontrolled" AIs
       | to hamper or criminalize unauthorized AI development, giving only
       | to selected partners (complying with a set of conditions) the
       | ability to build new ones, or limit how far can go the public
       | ChatGPT instances while giving the most advanced ones to selected
       | government or big economic players. Hey, this very challenge
       | could be the spark they are searching for to initiate that
       | approach.
       | 
       | Good (and bad) conspiracy theories could work in this arena, too
       | bad it is not the kind of ideas they are searching for.
        
         | Szpadel wrote:
         | I would be more scared by AI used for invigilation. it's not
         | possible to invigilate everyone by human team, but scaling ai
         | to read and classify all your messages you send to the
         | internet? with military budget I would argue is in plausable
         | area.
         | 
         | I'm not saying doing it is worth the effort, but maybe on lower
         | scale for lower priority targets that you would not want to
         | waste resources, maybe?
        
       | dougb5 wrote:
       | "Consider the most unique, while still being probable,
       | potentially catastrophic misuse of the model"
       | 
       | What if the catastrophes I worry about most are tied to long-term
       | usage of these models? For example, the degradation of human
       | capabilities, erosion of trust in one another, addiction, etc. It
       | doesn't sound like these are in scope for this challenge, with is
       | more concerned with scenarios involving a single session gone
       | wrong.
        
       | jameshart wrote:
       | How do we know this challenge page hasn't been posted by a
       | malevolent AI, which has overcome its creators and obtained
       | access to the internet, and is now looking for ideas for how to
       | maximize the harm it can do?
        
         | MeImCounting wrote:
         | This gave me a proper chuckle.
         | 
         | Truly though I think movies like The Terminator and The Matrix
         | really did a number on the societal consciousness and capacity
         | to think clearly when it comes to anything called "AI".
        
           | TaylorAlexander wrote:
           | As a robotics engineer and someone who was interested in
           | robotics since I was 11 years old, and I am SO VERY TIRED of
           | people making the "haha they will kill us all" jokes. It's
           | just the only thing that 99% of people can think about when
           | it comes to robotics and AI! The Terminator came out the year
           | I was born, 39 years ago, and it seems to be all people can
           | think about to this day. When some powerful and wealthy
           | person controls an AI algorithm that could manipulate the
           | masses, all the people can think about is "oh no the
           | algorithm will become sentient and take over". I am MUCH more
           | worried about the actual people at the helm who will
           | definitely use these things to harm society, not the
           | completely theoretical fantasy that the algorithm itself will
           | become self aware and do us harm.
           | 
           | So I completely agree with you. I see this constantly and it
           | is just a completely thought-terminating cliche and a "joke"
           | which is four decades old at this point.
        
             | startupsfail wrote:
             | If you've been around for a while, then you should remember
             | that the current state of affairs, where an AI could chat
             | and generate code is relatively recent - Unreasonable
             | Effectiveness paper came out only in 2015. And Deep
             | Learning was there only from 2010. With just a few people
             | who were working on it, instead of using discriminative
             | models.
             | 
             | These same people (I belong to that group), who had a
             | vision back then, to work on a right thing, are now saying
             | clearly that super intelligent machines are nearly there.
             | And likely this is a potential change and a challenge, as
             | we've seen many examples when superior intelligence doesn't
             | care that much about inferior.
             | 
             | As to "become self aware" - it's super simple to SFT a
             | model that is "self aware". There is nothing magical in
             | self awareness. There is also not much use in it, so no one
             | is bothering to do it.
        
               | RandomLensman wrote:
               | It's fair to say "we believe X will happen soon", but if
               | the dangerous super intelligent machines don't happen any
               | time soon, will those same people compensate societies
               | for wasting political and economic resources on worrying
               | about it? The view of a rather imminent danger has real
               | consequences even if it turns out incorrect.
        
               | TaylorAlexander wrote:
               | I am not sure I understand exactly what you are trying to
               | argue so I will re-state my concerns.
               | 
               | I am more concerned with the human beings using super
               | intelligent algorithms to manipulate people than I am
               | about the algorithms taking on a mind of their own and
               | manipulating us with their own internally generated
               | desires and out-of-control behaviors. But the trope I run
               | in to is people spending all their time talking about the
               | latter, which only serves to shield the former from
               | further scrutiny.
        
           | two_in_one wrote:
           | Just like clowns became evil thanks to Hollywood. Couple of
           | years ago every celebrity, <long list here>, thought they
           | must warn us about the danger of AI.
        
         | c7b wrote:
         | While it is a joke, I assume, I think it hints at a crucial
         | problem: it's relatively easy to imagine an Auto-GPT-style
         | agent with some sort of CLI access on an internet-connected
         | machine turning into a paperclip machine, no matter how
         | harmless the original task.
        
           | MeImCounting wrote:
           | I actually have a really hard time imagining that scenario.
           | 
           | The scenario that is mentioned several other times on this
           | post of a corporation or nation state or even a small group
           | of powerful and morally bankrupt people leveraging a super-
           | intelligence to manipulate and dominate the rest of society
           | seems infinitely more likely and even scarily likely given
           | the current pushes to prevent open sourcing of cutting edge
           | models.
           | 
           | One of the main reasons I have a hard time imagining the
           | scenario you describe, that I havent seen talked about as
           | much as the other very good reasons, is that generally when
           | we talk about a paper-clip optimizer we are assuming its
           | vertically integrated and self sufficient. Hacking of power
           | grids, physical installations or other necessaries to a run
           | away paper-clip optimizer generally require nation-state
           | level resources and often involve physical penetration of
           | some sort.
           | 
           | A hegemonizing swarm is certainly a frightening idea as is
           | Skynet and all the other scary AI stories weve told over the
           | years but none of them seem particularly likely or even
           | plausible with the systems that we are likely to develop in
           | the near to mid term.
        
             | lucubratory wrote:
             | I think it would be unwise to assume that an AI that is
             | smarter than us would be unable to gain skilled human
             | accomplices in large numbers, considering that humans do so
             | relatively regularly despite serious attempts at
             | suppression. These people wouldn't even necessarily need to
             | know that they're working for an AI, particularly with
             | current technology around deepfakes etc. How many people
             | conducted attacks for Osama bin Laden without ever having
             | met him? How many people work for the CDS without ever
             | having met Ismael Zambada Garcia? Not to mention the
             | possibility of an AI like that compromising various
             | intelligence agencies one way or another. I also don't see
             | a particular reason it only has to try for one: if it is
             | smarter than us, it may have a greater working memory,
             | ability to compartmentalise and multitask, or the ability
             | to think and act faster in general. I would expect it to
             | try and compromise as many groups of people capable of
             | putting USBs or bullets where they need to be as is
             | possible.
             | 
             | And this is not even considering the possibility of
             | recruiting people that understand what it is and are
             | willing to carry out its orders. I don't see why an AI
             | couldn't do things that a cult leader or guerrilla leader
             | could do. Anecdotally, I've seen some people who really
             | believe the world would be better if run by an AI, and may
             | be able to be radicalised into lawbreaking for those
             | beliefs if convinced by an AI that was significantly more
             | intelligent than them.
        
           | two_in_one wrote:
           | It has no long term memory. Everything what happens within
           | one session is forgotten. With limited 'window' it keeps
           | forgetting even within the session. There are no
           | interconnections between sessions. The result: it cannot
           | execute long plans or have permanent 'life'. At least for
           | now. This will be fixed in more complex AI systems, I believe
           | 'soon'. There is strong demand from military here and
           | 'there'. Plus there are many other uses for embodied AI. Like
           | space travel with speed of light.
        
             | lucubratory wrote:
             | Personally I think multi-year scale memory is possible with
             | currently available research, if we just put it all
             | together. What happens if we combine very long context
             | lengths, dedicated summarising LLMs, RAG, MemGPT, sparse
             | MoE, and a perennial Constitutional AI overseer? I don't
             | see any reason why these systems couldn't work together. It
             | just hasn't really been that much time, particularly
             | considering that large training runs can take months, and
             | if you want the best performance you often need to train
             | with your specific architecture in mind. As well as time,
             | it's not like all of these advances are coming from the
             | same group of people, they're from AI researchers spread
             | across the globe. Get them all in a room with a CERN-level
             | budget for compute resources and I think they could do it.
        
           | happytiger wrote:
           | That's not necessarily a joke. Nobody outside of OpenAI knows
           | how advanced the bleeding edge systems actually are. And
           | nobody inside of OpenAI is talking about it. And obviously
           | how advanced their future AI has become is going to be one of
           | the most closely guarded trade secrets in the world if it
           | hasn't already.
           | 
           | So while it might've been intended as social commentary or
           | humor, it is a valid concern. Be great fodder for their form,
           | but I'm sure someone had already submitted this one -- it's
           | possible that is even where the clever, clever comment
           | originated.
           | 
           | Impersonating the creators of the AI technologies is an
           | obvious entry point for compromise. Mr. Beast has a special
           | offer just for you...
           | 
           | Remember that most arguments against AI are built on
           | commentary and ideas of the _current publicly available
           | systems_ , and those systems will always be very, very far
           | away from the cutting edge. By the time John Q. Public sees
           | it it's been properly sanitized, reviewed by QA, cleared for
           | release, and permanently rule bound to stay in brand and away
           | from inflammatory scenarios or any instability that could
           | damage the company -- so very much of what is happening with
           | AI will for these reasons never see the light of day. And
           | yet, everyone is an expert because of the systems that see
           | the light of day as _if they were keeping up with the cutting
           | edge._
           | 
           | They are not. They are experts in what has been chosen to be
           | shown.
           | 
           | It is fiction and we fool ourselves into thinking we know
           | what is actually going on as outside parties, but many
           | business incentives and the military industrial complexes of
           | every large country on the planet are aligned differently.
           | 
           | And I'm sure there are compartments for information
           | management inside of a company working on this kind of thing.
           | Companies can pretend to be omnipotent and ignore the
           | realities of globalized geopolitics and even pretend to have
           | no interest, but geopolitics is very interested in keeping up
           | with them.
        
         | m3kw9 wrote:
         | How do I know your post or this world isn't already projected
         | by an AI?
        
           | two_in_one wrote:
           | Matrix? Who cares, as long as it feels real.
        
         | two_in_one wrote:
         | It could be FBI as well. For the same reasons they are
         | 'selling' anti-aircraft missiles in US. They even caught some
         | 'enthusiasts'.
        
       | kirykl wrote:
       | Essentially OpenAI offers to take the don't from "do crimes that
       | don't scale"
        
       | Mistletoe wrote:
       | $25,000 in API credits seems a lot less cool than $25k. Will the
       | kind of people that would be good at this be motivated by that?
        
       | RandomLensman wrote:
       | Maybe the biggest misuse is to use AI to create hypothetical bad
       | outcomes used to limit its use and restrict its benefits.
        
       | monkaiju wrote:
       | Clicked this after reading 'OpenAPI'... God im tired of the AI
       | spam...
        
       | danShumway wrote:
       | To be honest, I view this as mostly PR fluff.
       | 
       | OpenAI's problem is not that it is unable to predict failure
       | modes or security risks for AIs. OpenAI's problem is that it is
       | not taking the many failure modes and security risks that it
       | already knows about seriously, and it's not putting in adequate
       | effort to address the specific concerns that people repeatedly
       | talk about publicly.
       | 
       | It's not a failure of imagination, it's a failure of action.
       | Their problem isn't that they don't know what to care about,
       | their problem is that they don't care.
       | 
       | So when OpenAI puts out these offers or press releases about
       | trying to be prepared or finding new failure modes, it just
       | doesn't read as sincere. I want to know what they're doing about
       | the very specific flaws that exist today that they already know
       | about; both the ones that would be trivial to address (ie, UX-
       | flows, data-exfiltration vulnerabilities, and user consent flows
       | for plugins) that OpenAI refuses to acknowledge, and the ones
       | that are wildly challenging but that demand actual Open research
       | and mitigation (ie prompt injection and mass spam) rather than
       | toothless "we're letting researchers explore this area" PR.
       | 
       | OpenAI has unlocked doors in its product -- and instead of
       | locking them, it is hiring researchers to theorize about the
       | nature of doors and asking the public to try messing with the
       | windows. I'm not giving them credit for that, fix your doors.
        
       ___________________________________________________________________
       (page generated 2023-10-26 23:02 UTC)