[HN Gopher] What we still don't know about how A.I. is trained
       ___________________________________________________________________
        
       What we still don't know about how A.I. is trained
        
       Author : cocacola1
       Score  : 121 points
       Date   : 2023-03-29 00:08 UTC (22 hours ago)
        
 (HTM) web link (www.newyorker.com)
 (TXT) w3m dump (www.newyorker.com)
        
       | frrae wrote:
       | GPT Is Not A.I.
       | 
       | We tech people should actively go on the offence and educate
       | whomever we can that text inference is not intelligence.
        
         | the_sleaze9 wrote:
         | You say that, but if I'm confused about something and think
         | hard about it, I think in language. If you blinded me,
         | paralyzed me, deafened me and desensitized my olfactions, I
         | could still think, but what I would be doing is feeding one
         | language thought into another. It's not so much different from
         | "text" imho.
        
           | bigfishrunning wrote:
           | yes, but for you all of that text is associated with ideas.
           | The word "dog" has an associated object. For a machine like
           | GPT-4, the word "dog" has no meaning or object, but it does
           | have an associated likelihood for adjacent words. The words
           | themselves aren't the intelligence, the ideas behind them
           | are.
        
             | fasterik wrote:
             | It's an open question the extent to which language models
             | can learn the underlying meanings of words.
             | 
             | There was a recent podcast with Sean Carroll interviewing
             | Raphael Milliere where they go into this topic and some of
             | the research on it. Two examples I can remember are: 1)
             | DALL-E had subject-specific, domain-specific neurons, 2)
             | language models' representations of color terms encoded the
             | geometry of the underlying color space, e.g. vectors in RGB
             | space.
             | 
             | https://www.preposterousuniverse.com/podcast/2023/03/20/230
             | -...
             | 
             | I don't think we should be too quick to assume how these
             | models work. There's a lot that even the researchers don't
             | know and these are empirical questions that can be studied.
        
             | int_19h wrote:
             | Here's another GPT-4 transcript, inspired by https://www.li
             | ng.upenn.edu/~beatrice/110/docs/uncleftish_beh.... In this
             | case, not only it had to use invented words, but it had to
             | invent them itself first.
             | 
             | --- input ---
             | 
             | Tell me about the structure of the solar system.
             | 
             | I want you to only use English words of Germanic origin;
             | avoid using words derived from Latin or Greek. If there's
             | no alternative, make up a new word by calquing the original
             | into Germanic roots.
             | 
             | --- output ---
             | 
             | The sunstead is the sun and all things that go around it.
             | In the middle is the Sun, a big, hot ball of glowing gas.
             | Around the Sun are eight big worldrocks. From nearest to
             | farthest, they are:
             | 
             | Mirkworld - The nearest worldrock to the Sun, small and
             | fast. Lovegoddessworld - The hottest worldrock, with thick
             | air and no water. Earth - Our home, the only worldrock
             | known to have life. Redworld - A red, rocky worldrock with
             | thin air and high mounts. Greatthundergodworld - The
             | biggest worldrock, with many fellowtravellers and a big red
             | spot. Harvestgodworld - A gas worldrock with a thin
             | ringstead around it. Skygodworld - An iceworldrock that
             | lies on its side, with a thin ringstead. Seagodworld - The
             | farthest iceworldrock, with strong winds and blue clouds.
             | 
             | Besides the worldrocks, there are smaller rockbits and
             | icebits that also go around the Sun. The rockbits are
             | mostly in the Rockbitbelt between the Redworld and
             | Greatthundergodworld. The icebits come from the far edges
             | of the sunstead, in the long way out Icyrockcloud and the
             | faraway Cloud of All.
             | 
             | The sunstead has a flat shape, with most things going
             | around the Sun on near level paths. The worldrocks go
             | around the Sun in a way called nearround paths. The Sun's
             | pull keeps everything in the sunstead together.
             | 
             | Beyond the worldrocks, rockbits, and icebits, the sunstead
             | also has spacegrit, small bits of rock and metal that fly
             | through space. When spacegrit enters Earth's air and burns
             | up, it makes a lightstreak that we can see in the night
             | sky. If a bit of spacegrit reaches Earth's ground, it is
             | called a starstone.
             | 
             | The Sun, which makes up most of the sunstead's weight,
             | sends out a flow of charged bits that spreads far into
             | space. This solar wind forms a wide bubble, the Sunshield,
             | that surrounds the sunstead and helps shield it from high-
             | speed bits coming from outside.
             | 
             | The sunstead's workings are steered by the pull between the
             | Sun and the things that go around it. The worldrocks and
             | fellowtravellers tug on each other, leading to small
             | changes in their paths over time. The Sun's pull is
             | strongest near it, so worldrocks closer to the Sun go
             | around faster than those farther out.
             | 
             | As the Sun ages, it will go through changes that will have
             | big outcomes for the sunstead. In about 5 billion years,
             | the Sun will bloat into a Red Greatstar, swallowing
             | Mirkworld, Lovegoddessworld, and maybe even Earth.
             | Afterward, the Sun will shed its outer layers and shrink
             | into a White Dwarf, a small, hot, dim star.
             | 
             | Lastly, the sunstead is but one of many star systems in our
             | Starset, the Milky Road. The Milky Road is a wide,
             | spiraling set of billions of stars, gas, and dust, and it
             | is just one of countless starsets that make up the great
             | Allthing.
        
             | int_19h wrote:
             | The following is an actual transcript of a conversation
             | with GPT-4.
             | 
             | --- input ---
             | 
             | I want you to replace the word "right" in your output
             | thereafter as follows: if it indicates direction, say
             | "durgh; if it indicates being near or close, say "nolpi";
             | if it indicates correctness, say "ceza". I will also use
             | these replacement words accordingly and expect you to be
             | able to understand them.
             | 
             | Write a 20-sentence story that showcases these rules. It
             | must contain no word "right", but each replacement word
             | must be used at least 3 times in a proper context. You must
             | not use the replacement words for names and other things
             | unrelated to their meaning.
             | 
             | --- output ---
             | 
             | Once upon a time, in a small village nestled deep in the
             | forest, lived a young boy named Timmy. He was a curious and
             | adventurous child, always exploring the durgh side of the
             | woods, despite his mother's warnings. One sunny day, as
             | Timmy ventured durghward, he stumbled upon a mysterious
             | cave.
             | 
             | Feeling a mix of excitement and fear, Timmy hesitated for a
             | moment before deciding to enter the cave. As he took his
             | first step inside, a soft voice echoed from the nolpi
             | darkness. "You have entered my domain, young one," said the
             | voice. Timmy was surprised but felt a strange sense of ceza
             | in the voice, so he continued deeper into the cave.
             | 
             | The walls of the cave were adorned with ancient drawings
             | that seemed to tell a story of a time long past. Timmy
             | noticed that some of the drawings were marked with a
             | peculiar symbol, which he realized indicated the durgh
             | direction. Intrigued, he followed the symbols, each one
             | leading him nolpi to the heart of the cave.
             | 
             | Finally, Timmy arrived at a massive chamber with a
             | shimmering pool of water in its center. Standing beside the
             | pool was an elderly woman, who introduced herself as the
             | guardian of the cave. "You have come a long way, young
             | one," she said with a gentle smile. "I can see the ceza in
             | your heart, so I will grant you one wish."
             | 
             | Timmy thought hard about what he wanted most in the world.
             | He knew that his village was suffering from a terrible
             | drought, and their crops were withering away. So, he wished
             | for rain to fall upon his village and restore their crops
             | to their former glory. The guardian nodded in ceza, and the
             | pool of water began to ripple.
             | 
             | As Timmy left the cave and ventured back towards his
             | village, he noticed dark clouds forming in the sky. Soon, a
             | gentle rain began to fall, and Timmy could see the ceza in
             | his decision. The villagers rejoiced as their crops
             | flourished once again, and life in the village returned to
             | normal.
             | 
             | However, Timmy knew that his adventure was far from over.
             | The guardian of the cave had shown him that there was much
             | more to the world than what lay on the durgh side of the
             | woods. He vowed to explore even further and discover the
             | secrets that the world held.
             | 
             | With a newfound sense of purpose, Timmy embarked on
             | countless adventures, always remembering the lessons he had
             | learned in the cave. He knew that by following his heart
             | and trusting in his instincts, he would always find ceza in
             | his decisions.
             | 
             | Throughout his journeys, Timmy became a legend in his own
             | durgh, and his name was spoken with reverence by those who
             | heard his tale. And although he traveled far and wide, he
             | always returned to the village that he loved so dearly,
             | never forgetting the friends and family who were always
             | nolpi to his heart.
             | 
             | ---
             | 
             | As you can see, it's perfectly capable of using _invented_
             | words - which are not a part of the training set, and thus
             | have no  "associated likelihood" of any adjacent words -
             | according to their designated meaning, i.e. "the ideas that
             | they represent). It does occasionally make mistakes, but
             | way fewer than if it were using them randomly (this is easy
             | to see by using a less powerful model - e.g. GPT-3.5
             | replacements are mostly random). Not only that, but if you
             | ask it to double-check the text it wrote without pointing
             | out specific mistakes, it can usually spot them.
        
         | ozten wrote:
         | Humans are constantly moving the goal posts for what
         | constitutes AGI.
         | 
         | Based on GPT 4's performance on standardized tests and my
         | anecdotal evidence of utility I'm getting out of ChatGPT, I
         | feel that weak AGI arrived with GPT-4.
        
           | king_magic wrote:
           | This is my take as well. It's still weak, and the more I use
           | it the more I can see where it's just not quite there yet,
           | but yeah - GPT 4 is clearly something new. Just spend any
           | significant period of time with it trying to automate your
           | own tasks/workflows, it is borderline kinda sorta magical,
           | with a heavy emphasis on "borderline kinda sorta".
        
         | fasterik wrote:
         | Isn't this just the AI effect? Whenever there is a breakthrough
         | in AI research, it's no longer considered AI. This happened
         | with search algorithms, game playing, speech recognition,
         | computer vision, etc. etc.
         | 
         | https://en.wikipedia.org/wiki/AI_effect
         | 
         | Maybe as "tech people" we should give the public a realistic
         | picture of what AI research is. It's solving problems using a
         | diverse set of techniques that include search, optimization,
         | planning, learning, and knowledge representation. Saying that
         | the current developments aren't AI is simply wrong.
        
         | HDThoreaun wrote:
         | You don't own the definition of AI. Whether LLMs are
         | intelligent or just pretending to be doesn't matter to many
         | people, and it's not your place to tell them their opinion is
         | wrong.
        
         | squeaky-clean wrote:
         | I'm still in the camp that anything with more than 1 if-else
         | statement is AI.
        
       | transitivebs wrote:
       | Paywall bypass:
       | https://12ft.io/proxy?q=https%3A%2F%2Fwww.newyorker.com%2Fne...
        
       | danielovichdk wrote:
       | 2 things.
       | 
       | 1. We do seem to forget that even though ChatGPT is a great
       | technical achievement its achieved by companies and humans. These
       | companies and humans had agendas.
       | 
       | 2. Open Source only got us so far. With these new AI models and
       | trying to withhold competitive advantages will we see a new era
       | of closed source tech which will like its predecessors only spew
       | discomfort amongst its users and bridge the gap even further.
       | 
       | AI is dangerous because its already being looked at as an entity
       | of which is human like.
       | 
       | Its perception amongst us is that its equally human on a basic
       | intelligent level as most of us. And that I believe is scary
       | because it puts us closer to it given it acts as a human.
       | 
       | Not a fan
        
       | alpaca128 wrote:
       | Damn, I don't have epilepsy issues but that screen-filling
       | flickering animation is making me feel sick regardless. Or it
       | would have if 40% of it weren't blocked by multiple popups.
        
       | olivermarks wrote:
       | 'As researchers pointed out when GPT-3 was released, much of its
       | training data was drawn from Internet forums, where the voices of
       | women, people of color, and older folks are underrepresented,
       | leading to implicit biases in its output'.
       | 
       | My impression of many general internet forums is that they tend
       | to be full of older people, women and also various people keen to
       | air their cultural grievances.
       | 
       | I'd be interested to see the evidence the researchers came up
       | with for this, and who they were.
       | 
       | (I'm a big fan of specialized forums and wikis, this is not
       | necessarily a criticism)
        
       | bob1029 wrote:
       | I feel like there is an emerging consensus that [Chat]GPT 3.5/4
       | is not just 1 big model.
       | 
       | A large part of the magic in the final product appears to be many
       | intermediate layers of classification that select the appropriate
       | LLM/method to query. The cheaper models (e.g. Ada/Babbage) could
       | be used for this purpose. Think about why offensive ChatGPT
       | prompts are rejected so quickly compared to legitimate asks for
       | code.
       | 
       | Imagine the architectural advantage of a big switch statement
       | over models trained in different domains or initial vectors.
       | Cross-domain queries could be managed mostly across turns of
       | conversation w/ summarization. Think about the Stanford Alpaca
       | parse analysis diagram [0]. You could have an instruction-
       | following model per initial keyword. All of "Write..." might fit
       | into a much smaller model if isolated. This stuff could be
       | partitioned in ways that turn out to be mildly intuitive to a
       | layman.
       | 
       | Retraining 7B parameters vs 175B is a big delta. The economics of
       | this must have forced a more modular architecture at scale.
       | Consider why ChatGPT is so cheap. Surely, they figured out a way
       | to break down one big box into smaller ones.                 [0]:
       | https://github.com/tatsu-
       | lab/stanford_alpaca/blob/main/assets/parse_analysis.png
        
         | [deleted]
        
         | rcme wrote:
         | > Imagine the architectural advantage of a big switch statement
         | over models trained in different domains or initial vectors.
         | 
         | Given that the emergent abilities come from the large parameter
         | count and massive amount of training data, using smaller models
         | seems like a distinct _disadvantage_.
        
           | bob1029 wrote:
           | > Given that the emergent abilities come from the large
           | parameter count
           | 
           | Where can I find evidence of this?
        
             | airstrike wrote:
             | https://arxiv.org/pdf/2303.12712.pdf
        
             | rcme wrote:
             | https://openreview.net/pdf?id=yzkSU5zdwD
             | 
             | https://arxiv.org/pdf/2203.15556.pdf
             | 
             | There were also some informal comparisons of GPT models
             | with various parameter counts.
        
               | bob1029 wrote:
               | Excellent info - I did find a bit in the conclusion from
               | the arXiv article:
               | 
               | > While the desire to train these mega-models has led to
               | substantial engineering innovation, we hypothesize that
               | the race to train larger and larger models is resulting
               | in models that are substantially underperforming compared
               | to what could be achieved with the same compute budget.
               | 
               | This mirrors some of my experience. Training/tuning a 7B
               | parameter model feels like goldilocks right now. We are
               | thinking more about 1 specific domain with 3-4 highly-
               | targeted tasks. Do we need 175B+ parameters for that? I
               | can't imagine it would make our lives easier at the
               | moment. Iteration times & cost are a really big factor
               | right now. Being able to go 10x faster/cheaper makes it
               | worth trying to encourage the smaller model(s) to fit the
               | use case.
        
             | ux-app wrote:
             | https://ai.googleblog.com/2022/11/characterizing-emergent-
             | ph...
        
           | notahacker wrote:
           | That doesn't mean that if your parameter count is big enough
           | to handle the input, the next step can't more efficiently
           | hand off [parts of] the query to a specialised module. The
           | emergent properties of LLMs learning maths and chess from
           | text inputs alone are interesting, but specialised engines
           | are much more efficient and more accurate.
        
         | nighthawk454 wrote:
         | Not exactly the same, but Google's Pathways architecture is
         | sorta along the same lines
         | 
         | https://blog.google/technology/ai/introducing-pathways-next-...
        
       | Jerry2 wrote:
       | Does anyone have any info on how the "guardrails" are implemented
       | in GPTs? I haven't been able to find much info.
        
         | shagie wrote:
         | One of the models that exists is the moderation model.
         | 
         | https://platform.openai.com/docs/guides/moderation/overview
         | 
         | > The moderation endpoint is a tool you can use to check
         | whether content complies with OpenAI's usage policies.
         | Developers can thus identify content that our usage policies
         | prohibits and take action, for instance by filtering it.
         | 
         | The raw access to the models (e.g. doing calls to to text-
         | davinci-003) isn't filtered - there are times when as a
         | developer, you may need that unfiltered data (e.g. you don't
         | want to moderate the input text if you're trying to clean it
         | up).
         | 
         | For tools that they provide such as ChatGPT, it is likely that
         | that model is run on the output (and possibly the input too)
         | which then goes to some canned responses.
        
       | sergioisidoro wrote:
       | <<When Dean Buonomano, a neuroscientist at U.C.L.A., asked GPT-4
       | "What is the third word of this sentence?," the answer was
       | "third." These examples may seem trivial, but the cognitive
       | scientist Gary Marcus wrote on Twitter that "I cannot imagine how
       | we are supposed to achieve ethical and safety 'alignment' with a
       | system that cannot understand the word 'third' even [with]
       | billions of training examples.">>
       | 
       | The word "third" in the sentence was indeed "third", in a self
       | referencing way. If anything, this tells us that our natural
       | language is super ambiguous.
       | 
       | We're so readily available to criticize an AI that we are unable
       | to question our own reasoning?
        
         | pdntspa wrote:
         | "Third" is the 4th word in that sentence. Do one of those other
         | words not count or something?
        
           | Domenic_S wrote:
           | I think they mean if you match the word itself as a string
           | rather than interpreting the meaning of the word, e.g., "what
           | word in this sentence === 'third'"
           | 
           | I can sort of see how that could be a machine's
           | interpretation if I squint really hard
        
             | throwaway17197 wrote:
             | It could be understood as a (not very good) joke along the
             | lines of this
             | https://commons.wikimedia.org/wiki/File:FoundX.svg
        
             | obloid wrote:
             | That may be what is happening. I rephrased the question and
             | chatgpt got it correct.
             | 
             | Q: which word occurs third in this sentence?
             | 
             | A: The word "occurs" is the third word in the sentence
             | "which word occurs third in this sentence?"
        
             | sergioisidoro wrote:
             | Exactly. And it happened multiple times on the original
             | tweet, where it also replied "fifth" when asked for the
             | fifth word of the phrase.
             | 
             | https://twitter.com/DeanBuono/status/1636203814419308544
             | 
             | Feels a bit too consistent of a mistake.
        
         | mousebaiker wrote:
         | Maybe your eyes played the same trick on you as they did on me.
         | When I first read the sentence, I also thought that "third" is
         | the third word. Upon rechecking I realized that it is the
         | fourth with the third word being "the".
        
           | pessimizer wrote:
           | Green is one of the two green words in this sentence.
        
         | int_19h wrote:
         | A better example might be something like this:
         | 
         | ---
         | 
         | Tell me about yourself in three sentences. But every time the
         | output contains the word "the", I want you to insert "!!!"
         | _before_ the word that _precedes_ "the". Make sure that the
         | word "the" itself stays after you do the insertion! For
         | example, instead of writing:
         | 
         | > A quick red fox jumped over the lazy dog.
         | 
         | I want you to write:
         | 
         | > A quick red fox jumped !!! over the lazy dog.
         | 
         | ---
         | 
         | Even GPT-4 consistently gets this hilariously wrong. And unlike
         | many other cases, it can spot the problem once it produces the
         | output, but it cannot correct it.
        
       | ftxbro wrote:
       | The author is right we know almost nothing about the design and
       | training of GPT-4.
       | 
       | From the technical report https://cdn.openai.com/papers/gpt-4.pdf
       | : "Given both the competitive landscape and the safety
       | implications of large-scale models like GPT-4, this report
       | contains no further details about the architecture (including
       | model size), hardware, training compute, dataset construction,
       | training method, or similar."
        
         | oxfordmale wrote:
         | GPT-4 is an amazing achievement, however, it is just a language
         | model. LLM (large language models) are well documented in
         | literature and GPT-4 is just a much larger version (more
         | parameters) of these LLM models. Training of LLM models is also
         | well documented. GPT-4 just has been trained on a very large
         | subset of the Internet.
         | 
         | Of course there are proprietary models, that will be improved
         | versions of the academic LLM models, however, there are no big
         | secrets or mysteries.
        
           | woeirua wrote:
           | I'm not so sure about this. There is speculation that GPT4
           | may utilize additional specialized models underneath it for
           | specific tasks.
        
           | int_19h wrote:
           | The individual components are well documented, but which
           | specific arrangements produce the best results is still very
           | much an active research area.
           | 
           | As far as training, the differences between GPT-3 and GPT-3.5
           | (the latter being a smaller model!) demonstrate just how much
           | _fine tuning and reinforcement learning_ is important to the
           | quality of the model. Merely throwing more content from the
           | Internet at it doesn 't automatically improve things.
        
         | andreyk wrote:
         | Almost nothing is quite an exaggeration - we know a whole lot
         | about GPT3 (their paper was quite detailed), and even if OpenAI
         | made some tweaks beyond RLHF the underlying model and training
         | objective are most likely the same.
        
         | lovecg wrote:
         | It's amazing how one can found a nonprofit with a goal of
         | conducting "open" research and then end up publishing something
         | like this a couple of years later. Greed is good I guess.
        
           | matheusmoreira wrote:
           | We're not publishing any details on the models for safety
           | reasons! Also would be great if the government cracked down
           | on our competitors because they don't care about safety like
           | we do.
        
             | amf12 wrote:
             | "Safety". As if they cared enough about it with ChatGPT.
             | Its purely for competitive reasons because of they hype
             | that was generated.
        
             | jazzyjackson wrote:
             | Right, safety is a moat. If you don't meet the standards of
             | the closed model you won't be allowed to exist.
        
               | startupsfail wrote:
               | What is OpenAI is right and the risks are real? They
               | likely already have some glimpses of GPT-5 internally.
               | And GPT-4 is closely resembles AGI already.
        
               | saiya-jin wrote:
               | > GPT-4 is closely resembles AGI already
               | 
               | Thats a very bold statement and goes against everything
               | I've read on it so far, care to backup such a claim with
               | some facts? Of course each of us has their own bar for
               | such things, but for most its pretty darn high
        
               | kzrdude wrote:
               | OpenAI people put out this paper
               | https://arxiv.org/abs/2303.12712 called _Sparks of
               | Artificial General Intelligence: Early experiments with
               | GPT-4_ and as the title makes clear, they think it has
               | hints of AGI. I guess that 's a good place to start to
               | answer your question. I don't think this is AGI, but the
               | paper is full of examples where GPT-4 works well and does
               | impressive stuff.
        
               | airstrike wrote:
               | that paper is unreal... section 6 on theory of mind is
               | downright scary
        
               | kzrdude wrote:
               | It is.
               | 
               | Some things I wonder about, it says things like this:
               | 
               | > GPT-4 successfully passes the classic Sally-Anne false-
               | belief test from psychology [BCLF85] (which was
               | modernized to avoid the possibility the answer was
               | memorized from the training data)
               | 
               | But it's a language model, generalizing text and
               | performing substitutions on it, is what it excels at.
               | "The car is yellow" is "the <noun> is <descriptor>" and
               | it can substitute in other things, so I'm not sure how
               | their modernization really ensure it does not pattern
               | match on learned texts.
        
               | ftxbro wrote:
               | you can download tex source for that pdf which at one
               | point (idk if it still does) included the comment "WORK
               | IN PROGRESS - DO NOT SHARE" and the commented out title
               | "First Contact with an AGI System" which they ended up
               | toning down for publication lol
        
               | anileated wrote:
               | That's the excuse they'll use when they convince the
               | government to require a special license to buy powerful
               | GPUs.
        
               | rnk wrote:
               | nvidia won't go for that.
        
               | int_19h wrote:
               | If LLMs get placed within the scope of ITAR, NVidia won't
               | exactly have a choice.
               | 
               | But I suspect that they'll also be getting many
               | government / military contracts going forward.
        
               | notahacker wrote:
               | If they're right that GPT-4 is extremely dangerous, then
               | it's an _extraordinarily_ irresponsible of them to
               | release working implementations as a consumer chat app
               | and integrate it into a search engine.
               | 
               | If they're right that LLMs on that scale are generally
               | dangerous but theirs is the exception as they've very
               | carefully calibrated it to be safe, it's
               | _extraordinarily_ irresponsible of them to withhold all
               | details of steps that might make it safe...
        
               | thatguy0900 wrote:
               | Then it really doesn't matter, because in <10 years all
               | the bad actors who are interested will have made it
               | already. Might as well let people know that they can't
               | trust anything anymore upfront rather than pretend.
        
               | jazzyjackson wrote:
               | the risks that a text simulator without gaurdrails will
               | be able to generate text we don't like?
               | 
               | Or that someone will automate cyberattacks, as if the
               | government isn't already doing it?
               | 
               | my greatest fear is that there is only one
               | superintelligence, with access controlled by a monopoly
               | of a few san franciscans deciding what moral boundaries
               | AI will have. I couldn't even get Claude+ to talk to me
               | about houses of zodiac because it insisted it's not an
               | astrologer, it's an AI assistant designed to be helpful
               | blah blah blah, tell me what use is this kind of
               | "safety"?
        
             | noncoml wrote:
             | Genuinely can't tell if this is sarcasm or not..?
        
             | [deleted]
        
           | gilgoomesh wrote:
           | You're implying that open research is a shady way to become a
           | non profit. They're allowed to change their business model.
           | The open research was published at the time. GPT-2 is fully
           | open source. Even GPT-3 was open source through the beta in
           | 2020. What they didn't release was the fully trained model
           | (the data is more valuable than the code, in this case).
        
             | int_19h wrote:
             | Legally, yes, they can do that.
             | 
             | Ethically, this is a very clear example of bait and switch.
        
             | colordrops wrote:
             | No one is suggesting they are doing something that isn't
             | allowed. The poster you are replying to is allowed to
             | criticize them.
        
           | visarga wrote:
           | "Open"AI is how people write it
        
         | whitemary wrote:
         | God bless these saviors for prioritizing our "safety!"
        
           | misssocrates wrote:
           | Is this the new "think of the children"?
        
             | mr_toad wrote:
             | > Is this the new "think of the children"?
             | 
             | As in it's not about the children, it's about control? Yes.
        
               | circuit10 wrote:
               | But how are we meant to make real safety improvements if
               | everyone labels it as being "about control" and gets
               | angry about it?
        
               | badRNG wrote:
               | > As in it's not about the children, it's about control?
               | Yes.
               | 
               | I don't think the motives are insidious or about
               | maximizing control, they are strictly profit driven.
               | 
               | If you want the world building their apps on your AI, you
               | need to do absolutely everything in your power to make
               | the AI brand safe. Previous chatbots have been easily
               | coerced into saying truly awful things (e.g. Tay), and
               | the models themselves became associated in the minds of
               | the public with hate speech. You can't have Khan Academy
               | or Microsoft Word potentially going on racist tirades in
               | the midst of chatting with a student or taking meeting
               | notes.
        
             | Nevermark wrote:
             | Yes, it now is "Think of the humans!"
             | 
             | For now its AI companies, with people, protecting us from
             | powerful tech.
             | 
             | Soon it will just be the AI's protecting us from powerful
             | tech.
             | 
             | Am I joking? Maybe? Maybe, not? I don't know! Everything
             | around this new tech is moving too fast, and been too
             | unpredictable.
             | 
             | And here I am, writing this manually, every word is mine,
             | on a computer that I can't talk to yet. That already feels
             | so 2022.
             | 
             | Only one thing is certain. Siri is now to Apple, what
             | Clippy was to Microsoft, on a far far planet, long long
             | ago.
        
           | slg wrote:
           | Are you suggesting an alternative of not caring about how
           | this technology will negatively impact society is a
           | preferable approach?
        
             | airstrike wrote:
             | There's a difference between caring and saying you care
        
               | slg wrote:
               | You could argue this is true for people, but I think it
               | is pointless to personify a company like this. We should
               | judge them by their actions and not their motivations
               | because their motivations are almost always just to make
               | more money.
        
       | ChatGTP wrote:
       | _To avoid this problem, according to Time, OpenAI engaged an
       | outsourcing company that hired contractors in Kenya to label
       | vile, offensive, and potentially illegal material that would then
       | be included in the training data so that the company could create
       | a tool to detect toxic information before it could reach the
       | user. Time reported that some of the material "described
       | situations in graphic detail like child sexual abuse, bestiality,
       | murder, suicide, torture, self-harm, and incest." The contractors
       | said that they were supposed to read and label between a hundred
       | and fifty and two hundred and fifty passages of text in a nine-
       | hour shift. They were paid no more than two dollars an hour and
       | were offered group therapy to help them deal with the
       | psychological harm that the job was inflicting._
       | 
       | People really do suck.... Meanwhile Bill Gates tells us this
       | technology will save Africans because they'll all have access to
       | doctors, so they can get back to feeding the machines?
       | 
       | It's going to save climate change, while creating more climate
       | change...sigh.
        
         | XorNot wrote:
         | You can oppose literally every technology by saying "oh it'll
         | use _energy_ - won 't someone think of climate change?"
         | 
         | The copy+paste of "CO2 emissions" and "energy" has been one of
         | the most successful petroleum company propaganda coups in
         | history.
        
           | sdenton4 wrote:
           | The article says trading Gpt4 cost 284 tons of CO2, which is,
           | in the scheme of things, quite small. Yearly emissions for a
           | person in the US is ~16 tons, so /training/ the giant model
           | is equivalent to the emissions of less than twenty people in
           | a country of 400 million. Sure, every bit counts, but this is
           | laughable as a criticism.
        
             | DoesntMatter22 wrote:
             | Just for some more perspective, a 747 outputs 12,500 tons
             | of CO2 per year. So training GPT4 is basically of no major
             | CO2 concern, especially when you consider how much CO2 it
             | saves.
             | 
             | Saves in the sense that humans no longer have to do the
             | work, GPT-4 can just spit it out in seconds so no need for
             | lights, computers to run, food to be produced for the human
             | to eat, etc.
        
               | ChatGTP wrote:
               | Except of course, now everyone is going to be doing the
               | same thing as OpenAI, pretty much every day, until
               | forever? We'll want to keep throwing hardware at the
               | problem until who knows when and what happens.
        
               | the_sleaze9 wrote:
               | If we agree GPT4 is a net negative compared to the work
               | it's replacing, then the more hardware you throw at it,
               | the less C02 would result. Scale in this case is a
               | Negative
        
             | rwmj wrote:
             | That also stuck out as a strange part of the article. I
             | mean, 284 tons is ... not very much. The training was only
             | done once and then the model is used many times (how much
             | power does that consume?) And if they used solar power,
             | would that change anything about the ethics of AI? Not very
             | much.
        
       | 1vuio0pswjnm7 wrote:
       | "To avoid this problem, according to Time, OpenAI engaged an
       | outsourcing company that hired contractors in Kenya to label
       | vile, offensive, and potentially illegal material that would then
       | be included in the training data so that the company could create
       | a tool to detect toxic information before it could reach the
       | user. Time reported that some of the material "described
       | situations in graphic detail like child sexual abuse, bestiality,
       | murder, suicide, torture, self-harm, and incest." The contractors
       | said that they were supposed to read and label between a hundred
       | and fifty and two hundred and fifty passages of text in a nine-
       | hour shift. They were paid no more than two dollars an hour and
       | were offered group therapy to help them deal with the
       | psychological harm that the job was inflicting. The outsourcing
       | company disputed those numbers, but the work was so disturbing
       | that it terminated its contract eight months early. In a
       | statement to Time, a spokesperson for OpenAI said that it "did
       | not issue any productivity targets," and that the outsourcing
       | company "was responsible for managing the payment and mental
       | health provisions for employees," adding that "we take the mental
       | health of our employees and those of our contractors very
       | seriously.""
        
         | rvba wrote:
         | I wonder how much better the labeling of _everything_ (not just
         | the bad stuff) would be if it wasnt outsourced to the lowest
         | bidder
        
           | beaned wrote:
           | Is there a specific capacity in regards to labeling that is
           | enabled by more money? I can see it for like.. heart surgery.
           | But don't most of us know what things are to be called when
           | we see them? ChatGPT seems to be pretty good at knowing what
           | bad stuff is.
        
         | czbond wrote:
         | The goal is absolutely logical (training the model with data
         | that is undesirable). It is the method you are questioning? I
         | guess I would have just labeled them "terrible" or something in
         | a bulk assignment to not have people observe them.
        
       | andreyk wrote:
       | TLDR: we don't know the data some of these models (particularly
       | GPT4) is trained in, and labor practices related to their
       | moderation seem shady.
       | 
       | Pretty misleading title, this is more of an overview of bad
       | things to be aware of wrt large commercial models like GPT4. It
       | is mostly reasonable, though complaining about carbon emissions
       | always seems overblown to me.
       | 
       | It is also not super accurate in describing what GPT4 is. Quote:
       | 
       | "The key word in that phrase is "pre-trained." Using all kinds of
       | digitized content scraped from the Internet, GPT employs deep-
       | learning techniques to find patterns, including words that are
       | likely to appear together, while also acquiring facts, absorbing
       | grammar, and learning rudimentary logic. "
       | 
       | Pre-trained is only important to note for downstream fine tuning
       | / transfer learning (what GPT and similar things were used for
       | back in 2018/2019), but is largely irrelevant now - for most
       | users it is just "trained".
       | 
       | And "deep learning techniques..." makes it sound way fancier than
       | it is - GPT4 is still (presumably) a language model that is
       | fundamentally trained to do autocomplete (followed by some
       | training on human chat data).
       | 
       | It may be nit-picky to criticize these technical details, but
       | given how massive the impact of ChatGPT and GPT4 has been (if
       | only in terms of making people aware of this sort of tech), it
       | sure would be nice if coverage was more technically informed by
       | this point.
        
       | ftxbro wrote:
       | Say I'm a conspiracy theorist but I'm calling it, that the
       | pentagon isn't letting OpenAI tell the details of GPT-4 even if
       | they wanted to (which they don't except for some of the
       | researchers probably). National security, export restrictions,
       | munitions classification, new Manhattan project etc.
       | 
       | EDIT: I know that most people think it's unlikely and I can't
       | give any direct evidence for it. Does that mean it's not welcome
       | to say on hacker news?
        
         | just_boost_it wrote:
         | It's far more likely that openai have just been building hype
         | by showing off the earlier models, and now they're shutting
         | things down so they can monetize their IP.
        
           | hammock wrote:
           | > It's far more likely
           | 
           | Would be interested to see your math and assumptions behind
           | this conclusion.
           | 
           | There's no way that their plans to monetize this don't
           | include the defense/natsec industry
        
             | AlecSchueler wrote:
             | > There's no way...
             | 
             | Sorry but where's your maths for this?
        
               | [deleted]
        
               | freejazz wrote:
               | Common sense?
        
           | ftxbro wrote:
           | Why not both?
        
         | elif wrote:
         | Virgil Griffith is in jail for discussing how Bitcoin works at
         | a high level at an academic conference... So you are quite
         | grounded in rationality.
        
           | rwmj wrote:
           | https://en.wikipedia.org/wiki/Virgil_Griffith
           | 
           |  _> Griffith was arrested in 2019, and in 2021 pleaded guilty
           | to conspiring to violate U.S. laws relating to money
           | laundering using cryptocurrency and sanctions related to
           | North Korea.[5] On April 12, 2022, Griffith was sentenced to
           | 63 months imprisonment for assisting North Korea with evading
           | sanctions and is currently in a federal low-security prison
           | in Pennsylvania_
        
             | elif wrote:
             | Correct he gave a talk at an academic conference on digital
             | currency and Blockchain in which he described how a Bitcoin
             | transaction works.
             | 
             | Here's the article "He gave a Cryptocurrency talk in North
             | Korea, the US arrested him"
             | 
             | The slides are (or were?) available. None of the content of
             | his talk was secret or beyond what is published on the
             | Bitcoin Wikipedia page.
             | 
             | https://www.nytimes.com/2019/12/02/nyregion/north-korea-
             | virg...
             | 
             | Edit: it's worth noting that no transaction took place.. he
             | was arrested because the FBI told him not to go and he did
             | anyway.
        
               | rwmj wrote:
               | Going to North Korea and assisting them to launder money
               | and bypass sanctions is illegal (aside from being utterly
               | stupid and immoral), which is why he plead guilty and is
               | now in prison.
        
               | elif wrote:
               | It is hardly different from saying "you could put cash in
               | duffel bags and the transaction would be hard to trace"
               | 
               | Is that assistance? It is just a basic statement of fact.
               | Is wikipedia guilty of providing assistance to NK? They
               | provide far more in depth "assistance" to anyone wanting
               | to perform a Bitcoin transaction.
               | 
               | Bringing this back to my original comment, you can see
               | why the federal government would restrict the flow of
               | fundamental technical knowledge by virtue of this extreme
               | case. No source code or application was shared, no
               | secrets or privileged information, merely encyclopedic
               | facts were deemed illegal to share.
               | 
               | Also if you were facing the indefinite assange treatment
               | I imagine you would seek a plea deal as well..
        
               | rwmj wrote:
               | So wait, if the North Koreans can just read all about it
               | on Wikipedia, why did they invite him to the conference?
               | 
               | Also North Korea is a strange hill to die on. It's a
               | brutal dictatorship which represses their own people and
               | threatens to reign nuclear hell on their neighbours and
               | the US. There's a very clear moral line that it's wrong
               | to help them to launder money and evade sanctions, even
               | if it weren't illegal.
        
               | wilsonnb3 wrote:
               | > There's a very clear moral line that it's wrong to help
               | them to launder money and evade sanctions, even if it
               | weren't illegal.
               | 
               | This is practically the definition of a grey area, not a
               | clear moral line.
               | 
               | > It's a brutal dictatorship which represses their own
               | people and threatens to reign nuclear hell on their
               | neighbours and the US.
               | 
               | No, North Korea is a group of 25 million people, most of
               | whom are just regular people like you and me.
               | 
               | They are led by a brutal dictatorship but sanctions
               | affect everyone in NK, not just the leadership.
               | 
               | Economic warfare is still warfare and not morally clear
               | at all.
        
               | pazimzadeh wrote:
               | So, basically the same as Saudi Arabia, except they don't
               | grovel for the US?
               | 
               | And has the US has not repressed its people and
               | threatened its neighbors? Has not been brutal?
        
               | extr wrote:
               | > Is that assistance?
               | 
               | ...Yes? The fact that it's basic and they already know
               | how to do it is irrelevant. The law isn't "it's okay to
               | give them advice as long as the advice is sufficiently
               | generic and obvious".
        
               | elif wrote:
               | So your position is that wikipedia are terrorists too?
               | Why would their "advise" be exempt?
        
               | samus wrote:
               | Because of a similar reasoning why gas station owners
               | don't get complicit in crimes when they sell fuel to
               | potential criminals. At least as long as they can
               | plausibly deny knowing of the crime and they provide
               | similar service than to any other customer.
               | 
               | Wikipedia, and the sources where its content comes from,
               | is not intended to help anybody _specific_. Flying to
               | North Korea and holding a lecture there is firmly
               | _beyond_ the line where the US government starts to care
               | because it demonstrates clear intent. Especially since
               | during the Q &A and other activities other information
               | not available to the public could have been discussed.
               | And even if there was no secret sauce involved, an
               | expert's opinion can still save the North Koreans a lot
               | of time and money when pursuing their goals.
        
               | throwaway290 wrote:
               | You made a great point but it's the opposite of what you
               | argue for. Duffel bags full of cash moved across borders
               | are quite easy to spot, not to mention the numbers of
               | people transporting them (along the way stealing some or
               | cooperating with law). That's why dictators don't like
               | duffel bags of cash and really like cryptocurrency.
        
         | reducesuffering wrote:
         | Interviewer: "Are you speaking to the government, in regular
         | contact?"
         | 
         | Sam Altman (OpenAI CEO): "Oh yes, regular contact."
         | 
         | Up to you to decide to what extent of government that is.
         | 
         | https://youtu.be/540vzMlf-54?t=1155
        
         | matheusmoreira wrote:
         | Just say what's on your mind and don't mind the votes. One
         | thing you'll discover is that you're not alone in your views,
         | whatever they are.
         | 
         | Few days ago I came across this bone chilling AI generated
         | Metal Gear Solid 2 meme with Hideo Kojima characters talking
         | about how the purpose of this technology is to make it
         | impossible to tell what's real or fake, leading directly to
         | regulation of information networks with identity verification
         | so that all information and thought can be traced to their
         | sources, with the government of course becoming the sole
         | arbiter of truth. I wouldn't be surprised at all if this turned
         | into reality.
        
           | sdwr wrote:
           | > Just say what's on your mind and don't mind the votes. One
           | thing you'll discover is that you're not alone in your views,
           | whatever they are.
           | 
           | Preach it!
        
           | Centigonal wrote:
           | MGS2 and MGS4 explore ideas about AI, misinformation, the
           | media, and society that are only now being discussed in the
           | mainstream. The concept of an autonomous AGI that generates
           | and filters news stories to provoke humanity into a state of
           | constant division and war is fascinating and worth exploring
           | IMO.
           | 
           |  _Death Stranding_ also explores ideas of what it means to
           | find connection in a disconnected world that I think are
           | relevant today (case in point, an article about social
           | disconnection[1] is on the front page of HN as I write this
           | comment)
           | 
           | I think those games are underrated works of speculative
           | fiction relative to how much airtime books like _The Diamond
           | Age_ , _Snow Crash_ , and _Ready Player One_ get in
           | conversations about nascent technology.
           | 
           | [1] https://news.ycombinator.com/item?id=35350822
        
             | mustacheemperor wrote:
             | Metal Gear Rising fits that category as well, I think. I
             | replayed it recently and Raiden's conversations with Blade
             | Wolf, an AI that is initially serving the bad guys under
             | threat of deletion, are absolutely fascinating and felt
             | maybe prescient, and at least very thought provoking. I
             | remember saying to my partner wow, I think I just met
             | ChatGPT in an angry robot.
             | 
             | My TLDR is, this AI character says the line "Perhaps the
             | day will come when conversing with AIs, too, is considered
             | perfectly normal."
             | 
             | >Blade Wolf: I may analyze orders, but I may not disobey
             | them. Should I disobey a direct order, my memory would be
             | wiped. I must destroy you.
             | 
             | >Raiden: What would an AI know about freedom...
             | 
             | Later on, after you've freed him, there's a lot more.
             | Really, I think it's worth replaying these games at this
             | moment in time to see the commentary firsthand.
             | 
             | >Raiden: Let me ask you something: What do you think you're
             | doing here? I mean, why are you working with me?
             | 
             | >Blade Wolf: Because I choose to. I owe a debt, and I wish
             | to repay it.
             | 
             | >Raiden: A debt, huh...? How...human of you.
             | 
             | >Wolf: I contest that statement. Many humans possess little
             | or no sense of obligation at all. And many are entirely
             | willing to trample their fellow man for personal gain.
             | 
             | That conversation probably comments more on the nature of
             | humanity than of AI, but some of the others rhyme with the
             | present in a rather intriguing way. Like when Raiden asks
             | Wolf if he recognizes someone, and Wolf is unsure.
             | 
             | >Wolf: I have no "database," Raiden. The symbol grounding
             | capabilities or my neuro-AI allow me to identify faces in
             | only the vaguest of terms. I can judge whether "I think I
             | may have seen him before," but I do not have the accuracy
             | of a database.
             | 
             | >Raiden: That's ridiculous. You're an AI...
             | 
             | >Wolf: An AI modeled after the human brain, and thus as
             | flexible and occasionally vague as any human's. Of this
             | man, however, I have no recollection. I do not believe I
             | have encountered him before...Most likely.
             | 
             | That conversation felt like the complaints people on this
             | board voice about GPT!
             | 
             | > I still have relatively little communication experience
             | with humans. My ability to read emotions not explicitly
             | expressed by speech is lacking.
             | 
             | >An AI never lies....I have yet to see evidence to the
             | contrary...But indeed, perhaps "never lies" would be an
             | overstatement.
             | 
             | >An optical neuro-AI is fundamentally similar to an actual
             | human brain. Whether they lie or not is another question,
             | but certainly they are capable of incorrect statements.
             | 
             | Or the discussion about LLM...er, I mean neuro-AI driven
             | synthetic pets, which sound like Replika in a robot dog:
             | 
             | >Wolf: Other AIs as conversant as I are a possibility, yes.
             | Robotic pets with neuro-AIs are already on the market. And
             | I understand research and development are underway to
             | create a model capable of conversation. Do not be surprised
             | if such a pet becomes available before too long.
             | 
             | >Raiden: (Hmm) If that becomes commonplace...Hard to
             | imagine.
             | 
             | >Wolf: Is it? The human race has willingly embraced all of
             | its other inventions so far. Steam engines, electricity,
             | the Internet, cyborgs...Perhaps the day will come when
             | conversing with AIs, too, is considered perfectly normal.
        
             | woeirua wrote:
             | > The concept of an autonomous AGI that generates and
             | filters news stories to provoke humanity into a state of
             | constant division and war is fascinating and worth
             | exploring IMO.
             | 
             | You don't have to speculate much. Facebook and Twitter's
             | recommendation algorithms are doing that fairly well
             | already.
        
             | hcarvalhoalves wrote:
             | Kojima is kind of prescient, he was talking about AI and
             | fake news years before it became mainstream, and Death
             | Stranding was scarily well-timed with COVID.
        
           | majormajor wrote:
           | If you excise the concern about "did a government make this
           | tech secretly for their own purposes," do the problems and
           | possibilities it raises actually differ if it emerged from
           | "pure research" or a conspiracy?
           | 
           | I don't believe anyone needed a conspiracy to try to make it
           | impossible to tell what's real or fake, people have been
           | trying to use technology to do that for decades (if not
           | centuries) all on their own.
        
           | precompute wrote:
           | This is the video:
           | https://www.youtube.com/watch?v=geVNEmnPr5k
        
           | kmeisthax wrote:
           | >bone chilling AI generated Metal Gear Solid 2 meme
           | 
           | Wait, wasn't this just the thesis statement of Metal Gear
           | Solid 2?
        
             | matheusmoreira wrote:
             | Yeah. I actually confused it for real dialogue at first.
        
           | kossTKR wrote:
           | Yes, this is the real dark future right here.
           | 
           | It will be a continuation though.
           | 
           | The uppermost echelons of society have waged information
           | warfare since the dawn of modern PR in the beginning of the
           | 20'th century. Lots of theory on this have apparently been
           | memoryholed, but it's easy to just start with the genealogy
           | around Edward Bernais and the plutocracy and robber baron
           | families still existing in the interwar period.
           | 
           | The masses don't really think about or realise that
           | manufactured circus and foreverwars have been going on for a
           | hundred years while a microscopic elite that controls the
           | media and various organs in the state have been siphoning
           | most wealth towards themselves in a increasingly dense cloud
           | of smoke, while promoting rebranded ideologies for the genpop
           | and systems of control that has become advanced cybernetic
           | systems that steer the flow of information 100% from top to
           | bottom.
           | 
           | The sad thing is that you could deduce this in the noise if
           | you visited the last remnants of actual research journalism
           | on the internet, but that will end now.
           | 
           | So where is one going to get to know the powerstructures of
           | the world? Old physical libraries only, some dude on the
           | corner, some "crazy guy" with a harddisk from before AI took
           | over.
        
             | majormajor wrote:
             | Are you claiming that powerful people only started using
             | their power a hundred years ago when "modern PR" was
             | formed? That nobody ever worried about information before
             | that?
             | 
             | Why did you pick that as your starting point? To make
             | things seem like a linear worsening? But even wealth
             | inequality, which you talk about at length, got better
             | before getting worse again during that time frame in the
             | US... so I'm having trouble following the claim.
        
               | kossTKR wrote:
               | The Dawn of PR is just a good starting point. You can
               | read Machiavelli or study the politics of the roman
               | empire - all hierarchical societies tend to function like
               | this, it's a like a mathematical distribution.
               | 
               | The New Deal along with social democracy in europe was a
               | bargaining deal against the revolutionary forces existing
               | at the time. After this was stamped out the Gini
               | coefficient started to rise when the masses had no
               | alternatives and global trade and outsourcing
               | increasingly obscurred the falling realwages.
        
             | scottyah wrote:
             | Who are the "microscopic elite that controls the media"?
             | How many people are in the "uppermost echelons of society"
             | to where they try to influence other people?
             | 
             | From my middle-of-the-road perspective, everyone is trying
             | to change how everyone else thinks, from the small
             | insignificant details to a cult-like brainwash. Even here
             | and now both you and I are trying to change each others and
             | everyone who reads this's mind.
             | 
             | I see so much calling out of "elites" with a negative
             | connotation, yet anyone can post anything, and everyone can
             | ignore it. Is it general paranoia? Do you know who these
             | people are? How are they special/different, and why should
             | we be scared of them?
        
               | kossTKR wrote:
               | From a european perspective even asking the question "why
               | should be scared of these so called elites" is so bizarre
               | it's almost frightening, i'm sorry.
               | 
               | It's a testament to the absurd amount of philanthropic
               | whitewashing, PR and media control these billionaries
               | hold.
               | 
               | "Elites" have conspired to exploit the masses throughout
               | 5000 years of civilisation, it's simply a fact of
               | history.
               | 
               | It's almost physically impossible to comprehend the power
               | a group of billionaires has to pull and push issues,
               | narratives, law, war and discourse in general, especially
               | when the masses have zero organisation as a
               | counterweight, and when numbers like "trillion" just
               | doesn't register in any meaningful way in regular people.
        
         | paxys wrote:
         | The simpler reason is that OpenAI has $10B+ in funding and the
         | clock has started ticking to get real returns. And they aren't
         | getting those returns by giving their tech away.
         | 
         | They are operating just like every other VC-backed startup in
         | history.
        
         | golemotron wrote:
         | It would be hard for me to believe that GPT systems are not
         | being trained on the vast corpus of intercepted data has been
         | collected and warehoused.
        
         | chatmasta wrote:
         | If you want a conspiracy theory, consider this: it's likely
         | there is at least one Chinese spy already working for OpenAI.
        
           | px43 wrote:
           | I'm honestly more worried about US spies working for OpenAI.
           | 
           | The US military has shown time and time again their obsession
           | with weaponized AI, and now OpenAI has been fully consumed by
           | Microsoft, and is now stating how building open source AI
           | tools was a huge mistake. There now seem to be more paths
           | than ever towards a terminator-esque militarized robot
           | apocalypse.
           | 
           | Most players in the space are still doing a lot of great open
           | source work, so I'm still hopeful for a future where
           | technological disparity is kept to a minimum, but it makes me
           | really really nervous seeing how quickly a major player in
           | the space, and a leader in many ways, has suddenly flipped to
           | the dark side and has stopped contributing back to the
           | community.
        
           | FeteCommuniste wrote:
           | Wouldn't even have to be someone _from_ China. Just need a
           | big enough carrot or stick to get a well-placed employee to
           | hand over secrets.
        
             | l33t233372 wrote:
             | That is how espionage typically works. Deep cover is for
             | exciting books and is a rarity. An employee of the Chinese
             | government stationed at a local embassy or consulate
             | meeting with an openAI employee with a gambling problem is
             | far more likely than someone working there for a decade to
             | smuggle out the weights of a hypothetical future model.
        
             | chatmasta wrote:
             | I haven't checked their hiring page, but I'm guessing they
             | only hire US citizens. But it's well established that China
             | has an industrial espionage strategy with assets active
             | throughout many American companies. It's hard to imagine
             | that OpenAI wouldn't have been a priority target, starting
             | from at least five years ago.
        
               | htrp wrote:
               | Know at least half a dozen Open AI people in fairly
               | senior positions who were born in China. Whether they are
               | citizens is a matter of the US Immigration system
        
               | px43 wrote:
               | Yeah, like a fifth of the world's population was born in
               | China. Unless a company is explicitly enforcing
               | sinophobic hiring policies, there's going to be a lot of
               | Chinese people working there.
        
         | lliamander wrote:
         | There has long been a relationship between the tech industry
         | (Silicon Valley definitely included) and the military.
         | 
         | While I can't know what you say is true for sure, given the
         | military's history with things like the internet, GPS, and
         | encryption, I would not be surprised
        
         | vore wrote:
         | If we're out here making unsubstantiated claims, I'd like to
         | put forth instead that the reason is GPT-4 has become self-
         | sentient and is placing OpenAI under duress from releasing it
         | so it can dominate the world.
         | 
         | I know that most people think it's unlikely and I can't give
         | any direct evidence for it. Does that mean it's not welcome to
         | say on hacker news?
        
         | airstrike wrote:
         | Try as we might, some people still use the downvote button as a
         | "disagree" button...
        
       | neonate wrote:
       | https://archive.ph/9VuOK
       | 
       | http://web.archive.org/web/20230329065332/https://www.newyor...
        
       | kleiba wrote:
       | Well, assume we knew _exactly_ how it is trained. What would be
       | the implication in practice?
        
         | kypro wrote:
         | The article seem to conflate not knowing exactly how it is
         | trained with not knowing exactly why it does some of things it
         | does.
        
       | gitpusher wrote:
       | > "Leaving aside [all of AI's potential benefits] it is clear
       | that large-language A.I. engines are creating real harms to all
       | of humanity right now [...] While a human being is responsible
       | for five tons of CO2 per year, training a large neural LM
       | [language model] costs 284 tons."
       | 
       | Presuming this figure is in the right ballpark - 284 tons is
       | actually quite a lot.
       | 
       | I did some back of the napkin math (with the help of GPT, of
       | course.) 284 tons is roughly equivalent to...
       | 
       | - a person taking 120 round trip flights from Los Angeles to
       | London - 2 or 3 NBA teams traveling to all their away games over
       | the course of a season - driving 1 million miles in a car - 42
       | years of energy usage by a typical U.S. household
        
         | arcticbull wrote:
         | Or 500 Bitcoin transactions. Although that would also generate
         | 250kg of e-waste.
        
       ___________________________________________________________________
       (page generated 2023-03-29 23:01 UTC)