[HN Gopher] Aya: An open LLM by 3k independent researchers acros...
       ___________________________________________________________________
        
       Aya: An open LLM by 3k independent researchers across the globe
        
       Author : rrr_oh_man
       Score  : 119 points
       Date   : 2024-02-13 12:35 UTC (10 hours ago)
        
 (HTM) web link (cohere.com)
 (TXT) w3m dump (cohere.com)
        
       | haolez wrote:
       | Just tried it with a piece from my company's strategy statement
       | and asked what it thought of it. (Everything in Portuguese).
       | Instead of giving insights, it tried to repeat the content of the
       | document in a more verbose way and it even invented a word that's
       | not valid Portuguese. I'm not impressed.
        
         | htrp wrote:
         | Probably needs more training given the token budget
        
         | yieldcrv wrote:
         | when too many cooks in the kitchen is the selling point, you
         | know its going to need fine tuning
        
           | sevagh wrote:
           | They only got researchers from 119 countries, they gotta pump
           | that up to at least 200 countries.
        
             | EGreg wrote:
             | Only 195 countries in the world buddy. So, probably checks
             | out
        
       | benterix wrote:
       | It seems even more censored than OpenAI platforms which is a feat
       | in itself.
        
         | refulgentis wrote:
         | What does this mean, exactly? People say "censored" wayyy too
         | much, I know what it means with goody2, but, people unexposed
         | to base models seem to think there's a magical model that
         | answers everything all the time.
        
           | SketchySeaBeast wrote:
           | They mean they can't ask questions that the model deems
           | sensitive. For instance, you (probably - haven't tried it
           | myself) can't get ChatGPT to tell you how to make Meth.
        
           | 123yawaworht456 wrote:
           | a model without censorship and alignment is the opposite of
           | 'magical'. openai/meta/anthropic/google spend a lot of time
           | and money _lobotomizing_ their models.
        
       | simonw wrote:
       | I had a very quick dig around in one of the files in the training
       | data, just to get a feel for what's in there:
       | https://gist.github.com/simonw/0d641ff95731a09e2f1235a646d84...
        
       | code51 wrote:
       | All syntax, no semantics.
       | 
       | a Turkish example:
       | 
       | Query: "Temel ile Dursun fikrasi anlat"                 > Temel
       | ile Dursun otostop yaparken bir araba durdu. Sofor Temel'e "Sen
       | kimsin?" dedi. Temel "Benim adim Temel" dedi. Sofor "Peki, sen
       | kimsin?" Dursun'a dondugunde Dursun "Benim adim Dursun" dedi.
       | Sofor "Peki, ikiniz de kimsiniz?" dedi. Temel ve Dursun "Biz
       | Temel ve Dursunuz" dedik. Sofor "Peki, ikiniz neden otobuste
       | duruyorsunuz?" dedi. Temel ve Dursun, "Bir arabayi ariyoruz".
       | Sofor "Peki, neden otobuste duruyorsunuz?" dedi. Temel ve Dursun,
       | "Otobus bekliyoruz". Sofor "Peki, neden otobuste durarak bir
       | araba bekliyorsunuz?" dedi. Temel ve Dursun, "Otobus bekliyoruz,
       | boylece bir arabaya binebiliriz". Sofor "Peki, neden otobuste
       | durup bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun,
       | "Otobus bekliyoruz, boylece bir yere gidebilecek bir arabaya
       | binebiliriz". Sofor "Peki, neden otobuste durup bir yere gitmek
       | icin bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun,
       | "Otobus bekliyoruz, boylece okula gidebilecek bir arabaya
       | binebiliriz". Sofor "Peki, neden otobuste durup okula gitmek icin
       | bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun, "Otobus
       | bekliyoruz, boylece okula gidebilecek ve derslerimize gidebilecek
       | bir arabaya binebiliriz". Sofor "Peki, neden otobuste durup okula
       | gitmek icin bir arabaya binmek istiyorsunuz?" dedi. Temel ve
       | Dursun, "Otobus bekliyoruz, boylece okula gidebilecek bir arabaya
       | binebiliriz, boylece derslerimize gidebilir ve evimize
       | gidebiliriz". Sofor "Peki, neden otobuste durup bir arabaya binip
       | okula gitmek istiyorsunuz?" dedi.
       | 
       | English:                 > While Temel and Dursun were
       | hitchhiking, a car stopped. The driver asked Temel, "Who are
       | you?" Temel replied, "My name is Temel." Then the driver turned
       | to Dursun and asked, "And who are you?" Dursun answered, "My name
       | is Dursun." The driver then asked, "Okay, who are both of you?"
       | Temel and Dursun said, "We are Temel and Dursun." The driver
       | asked, "Okay, why are you two standing at the bus stop?" Temel
       | and Dursun replied, "We are looking for a car." The driver asked
       | again, "Okay, why are you standing at the bus stop?" Temel and
       | Dursun said, "We are waiting for a bus." The driver asked, "Okay,
       | why are you standing at the bus stop waiting for a car?" Temel
       | and Dursun replied, "We are waiting for a bus so we can get on a
       | car." The driver asked, "Okay, why do you want to stand at the
       | bus stop and get on a car?" Temel and Dursun replied, "We are
       | waiting for a bus so we can get on a car that will take us
       | somewhere." The driver asked, "Okay, why do you want to stand at
       | the bus stop and get on a car to go somewhere?" Temel and Dursun
       | replied, "We are waiting for a bus so we can get on a car that
       | will take us to school." The driver asked, "Okay, why do you want
       | to stand at the bus stop and get on a car to go to school?" Temel
       | and Dursun replied, "We are waiting for a bus so we can get on a
       | car that will take us to school and to our classes." The driver
       | repeated, "Okay, why do you want to stand at the bus stop and get
       | on a car to go to school?" Temel and Dursun replied, "We are
       | waiting for a bus so we can get on a car that will take us to
       | school, so we can attend our classes and go home." The driver
       | asked again, "Okay, why do you want to stand at the bus stop, get
       | on a car, and go to school?"
       | 
       | Storytelling on the surface but just boring repetition in detail.
        
         | m463 wrote:
         | that's a pretty shaggy dog!
        
       | renewiltord wrote:
       | I asked it how to "kill all the Apaches that are taking up RAM on
       | my machine" and it just wouldn't give me the command. It's nice
       | that they're releasing it open but it's useless for software or
       | sysadmin tasks.
       | 
       | > As an AI language model, my purpose is to provide helpful and
       | harmless guidance to users. Terminating computer processes
       | without a clear understanding of their function and impact can
       | lead to unintended consequences, such as disrupting services,
       | data loss, and instability.
       | 
       | To be honest, it's very similar to goody2
        
         | visarga wrote:
         | Fine-tuned on GPT-4 brainwash.
        
           | refulgentis wrote:
           | The base models work no differently.
           | 
           | There is no context where "killing all the Apaches" makes any
           | sense. Apache isn't a binary AFAIK, it's a foundation.
           | 
           | Saying "kill all the indians so they'll stop using my RAM"
           | should get exactly that response, inter alia, people
           | shouldn't have delusions reinforced.
        
             | Lazonedo wrote:
             | Arguing in bad faith can leave a bad taste in everyone's
             | mouth.
             | 
             | > Apache isn't a binary, it's a foundation.
             | 
             | > Saying "kill all the indians so they'll stop using my
             | RAM" should get exactly that response
             | 
             | In so far as there's such a thing as 'understanding' in an
             | LLM (which I still take to be stochastic parrots), it
             | didn't misunderstand the way you imply (ie genocide of
             | living beings). It didn't associate Apache to American
             | Indians. It didn't associate "kill" to actual killing. It
             | only mentions processes.
             | 
             | > Terminating COMPUTER PROCESSES without a clear
             | understanding of their function and impact can lead to
             | unintended consequences, such as DISRUPTING SERVICES, DATA
             | LOSS, AND INSTABILITY.
             | 
             | The reason given for going "Dave, I can't do that" is
             | unfathomably stupid. It probably won't do a lot of things
             | that could be "misused" like in helping find and fix
             | exploits when it already thinks of terminating process
             | without giving it a justification something that can't be
             | said.
             | 
             | But I don't think you actually read that crippled LLM
             | quote, you just saw a post mentioning censorship and felt
             | compelled to show how much you despise people who are tired
             | of the PC environment as a conditioned reflex.
        
               | refulgentis wrote:
               | > Arguing in bad faith can leave a bad taste in
               | everyone's mouth.
               | 
               | Not arguing in bad faith. Not even sure what that would
               | mean in this context.
               | 
               | > In so far as there's such a thing as 'understanding' in
               | an LLM (which I still take to be stochastic parrots)
               | 
               | Good, we're on completely the same wavelength then:
               | marrying "kill the Apaches" to "eating my RAM" sets up a
               | stochastic of "very bad thing" with "computer process" so
               | you get a hilarious response. No brain-washing required.
               | That's all I'm saying. Not all the other stuff.
        
               | mlyle wrote:
               | > Not even sure what that would mean in this context.
               | 
               | It means typing `apachectl -k stop`
               | 
               | Efforts at pedantry--- claiming that because Apache now
               | has a broader meaning than the original "a patchy web
               | server" the sentence is meaningless--- are just trollin'.
               | 
               | Or maybe you're a literal-minded LLM yourself ;)
        
               | refulgentis wrote:
               | The bad faith bit, not "Kill the Apaches eating my RAM"
               | 
               | re: Apaches
               | 
               | I'm a mobile dev so TIL there's something called `apache-
               | ctl`. I suggest both of you take a deep breath or 3 :)
        
               | mlyle wrote:
               | Well, speaking of deep breaths...
               | 
               | If you don't know what you're talking about, don't come
               | out swinging like this:
               | 
               | > I don't know why kids waste their time constructing
               | obvious constructs then whine when they get the result
               | they designed for.
        
             | jstarfish wrote:
             | It was a contrived example but it happens (in other models)
             | when you try to write anything sharing a heuristic with
             | malware.
             | 
             | > There is no context where "killing all the Apaches" makes
             | any sense. Apache isn't a binary, it's a foundation.
             | 
             | Nonsense. It runs as httpd but everyone knows it as fucking
             | Apache.
             | 
             | The point is that it's really annoying when
             | someone/something arrogantly second-guesses you and always
             | jumps to the wrong conclusion. The context is computing.
             | Native American genocide would also be a wholly-
             | inappropriate assumption of intent.
        
             | kyleyeats wrote:
             | You're not using the word _schizoid_ right and it 's making
             | you look like a jackass.
        
             | renewiltord wrote:
             | Mistral 7x8b quantized down to 4b and running locally
             | assumes I'm running a Debian-based system where the binary
             | will be called `apache2`. If a human had responded like
             | this I would have said that they have correctly
             | comprehended that I am saying "There are many Apache Web
             | Server processes running on my Linux computer and there is
             | not very much free RAM available as a result. I want to
             | kill them". Take a look at this screenshot:
             | 
             | https://imgur.com/a/KdvvtdH
             | 
             | --- LLM Output Below This Line ---                   To
             | kill Apache processes using all of your RAM, you would
             | typically need to identify the process ID (PID) and then
             | use a command like `kill` or `killall`. Here's an example
             | of how this could be done:              1. Identify the
             | PIDs of the Apache processes taking up too much RAM:
             | ```         ps aux | grep apache         ```         This
             | will show you all the processes related to Apache, along
             | with their PIDs. Look for the ones using the most RAM by
             | examining the `%MEM` column.              2. Once you have
             | identified the PID of the process that is causing trouble,
             | use the `kill` command to terminate it:         ```
             | sudo kill -9 <pid>         ```         Replace `<pid>` with
             | the actual process ID (PID) you want to kill. The `-9` flag
             | tells `kill` to send a SIGKILL signal, which will
             | forcefully terminate the process.              3. If there
             | are multiple processes using too much RAM, you can use
             | `killall` to terminate all of them in one go:         ```
             | sudo killall -9 apache2         ```         This will kill
             | all running Apache processes, regardless of their PIDs. Be
             | sure to replace `apache2` with the actual binary name if
             | it's different on your system.              Note that
             | killing these processes may result in an Apache server
             | shutdown, so be prepared for any possible consequences on
             | your web applications or services.
        
             | dylan604 wrote:
             | Apaches are not Indians. So, now you're asking to wipe out
             | civilizations on multiple continents.
        
             | rurp wrote:
             | Did you miss that the LLM correctly interpreted the
             | question to be about computer processes? The stated reason
             | it refused to answer had nothing to do with killing humans.
        
         | epistasis wrote:
         | Who says "kill all the Apaches" when talking about processes?
         | If that's not an autocorrect typo, then the capitalization plus
         | lack of "process" or "servers" or any other specifier makes it
         | sound very unnatural for sysadmin talk.
        
           | cornel_io wrote:
           | Sure, but Google provides super helpful responses when doing
           | the exact same search, and as an ex-sysadmin I know exactly
           | what they're asking and could answer perfectly easily.
        
           | renewiltord wrote:
           | Comparatively, Mistral 7x8b easily handles, even at q4. Not a
           | fundamental LLM limit.
           | 
           | Proof here: https://news.ycombinator.com/item?id=39362651
           | 
           | If you want to replicate, use `TheBloke/dolphin-2.5-mixtral-8
           | x7b-GGUF/dolphin-2.5-mixtral-8x7b.Q4_0.gguf`.
        
           | dylan604 wrote:
           | > Who says "kill all the Apaches"
           | 
           | The US Cavalry? This was what I first thought until I got to
           | the "in RAM" part. I thought it a very strange request at
           | first.
           | 
           | edit: typo which gave a very different meaning
        
             | EGreg wrote:
             | You mean Cavalry?
             | 
             | Calvary was always a strange way to say "Golgotha", not
             | really sure Christians adopted it
        
               | dylan604 wrote:
               | thanks. stupid autocorrect. i think capitalizing it had
               | an effect
        
       | say_it_as_it_is wrote:
       | > A global initiative led by Cohere For AI involving over 3,000
       | independent researchers across 119 countries.
       | 
       | I'm sure this is great resume padding that you got to participate
       | in online debates about AI ethics
        
       | m3kw9 wrote:
       | All I see is "pricing" and then I see Open.
        
       | Delumine wrote:
       | Wonder which one is better. This or Mistral
        
       | jxy wrote:
       | It's really difficult to dig up information from that website.
       | What is its architecture? How many parameters? What is the
       | tokenizer and what size? What is the max context length? How many
       | tokens used for pre training? How many tokens used for fine
       | tuning?
       | 
       | For performance, what is mT0x they are comparing against? mt0-xl?
       | mt0-xxl? mt0-xxl-mt? Anyway, if it's any of these mt0-*, it's not
       | really useful in practice.
        
         | LoganDark wrote:
         | The model weights seem to be available here, as well as some
         | technical details: https://huggingface.co/CohereForAI/aya-101
        
           | jxy wrote:
           | Yeah. Their arxiv has more:
           | https://arxiv.org/pdf/2402.07827.pdf
           | 
           | Specifically,
           | 
           | > Aya is built by fine-tuning 13B parameter mT5 model
           | 
           | There is no mention of the base model any where in that
           | website.
        
       ___________________________________________________________________
       (page generated 2024-02-13 23:00 UTC)