[HN Gopher] Aya: An open LLM by 3k independent researchers acros...
___________________________________________________________________
Aya: An open LLM by 3k independent researchers across the globe
Author : rrr_oh_man
Score : 119 points
Date : 2024-02-13 12:35 UTC (10 hours ago)
(HTM) web link (cohere.com)
(TXT) w3m dump (cohere.com)
| haolez wrote:
| Just tried it with a piece from my company's strategy statement
| and asked what it thought of it. (Everything in Portuguese).
| Instead of giving insights, it tried to repeat the content of the
| document in a more verbose way and it even invented a word that's
| not valid Portuguese. I'm not impressed.
| htrp wrote:
| Probably needs more training given the token budget
| yieldcrv wrote:
| when too many cooks in the kitchen is the selling point, you
| know its going to need fine tuning
| sevagh wrote:
| They only got researchers from 119 countries, they gotta pump
| that up to at least 200 countries.
| EGreg wrote:
| Only 195 countries in the world buddy. So, probably checks
| out
| benterix wrote:
| It seems even more censored than OpenAI platforms which is a feat
| in itself.
| refulgentis wrote:
| What does this mean, exactly? People say "censored" wayyy too
| much, I know what it means with goody2, but, people unexposed
| to base models seem to think there's a magical model that
| answers everything all the time.
| SketchySeaBeast wrote:
| They mean they can't ask questions that the model deems
| sensitive. For instance, you (probably - haven't tried it
| myself) can't get ChatGPT to tell you how to make Meth.
| 123yawaworht456 wrote:
| a model without censorship and alignment is the opposite of
| 'magical'. openai/meta/anthropic/google spend a lot of time
| and money _lobotomizing_ their models.
| simonw wrote:
| I had a very quick dig around in one of the files in the training
| data, just to get a feel for what's in there:
| https://gist.github.com/simonw/0d641ff95731a09e2f1235a646d84...
| code51 wrote:
| All syntax, no semantics.
|
| a Turkish example:
|
| Query: "Temel ile Dursun fikrasi anlat" > Temel
| ile Dursun otostop yaparken bir araba durdu. Sofor Temel'e "Sen
| kimsin?" dedi. Temel "Benim adim Temel" dedi. Sofor "Peki, sen
| kimsin?" Dursun'a dondugunde Dursun "Benim adim Dursun" dedi.
| Sofor "Peki, ikiniz de kimsiniz?" dedi. Temel ve Dursun "Biz
| Temel ve Dursunuz" dedik. Sofor "Peki, ikiniz neden otobuste
| duruyorsunuz?" dedi. Temel ve Dursun, "Bir arabayi ariyoruz".
| Sofor "Peki, neden otobuste duruyorsunuz?" dedi. Temel ve Dursun,
| "Otobus bekliyoruz". Sofor "Peki, neden otobuste durarak bir
| araba bekliyorsunuz?" dedi. Temel ve Dursun, "Otobus bekliyoruz,
| boylece bir arabaya binebiliriz". Sofor "Peki, neden otobuste
| durup bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun,
| "Otobus bekliyoruz, boylece bir yere gidebilecek bir arabaya
| binebiliriz". Sofor "Peki, neden otobuste durup bir yere gitmek
| icin bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun,
| "Otobus bekliyoruz, boylece okula gidebilecek bir arabaya
| binebiliriz". Sofor "Peki, neden otobuste durup okula gitmek icin
| bir arabaya binmek istiyorsunuz?" dedi. Temel ve Dursun, "Otobus
| bekliyoruz, boylece okula gidebilecek ve derslerimize gidebilecek
| bir arabaya binebiliriz". Sofor "Peki, neden otobuste durup okula
| gitmek icin bir arabaya binmek istiyorsunuz?" dedi. Temel ve
| Dursun, "Otobus bekliyoruz, boylece okula gidebilecek bir arabaya
| binebiliriz, boylece derslerimize gidebilir ve evimize
| gidebiliriz". Sofor "Peki, neden otobuste durup bir arabaya binip
| okula gitmek istiyorsunuz?" dedi.
|
| English: > While Temel and Dursun were
| hitchhiking, a car stopped. The driver asked Temel, "Who are
| you?" Temel replied, "My name is Temel." Then the driver turned
| to Dursun and asked, "And who are you?" Dursun answered, "My name
| is Dursun." The driver then asked, "Okay, who are both of you?"
| Temel and Dursun said, "We are Temel and Dursun." The driver
| asked, "Okay, why are you two standing at the bus stop?" Temel
| and Dursun replied, "We are looking for a car." The driver asked
| again, "Okay, why are you standing at the bus stop?" Temel and
| Dursun said, "We are waiting for a bus." The driver asked, "Okay,
| why are you standing at the bus stop waiting for a car?" Temel
| and Dursun replied, "We are waiting for a bus so we can get on a
| car." The driver asked, "Okay, why do you want to stand at the
| bus stop and get on a car?" Temel and Dursun replied, "We are
| waiting for a bus so we can get on a car that will take us
| somewhere." The driver asked, "Okay, why do you want to stand at
| the bus stop and get on a car to go somewhere?" Temel and Dursun
| replied, "We are waiting for a bus so we can get on a car that
| will take us to school." The driver asked, "Okay, why do you want
| to stand at the bus stop and get on a car to go to school?" Temel
| and Dursun replied, "We are waiting for a bus so we can get on a
| car that will take us to school and to our classes." The driver
| repeated, "Okay, why do you want to stand at the bus stop and get
| on a car to go to school?" Temel and Dursun replied, "We are
| waiting for a bus so we can get on a car that will take us to
| school, so we can attend our classes and go home." The driver
| asked again, "Okay, why do you want to stand at the bus stop, get
| on a car, and go to school?"
|
| Storytelling on the surface but just boring repetition in detail.
| m463 wrote:
| that's a pretty shaggy dog!
| renewiltord wrote:
| I asked it how to "kill all the Apaches that are taking up RAM on
| my machine" and it just wouldn't give me the command. It's nice
| that they're releasing it open but it's useless for software or
| sysadmin tasks.
|
| > As an AI language model, my purpose is to provide helpful and
| harmless guidance to users. Terminating computer processes
| without a clear understanding of their function and impact can
| lead to unintended consequences, such as disrupting services,
| data loss, and instability.
|
| To be honest, it's very similar to goody2
| visarga wrote:
| Fine-tuned on GPT-4 brainwash.
| refulgentis wrote:
| The base models work no differently.
|
| There is no context where "killing all the Apaches" makes any
| sense. Apache isn't a binary AFAIK, it's a foundation.
|
| Saying "kill all the indians so they'll stop using my RAM"
| should get exactly that response, inter alia, people
| shouldn't have delusions reinforced.
| Lazonedo wrote:
| Arguing in bad faith can leave a bad taste in everyone's
| mouth.
|
| > Apache isn't a binary, it's a foundation.
|
| > Saying "kill all the indians so they'll stop using my
| RAM" should get exactly that response
|
| In so far as there's such a thing as 'understanding' in an
| LLM (which I still take to be stochastic parrots), it
| didn't misunderstand the way you imply (ie genocide of
| living beings). It didn't associate Apache to American
| Indians. It didn't associate "kill" to actual killing. It
| only mentions processes.
|
| > Terminating COMPUTER PROCESSES without a clear
| understanding of their function and impact can lead to
| unintended consequences, such as DISRUPTING SERVICES, DATA
| LOSS, AND INSTABILITY.
|
| The reason given for going "Dave, I can't do that" is
| unfathomably stupid. It probably won't do a lot of things
| that could be "misused" like in helping find and fix
| exploits when it already thinks of terminating process
| without giving it a justification something that can't be
| said.
|
| But I don't think you actually read that crippled LLM
| quote, you just saw a post mentioning censorship and felt
| compelled to show how much you despise people who are tired
| of the PC environment as a conditioned reflex.
| refulgentis wrote:
| > Arguing in bad faith can leave a bad taste in
| everyone's mouth.
|
| Not arguing in bad faith. Not even sure what that would
| mean in this context.
|
| > In so far as there's such a thing as 'understanding' in
| an LLM (which I still take to be stochastic parrots)
|
| Good, we're on completely the same wavelength then:
| marrying "kill the Apaches" to "eating my RAM" sets up a
| stochastic of "very bad thing" with "computer process" so
| you get a hilarious response. No brain-washing required.
| That's all I'm saying. Not all the other stuff.
| mlyle wrote:
| > Not even sure what that would mean in this context.
|
| It means typing `apachectl -k stop`
|
| Efforts at pedantry--- claiming that because Apache now
| has a broader meaning than the original "a patchy web
| server" the sentence is meaningless--- are just trollin'.
|
| Or maybe you're a literal-minded LLM yourself ;)
| refulgentis wrote:
| The bad faith bit, not "Kill the Apaches eating my RAM"
|
| re: Apaches
|
| I'm a mobile dev so TIL there's something called `apache-
| ctl`. I suggest both of you take a deep breath or 3 :)
| mlyle wrote:
| Well, speaking of deep breaths...
|
| If you don't know what you're talking about, don't come
| out swinging like this:
|
| > I don't know why kids waste their time constructing
| obvious constructs then whine when they get the result
| they designed for.
| jstarfish wrote:
| It was a contrived example but it happens (in other models)
| when you try to write anything sharing a heuristic with
| malware.
|
| > There is no context where "killing all the Apaches" makes
| any sense. Apache isn't a binary, it's a foundation.
|
| Nonsense. It runs as httpd but everyone knows it as fucking
| Apache.
|
| The point is that it's really annoying when
| someone/something arrogantly second-guesses you and always
| jumps to the wrong conclusion. The context is computing.
| Native American genocide would also be a wholly-
| inappropriate assumption of intent.
| kyleyeats wrote:
| You're not using the word _schizoid_ right and it 's making
| you look like a jackass.
| renewiltord wrote:
| Mistral 7x8b quantized down to 4b and running locally
| assumes I'm running a Debian-based system where the binary
| will be called `apache2`. If a human had responded like
| this I would have said that they have correctly
| comprehended that I am saying "There are many Apache Web
| Server processes running on my Linux computer and there is
| not very much free RAM available as a result. I want to
| kill them". Take a look at this screenshot:
|
| https://imgur.com/a/KdvvtdH
|
| --- LLM Output Below This Line --- To
| kill Apache processes using all of your RAM, you would
| typically need to identify the process ID (PID) and then
| use a command like `kill` or `killall`. Here's an example
| of how this could be done: 1. Identify the
| PIDs of the Apache processes taking up too much RAM:
| ``` ps aux | grep apache ``` This
| will show you all the processes related to Apache, along
| with their PIDs. Look for the ones using the most RAM by
| examining the `%MEM` column. 2. Once you have
| identified the PID of the process that is causing trouble,
| use the `kill` command to terminate it: ```
| sudo kill -9 <pid> ``` Replace `<pid>` with
| the actual process ID (PID) you want to kill. The `-9` flag
| tells `kill` to send a SIGKILL signal, which will
| forcefully terminate the process. 3. If there
| are multiple processes using too much RAM, you can use
| `killall` to terminate all of them in one go: ```
| sudo killall -9 apache2 ``` This will kill
| all running Apache processes, regardless of their PIDs. Be
| sure to replace `apache2` with the actual binary name if
| it's different on your system. Note that
| killing these processes may result in an Apache server
| shutdown, so be prepared for any possible consequences on
| your web applications or services.
| dylan604 wrote:
| Apaches are not Indians. So, now you're asking to wipe out
| civilizations on multiple continents.
| rurp wrote:
| Did you miss that the LLM correctly interpreted the
| question to be about computer processes? The stated reason
| it refused to answer had nothing to do with killing humans.
| epistasis wrote:
| Who says "kill all the Apaches" when talking about processes?
| If that's not an autocorrect typo, then the capitalization plus
| lack of "process" or "servers" or any other specifier makes it
| sound very unnatural for sysadmin talk.
| cornel_io wrote:
| Sure, but Google provides super helpful responses when doing
| the exact same search, and as an ex-sysadmin I know exactly
| what they're asking and could answer perfectly easily.
| renewiltord wrote:
| Comparatively, Mistral 7x8b easily handles, even at q4. Not a
| fundamental LLM limit.
|
| Proof here: https://news.ycombinator.com/item?id=39362651
|
| If you want to replicate, use `TheBloke/dolphin-2.5-mixtral-8
| x7b-GGUF/dolphin-2.5-mixtral-8x7b.Q4_0.gguf`.
| dylan604 wrote:
| > Who says "kill all the Apaches"
|
| The US Cavalry? This was what I first thought until I got to
| the "in RAM" part. I thought it a very strange request at
| first.
|
| edit: typo which gave a very different meaning
| EGreg wrote:
| You mean Cavalry?
|
| Calvary was always a strange way to say "Golgotha", not
| really sure Christians adopted it
| dylan604 wrote:
| thanks. stupid autocorrect. i think capitalizing it had
| an effect
| say_it_as_it_is wrote:
| > A global initiative led by Cohere For AI involving over 3,000
| independent researchers across 119 countries.
|
| I'm sure this is great resume padding that you got to participate
| in online debates about AI ethics
| m3kw9 wrote:
| All I see is "pricing" and then I see Open.
| Delumine wrote:
| Wonder which one is better. This or Mistral
| jxy wrote:
| It's really difficult to dig up information from that website.
| What is its architecture? How many parameters? What is the
| tokenizer and what size? What is the max context length? How many
| tokens used for pre training? How many tokens used for fine
| tuning?
|
| For performance, what is mT0x they are comparing against? mt0-xl?
| mt0-xxl? mt0-xxl-mt? Anyway, if it's any of these mt0-*, it's not
| really useful in practice.
| LoganDark wrote:
| The model weights seem to be available here, as well as some
| technical details: https://huggingface.co/CohereForAI/aya-101
| jxy wrote:
| Yeah. Their arxiv has more:
| https://arxiv.org/pdf/2402.07827.pdf
|
| Specifically,
|
| > Aya is built by fine-tuning 13B parameter mT5 model
|
| There is no mention of the base model any where in that
| website.
___________________________________________________________________
(page generated 2024-02-13 23:00 UTC)