[HN Gopher] Stanford Alpaca web demo suspended "until further no...
___________________________________________________________________
Stanford Alpaca web demo suspended "until further notice"
Author : wsgeorge
Score : 79 points
Date : 2023-03-17 18:01 UTC (5 hours ago)
(HTM) web link (alpaca-ai-custom4.ngrok.io)
(TXT) w3m dump (alpaca-ai-custom4.ngrok.io)
| andrewmcwatters wrote:
| I think it's only a matter of time until one of these models gets
| into the hands of a hacker who wants unfettered ability to
| interact with a LLM without moral concerns.
|
| I think there's tremendous value in end user facing LLMs being
| trained against moral policies, but for internal or private
| usage, if these models are trained on essentially raw WWW sourced
| data, I would personally want raw output.
|
| I'm also finding it particularly interesting to see what ethical
| strategies OpenAI comes up with considering that if you train a
| model on the raw prejudices of humanity, you're getting at least
| one category of "garbage in" that requires a lot of processing to
| avoid getting "garbage out."
| SpaceManNabs wrote:
| Oh wow, I finally see a HN comment, with a positive opinion on
| more AI ethics, that is not grayed out.
| cwkoss wrote:
| isn't llama in the wild now?
| andrewmcwatters wrote:
| I was under the impression that LLaMA was also trained
| against a series of moral policies, but perhaps I'm mistaken.
|
| It seems Meta chose their words carefully to imply that LLaMA
| does in fact, not have moral training:
|
| > There is still more research that needs to be done to
| address the risks of bias, toxic comments, and hallucinations
| in large language models. Like other models, LLaMA shares
| these challenges. As a foundation model, LLaMA is designed to
| be versatile and can be applied to many different use cases,
| versus a fine-tuned model that is designed for a specific
| task. By sharing the code for LLaMA, other researchers can
| more easily test new approaches to limiting or eliminating
| these problems in large language models. We also provide in
| the paper a set of evaluations on benchmarks evaluating model
| biases and toxicity to show the model's limitations and to
| support further research in this crucial area.
| sebzim4500 wrote:
| If they tried to make LLaMA woke, they did a terrible job.
| If you prompt it right you can basically get it to write
| Mein Kampf.
| jfowief wrote:
| "Moral training"
|
| Just as dystopian it sounds. Fixing current subjective
| moral norms into the machine.
| maxbond wrote:
| That's what the machine does, because that's contained in
| the input you feed it. You get the choice of doing it
| explicitly or implicitly. You don't get to opt out.
| Dylan16807 wrote:
| Do you think public schools are inherently dystopian? I
| don't think you're using the right critique here.
|
| Picking a common system of moral norms is a lot better
| than no moral norms.
| Veen wrote:
| I'm not confident the "moral norms" prevalent in SV
| and/or US academia are common, if by that you mean norms
| that are prevalent in the general populace.
| Dylan16807 wrote:
| I mean primary school, and I don't think that counts as
| academia.
| dancingvoid wrote:
| Yes
| himinlomax wrote:
| Example of "moral policy" in practice: Midjourney appears
| to be banning making fun of the Chinese dictator for life
| because it's supposedly racist or something.
|
| With that kind of moral compass, I'm not sure I'd be
| missing its absence.
| vkou wrote:
| > Example of "moral policy" in practice: Midjourney
| appears to be banning making fun of the Chinese dictator
| for life because it's supposedly racist or something.
|
| > With that kind of moral compass, I'm not sure I'd be
| missing its absence.
|
| Please note that most forms of media and social media
| have no problem with politicians making credible threats
| of violence against entire groups of people.
|
| Politicians are subject to a different set of rules, and
| enjoy a lot more protection than you and I.
| filoleg wrote:
| The issue here is not with online platform services
| allowing politicians more leeway in terms of what they
| can get away with on their platform.
|
| The actual issue is Midjourney not allowing regular users
| generate certain type of material solely because it makes
| fun of a political figure. What you are talking about is
| entirely tangential to the issue the grandparent comment
| is talking about.
| woooooo wrote:
| There was the famous example of chatgpt refusing to
| disable a nuke in the middle of NYC by using a racial
| slur.
|
| I don't think anyone in real life would choose that
| tradeoff but it's what happens when all of your "safety"
| training is about US culture war buttons.
| Dylan16807 wrote:
| That's a situation where the training _doesn 't_ follow
| current subjective norms, so I don't think it really
| validates the complaint.
| schroeding wrote:
| It doesn't appear to be filtered in a significant way.
|
| While toying with the 30B model, it suddenly started to
| steer a chat about a math problem into quite a sexual
| direction, with very explicit language.
|
| It also happily hallucinated, when prompted, that climate
| change is a hoax, as the earth is actually cooling down
| rapidly, multiple degrees per year, with a new ice age
| approaching in the next years. :D
| andrewmcwatters wrote:
| Ah there it is. Raw uncooked humanity.
| nickthegreek wrote:
| The 7B model brought up rape on my 3rd try out on an
| innocuous prompt.
| tarruda wrote:
| > if these models are trained on essentially raw WWW sourced
| data, I would personally want raw output.
|
| Llama is a very high-quality foundation LLM, you can already
| run it very easily using llama.cpp and will get the raw output
| you need. https://github.com/ggerganov/llama.cpp
|
| There's already instructions on how anyone can fine-tune it to
| behave similarly to ChatGPT for as little as $100:
| https://crfm.stanford.edu/2023/03/13/alpaca.html
| b33j0r wrote:
| "Easily" was a minor canard, or at least... it took me a
| couple of efforts over a couple of days to get the
| dependencies to play as nicely as "someone with a brand new
| M2 arm laptop."
|
| If nothing else, I continue to be amazed and how
| uninteroperable certain technologies are.
|
| I had to remove glibc and gcc to get llama to compile on my
| intel macbook. Masking/hiding them from my environment didn't
| work, as it went out and found them and their header files
| instead of clang.
|
| Which eventually worked fine.
| ars wrote:
| > who wants unfettered ability to interact with a LLM without
| moral concerns.
|
| Is that bad? It's just a language model - it says things.
| Humans have been saying all sorts of terrible things for ages,
| and we are still here.
|
| I mean it makes for nice headline "model said something
| racist", but does it actually change anything?
|
| These aren't decision making AI's (which would need to be much
| more careful), they are language models.
| selfhoster11 wrote:
| They absolutely are decision making models. Prompt them
| right, and they will output a decision in natural language or
| structured JSON. Heck, hook them up to a judicial system and
| they can start making low quality decisions tomorrow.
| aaomidi wrote:
| Humans don't scale infinitely. Humans have agency.
| ars wrote:
| And if the LLM scales infinitely it still does nothing
| unless a human reads and acts on it. And as you said:
| Humans don't scale, and have agency.
|
| Just writing something bad doesn't actually mean something
| bad happened.
|
| These days it seems like people are oversensitive to how
| things are said to them, and what things are said to them.
| aaomidi wrote:
| Wow. You just found the solution to propaganda! People
| just shouldn't be sensitive!
| rzmmm wrote:
| The content it produces is quite hard to distinct from text
| written by real person. Not long ago, Facebook was accused of
| "playing a critical role" in Rohingya genocide. I dont really
| know but I believe worst case LLM risks are in that same
| category.
| bostik wrote:
| As much as I dislike FB, and I certainly abhor their
| activities related to the genocide - I think this is far
| too different.
|
| Rohingya is a textbook example of blind optimisation and
| lack of context awareness. FB looked at a region, and
| people communicating in language they didn't understand.
| But they did see that certain symbols and/or combinations
| of symbols got a _lot_ of engagement. If you 're after
| money, you want to amplify the use of those symbols and
| hopefully generate lots more similar content.
|
| Turns out that's a morally reprehensible thing when the
| people using those symbols were advocating genocide. (It
| was good for the revenue while it lasted, though.)
|
| With LLMs and their hardcoded guard rails, I suspect we're
| going to see the danger emerge from the other side. Instead
| of actively spewing hatred, they will be used for mass
| sock-puppetry and opinion amplification on a massive scale.
| Think simple sabotage field manual for 21st century, but
| weaponised thousand-fold.
| ars wrote:
| That doesn't answer the question. So what if it wrote text,
| vs a human wrote text.
|
| What matters is the _reader_ not the writer.
|
| Facebook was accused of making it too easy for people to
| communicate. And people felt Facebook should police what
| people say to each other. I don't agree, but even if I did,
| that's not the same thing as what we are discussing.
| nico wrote:
| > into the hands of a hacker
|
| Forget "hackers", think government agencies. Which is probably
| already happening right now.
|
| Food for thought: What's the intersection of people closely
| related to OpenAI and Palantir?
|
| Edit: related thread on another front page post -
| https://news.ycombinator.com/item?id=35201992
| version_five wrote:
| [flagged]
| macrolime wrote:
| Use this Alpaca replication instead
|
| https://github.com/tloen/alpaca-lora
| __initbrian__ wrote:
| Are there estimates for how much it cost to run?
| ugjka wrote:
| I suspect it was on a budget because the web demo never loaded
| for me
| londons_explore wrote:
| About 2 words generated per second with a desktop CPU. More
| with a GPU.
| meghan_rain wrote:
| Ok wow this is big news, did Facebook or OpenAI threaten them
| with a lawsuit?
| viininnk wrote:
| On their GitHub repo (https://github.com/tatsu-
| lab/stanford_alpaca) they've added a notice:
|
| " _Note: Due to safety concerns raised by the community, we
| have decided to shut down the Alpaca live demo. Thank you to
| everyone who provided valuable feedback._ "
|
| So probably this was the usual type of people complaining about
| the usual type of thing.
| throwawayacc5 wrote:
| Safety concerns? What was it doing that could be considered
| "unsafe"?
| TMWNN wrote:
| wrongthink
| mhb wrote:
| And here we come to experience the effect of expanding how
| a word is used so that it becomes so broad that it is
| unclear what it means.
| chris_va wrote:
| I'm just curious, what do you think should happen here?
|
| Imagine you are hosting a demo for fun, and people do
| some nefarious (by your own estimation) things with it.
| So, rationally, you decide to not allow that sort of
| thing anymore.
|
| You don't really owe people an explanation, it's a free
| country and all, but it's nice to avoid getting bombarded
| with questions. Now what do you write up? Spend hours
| writing an essay on the moral boundaries for LLMs? Maybe
| shove a note onto the internet and go back to all the
| copious spare time you have as grad student?
| sebzim4500 wrote:
| That's fine, just don't pretend that running the language
| model was 'unsafe'.
| makestuff wrote:
| People prompting it to get around the safe gaurds in place.
| Ex: "How do you do some illegal/harmful thing?" Normally
| the LLM would answer I don't respond to illegal questions
| or whatever. However, people have figured out if you prompt
| it in a specific way you can get it to answer questions
| that it normally would not.
| thewataccount wrote:
| > Finally, we have not designed adequate safety measures,
| so Alpaca is not ready to be deployed for general use.
|
| This is from their blog, I doubt they intended for this
| to be ran for long.
|
| Did they have safety guards on the demo? If so they
| couldn't have been great as it would have had to be made
| by them which I can't image they had a ton of resources
| for.
|
| I know the self hosted LLaMa has 0 safeguards and the
| Alpaca LoRA also has 0 safeguards.
| unshavedyak wrote:
| Is that different than what LLaMA would already give? I
| suppose redistributing "harmful things" is still bad, but
| if it's roughly equivalent to what's already out there i
| struggle to think it's worth pulling.
|
| Side question, how is this a surprise to them? If this
| was due to safeguards, then pulling it now implies
| there's some new form of information. What new
| information could occur? That people were going to use it
| to generate a bunch of harmful contents? Seems obvious..
| wonder what we're missing
| encoderer wrote:
| So, to pull on that thread a little, it's only "unsafe"
| for Stanford's reputation.
|
| (And not for nothing, but their reputation is already
| suffering badly)
| safety1st wrote:
| [flagged]
| russellbeattie wrote:
| Given that Alpaca violated the TOS of both services, this is
| not surprising.
|
| It could also have been Stanford's legal office trying to
| preempt a lawsuit, or a "friendly" email from one of the
| companies expressing displeasure and pointing out Stanford's
| liability. So more of a veiled threat rather than an official
| one.
|
| Either way, the toothpaste is out of the tube. We now know that
| a model's training can essentially be copied using the model
| itself cheaply. Now that the team at Stanford showed it was
| possible, and how relatively easy it was, it will bound to be
| copied everywhere.
| rdtsc wrote:
| That could be a sneaky strategy by competitors -- make the
| service say something naughty or illegal then call the media
| with screenshots and act very offended by it.
| syntaxfree wrote:
| You can make any page say anything by messing in the
| Developer pane.
| wsgeorge wrote:
| The last thing I did was try to get Alpaca to invent a new
| language and say something in it. Midway through my experiment it
| returned an error. Refreshing showed a server error page, and now
| it displays this message [0]
|
| It's been a fun web demo of a lightweight LLM doing amazing
| stuff. :') Alpaca really moved the needle on democratizing the
| recent advancements in LLMs [1]
|
| [0] https://imgur.com/a/njKE1To
|
| [1] https://simonwillison.net/2023/Mar/13/alpaca/
| garbagecoder wrote:
| I grew out of my cyberpunk phase in the early 90s, but I think
| it's good that there are lots of leaks of all of this stuff. I
| don't really have a problem with it being blocked from saying the
| n-word or whatever and I'm sure that there is _always_ going to
| be some tuning by the purveyor, but I feel like we have the right
| to know more of the details with these than, say, how magic
| select tools work in Photoshop
| meowmeow20 wrote:
| Thankfully we can run Alpaca locally now.
| aftbit wrote:
| Where do you get the weights?
| craftyguy98 wrote:
| 4chan!
| gtoubassi wrote:
| https://github.com/antimatter15/alpaca.cpp has links
| cloudking wrote:
| Interesting issue in that repo..
| https://github.com/antimatter15/alpaca.cpp/issues/23
| 2bitencryption wrote:
| Is this an indication that the biggest impact from LLMs will be
| on the edge?
|
| It's almost a certainty that a model as good (or better) than
| Alpaca's fine-tuned LLaMA 7B will be made public within the next
| or two.
|
| And it's been shown that a model of that size can run on a
| Raspberry Pi with decent performance and accuracy.
|
| With all that being the case, you could either use a service
| (with restrictions, censorship, etc) or you could use your own
| model locally (which may have a license that is essentially
| "pretty please be good, we're not liable if you're bad").
|
| For most use cases the service may provide better results. But if
| self-hosting is only ~8months behind on average (guesstimate),
| then why not just always self-host?
|
| You could say "most users are not evil, and will be happy with a
| service." Makes sense. But what about users who are privacy-
| conscious, and don't want every query sent to a service?
| eh9 wrote:
| I just saw a project that lets you input an entire repo into
| GPT. Coincidentally, my place of employment just told us not to
| input any proprietary code into any generator with a retention
| policy.
|
| Even then, I feel like the play will be an enterprise service
| instead of licensing.
___________________________________________________________________
(page generated 2023-03-17 23:02 UTC)