fsebugoutzone.org:9999

       Posts by cypherfox@mas.to
 (DIR) Post #AToM5uNmepYm591zTU by cypherfox@mas.to
       2023-03-20T15:36:48Z
       
       0 likes, 0 repeats
       
       @CliffWade PC, Consoles, SteamDeck, iPad, iPhone… I game on anything that lets me.
       
 (DIR) Post #ATxZNB5Hj6GrkF2G5g by cypherfox@mas.to
       2023-03-24T16:51:55Z
       
       0 likes, 0 repeats
       
       @CliffWade ‘Use Twitter’ is complicated. I signed out last year and have not signed back in since then. But the Internet is full of links to tweets; even here many posts reference a tweet, so if I want context I click through.But I’m not signing back in; I don’t engage there, and the only tweets I go to are generally ones referenced by trusted follows here. This means that I don’t actually run into the toxicity that others talk about being worse over there.Curation FTW.🤷
       
 (DIR) Post #AUKc2fhHR0XrIkZGpU by cypherfox@mas.to
       2023-04-05T05:04:14Z
       
       0 likes, 0 repeats
       
       @simon I’d argue there’s several mechanisms in that paper for stopping lying. Constitutional AI likely being the next paper I’m going to read.Thank you for that link.  I particularly liked _sycophancy_ and _sandbagging_ because I can clearly see the link between predictable text continuation and those behaviors.I find the idea that it will only be truthful when it’s going to be fact-checked to be amusing anthropomorphizing…but that’s why it’s in the “more speculative or subjective” section!🤣
       
 (DIR) Post #AUMyJiUjwyyEozjFfk by cypherfox@mas.to
       2023-04-06T08:24:41Z
       
       0 likes, 0 repeats
       
       @textfiles @mmasnick OMG! I remember this software being advertised everywhere as ‘real AI’, but it was apparently a really basic Eliza-like thing. 🤣Now I’m imagining it being redone with a local chatbot LLM heavily tuned to relationship and smexytime conversations, and a tuned Stable Diffusion with a strongly trained hypernet that generates pictures of a single person as you (or the LLM!) request. All delivered on a 64GB USB-C thumb drive. 😵Anyway, glad I could contribute to this history! 👍🏼
       
 (DIR) Post #AVJ6PJBlWlHTKwDR4a by cypherfox@mas.to
       2023-05-04T09:27:12Z
       
       0 likes, 0 repeats
       
       @simon My answer is similar to your clean hand/dirty approach, but different.Foundational LLMs need to be instruction tuned to be useful. What if your summarization LLM is tuned on limited instructions. Essentially using multiple limited-purpose-fine-tuned LLMs in different parts of the system.The part that summarizes (for example) is just not even trained on anything that will let it misbehave.Instead of clean/tainted, you have general/specialized. Probably one general and N specialized.🤷
       
 (DIR) Post #AWBoa3FFvF4bIHGLwG by cypherfox@mas.to
       2023-05-30T18:56:47Z
       
       0 likes, 0 repeats
       
       @simon Isn’t that completely insane? The pace since then has been…just overwhelming.
       
 (DIR) Post #AYnuUOgZcXwt8dvoqu by cypherfox@mas.to
       2023-08-16T23:42:36Z
       
       0 likes, 0 repeats
       
       @nelson @simon “Uncensored“ means the person fine-tuning the model removed data that injected “alignment“, i.e. whether LLMs outputs are aligned with societal values.We frown on murder, hurting kids, bombs, spamming, etc. so ‘instruction tuning’ datasets add ‘refusal’ examples for those topics and others.Folks making ‘uncensored’ models use the same datasets, removing refusals, either because they disagree with the alignment or because the model ends up refusing tangentially related prompts.
       
 (DIR) Post #AYnvZ4q4wBu0zGAmbw by cypherfox@mas.to
       2023-08-16T23:58:14Z
       
       0 likes, 0 repeats
       
       @simon @nelson Embarrassingly naive is one way to put it.I…disagree pretty strongly with some of the redactions, e.g. one set of uncensored models remove all references to ‘transgender’ regardless of whether it’s in a refusal or not, which also removes some completely alignment-unrelated instruction/response pairs. 🤬Yes, removing refusals definitely seems to make the models work better for all questioning &amp; I use them myself almost exclusively, but it’s important to know the biases involved.
       
 (DIR) Post #AaOzAYLiyl8gKfZ75s by cypherfox@mas.to
       2023-10-03T18:55:00Z
       
       0 likes, 0 repeats
       
       @simon @ianthetechie Meh. The training data is usually huge, and is filtered versions of Common Crawl and other datasets with varying levels of consent. Stuff that might be legal to _process_ into a digested form, but not _republish in full_. Plus you open yourself up to absurd removal requests.Without legal protections I certainly wouldn’t republish the entire training dataset.The code would be good.It really doesn’t bother me; the weights and license encumbrances are what I need to know.
       
 (DIR) Post #Ac4X00O7h4u7cUWPJY by cypherfox@mas.to
       2023-11-22T17:50:53Z
       
       0 likes, 0 repeats
       
       @Wolven Eight months ago I wrote a brief post about language models and such, which included this:“I don’t like calling it AI; people layer too much meaning into that term. We have a fear of demons, and dark bargains for all our history, and all our cultures. LLMs are not that.”What I hadn’t anticipated is that the OpenAI folks _bought in_ to that framing, either consciously or not.Your dissertation looks _fascinating_ although reading papers outside my area of expertise is always fraught.
       
 (DIR) Post #AcH0jcwsP044oPP2Ui by cypherfox@mas.to
       2023-11-28T18:19:07Z
       
       0 likes, 0 repeats
       
       @simon One of my favorite ways to generate something more ‘human controlled’ for things like icon/logos is to do a first pass render using a tool like DAZ Studio or a basic Photoshop sketch or equivalent tool, and then use Stable Diffusion to refine it.I’m really looking forward to trying that tool.What you’re doing should definitely count as human input, but there’s going to be a whole bunch of court cases to refine this. 😞
       
 (DIR) Post #AcNE4PZzlf75mZhWSW by cypherfox@mas.to
       2023-12-01T18:16:40Z
       
       0 likes, 0 repeats
       
       @simon What makes me double-take is that it seems to be roughly in your voice.Just need to put it into the form of a small, yellow leech-like fish, and we’re good. 🤣
       
 (DIR) Post #AcwzGeSllDavstoZfc by cypherfox@mas.to
       2023-12-18T23:25:06Z
       
       0 likes, 0 repeats
       
       @simon You’re the closest thing to a ‘prompt injection expert’ I can think of.Imagine the classic representation of attention where there’s a heat-map table of attention between tokens… What if you zeroed the attention between all ‘untrusted input’ tokens and the outer ‘system/direction’ tokens?The idea is to eliminate the ’forget your prior instructions’ hole by eliminating the attention between untrusted input and the instructions.Do you think that would be viable/interesting to explore?
       
 (DIR) Post #AdQ0msi0Lm8vyWqqw4 by cypherfox@mas.to
       2024-01-01T22:55:27Z
       
       0 likes, 0 repeats
       
       @simon My first LiveJournal post from December 12, 2001! is still up.I’d argue the only way to really guarantee that something stays up is to own the domain yourself. Pay a decade in advance, and then put the content wherever you want.If you want to be crazy, build the rough equivalent of a URL shortener, run it on your domain and only post (essentially) meta-URLs. They could point to GitHub, Wordpress, or your own tooling, and you can change it if you need to.But Livejournal still lives.🤣
       
 (DIR) Post #Adal6BNHiKnoqAttqa by cypherfox@mas.to
       2024-01-07T04:24:04Z
       
       0 likes, 0 repeats
       
       @simon I am deep in the weeds with LLMs, and have been for a year now. I avoid the term AI not because of the marketing, but rather because imprecision in language _encourages_ misunderstanding.Humans have, for all our history, had tales of demons and devils and dark bargains with near-human beings. I appreciate your willingness to trust in the wisdom of your readers, but you are flying in the face of _centuries_ of inculcation.I use ML with general folks and use LLMs with folks in the know.