[HN Gopher] Controlling Language and Diffusion Models by Transpo...
       ___________________________________________________________________
        
       Controlling Language and Diffusion Models by Transporting
       Activations
        
       Author : 2bit
       Score  : 56 points
       Date   : 2025-04-10 17:58 UTC (5 hours ago)
        
 (HTM) web link (machinelearning.apple.com)
 (TXT) w3m dump (machinelearning.apple.com)
        
       | turnsout wrote:
       | Super interesting. You can see why Apple would be interested in
       | strictly controlling output. I wonder if any of this work found
       | its way into the Image Playground.
        
       | scorps wrote:
       | It's amusing to me that humans seem to have this same problem
       | ("Do not think of a pink elephant!")
        
       | sampton wrote:
       | Multimodal LLM is the true solution but Apple is probably looking
       | for something they can run on-device, at least current generation
       | of devices.
        
       | roro_7 wrote:
       | I could be wrong but ... I feel like this may partially go
       | against a very basic fact about intelligence that was recently
       | stated by Ilya (but is common sense): the more intelligent the
       | model the harder it is to control it. You can remove elephants
       | and force other basic behavioral changes, but the strength of
       | artificial free will (so to speak) of these models is correlated
       | with their intelligence, and this does not reduce it, so it will
       | come out in other ways. If you do manage to control it fully then
       | you will have a model as dumb as a brick. The whole point of
       | intelligent machines is their independent thought. The more
       | intelligent, the more independent thinking will emerge.
        
         | hiddencost wrote:
         | s/fact/hypothesis/
        
         | antonkar wrote:
         | The intelligence is just a static geometric shape in an LLM
         | file (only GPUs "choose" and "shape-change" in that shape).
         | 
         | So the maximal intelligence is actually not an agent at all (it
         | has zero agency itself), it's a place. You can imagine the
         | final direct democratic simulated multiverse, that's the final
         | absolute super-intelligence. It has all the agents inside of
         | it, while it itself is as static spacetime. Agents (like us and
         | others) are 3D and dynamic, while the multiverse is 4D static
         | spacetime. Everything already happened, so there is no future,
         | only the past, you can forget something to relive it.
         | 
         | While maximal agency (=shape-changing) is actually the Big
         | Bang, it has almost zero intelligence (it's a dot) but infinite
         | potential future intelligence (can become a multiversal
         | simulation).
        
       | imranq wrote:
       | This just seems like a fancy way of describing LoRA? At the end
       | of the day you are still learning weights based on a described
       | set of outputs and then applying them to inference
        
       | antonkar wrote:
       | There is an idea for the unicorn AI safety startup to get
       | currently almost 100% unprotected (from AI botnet) consumer GPUs
       | into a cloud to get Google-level security (each GPU can bring you
       | $30-1500 in profits per month, you can share it with the user,
       | the user can play GPU game from any device, use any free or paid
       | AI model, everything really becomes better, you can include a 5g
       | modem), here's the full proposal (the author is probably
       | dyslexic) https://melonusk.substack.com/p/notes-on-euto-
       | principles-and...
        
       | vessenes wrote:
       | OK - basic plan here, which I feel I may have read (just called
       | something like a concept LoRA on r/stablediffusion?):
       | 
       | 1. Any concept you're interested in, get inputs with and without
       | it. For images: 100 with, say a pink elephant, 100 without.
       | 
       | 2. Calculate the difference between these models as represented
       | by an "Optimal Transport Map".
       | 
       | Apply the map at desired strength, and voila - you don't have a
       | pink elephant anymore. These can stack.
       | 
       | There are lots of obvious and interesting applications here in
       | LLMs - there's some research showing that LLMs have
       | honesty/dishonesty parameter groupings, for instance.
       | 
       | But, I can't really figure out what this OT map _is_. Is it a
       | single layer tensor? Is it multidimensional? If it 's the size of
       | the original model (which they say it is not), then I understand
       | how to apply it - just add weights and rerun. If it's not a copy,
       | where and when is this map applied? Another way to say this is,
       | how is this different than calculating the average difference and
       | storing it in a low-rank adapter? I have no idea.
        
       ___________________________________________________________________
       (page generated 2025-04-10 23:00 UTC)