[HN Gopher] A Multimodal Automated Interpretability Agent
       ___________________________________________________________________
        
       A Multimodal Automated Interpretability Agent
        
       Author : el_duderino
       Score  : 74 points
       Date   : 2024-07-24 12:42 UTC (1 days ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | empath75 wrote:
       | https://arxiv.org/pdf/2404.14394
       | 
       | Actual paper to save you from having to read the PR release.
        
         | dang wrote:
         | Ok, we'll change the URL to that from
         | https://news.mit.edu/2024/mit-researchers-advance-
         | automated-.... Users may still want to read the latter for a
         | quick intro.
        
       | curious_cat_163 wrote:
       | > We think MAIA augments, but does not replace, human over- sight
       | of AI systems. MAIA still requires human supervision to catch
       | mistakes such as confirmation bias and image generation/editing
       | failures. Absence of evidence (from MAIA) is not evidence of
       | absence: though MAIA's toolkit enables causal interventions on
       | inputs in order to evaluate system behavior, MAIA's explanations
       | do not provide formal verification of system performance.
       | 
       | For folks who are more familiar with this branch of literature,
       | given the above, why is this a fruitful line of inquiry? Isn't
       | this akin to stacking turtles on top of each other?
        
         | yurimo wrote:
         | I think what authors aimed for is perhaps a proof-of-concept
         | work where they attempt to demonstrate that you can (to a
         | degree) automate interpretability. Mech interpretability is
         | challenging because it does not scale well at the moment, and
         | there is a debate about whether localized structural
         | discoveries on toy examples actually translate to patterns in
         | large networks. My guess if you could build an automatic
         | explainer system this would allow you to flag problems and find
         | issues faster, basically as some sort of meta-heuristic for
         | further investigation
         | 
         | Unfortunately, that title hypes it up, and as always, once you
         | read the paper, the results are less impressive, but that is
         | what the state of AI research is currently, speaking as a
         | researcher myself.
         | 
         | In a similar vain: https://openai.com/index/language-models-
         | can-explain-neurons...
        
         | visarga wrote:
         | That's basically a known fact about LLMs, they need oversight.
         | But if they make the task 100x easier, it's still useful as a
         | starting point. This kind of neural net analysis is difficult
         | to do manually.
         | 
         | I am curious if they just start making inventories for all
         | neurons in all layers, then they can compare models based on
         | neuron types, or even train them to achieve the right mix of
         | concepts.
        
       | benreesman wrote:
       | We uncritically accept extraordinary claims on this. They might
       | even be valid claims, but they are so rarely supported by
       | evidence that is likewise extraordinary.
       | 
       | In my experience real, durable progress generally starts
       | happening once we come back down to Earth and start iterating.
       | 
       | Are modern large models crucial to transportation? Maybe? Waymo
       | is cool but it's not yet an economic reality at scale, and I
       | doubt there are 1.75T weight models running in cars. Are they
       | crucial to finance? I'm quite sure that machine learning plays an
       | important role in finance because I know people in finance who do
       | it all day for serious firms, but I'm very skeptical that finance
       | has been revolutionized in the last 18 months (unless you count
       | the NVDA HODL).
       | 
       | Can we push back a little on the breathless hyperventilation? It
       | was annoying a year ago, the AGI people were wrong, it's
       | offensive now, we got played for suckers.
       | 
       | "As artificial intelligence models become increasingly prevalent
       | and are integrated into diverse sectors like health care,
       | finance, education, transportation, and entertainment,
       | understanding how they work under the hood is critical.
       | Interpreting the mechanisms underlying AI models enables us to
       | audit them for safety and biases, with the potential to deepen
       | our understanding of the science behind intelligence itself."
        
         | ainoobler wrote:
         | Eventually both the hype and its criticism will be automated
         | with AI as well so that we can all go to the beach and relax.
        
       ___________________________________________________________________
       (page generated 2024-07-25 23:16 UTC)