[HN Gopher] Segment Anything Model (Sam) Visualized
       ___________________________________________________________________
        
       Segment Anything Model (Sam) Visualized
        
       Author : nimsy
       Score  : 65 points
       Date   : 2023-12-01 08:26 UTC (14 hours ago)
        
 (HTM) web link (flowforward.simple.ink)
 (TXT) w3m dump (flowforward.simple.ink)
        
       | nimsy wrote:
       | Hi everyone, We have created this visualization of the SAM model
       | that allows you to see the architecture in a interactive manner,
       | along with the code. We made this when trying to implement the
       | SAM model for us to understand it better. Thought I'd share it
       | and some ML folks might find it useful. Please let me know if it
       | did or did not help you! How do you usually go about
       | understanding the model architectures?
       | 
       | https://flowforward.simple.ink/
        
         | arketyp wrote:
         | I'm working with fine tuning the SAM encoder using LoRA at the
         | moment, so thanks for this. The Segment Anything code on its
         | own, I must say though, is perhaps the most readable and easily
         | navigated DL code I've encountered. Or maybe it's just me
         | coming from mostly a TensorFlow background. I remember I
         | struggled understanding even my own networks when viewing them
         | in TensorBoard. The code peek feature of yours is great.
        
           | nimsy wrote:
           | Thanks for your comment! Glad it was a (small) help. Yes,
           | Meta research did a great job documenting their code! Quick
           | question: why do you not use Hugging Face with their PEFT
           | library for doing fine-tunning SAM with LoRA?
        
             | arketyp wrote:
             | The work is based on the SAMed paper and repo, so I'm not
             | re-inventing the wheel, still leveraging best practices.
             | Generally I see a point in keeping things minimal though,
             | anticipating getting gritty with it.
        
         | zorgmonkey wrote:
         | I like the idea a lot, but I had two main UX problems:
         | 
         | * it is hard to know which of the green blocks can be expanded
         | by clicking, maybe a different color or border for the ones
         | that can be expanded.
         | 
         | * I kept accidentally clicking the text to go to github, but I
         | did realize that if you aim for the edge it works a lot more
         | reliably
        
       | nhinck wrote:
       | hit detection for hovering and clicking is pretty off in Firefox.
        
         | nimsy wrote:
         | Sorry to hear that. Will definately try to address in the
         | future. Other than the terrible UX, did this help you in any
         | way?
        
       | valine wrote:
       | This is a really neat idea. Would love to have a similar view for
       | llama like models. I've been working with Mistral 7B lately and
       | it's annoying how many small changes there are between it and
       | llama. Having a view like this would be a good time saver.
        
         | nimsy wrote:
         | Thanks for your comment. Will definately try to work on that
         | too! Quick question: why does the differences between Mistril
         | and llama? Do you mean it can save time for reading the paper?
         | Or actually during the coding/building process?
        
           | valine wrote:
           | The coding process. I've been experimenting with fine tuning
           | methods where I freeze various layers or use different loss
           | functions for attention vs feed forward. It's random little
           | things like the names of layers that trip me up. For example
           | the attribute that holds the name of the activation function
           | in mistral is called hidden_act where in llama it's called
           | activation_function.
        
       ___________________________________________________________________
       (page generated 2023-12-01 23:01 UTC)