[HN Gopher] Improving Accessibility Using Vision Models
       ___________________________________________________________________
        
       Improving Accessibility Using Vision Models
        
       Author : bearjaws
       Score  : 61 points
       Date   : 2024-10-03 18:48 UTC (1 days ago)
        
 (HTM) web link (myswamp.substack.com)
 (TXT) w3m dump (myswamp.substack.com)
        
       | bearjaws wrote:
       | Funny Google just released moments ago - gemini-1.5-flash-8b
       | which scores slightly lower on vision. For clarity this is on the
       | "older" gemini-1.5-flash.
       | 
       | https://developers.googleblog.com/en/gemini-15-flash-8b-is-n...
        
       | gostsamo wrote:
       | Funnily enough, the images in the article do not have actually
       | useful alt text and like every image in Substack I've encountered
       | so far have no useful captions either.
        
         | bearjaws wrote:
         | How is the alt-text not useful? I even went through the effort
         | of putting the data in the alt text for the bar chart. I tend
         | to think of alt text as proving the same context as the image,
         | for example the line chart is meant to convey how 1.5-flash
         | outperforms 4o, but I am not going to embed each discrete data
         | point in the alt text.
        
           | gostsamo wrote:
           | Maybe something is lost in the translation, but here it is
           | what my screen reader makes out of the article:
           | 
           | Along the way we realized some of our math courses had not
           | been updated in quite some time, and some schools were still
           | leveraging these courses to teach. Images for equations are
           | bad m'kay
           | 
           | It was immediately apparent was the use of images to
           | represent equations like this: https%3A%2F%2Fsubstack-post-
           | me... https%3A%2F%2Fsubstack-post-me... This is not great...
           | the font is a bit on the smaller side and the font itself is
           | not very legible, in my non-font expert opinion. Making
           | matters worse, there is no alt-text provided that can explain
           | the equation.
        
           | gostsamo wrote:
           | Checking the later pictures that you talk about, the alt text
           | is found indeed. My recommendation though would be to give a
           | summary of the data and not the conclusion. E.g. Gemini flash
           | has error rate of x% while the others are y% and z%.
        
           | SalmonSnarker wrote:
           | 3 out of 5 images on the post have empty alt text (alt="").
           | most substacks are pretty careless about alt text and so
           | previous poster is just noting that your accessibility post
           | follows this trend. (It's worth noting the post you made
           | previous to this has 0 out of 4 images with alt text.)
        
             | bryanrasmussen wrote:
             | looking through it the images that are definitely content
             | controlled by the user has alt text - that is to say the
             | graphs, the first alt text = "" is inside a bit of content
             | that is display:none and thus not available to a screen
             | reader - I suppose the others, so it is not knowable if
             | that alt text will be filled when the area is rendered
             | (probably not) I didn't look for the other one but I expect
             | it is the same situation because all the images I
             | encountered that were in the writer's control had alt text.
             | 
             | About the empty ones I have not investigated but there are
             | numerous situations in which an empty alt text makes
             | perfect sense and is a better accessibility solution for
             | most users of screen readers than otherwise. For example if
             | they are inside something clickable that has an aria label
             | on it telling you how to use that part of the dom, the alt
             | text on a child image just makes things overly verbose and
             | annoying in most circumstances.
             | 
             | I have an article in the works that touches on these issues
             | with proposed solutions but unfortunately it would be too
             | big to talk all about here.
             | 
             | on edit: of course it is possible that, being alerted to
             | the fact, the writer has added the alt text in.
        
       | armoredkitten wrote:
       | What is the measurement on the x-axis in the graph?? The text is
       | talking about equations of 20 or 30 characters, but the graph
       | goes up to...6. Six what?? Characters? Terms? If it's characters,
       | why do we only get to see the performance from 1-6, when
       | apparently 7% of equations had more than 20?
        
         | bearjaws wrote:
         | That's a fair point, I bucketed them into lengths of 1-10,
         | 11-20, 21-30. I'll do a quick update.
        
       | pumanoir wrote:
       | I've had great success to convert math pics to latex using
       | qwen2-vl
        
       | jmull wrote:
       | I don't understand "The Results" graph.
       | 
       | The x-axis has integers, 0, 1, 2, 3, 4, 5, 6, but the text talks
       | about models struggling at the 30 character mark? On the graph
       | they all start getting bad around 3, depending on what you mean
       | by bad. Is the x-axis tens of characters??
       | 
       | Anyway...
       | 
       | > anything longer than 20 characters would tend to have more
       | issues, we flagged those for manual review.
       | 
       | Even though the failure rate was smaller, is it OK if several of
       | the shorter equations are wrong? Maybe they should have manually
       | reviewed all of them.
       | 
       | Edit: Now I see someone else brought up the x-axis issue. There's
       | a response that seems to say the x-axis is buckets of 10
       | characters. I guess the update hasn't gone through yet.
        
       ___________________________________________________________________
       (page generated 2024-10-04 23:02 UTC)