[HN Gopher] Improving Accessibility Using Vision Models
___________________________________________________________________
Improving Accessibility Using Vision Models
Author : bearjaws
Score : 28 points
Date : 2024-10-03 18:48 UTC (4 hours ago)
(HTM) web link (myswamp.substack.com)
(TXT) w3m dump (myswamp.substack.com)
| bearjaws wrote:
| Funny Google just released moments ago - gemini-1.5-flash-8b
| which scores slightly lower on vision. For clarity this is on the
| "older" gemini-1.5-flash.
|
| https://developers.googleblog.com/en/gemini-15-flash-8b-is-n...
| gostsamo wrote:
| Funnily enough, the images in the article do not have actually
| useful alt text and like every image in Substack I've encountered
| so far have no useful captions either.
| bearjaws wrote:
| How is the alt-text not useful? I even went through the effort
| of putting the data in the alt text for the bar chart. I tend
| to think of alt text as proving the same context as the image,
| for example the line chart is meant to convey how 1.5-flash
| outperforms 4o, but I am not going to embed each discrete data
| point in the alt text.
| gostsamo wrote:
| Maybe something is lost in the translation, but here it is
| what my screen reader makes out of the article:
|
| Along the way we realized some of our math courses had not
| been updated in quite some time, and some schools were still
| leveraging these courses to teach. Images for equations are
| bad m'kay
|
| It was immediately apparent was the use of images to
| represent equations like this: https%3A%2F%2Fsubstack-post-
| me... https%3A%2F%2Fsubstack-post-me... This is not great...
| the font is a bit on the smaller side and the font itself is
| not very legible, in my non-font expert opinion. Making
| matters worse, there is no alt-text provided that can explain
| the equation.
| gostsamo wrote:
| Checking the later pictures that you talk about, the alt text
| is found indeed. My recommendation though would be to give a
| summary of the data and not the conclusion. E.g. Gemini flash
| has error rate of x% while the others are y% and z%.
| SalmonSnarker wrote:
| 3 out of 5 images on the post have empty alt text (alt="").
| most substacks are pretty careless about alt text and so
| previous poster is just noting that your accessibility post
| follows this trend. (It's worth noting the post you made
| previous to this has 0 out of 4 images with alt text.)
| armoredkitten wrote:
| What is the measurement on the x-axis in the graph?? The text is
| talking about equations of 20 or 30 characters, but the graph
| goes up to...6. Six what?? Characters? Terms? If it's characters,
| why do we only get to see the performance from 1-6, when
| apparently 7% of equations had more than 20?
| bearjaws wrote:
| That's a fair point, I bucketed them into lengths of 1-10,
| 11-20, 21-30. I'll do a quick update.
| pumanoir wrote:
| I've had great success to convert math pics to latex using
| qwen2-vl
___________________________________________________________________
(page generated 2024-10-03 23:00 UTC)