Post Apz3FV3qUvCxQa2IT2 by starsider@valenciapa.ws
 (DIR) More posts by starsider@valenciapa.ws
 (DIR) Post #ApvWKBwN8Quvc3I2aW by bedast@beige.party
       2025-01-09T15:34:26Z
       
       7 likes, 7 repeats
       
       The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.This is not generative AI.While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.What VLC is doing is something that will contribute to accessibility in a big way.AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.#AI #Transcription #VLC #HearingImpaired #Deaf #Accessibility
       
 (DIR) Post #ApwLtIjexVCKU6qJFI by Suiseiseki@freesoftwareextremist.com
       2025-01-10T13:18:26.497266Z
       
       0 likes, 0 repeats
       
       @bedast The problem is such subtitling software is not free software, as I really doubt the license of the training works was followed (most training is done on creative works with no license) and there is no complete source code - just an object code form that nobody understands how it really works, thus such software doesn't have the 4 freedoms; https://www.gnu.org/philosophy/free-sw.en.html#four-freedomsAre you really confident such nonfree software is compatible with VLC's licenses?When I watch a video, I will never be content with slop subtitles - handcrafted .ass's is what I need.
       
 (DIR) Post #Apxza0eY9gISE2lhh2 by SamiMaatta@mementomori.social
       2025-01-10T12:00:14Z
       
       0 likes, 0 repeats
       
       @FediThing @bedast One of GenAI's well poisoning aspects has been tarnishing the term "AI". It has lost its meaning now.
       
 (DIR) Post #Apxza1feN4lrNkI62y by tyil@fedi.tyil.nl
       2025-01-11T08:17:43.503Z
       
       0 likes, 0 repeats
       
       @SamiMaatta@mementomori.social @FediThing@social.chinwag.org @bedast@beige.party Isn't VLC doing the same, though? It doesn't seem like there's much "intelligence" going on when all you're doing is voice recognition, so why refer to it as a form of "intelligence"? Maybe everyone should stop trying to create hype by calling any new feature "AI".
       
 (DIR) Post #ApyZWqh2KHB2ogivs8 by SuperDicq@minidisc.tokyo
       2025-01-11T15:00:39.135Z
       
       0 likes, 0 repeats
       
       @bedast@beige.party If it runs on the user's local device and is free software I'm all for it.
       
 (DIR) Post #Apz3EtdfdHHlMgVN20 by zanagb@mastodon.social
       2025-01-11T18:54:51Z
       
       0 likes, 0 repeats
       
       @mia @starsider @bedast i do not need to pretend because realtime transcription and translation isnt going to happen on-device. Much less with an OpenAI dependency, with a dataset that is entirely a black box. (Which on itself is already a privacy nightmare because who knows what it was trained on)CES demos are always smoke and mirrors. It won't ever fully run offline (noone'll download 3GiB of mistery sauce binaries and allow video playback to take 4GiB of RAM, much less on consumer software)
       
 (DIR) Post #Apz3EuInAOB9QDkD7w by starsider@valenciapa.ws
       2025-01-11T18:58:28Z
       
       1 likes, 0 repeats
       
       @zanagb @mia @bedast What's your basis for saying "realtime transcription and translation isnt going to happen on-device"? I *already* do it regularly both in my PC and in my phone. No connection required.
       
 (DIR) Post #Apz3FT5XpdNXJGocq0 by zanagb@mastodon.social
       2025-01-10T13:36:11Z
       
       0 likes, 0 repeats
       
       @bedast so long the model is outsourced to OpenAI and the like. You can always be certain everything you ever watch on VLC will be beamed to a third party "for improvement". Auto-generated subtitles might be better than no subtitles, but not at the cost of constantly feeding third parties with your data.And of course, if we are talking of OpenAI's models, they are known to outright invent nonsense phrases when they tried audio transcription a few months ago.Id not trust an hallucinegic liar.
       
 (DIR) Post #Apz3FTpd4IF3cCNQfY by starsider@valenciapa.ws
       2025-01-10T23:07:23Z
       
       0 likes, 0 repeats
       
       @zanagb @bedast VLC will do it in-device, not sending anything anywhere.Whisper models are terrible at transcribing casual conversations of doctors and patients because the training data doesn't reflect that kind of speech and environments. But it excels at transcribing movies etc. because a lot of its training data are closed captions. So this would actually work reasonably well. One can put some text with the names of characters, places, etc. as context and that makes it transcribe those names very well. (source: I've been using whisper models at work, and occasionally I've been putting the mic towards the speaker with some show I'm watching to test) (also: I haven't sent any data to openai nor paid them anything)
       
 (DIR) Post #Apz3FUWWUoYLlERgWm by zanagb@mastodon.social
       2025-01-10T23:56:38Z
       
       0 likes, 0 repeats
       
       @starsider @bedast the CES demo makes it clear the transcription is **off-device**, ie, syphoning data. And besides, there are already many built in tools for that on macOS and linux.If i wanted fucked-up nonsense on my videos i would watch a raunchy youtube poop from the early 2010sId rather have a phoneme-based system where at least you can tell what the gibberish came from and you can tell its an error, and even reconstruct the sentence back.We do not need this.
       
 (DIR) Post #Apz3FV3qUvCxQa2IT2 by starsider@valenciapa.ws
       2025-01-11T00:26:43Z
       
       1 likes, 0 repeats
       
       @zanagb @bedast What makes it clear that it's off-device? Can you provide a link?What tools are you talking about? I use Linux, what should I search? I would like to compare it with the tool I'm doing as part of my day job (for which I compile the *whole* source code incl. all dependencies so I know for a fact that nothing is ever syphoned).About fucked-up nonsense, what I see in youtube all the time: Youtube's automatic subtitles are beyond terrible. With automatic translations to my native language they're even worse. Family members use it and I can't fathom how can they get anything out of it. No pauses, no punctuation, full of mistakes.Using whisper is a 1000x improvement over youtube's. It adds all the correct punctuation and everything. It only fails with proper names (unless it's given a context) and with speech with a lot of background noise. In all the 4 languages I've been testing it.For regular casual speech it doesn't work _that_ well but my work's project has that in account by marking all the dubious words. It also discards whole sentences with too many dubious words because they tend to be gibberish from random noise. Which makes me shudder when I read about the model being used as-is for conversations without regard from confidence levels, without using the context feature, and using naive stitching (since it can only transcribe 30 seconds at a time). Results are awful as I would have expected.