[HN Gopher] OpenAI quietly launched Whisper V2 in a GitHub commit
___________________________________________________________________
OpenAI quietly launched Whisper V2 in a GitHub commit
Author : fudged71
Score : 61 points
Date : 2022-12-06 18:24 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| nshm wrote:
| Looks like they plugged GPT-4 AI into speech recognition research
| and now they are going to release huge updates every month.
| dweekly wrote:
| What is the basis for this claim? How does their GPT-4 work
| intersect with the work on their ASR model?
| tpmx wrote:
| So in general and from the probably uninformed outside it seems
| like OpenAI (120 employees?) is outperforming Alphabet (187k
| employees). How?
| amelius wrote:
| How can the Word Error Rate (WER) be larger than 100% for some
| languages?
| lunixbochs wrote:
| If the target is the words "A A A" and you produce "B B B B",
| you have more errors than there were words in the target. 3
| replacements and 1 insertion.
| iKlsR wrote:
| I've been using whisper to get transcripts from my local radio
| stations. I know it's out of scope for the original project but I
| hope someone can build a streaming input around it in the future.
| Currently pipe in and save 10 minute chunks that get sent off for
| processing.
| rexreed wrote:
| What processing / server / backend are you using to run the
| whisper model?
| galleywest200 wrote:
| I have been using Whisper to transcribe my audio notes. I just
| save my voice memo from my phone to my NAS and my little script
| does the rest on a loop.
| chimineycricket wrote:
| How do you handle connectivity from outside your home network
| (when you're at the grocery store for example)? Do you have a
| VPN running?
| tehf0x wrote:
| Not OP but I use syncthing for such things.
| lunixbochs wrote:
| Nice catch. I'll run my test suite [1] on this and report back.
|
| [1] https://twitter.com/lunixbochs/status/1574848899897884672
| amelius wrote:
| Would be cool if you could run these speech models in tandem,
| and compute a new error-rate for the consensus of them.
| daemoens wrote:
| Is there a reason why Spanish and Italian have a lower WER than
| English? OpenAI is based in America right? They probably focused
| on English more than anything.
| nshm wrote:
| Spanish and Italian are very easy to recognize due to very
| simple phonetic structure.
___________________________________________________________________
(page generated 2022-12-06 23:00 UTC)