Post Ac66smh00RUIcnAPGy by tarkowski@101010.pl
(DIR) More posts by tarkowski@101010.pl
(DIR) Post #Ac66smh00RUIcnAPGy by tarkowski@101010.pl
2023-11-23T12:07:59Z
0 likes, 0 repeats
Rest of the World offers, as is often the case, a healthy antidote to some of the mainstream spins on tech development - in this case, on chatbots and #LLM. This short interview is with Asmelash Teka Hadgu, a developer of an LLM for Ethiopian languages."If you ask ChatGPT in Tigrinya or Amharic the simplest and most frequently asked questions, it gives you gibberish, a mix of Tigrinya and Amharic, or even made-up words".Sounds obvious, but we forget this as we discuss #ChatGPT, a chat optimised for English and some other major languages. Here in Poland we lack a local, Polish LLM, but everyone loves talking about ChatGPT.Here's the quote that I find mosst striking:"Most of the data that powers them is basically internet data, and there is not enough data online for these languages."once again, the discussion about #AI needs to be one about data. and in this case there's a major digital / linguistic divide between the haves (Major languages of countries where the majority of LLM development is located) and have nots. The rest: Ethiopia, Poland, you name it.Kudos, by the way, to organizations like Eleuther.ai that try to bridge this divide.https://restofworld.org/2023/3-minutes-with-asmelash-teka-hadgu/