[HN Gopher] EuroLLM: LLM made in Europe built to support all 24 ...
       ___________________________________________________________________
        
       EuroLLM: LLM made in Europe built to support all 24 official EU
       languages
        
       Author : NotInOurNames
       Score  : 471 points
       Date   : 2025-10-28 14:58 UTC (8 hours ago)
        
 (HTM) web link (eurollm.io)
 (TXT) w3m dump (eurollm.io)
        
       | adzm wrote:
       | For those curious, the 24 official languages are Bulgarian,
       | Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
       | French, German, Greek, Hungarian, Irish, Italian, Latvian,
       | Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak,
       | Slovenian, Spanish, and Swedish.
       | 
       | Maltese, interestingly, is the only Afro-Asiatic derived
       | language.
       | 
       | Hungarian, Finnish, and Estonian are the three Uralic languages.
       | 
       | All the others are Indo-European, Greek being the only Hellenic
       | one, Irish the only Celtic, the rest are Baltic, Slavic, Italic,
       | or Germanic.
       | 
       | (I originally used the term Balto-Slavic, though I was unaware of
       | some of the connotations of that term until just now. Baltic and
       | Slavic do share a common origin, but that was a very very long
       | time ago)
        
         | purrcat259 wrote:
         | I read, write and speak Maltese, AMA if you are curious about
         | the language.
        
           | ebb_earl_co wrote:
           | What is the name of Maltese in Maltese? Like "el espanol" in
           | Spanish, it's neat to know what languages call themselves
        
             | kridsdale3 wrote:
             | 'ish' is a pretty universal english suffix. So Spanish is
             | just "espan-ish".
        
             | ggsp wrote:
             | Wikipedia says it's "Malti"
        
               | arbuge wrote:
               | Il-Malti to be precise. Il- means "the" and changes its
               | meaning to that of the language. Malti alone would mean a
               | Maltese person.
               | 
               | Source: I'm also Maltese.
        
               | jll29 wrote:
               | The "Il" in Il-Malti is like "al" in Arabic, which
               | Maltese is closely related to as was pointed out above.
               | 
               | Arabic (language): al-'arabiyyah (l`arabiyaW@).
        
             | kwk1 wrote:
             | A term for that concept, by the way, is "endonym":
             | 
             | https://en.wikipedia.org/wiki/Endonym_and_exonym
        
           | Raed667 wrote:
           | Tunisians claim they can understand Maltese with minimum
           | effort, is it reciprocal? How close is Maltese to arabic /
           | tunisian dialect ?
        
             | arbuge wrote:
             | Not sure which Tunisians are claiming this but they'd
             | definitely need a lot more than minimum effort. Maltese
             | split off from Arabic around 1k years ago. The two
             | languages sound pretty different, and are written with
             | different alphabets.
        
               | cenamus wrote:
               | Also lots of influence from Italian and English.
        
               | findyoucef wrote:
               | As an Algerian, I can confirm that Maltese is
               | surprisingly easy to understand. I was genuinely shocked
               | the first time I heard it because the similarities are so
               | obvious. Many Arabic dialects are also written using the
               | Latin alphabet, especially online and on social media, so
               | the different writing systems aren't really a barrier at
               | all.
        
             | purrcat259 wrote:
             | I don't have much personal experience in attempting to
             | communicate with arabic speakers. From others I have heard
             | Lebanese arabic is the closest and you can have a passable
             | conversation.
        
           | adzm wrote:
           | I'm actually really curious about everyday usage of the
           | language; is code switching between English and Maltese more
           | common than Maltese on its own? I've seen a few online
           | communities where the vocabulary switches between Maltese and
           | English very often which is interesting but I wonder how much
           | of that is just online / written versus everyday speech.
        
             | purrcat259 wrote:
             | Depends on where you live and how you were brought up, but
             | for the most part code switching is default.
             | 
             | There was a point about 7 years ago when the overton window
             | shifted to "speak english to strangers first" because of a
             | large influx of foreigners who did not know the language.
             | Since then I've met foreigners who have better Maltese than
             | some natives.
             | 
             | Older folks & geriatrics will sometimes be surprised when
             | they assume someone is foreign and they turn out to be
             | Maltese. "int Malti??" is a statement I get often because I
             | don't look Mediterranean despite being born here.
        
           | nxor wrote:
           | How are loan words viewed? Do businesses work in Maltese? Are
           | monolingual speakers of the language regarded differently
           | than those fluent in English? Do young people in Malta listen
           | to Maltese music?
        
             | JAlexoid wrote:
             | Yes, there's plenty of Maltese spoken and listened to.
             | 
             | I was surprised to hear Maltese radio stations played in
             | taxis, while visiting Malta just a few weeks back
        
               | nxor wrote:
               | The point of my question was to ask someone who lives
               | there, not someone who visited
        
             | purrcat259 wrote:
             | Maltese has been loaded with loan words since forever. 5
             | points if you can guess where bongu, bravu and mappa come
             | from. At some point there was some literary council for the
             | language that decided that any new loan words should just
             | be spelled phonetically. Computer became kompjuter.
             | 
             | Businesses do work in Maltese and English. Both are
             | official languages. Its quite rare to encounter a business
             | that deals near exclusively in Maltese. Many prefer Maltese
             | but will fall back to english where necessary.
             | 
             | Regarding monolignual speakers, I think theres a lot of
             | stereotypes for maltese only, english only and code
             | switchers. I think its all a bit silly... So as long as
             | communication can happen I don't fuss.
             | 
             | On Maltese music... There's a lot of low ish quality music
             | then there's a few absolute gems. Look up The Travellers,
             | Lapes, Jon Mallia on YouTube/Spotify.
        
               | nxor wrote:
               | Interesting, but I get the impression that ubiquitous
               | English loan words in seemingly every language is a lot
               | different than loan word patterns of the past. Do you
               | think? Maybe not?
        
               | purrcat259 wrote:
               | I don't have much of an opinion I suppose english
               | language cultural dominance has meant that newer words
               | are just imported rather than adapted
        
               | lullu57 wrote:
               | I can concur. All older words (think any word that was
               | needed since the older generations), are Arabic based.
               | All the numbers, all older verbs etc. 'Newer' words are
               | latin based.
        
           | cm2012 wrote:
           | Can you communicate with Maltese dogs more effectively?
        
             | purrcat259 wrote:
             | Only if we have a few Maltesers first
        
           | Tade0 wrote:
           | How is "Marsaxlokk" _really_ pronounced? I 've heard that
           | word a few times, but never from a native. Google translate
           | can't help me here, as it doesn't seem to have Maltese text-
           | to-speech.
        
             | purrcat259 wrote:
             | Read with English pronunciation, closest would be mar-sa-
             | shlock.
        
               | cess11 wrote:
               | From my experience it will be understood by locals when
               | pronounced like that.
        
           | franklin_p_dyer wrote:
           | Not a question, but - Tatoeba could use your help! It is an
           | open source (both code and data) dataset of parallel
           | sentences and their Maltese data is very lacking. Also it's
           | pretty fun to just translate a bunch of random sentences into
           | a language you speak. :-)
           | 
           | https://tatoeba.org/
        
           | runarberg wrote:
           | Is there any dialect of Arabic which you can understand
           | without too much effort?
           | 
           | How much do you consider Maltese its own language (as opposed
           | to a dialect of Arabic)?
        
             | notahacker wrote:
             | I know that the reverse understanding isn't too bad from
             | chatting with a Saudi-born member of staff on holiday in
             | Malta.
             | 
             | I don't think anyone would seriously consider it a dialect
             | of Arabic though with its completely different alphabet and
             | half the vocabulary and morphology coming from Italian
             | languages/dialects, even if Malta hadn't spent the best
             | part of a millennium trying very hard _not_ to become part
             | of the Arab world
        
             | purrcat259 wrote:
             | From what I have heard, Lebanese Arabic is the closest, and
             | still pretty far. Passable conversation is possible.
             | 
             | Maltese is definitely its own language. Arabic roots are
             | there (theres a Semitic joke in there ) but it isn't arabic
             | anymore. Its written left to right with a variant of the
             | english alphabet.
        
           | barrell wrote:
           | I recently discovered Maltese existed, and started learning
           | it that day. I find it such an awesome language, and not just
           | because of the letter H
           | 
           | I do wonder what natives think and feel about the longevity
           | of their language? What is taught in schools at what ages
           | (assuming English is in the mix somewhere). Is there enough
           | media in Maltese for Malti to go about the moderns at fully
           | in Maltese? It's shockingly hard to find any information on
           | Maltese, and even harder to find content.
           | 
           | I'm not sure if's dying out, or in danger thereof; if there
           | are preservation efforts, or if there is no need.
        
             | lullu57 wrote:
             | Native Maltese speaker here. It is thought in schools
             | alongside English, with both being official national
             | languages. Most people locally, that are not foreign born
             | or immigrants speak the language, and it is used in most
             | households as the main language. But everyone grows up
             | bilingual, as English is essential for most everything else
             | that we do as a nation.
        
         | jim180 wrote:
         | Lithuanian and Latvian are Baltic languages. Nothing to do with
         | Slavic...
        
           | Telaneo wrote:
           | https://en.wikipedia.org/wiki/Balto-Slavic_languages
        
             | asveikau wrote:
             | See the section "historical dispute".
             | 
             | I think some people get touchy about them being lumped
             | together if their last period of commonality (per the
             | article) was 1400 BCE. For comparison, I believe all the
             | Slavic languages were mutually intelligible around 1200 AD.
             | But much more recently than this, in the last few
             | centuries, there have been notable attempts by east slavs
             | to absorb the Baltic language cultures and deny them.
        
               | krzyk wrote:
               | I doubt that South Slavic and West/East Slavic were
               | mutually intelligible at 1200 AD.
               | 
               | I doubt West and East Slavic were. But inside those
               | geographic groups they probably were (Czech and Polish
               | AFAIR were around that time).
        
               | actionfromafar wrote:
               | Depends on your standards, too. Even today, any pair of
               | slavic speakers should have a head start in understanding
               | each other. Put them next to each other for a month and
               | they should be talking, at least about basic everyday
               | things.
        
               | asveikau wrote:
               | I may be off by 100-200 years, but this is what I read.
               | There were accents and regionalisms but they were all
               | mutually intelligible.
               | 
               | It is an example I think of often, about how quickly
               | languages can change. In the scale of 1000 years, a lot
               | changes. Most of the diversity in Romance languages is
               | from around that timescale too, it really started to
               | diverge substantially around 900ad-1100ad.
        
           | kaato137 wrote:
           | Balto-Slavic branch divides into Baltic and Slavic language
           | groups so nothing wrong here
        
             | kreetx wrote:
             | Yup, most of Eastern Europe are Balto-Slavic. While the
             | division from the Eastern Slavic languages (Russian,
             | Belarussian, Ukranian, etc) is distant, they are still
             | Slavic. From Eastern Europe, only Estonian is not a Slavic
             | language.
        
               | d1sxeyes wrote:
               | Hungarian too, although there's a question about whether
               | Hungary is Eastern or Central Europe.
        
               | kreetx wrote:
               | Ah, yes, how could I forget! As a side note, though also
               | Finno-Ugric then similarity in sound and appearance from
               | Finnish or Estonian at least appears very far.
        
               | dragonwriter wrote:
               | "There's a question" implies that there is a ground truth
               | that might be discovered to resolve this rather than
               | simply a clash of different purely arbitrary definitions
               | of the same terms.
        
               | lo_zamoyski wrote:
               | The Visegrad 4 (Poland, Czechia, Slovakia, Hungary)are
               | generally taken to be "Central European". The strict
               | East/West division is largely a product of the Cold War
               | and the Iron Curtain.
        
               | NicuCalcea wrote:
               | > From Eastern Europe, only Estonian is not a Slavic
               | language.
               | 
               | Well, that and Romanian. And Hungarian. And outside the
               | EU, Albanian. And Georgian, Azeri and Armenian if you
               | consider those Eastern Europe.
        
               | ardit33 wrote:
               | Albania is not "East Europe", but South East. Same as
               | Greece.
        
               | NicuCalcea wrote:
               | That's just your opinion, and the UN would disagree:
               | https://www.un.org/dgacm/en/content/regional-
               | groups#:~:text=...
               | 
               | Some of my fellow Romanians will also claim they're
               | Central European, but in my mind, all the ones I listed
               | are Eastern European countries. I'd even include Turkey
               | and Kazakhstan in there, part of the latter is to the
               | West of the Urals, which is what we normally consider the
               | border between Europe and Asia.
        
               | kreetx wrote:
               | I regret being that loose with the designation :),
               | Romanian and Hungarian are valid counter arguments.
               | 
               | In my mind, I was thinking of the belt of countries
               | between Russia and Central Europe, starting from the
               | Baltics down to the Balkan (excluding Greece).
        
               | NicuCalcea wrote:
               | Even by your definition, I can count at least seven
               | countries where the official language is not Slavic. And
               | that's not even including all the Altaic, Romance and
               | other assortment of regional languages, many of which
               | have some sort of official status.
        
               | rich_sasha wrote:
               | Latvian and Lithuanian are not at all Slavic.
               | 
               | There is a branch that contains both Baltic and Slavic
               | languages, but there's also one that contains Albanian
               | and Greek.
        
               | ardit33 wrote:
               | Albanian and Greek are both completely separate branches,
               | and both unique on the tree (they don't have common
               | cousins like the others).
               | 
               | There have been some attempts to tie Albanian to
               | Germanic, or Greek, or other branches, but they all have
               | failed.
               | 
               | At some point they all are Indo_european, but they split
               | a way ago.
        
               | pqtyw wrote:
               | > most of Eastern Europe are Balto-Slavic
               | 
               | and
               | 
               | > only Estonian is not a Slavic language.
               | 
               | So following this logic saying "in Eastern Europe, only
               | Estonian is not a Baltic language" would make as much
               | sense?
        
             | sublimefire wrote:
             | It is just one of the theories, there is no clear evidence
             | to suggest that Baltic and Slavic were the same language
             | thousands of years ago.
        
               | pqtyw wrote:
               | Well there is if you go far enough. It's just the
               | question when did they split off from each other. However
               | there is no question that Baltic and Slavic are more
               | closely related to each other than any other non extinct
               | Indo-European languages.
               | 
               | The fact they they are the closest surviving relatives on
               | it own doesn't mean it makes sense to group them together
               | (i.e. Italo-Celtic is also a theorized subgroup in a
               | similar way but nobody is disputing that Celtic and
               | Italic languages evolved into distinct groups).
               | 
               | Then there is a huge amount of missing links and unknown
               | unknowns. e.g. Thracian and Dacian probably were also
               | pretty close to Baltic or Slavic (maybe even closer to
               | Baltic than Slavic is but we don't know enough about them
               | to make any conclusive claims at all... but we at least
               | know these languages existed)
        
             | Tade0 wrote:
             | Plenty of wrong here, considering Lithuanian and Latvian
             | are utterly unintelligible to slavs, save for loanwords,
             | but Slavic languages between themselves retain some level
             | of intelligibility, which even spawned two competing
             | constructed languages.
        
           | adzm wrote:
           | I was thinking about separating the two groups when I was
           | writing this but was afraid of getting too verbose, though in
           | retrospect that probably would have made more sense
           | regardless of the historical lineage. My apologies if this
           | came off as inconsiderate.
           | 
           | I updated my original comment, and learned a good amount
           | about that dispute as a result, so thanks for calling it out.
        
         | Vinnl wrote:
         | Tomorrow there are elections in the Netherlands, and two
         | parties are proposing adding Frysian to that list:
         | https://neerlandistiek.nl/2025/10/kies-voor-taal/
         | 
         | Best get to retraining those models.
        
           | przemub wrote:
           | Each EU country nominates one official language for the EU,
           | otherwise we'd have Catalan, Breton, Kashubian and many more.
        
             | rsynnott wrote:
             | They could get Austria to do it, as it presumably has a
             | spare slot.
        
               | outside1234 wrote:
               | This raises an interesting question. Is there only one
               | dialect of German in the LLM? My understanding is that
               | the German German and Austrian German dialects are
               | significantly different.
        
               | hebelehubele wrote:
               | My German teacher always claimed that Swiss German and
               | German German (Hochdeutsch) were so different that she
               | needed subtitles to understand it, and she didn't
               | understand why they weren't considered separate
               | languages.
        
               | umanwizard wrote:
               | They are in fact considered separate languages.
        
               | geretnal wrote:
               | Try dutch, it is combination of German and English!
        
               | layer8 wrote:
               | If Switzerland was in the EU, it would certainly be made
               | a separate official language.
        
               | ipsi wrote:
               | They really are very, very different. Knowledge of one
               | helps with the other, but it's far more than just "a
               | couple of weeks to adjust to the accent", for example.
               | 
               | EDIT: It's worth noting that this is mostly a _spoken_
               | thing, AIUI - most formal /semi-formal writing would be
               | in Hochdetusch rather than a local dialect.
        
               | lhoff wrote:
               | It depends. There is not one Swiss German but multiple
               | subdialects. The language spoke around the Bern region
               | very far away from German while the one from Zurich or
               | Basel is much closer. Since there is no official written
               | from they never really converged to a homogeneous
               | language.
        
               | ipsi wrote:
               | When spoken? Almost certainly. But I think they mostly
               | write in Hochdeutsch, especially in formal contexts, at
               | least that I've seen (private chats/etc are a totally
               | different matter), so I don't foresee any major issues
               | there.
        
               | lxgr wrote:
               | Austrian standard german is slightly different from the
               | German variant, even when written. The differences are
               | pretty minor, though, so it's very possible to have a
               | relatively long text without being able to tell which one
               | it actually is (especially when potatoes are not
               | referenced in it).
        
             | piltdownman wrote:
             | Including the nasty political side-show that is Ulster
             | Scots - literally only brought in as a chilling effect
             | 'whataboutism' to diminish support when Irish speakers ask
             | for language rights in Northern Ireland.
             | 
             | https://www.reddit.com/r/northernireland/comments/1fivtob/n
             | o...
        
               | pqtyw wrote:
               | Well Scots is a real language. As much as English or any
               | other. Whether enough people speak it especially in NI to
               | justify it having an official status and such is another
               | matter.
        
               | AlecSchueler wrote:
               | This completely ignores the history of published writing
               | in Ulster Scots going back centuries.
        
               | wizzwizz4 wrote:
               | This is one of those topics where the Hacker News take is
               | unlikely to be correct. There's a _lot_ of strong feeling
               | here, and an outsider would need at least three books to
               | understand the historical context (one of which, afaict,
               | has not been written yet: it 's oral tradition only).
               | 
               | People closer to the issue are better-placed to gather
               | the necessary information, but again: strong feeling.
               | Most people find it hard to get past that. The most
               | informed person I know is _so_ biased that I don 't at
               | all trust their conclusions.
        
             | Levitz wrote:
             | Well, this was 4 days ago, Spain in talks with Germany
             | regarding the addition of official languages:
             | 
             | https://www.politico.eu/article/catalan-basque-galician-
             | boos...
        
             | runarberg wrote:
             | Is English a legacy official language then from the time
             | the UK was a member (I'm guessing Ireland nominated Irish
             | instead of English). Aside it feels very un-EU to push this
             | limitation, as I was under the assumption that EU was all
             | about celebrating (European) diversity.
        
               | handelaar wrote:
               | Still an official language, thankfully. Officially,
               | because of Cyprus.
        
               | Muvasa wrote:
               | Malta and ireland
        
           | sigmar wrote:
           | Should be noted- the Netherlands can't unilaterally make
           | changes. Spain has been trying to push for languages to be
           | added and hasn't had luck.
        
             | Vinnl wrote:
             | Haha I just added it as a fun fact, I don't actually
             | believe folks will need to start retraining things, or that
             | this is likely to be at the top of the priorities list for
             | anyone. Party programmes are aspirational anyway.
        
           | mikrl wrote:
           | As a Brit I feel very at home when hearing/reading Dutch and
           | Frisian. It's a reminder that England and the Low Countries
           | share a lot of close history all the way back to Anglo-Saxon
           | times; of being fishers, traders, burghers and mercenaries
           | moving around the North Sea chasing opportunities, spreading
           | and augmenting languages.
           | 
           | "Brea, buter en griene tsiis is goed Ingelsk en goed Frysk"
        
             | RobotToaster wrote:
             | If you've ever read anything written in old English, it's a
             | even closer to Dutch.
        
               | lawlessone wrote:
               | Before the Dutch arrived would it have been something
               | like Welsh that was spoken in England?
        
               | rgblambda wrote:
               | Anglo-Saxons not Dutch. But the short answer is yes. The
               | word Welsh is derived from the Old English word for
               | foreigner.
               | 
               | Latin would have been spoken in towns and cities but as
               | Roman rule collapsed it was replaced by Brittonic
               | (ancestor of Welsh), unlike in the continent where it
               | developed into various Latin derived Romance languages.
        
             | tirant wrote:
             | Not only on the language but also in gastronomy and
             | architecture. When I see old towns in UK I usually think
             | about Dutch towns but just without any biking
             | infrastructure.
        
             | tannhaeuser wrote:
             | > _However modern standard Dutch (Nederlands, Hollands) is
             | based upon Franconian, rather than Saxon dialects._
             | 
             | > _Some of these [Old Saxon] speakers took part in the
             | Germanic conquest of England in the fifth century AD. While
             | it is not true that English and Plattdeutsch derive
             | completely from the same source, the Old Saxon input into
             | Anglo-Saxon was of primary importance and this linguistic
             | group contributed greatly to the Anglo-Saxon dialects which
             | our English forefathers spoke._
             | 
             | [1]: http://www.plattmaster.de/plattoew.htm
        
           | tecleandor wrote:
           | AFAIK, they are trying to get Frisian added to the "European
           | Charter for Regional or Minority Languages", not the official
           | language list.
           | 
           | They get certain recognition, but they are not official in
           | Europe. For example, just from Spain there are 13 languages
           | on that list.
        
           | ginko wrote:
           | Just do a 50:50 mix of the German and Dutch model weights.
        
             | Vinnl wrote:
             | Oops, accidentally made the model speak Limburgish.
        
         | ChrisMarshallNY wrote:
         | Flemish? I remember watching a TV show in Flemish ( _Hotel Beau
         | Sejour_ [0]), so it's prevalent enough to invest that kind of
         | money into.
         | 
         | What about Basque? Is that too controversial?
         | 
         | [0] https://en.wikipedia.org/wiki/Hotel_Beau_Sejour
        
           | td540 wrote:
           | like British English vs US English, Flemish is a dialect of
           | dutch
        
             | ChrisMarshallNY wrote:
             | Ah. That makes sense.
             | 
             | It's all Greek, to me...
        
           | mytailorisrich wrote:
           | I think those 24 languages reflect all the languages that are
           | official languages at country level.
           | 
           | So for instance, Basque is not an official language of any
           | country (only French in France and Spanish/Castilian in
           | Spain). Belgium's official languages are French, Dutch, and
           | German, "Flemish" is only a local variant of Dutch (Belgian
           | French is also only a local variant of French).
        
             | ChrisMarshallNY wrote:
             | Thanks. That makes sense.
             | 
             | In the US, people will resort to fisticuffs, over variants
             | of Spanish. I usually translate into Castilian Spanish,
             | because that seems to be the equivalent of "Vanilla"
             | Spanish. No one is really happy (except the Spaniards), but
             | I'm not accused of favoritism.
        
             | contravariant wrote:
             | Official is a weird concept though. Turns out Dutch law
             | never really bothered to define an official language, Dutch
             | simply is the de facto standard and is required for a lot
             | of things making it effectively the standard. This makes
             | Dutch Sign Language the only language officially recognised
             | by law. An attempt to recognise Frysian and Dutch as
             | official languages in the constitution failed.
        
               | rags2riches wrote:
               | Sweden didn't have an "official" language before the
               | Language Law of 2009. Five minority languages (Finnish,
               | Meankieli, Romani, Sami, Yiddish) were officially
               | recognized as such since 1999.
        
             | tirant wrote:
             | Basque is an official language and declared as such in the
             | Spanish constitution however restricted only to the regions
             | that decide to apply it (Basque Country and Navarra).
        
               | mytailorisrich wrote:
               | If we want to go all legal, I believe that
               | Spanish/Castilian is the only official language of the
               | State, so at country level, with the other "Spanish
               | languages" only official in their respective areas:
               | 
               |  _Section 3
               | 
               | (1) Castilian is the official Spanish language of the
               | State. All Spaniards have the duty to know it and the
               | right to use it.
               | 
               | (2) The other Spanish languages shall also be official in
               | the respective Autonomous Communities in accordance with
               | their Statutes.
               | 
               | (3) The richness of the different linguistic modalities
               | of Spain is a cultural heritage which shall be specially
               | respected and protected._ [1]
               | 
               | [1] https://www.senado.es/web/conocersenado/normas/consti
               | tucion/...
        
           | tirant wrote:
           | Basque is not controversial, but spoken just by very little
           | people.
        
             | embedding-shape wrote:
             | Not sure that should be the qualifier, there might be more
             | people able to speak Basque in the world than Danish,
             | doesn't stop Danish from being well supported.
        
               | Levitz wrote:
               | Quick google points to about 1M Basque speakers in the EU
               | against 5-6M Danish speakers, there's also the fact that
               | Basque is not the only official language in the country
               | it belongs to, and that it's in fact not spoken in the
               | vast majority of the country.
               | 
               | From https://european-union.europa.eu/principles-
               | countries-histor... we can find an excerpt relating to
               | the policy and its purpose:
               | 
               | >One of the EU's founding principles is multilingualism.
               | 
               | >This policy aims to:
               | 
               | >communicating with its citizens in their own languages
               | 
               | >protecting Europe's rich linguistic diversity
               | 
               | >promoting language learning in Europe
               | 
               | With this in mind, the first intention fails by an
               | enormous margin, given that 95%+ of Spain doesn't speak
               | an iota of Basque, the second is met handily, given the
               | long history of the language, and I'm not sure what to
               | think about the third, any language whatsoever would
               | serve that purpose.
        
           | yvdriess wrote:
           | Flemish is more of a political construct than linguistic,
           | it's a grouping of belgian-dutch the coastal, brabant and
           | limburg language groups with each having their own regional
           | dialects.
        
             | OptionOfT wrote:
             | It's more than political. In speaking Flemish is to Dutch
             | as UK English is to US English. In writing however there is
             | no difference in spelling, but there is a difference in
             | word choice.
             | 
             | Now, being from Belgium, even within that small part of the
             | country where everybody is supposed to speak Dutch, I
             | genuinely don't understand people from near the coast,
             | which was about 150 miles from where I used to live.
        
         | punnerud wrote:
         | Norwegian is also included, based on the model card:
         | https://huggingface.co/utter-project/EuroLLM-9B
        
         | arbuge wrote:
         | > Maltese, interestingly, is the only Afro-Asiatic derived
         | language.
         | 
         | It's Semitic, to be precise.
         | 
         | https://en.wikipedia.org/wiki/Semitic_languages
        
           | UebVar wrote:
           | Arabic, even. An outlier, as it is AFAIK the only arabic
           | dialect that is not written with the arabic alphabet. Also
           | it's far removed from other arabic dialects.
        
             | findyoucef wrote:
             | It's not at all far removed from the North African dialects
             | of arabic which is the dialect that it's derived from.
             | Tunisians and Algerians can understand Maltese quite well.
        
         | fsckboy wrote:
         | Is Ireland the only country to bring in two languages,
         | Irish/Gaelic and English? Is English an official language of
         | any other EU countries?
        
           | JAlexoid wrote:
           | I believe Malta has English as an official language.
           | 
           | PS: Gaelic is a more general term for Irish and Scottish.
           | Ireland brings specifically Irish(Gaeilge in Irish) language.
        
           | rags2riches wrote:
           | Malta has Maltese and English as official languages. I don't
           | know what they bring to the EU list of official languages.
        
           | ginko wrote:
           | AFAIK Ireland only listed Gaelic as their official language
           | with UK having English. That caused a bit of a problem during
           | Brexit since technically English wasn't officially an EU
           | language anymore. I guess they resolved it somehow.
        
           | layer8 wrote:
           | English is an official EU language because Regulation 1
           | Article 1 says so [0] and hasn't been changed. In practice,
           | English is the most widely used language in EU institutions,
           | so it would be have been silly to remove it after Brexit.
           | 
           | [0] https://eur-lex.europa.eu/legal-
           | content/EN/TXT/?uri=CELEX:01...
        
             | raattgift wrote:
             | That said, whenever there is a language selection UI (e.g.
             | at banking machines or institutional websites) in wider
             | Europe that uses flags to represent languages -- probably
             | not a good idea to start with, but very common -- the Irish
             | tricolour should be used to indicate English rather than
             | the UK or USA flags. (although cf Airteagal 8 of Bunreacht
             | na hEireann).
        
             | ChocolateGod wrote:
             | English at this point has stopped culturally belonging to
             | the United Kingdom and whilst one can discus it's not so
             | very moral way of getting there, it's become the bridge
             | language for people of different languages to communicate
             | in, further solidified by the internet.
        
             | rcbdev wrote:
             | It's a national language in Malta, making it a popular
             | destination for "language weeks" in European schools, where
             | English is usually a main subject.
        
         | threesmegiste wrote:
         | Turkish?
        
           | runarberg wrote:
           | Is official in Northern Cyprus. But as I understand it while
           | the whole island of Cyprus is in the EU, the state of
           | Northern Cyprus isn't.
        
         | sva_ wrote:
         | Seems like the model isn't limited to those though, from the
         | paper:
         | 
         | > as well as some additional relevant languages (Arabic,
         | Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian,
         | Russian, Turkish, and Ukrainian).
         | 
         | https://arxiv.org/pdf/2409.16235
         | 
         | The paper also goes into detail on training set sources, which
         | I feel like a curation thereof might be considered the main
         | contribution of this publication?
        
         | _kidlike wrote:
         | In Greek we call our language Hellenic, and our country Hellas.
         | "Greek" / "Greece" don't exist in the Hellenic language.
        
           | ranadomo wrote:
           | > Graikoi, Graikoi were an ancient Hellenic tribe
           | 
           | https://en.wikipedia.org/wiki/Graecians
        
           | 3836293648 wrote:
           | Yes it does, it was a greek colony off the southern coast of
           | Italy, which were the primary greek connection to the romans
           | which how the name stuck.
        
         | ks2048 wrote:
         | From other comments, it seems many people don't realize that
         | there are 11 more languages than these 24 official (this is
         | mentioned in the paper):
         | 
         | Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean,
         | Norwegian, Russian, Turkish, and Ukrainian.
        
           | jll29 wrote:
           | +1
        
         | amarant wrote:
         | I find it interesting that Norwegian isn't on the list.
         | 
         | I have often joked that Norwegian is just a dialect of Swedish,
         | but I never expected to get official validation like this!
        
           | bdhtu wrote:
           | Norway isn't in the EU.
        
           | emil-lp wrote:
           | Norway isn't in EU, though.
        
           | rcbdev wrote:
           | Norwegian is not on this list, because in fact no country
           | with Norwegian as their national language is part of the
           | European Union at the time of writing.
        
         | jenadine wrote:
         | No Luxembourgish?
        
         | cyfex wrote:
         | > Greek being the only Hellenic one
         | 
         | Are there really any other Hellenic languages besides Greek?
        
         | zhengiszen wrote:
         | Maltese is derived from dialectical arabic
        
       | moralestapia wrote:
       | Benchmarks?
       | 
       | Edit: Thanks, @Bengalilol.
       | 
       | The 1.7B one looks meh.
       | 
       | But really solid numbers on the 9B! Props to the team!
        
         | Bengalilol wrote:
         | 1.7B
         | 
         | https://huggingface.co/utter-project/EuroLLM-1.7B#results
         | 
         | 9B
         | 
         | https://huggingface.co/utter-project/EuroLLM-9B#results
        
       | nellyspageli wrote:
       | Could you adjust the title from:
       | 
       | "all official 24 EU languages" to "all 24 official EU languages"
        
         | scoot wrote:
         | @dang
        
         | Philpax wrote:
         | The former is used on the website itself.
        
       | seydor wrote:
       | It's just another Horizon2020 grant, people. Don't be overly
       | harsh to a bunch of academics who are just earning their living.
        
         | giorgioz wrote:
         | I didn't know of the grant! https://research-and-
         | innovation.ec.europa.eu/funding/funding...
         | 
         | It seems the new version is called Horizon Europe
        
         | tonyhart7 wrote:
         | Yeah people comparing this to SOTA model is too harsh
        
         | oytis wrote:
         | I thought research grants were to make novel discoveries, not
         | to replicate what industry has long done. Unless we are at the
         | point where we study US as an alien civilization
        
       | srameshc wrote:
       | I was thinking the same, why are so many superior models coming
       | from only countries like US and China. And why are European
       | countries not in the list other than France with Mistral. Why are
       | so few companies in India, Japan, South Korea even close to a
       | promising new model like what Chinese companies did ?
        
         | apples_oranges wrote:
         | Does it even make sense? Just use the American or Chinese ones,
         | adjust As needed. Where's the point in spending millions to
         | build The same thing or worse
        
           | t43562 wrote:
           | Now that the big bets have been made, who wants to try to
           | compete with them?
        
         | loandbehold wrote:
         | Because training frontier model is expensive and only US and
         | China have capital structure to raise tens of billions of
         | dollars to do it.
        
           | busssard wrote:
           | being able to train new frontier models is the new equivalent
           | to nuclear capabilities.
           | 
           | i predict at some point countries will get CIA'ed when they
           | publish plans to build a large data center.
           | 
           | Similar to the time when they got CIA'ed when announcing
           | plans for new nuclear plants.
        
             | henriquenunez wrote:
             | They are already CIA'ed on a regular basis for much less
             | than that.
        
           | lossolo wrote:
           | You can easily fit below 10 billion for the whole datacenter,
           | then you only pay for electricity + maintenance + staff. 100k
           | GPUs cost a few billion USD, that's more than enough to train
           | frontier models, run experiments, and serve models in the EU
           | to start. Look at what xAI did and how much it cost them and
           | it's more expensive to do in US than in EU.
        
         | nonethewiser wrote:
         | "Why" is a fair question but are you surprised? Europe is
         | consistently behind in tech.
         | 
         | Europe has about 1.3 times the population of the USA and about
         | 75% of the GDP yet EU tech output is a very small percentage of
         | US tech output. We are not talking about 70, 50, 30, or even
         | 20%. It's a drop in the bucket.
         | 
         | >The seven largest U.S. tech companies, Alphabet (Google),
         | Amazon, Apple, Meta, Microsoft, Nvidia, and Tesla, are 20 times
         | bigger than Europe's seven largest, and generate 10 times more
         | revenue.
         | 
         | https://eqtgroup.com/thinq/technology/why-is-europes-tech-in...
         | 
         | "Why" is a good question, but I definitely wouldnt expect
         | significant competition in LLMs from Europe based on the giant
         | tech disparity. Having 1 non-cutting edge model that isn't
         | really competitive is pretty much what I would expect.
        
           | emporas wrote:
           | Also, commercial software is consistently behind from open
           | source.
           | 
           | I only use open source LLMs for writing (Qwen 32b from Groq)
           | and open source editor of course, Emacs.
           | 
           | If some people can write better using commercial LLMs (and
           | commercial editors), by all means, but they put themselves at
           | a disadvantage.
           | 
           | Next step for me, is to use something open source for
           | translation, I use Claude for the moment, and open source for
           | programming, I use GPT curently. In less than a year I will
           | find a satisfying solution to both of these problems. I
           | haven't looked deep enough.
        
             | neoromantique wrote:
             | What a weird comment.
             | 
             | llama-3.1-70b-versatile is pretty good at translating
             | though
        
           | InsideOutSanta wrote:
           | _> The seven largest U.S. tech companies (...) are 20 times
           | bigger than Europe's seven largest, and generate 10 times
           | more revenue._
           | 
           | I'm going to guess that this part is intentional. Europe
           | tends to be more aggressive in enforcing antitrust laws.
           | Economically, Europe's goal isn't to have the biggest
           | companies but to have more smaller companies.
           | 
           | So you're not going to get companies like Google, but you
           | will get companies like Proton, Spotify, Tuta, Hetzner,
           | Mistral, Threema, Filen, Babbel, Nextcloud, CryptPad, DeepL,
           | Vivaldi, and so on.
        
             | nonethewiser wrote:
             | >I'm going to guess that this part is intentional. Europe
             | tends to be more aggressive in enforcing antitrust laws.
             | Economically, Europe's goal isn't to have the biggest
             | companies but to have more smaller companies.
             | 
             | So is your hypothesis that the total market cap of EU tech
             | companies is something like 50,60,70, etc. % of total US
             | tech marketcap? Something significantly different than the
             | ~10% implied by that figure (largest us companies 10x
             | largest EU companies). And it's just more broadly
             | distributed?
             | 
             | Hard to find data on this but this is showing EU tech
             | market cap at 3.2T.
             | https://www.stateofeuropeantech.com/chapters/outcomes
             | 
             | Whereas this is saying the US "megacaps" ($200B+) are at
             | 21T. https://www.cnbc.com/2025/09/05/tech-megacaps-worth-
             | market-c...
             | 
             | Which puts the entire EU tech market at 15% of the US
             | megacaps. Not even the entire market.
        
               | layer8 wrote:
               | European companies are smaller on average and less likely
               | to go public in general, so market cap comparisons don't
               | show the whole picture. Growing big is less often seen as
               | a goal than in the US. "Megacaps" aren't necessarily
               | considered a healthy thing to have.
        
               | jimbokun wrote:
               | Yes, and this all but guarantees that Europe will stay
               | behind USA and China in their technology capabilities.
        
               | mjburgess wrote:
               | What are these capabilities?
               | 
               | I don't see any sense in which the EU has fewer
               | capabilities. It has, say, a smaller number of businesses
               | with smaller market dominance.
               | 
               | It isnt clear to me what capability the EU would gain by
               | having a monopolist social network, a monopolist search
               | engine, a monopolist advertising trader
        
               | jimbokun wrote:
               | Europe has all of those things, they just come from the
               | US.
        
         | sunaookami wrote:
         | EU made a >900 page law about AI and patted themselves on the
         | back for being "the first to regulate AI" (which was not even
         | true, China had an AI law before and it's two pages long).
        
           | sajithdilshan wrote:
           | This cannot be stressed enough. In my experience working in
           | multiple tech startups in Germany, the power compliance,
           | legal and all other 2nd line has over engineering is quite
           | immense. Most of the time they act as a hindrance for
           | innovation rather than a supporting factor.
           | 
           | This AI law is a clear example of that. Pencil pushers
           | creating more obstacles for the sake of creating more
           | obstacles rather than actually taking a pragmatic approach.
        
             | isodev wrote:
             | It's strange, my real life experience is very different
             | than yours. Unless you're training AI to do something
             | shady, it's really no bother at all. In fact, most of what
             | the AI Act requires, you have to do anyway for a good model
             | card.
        
         | sublimefire wrote:
         | As a European citizen I think it boils down to access to the
         | capital. EU/EEA is not a country and the market is sort of
         | fragmented. The big players are UK, France, Germany, everyone
         | else does not have the same access to money as say in the US.
         | Folks want to do it but there is a glass ceiling. Hence you
         | have these collabs among large institutions to tap into funds
         | such as from Horizon which are academic in nature and do not
         | translate well into products.
        
         | isodev wrote:
         | Because the value of these models is (actually) yet to be
         | proven. Why saturate the market with something that we already
         | have at least one of and others are selling as a service? No
         | model provider (including the "big ones" like OpenAI) has been
         | able to produce a viable business case. They're all literally
         | running on government deals and investor money.
        
       | elias_t wrote:
       | Are there any benchmarks that exist for those 24 languages?
        
         | moralestapia wrote:
         | dupe of https://news.ycombinator.com/item?id=45733832
         | 
         | which sank to the bottom thanks to HN's invisible hand
         | 
         | Oh wait, one's not supposed to _notice_
        
           | morkalork wrote:
           | It's more like the default is to be ranked near the bottom
           | unless your comment gets traction during the brief window of
           | time it is ranked first for being new. Seeing your comments
           | go _splat_ after that window expires is not some nefarious
           | conspiracy..
        
             | moralestapia wrote:
             | Oh, you'd be surprised to know what's behind many of those
             | "conspiracies"!
        
         | nodja wrote:
         | It's on the huggingface readme
         | 
         | https://huggingface.co/utter-project/EuroLLM-9B#results
         | 
         | https://huggingface.co/utter-project/EuroLLM-9B#english
        
         | ks2048 wrote:
         | The detailed results are in appendix to the paper:
         | https://arxiv.org/abs/2506.04079
        
       | loandbehold wrote:
       | Aren't all frontier models already able to use all these
       | languages? Support for specific languages doesn't need to be
       | built in, LLMs support all languages because they are trained on
       | multilingual data.
        
         | melvinmelih wrote:
         | > because they are trained on multilingual data
         | 
         | But they were not trained on government-sanctioned homegrown EU
         | data.
        
           | saretup wrote:
           | The entirety of the internet vs government-sanctioned
           | homegrown EU data.
        
           | raverbashing wrote:
           | > But they were not trained on government-sanctioned
           | homegrown EU data.
           | 
           | If none of the LLM makers used the very big corpus of EU
           | multilingual data I have an EU regulation bridge to sell it
           | to you
        
           | tonyhart7 wrote:
           | "But they were not trained on government-sanctioned homegrown
           | EU data."
           | 
           | ok what are you implying on this
        
           | sunaookami wrote:
           | Who in their right mind would use this?
        
             | tensor wrote:
             | I'd use a model trained on a targeted and curated data set
             | over one trained on all the crap on the internet any day.
        
               | loandbehold wrote:
               | I keep hearing that LLMs are trained on "Internet crap"
               | but is it true? For instance we know from Anthropic
               | copyright case that they scanned millions of books to
               | make a training set. They certainly use Internet content
               | for training but I'm sure it's curated to a large degree.
               | They don't just scrap random pages and feed into LLM.
        
               | nutjob2 wrote:
               | > I'm sure it's curated to a large degree. They don't
               | just scrap random pages and feed into LLM.
               | 
               | How would they curate it on that scale? Does page ranking
               | (popularity) produce interesting pages for this purpose?
               | I'm skeptical.
        
               | airspresso wrote:
               | > I keep hearing that LLMs are trained on "Internet crap"
               | but is it true?
               | 
               | Karpathy repeated this in a recent interview [0], that if
               | you'd look at random samples in the pretraining set you'd
               | mostly see a lot of garbage text. And that it's very
               | surprising it works at all.
               | 
               | The labs have focused a lot more on finetuning
               | (posttraining) and RL lately, and from my understanding
               | that's where all the desirable properties of an LLM are
               | trained into it. Pretraining just teaches the LLM the
               | semantic relations it needs as the foundation for
               | finetuning to work.
               | 
               | [0]: https://www.dwarkesh.com/p/andrej-karpathy
        
         | lm28469 wrote:
         | Meh, it depends a lot on the dataset, which are heavily skewed
         | towards the main languages. For example they almost always
         | confuse Czech and Slovak and often swap one for the other in
         | middle of chats
        
           | mirekrusin wrote:
           | But the only way to unskew it is to remove main language data
           | because there isn't really any to add, no?
        
             | tensor wrote:
             | You can also correctly bias your sampling so that when
             | selecting new training instances each language is chosen
             | equally. Generally the diversity of data is good, unless
             | that data is "wrong" which, ironically, is probably most of
             | the internet, but I digress.
        
           | RobotToaster wrote:
           | Aren't they about as different as American English and
           | British English?
        
             | svobodovic wrote:
             | The difference ia larger than let's say just a "dialect".
             | They really are different languages, even though we
             | generally understand each other quite well (younger
             | generations less so). I've heard it's about as different as
             | e. g. Danish and Swedish - not sure if that comparison is
             | helpful.
        
         | intended wrote:
         | Nope. Capability begins to degrade once you move away from
         | english.
         | 
         | Plus all your T&S/AI Safety is not solved with translation, you
         | need lexicons and data sets of examples.
         | 
         | Like, people use someone in Malaysia, to label the Arabic
         | spoken by someone playing a video game in Doha - the cultural
         | context is missing.
         | 
         | The best proxy to show the degree of lopsidedness was from this
         | : https://cdt.org/insights/lost-in-translation-large-
         | language-...
         | 
         | Which in turn had to base it on this:
         | https://stats.aclrollingreview.org/submissions/linguistic-di...
         | 
         | From what I am aware of, LLM capability degrades once you move
         | out of English, and many nation states are either building, or
         | considering the option of building their own LLMs.
        
         | tensor wrote:
         | No, that's not how training works. It's not just about having
         | an example in a given language, but also how many examples and
         | the _ratio_ of examples compared to other languages. English
         | hugely eclipses any other language on most US models and that
         | 's why performance on other languages is subpar compared to
         | performance on english.
        
           | andy12_ wrote:
           | I have never noticed any major difference in performance of
           | ChatGPT between English and Spanish. The truth is that as
           | long as the amount of training data of a given language is
           | above some threshold, knowledge transfers between languages.
        
           | Byamarro wrote:
           | There's actually a research showing that llms are more
           | accurate when questions are in Polish:
           | https://arxiv.org/pdf/2503.01996
        
           | voxgen wrote:
           | Ratio/quantity is important, but quality is even more so.
           | 
           | In recent LLMs, filtered internet text is at the low end of
           | the quality spectrum. The higher end is curated scientific
           | papers, synthetic and rephrased text, RLHF conversations,
           | reasoning CoTs, etc. English/Chinese/Python/JavaScript
           | dominate here.
           | 
           | The issue is that when there's a difference in training data
           | quality between languages, LLMs likely associate that
           | difference with the languages if not explicitly compensated
           | for.
           | 
           | IMO it would be far more impactful to generate and publish
           | high-quality data for minority languages for current model
           | trainers, than to train new models that are simply enriched
           | with a higher percentage of low-quality internet scrapings
           | for the languages.
        
         | numpad0 wrote:
         | Not natively, they all sound translated in languages other than
         | English. I occasionally come across French people complaining
         | about LLMs' use of non-idiomatic French, but it's probably not
         | a French problem at all, considering that this effort includes
         | so many _Indo-European_ languages.
        
           | FinnKuhn wrote:
           | I can at least also confirm this for German. Here is one
           | example that is quite annyoing:
           | 
           | Chat GPT for example tends to start emails with "ich hoffe,
           | es geht dir gut!", which means "I hope you are well!". In
           | English (especially American) corporate emails this is a
           | really common way to start an email. In German it is not as
           | "how are you" isn't a common phrase used here.
        
         | whazor wrote:
         | European governments have huge collections of digitalised
         | books, research, public data.
         | 
         | But also European culture could maybe make a difference? You
         | can already see big differences between Grok and ChatGPT in
         | terms of values.
        
           | pembrook wrote:
           | If it's publicly available data, books and research, I can
           | assure you the big models have already all been trained on
           | it.
           | 
           | European culture is already embedded in all the models,
           | unless the people involved in this project have some hidden
           | trove of private data that they're training on which diverges
           | drastically from things Europeans have published publicly
           | (I'm 99.9% positive they don't...especially given Europe's
           | alarmist attitude around anything related to data).
           | 
           | I think people don't understand a huge percentage of the
           | employees at OpenAI, Anthropic, etc. are non-US born.
        
         | charlieyu1 wrote:
         | Training is a very different thing. Can't speak for European,
         | but LLMs are often much worse in Japanese because tokenisation
         | used Unicode and a single Japanese character often has to be
         | represented by more than one token
        
       | htrp wrote:
       | >The EuroLLM Team brings together some of the brightest minds in
       | AI including Unbabel, Instituto Tecnico Lisbon, the University of
       | Edinburgh, Instituto de Telecommunicacoes, Universite Paris-
       | Saclay, Aveni, Sorbonne University, Naver Labs, and the
       | University of Amsterdam.
       | 
       | >Europe is the only continent in the world to have a large public
       | network of supercomputers that are managed by the EuroHPC Joint
       | Undertaking (EuroHPC JU). As soon as we received the EuroHPC JU
       | access to the supercomputer, we were ready to roll up our sleeves
       | and get to work. We developed the small model right away and in
       | less than 6 months the second model was ready.
       | 
       | [1] https://www.eurohpc-ju.europa.eu/eurohpc-success-story-
       | speak...
       | 
       | Repurposing some of that physics sim compute
        
       | sorenjan wrote:
       | If I want to use an LLM to do translation, should I use a base
       | model or an instruction tuned version? I've had mixed results
       | using the chat models and a simple "Translate this to <language>:
       | "
        
         | wongarsu wrote:
         | For a 9B model like EuroLLM, fine tuning the base model is
         | pretty viable. You don't need a lot of samples, on the order of
         | 300 high quality examples can produce good results, and the GPU
         | time is pretty manageable with rented GPU instances
         | 
         | Just the base model and a template like "English:
         | {text}\n{language}:" can also work with a bit of filter and
         | retry logic
        
       | rob_c wrote:
       | This, I hope, is close to multi-modal in lingual terms. There's
       | potentially a lot to learn from examining where this works/fails
       | :D
        
       | jagermo wrote:
       | looks cool, i hope kagi adds it to the assistant.
        
       | Stagnant wrote:
       | Title is missing "(2024)". The 9B model was released last
       | december[0].
       | 
       | 0: https://sites.google.com/view/eurollm/home
        
       | aurintex wrote:
       | Is it planned to have a VLM or something compareable like
       | Qwen3-VL for the future?
        
         | jug wrote:
         | A multimodal release is planned.
        
       | rvz wrote:
       | As expected, Europe finally catches up to 2024 and launches an
       | LLM that barely competes against the heavyweights.
       | 
       | The US and China are running rings around Europe.
       | 
       | Mistral is an exception as it was funded by US VCs and they are a
       | great example showing that without VC funding, Mistral would have
       | been begging to the EU for a microsopic grant to train a LLM
       | worse than Llama.
        
         | laurentiurad wrote:
         | less exposure to a technology that doesn't bring that much
         | revenue and it's not projected to do so in the upcoming years.
        
           | whimsicalism wrote:
           | yep, Europe is demonstrating the same sort of strategic
           | thinking that economic behemoths like the Smithsonian use
        
           | oytis wrote:
           | Why wasting money on trying to compete at all then?
        
             | t43562 wrote:
             | Every country needs a few plumbers and carpenters whether
             | or not they are at the forefront of technology. Some money
             | must be spent to give academics work to do so they can
             | sharpen up their skills and perhaps teach the next set of
             | students who might be more commercial
        
               | oytis wrote:
               | It would be a better use of the money to hire someone who
               | has worked on actual frontier models to teach at European
               | universities
        
               | t43562 wrote:
               | If you could find one for the money, if they were happy
               | to teach in the long term. If it wasn't better to have N
               | for the price of 1. In other situations of import
               | substitution I'm pretty sure people try to develop their
               | local talent in addition to buying in experts.
        
         | AJ007 wrote:
         | Mistral is pretty much toast? Their models perform poorly and
         | I'm not sure why anyone would use them. Maybe there is a
         | catching up point somewhere in the future, hopefully.
        
       | kreetx wrote:
       | I'm somewhat skeptical of taxpayer funded innovation. Seen a few
       | Horizon grants from the side, as a citizen I'd prefer to not pay
       | for them, but unfortunately can't opt out.
        
         | owisd wrote:
         | How about Tesla for taxpayer funded innovation?
         | https://www.energy.gov/lpo/tesla
        
           | kreetx wrote:
           | I wouldn't mind actually/visibly productive companies taking
           | these grants. But I've also seen mostly research-focused
           | (nominally) private companies who mostly live off of science
           | grants, who don't produce nor sell much - because they don't
           | have to.
        
         | bigbadfeline wrote:
         | > I'm somewhat skeptical of taxpayer funded innovation... as a
         | citizen I'd prefer to not pay for them, but unfortunately can't
         | opt out.
         | 
         | There are a few variables here but at this point in time,
         | private-funded innovation isn't different by much and all
         | things considered, the difference isn't in its favor.
        
         | tensor wrote:
         | The vast majority of US discoveries are by immigrants using
         | taxpayer money. AKA scientists at universities. Your media
         | likes to give credit to the companies, but generally the
         | companies only apply things, they rarely create new science
         | these days.
        
           | kreetx wrote:
           | The above is not a discovery though.
           | 
           | My experience with government funding is that they apply
           | something and won't even try to sell it because selling is
           | hard: you don't want to know that the thing you built is
           | lacking nor that the competition is better. Especially the
           | academic types don't. Yet I'm paying for these guys. Also, by
           | funding the academics they won't even need to go to the job
           | market.. But as I paid for their education I thought I was
           | buying people who create value.
           | 
           | Perhaps the above is rather harsh and it's "not that bad", my
           | subjective experience nevertheless.
        
             | tensor wrote:
             | Much of the neural network work was funded by Canadian
             | Universities, and commercialized by US companies. Even if
             | you look just at the "Attention is all you need" paper,
             | which is primarily by authors working at Google, most of
             | those authors come from academia and are immigrants.
             | 
             | Vaswani is an Indian born computer scientist, Shazeer is
             | US, Parmar was born in India, Uszkoreit was born in
             | Germany, Jones was born in the UK, Gomez is British-
             | Canadian, Kaiser is a Polish computer scientist, and
             | Polosukhin is Ukrainian.
             | 
             | Almost all of these people have PhDs and Master degrees.
             | The ROI on academia is vast for society, including European
             | universities. The thing the US does well is capitalize on
             | that education, and sadly also try to steal credit for it
             | as "American exceptionalism." If Europe and other countries
             | learn how to keep their academics and get them working in
             | local industries, America's edge will evaporate overnight.
        
               | notahacker wrote:
               | A major factor in European academics moving to the US is
               | that top US institutes can charge a small fortune, and
               | some of that gets reflected in academic salaries.
               | Interesting move by the US government to try to put them
               | off...
               | 
               | The wider availability of capital is a bigger deal
               | though. "Attention is all you need" is available to
               | people on other continents to read, but a computer
               | scientist in Europe that understood exactly how big
               | transformers were going to be and why had less chance of
               | funding than a webdev in California with a pitchdeck full
               | of cliches and me-too GPT wrapper for an industry they'd
               | barely touched does today.
        
       | nonethewiser wrote:
       | How does this work?
       | 
       | It seems like it, in most ways, it would be bad to train on 24
       | separate languages. That's just 24 partitions to the data. Seems
       | really inefficient and better to simply train in the biggest
       | (english) and translate.
       | 
       | I do think this will introduce some biases that correlate with
       | the English language. It would be interesting to see more
       | specifically what this means. But regardless, I don't think you
       | can produce a competitive model with such a large subdivision of
       | training data.
        
         | antiloper wrote:
         | If you train a model on multiple languages, you can use the
         | model itself for translation. As well as allowing the model to
         | naturally respond in the user's language.
        
         | whimsicalism wrote:
         | nah, it's better to train on all languages. 24 partitions? you
         | are gravely underestimating these models and how they represent
         | things in their latents... transfers easily
        
       | DrNosferatu wrote:
       | 1. It's a nice start, but the EU has to scale to Manhattan
       | Project levels in order to properly compete with the US and
       | China.
       | 
       | 2. A credible scale effort for EU own silicon for AI Compute,
       | wouldn't hurt either.
       | 
       | 3. And this can only be achieved by vertical integration to
       | combat fragmentation.
        
         | fulafel wrote:
         | Good to distinguish between publicly funded research models
         | (like this one) and commercial ones (like Mistral in France).
         | What are the chinese and usa public research models like?
        
         | t43562 wrote:
         | The Germans do have some neurpmorphic hardware. It might be
         | smarter to invest in that to avoid having to build a lot of new
         | power stations.
        
         | bean469 wrote:
         | > It's a nice start, but the EU has to scale to Manhattan
         | Project levels in order to properly compete with the US and
         | China.
         | 
         | Yep, the US-government sponsored, open-weight LLM is miles
         | ahead of EuroLLM
        
         | DrNosferatu wrote:
         | I propose an European AI-only "NASA" style agency that would
         | have a frontier LLM-"Apollo Program" goal. It would subcontract
         | the several blocks it needs across EU member states.
         | 
         | Would you prefer European AI sovereignty with 15% overhead
         | costs from geographic distribution, or 100% dependence on
         | Nvidia/OpenAI with zero European industrial base?
        
         | DrNosferatu wrote:
         | Allow me to elaborate,
         | 
         | EuroAI: Europe's Moonshot to AI Sovereignty
         | 
         | https://open.substack.com/pub/ifiwaspolitical/p/euroai-europ...
        
         | snek_case wrote:
         | 2. New state-funded joint venture: EuroNV, pronounced euro-
         | envy.
        
       | whimsicalism wrote:
       | Actually nuts to me the degree to which European policymakers do
       | not even begin to understand _how_ to kickstart technologically-
       | intensive industry. Anyone who has seen close-up the results of a
       | "pick the winners" grant-style approach to innovation knows what
       | will go wrong here.
       | 
       | Also funny to read this narrative of how access to the European
       | 'supercomputer' cluster is going.
       | https://x.com/levelsio/status/1981485945745788969
        
         | webdevver wrote:
         | EU grifting is so much worse than even the most brazen Trumpian
         | crypto pump n' dump.
         | 
         | Geniunely repugnant. Atleast the Trump admin has the decency to
         | pump everyones 401k...
         | 
         | I'm trying to figure out why it bothers me so much. I think its
         | because the EU are such unbelievable losers in everything they
         | do. they can't even grift, thats how useless they are. they
         | can't even steal properly. its so undignified, and offensive to
         | the senses.
        
           | whimsicalism wrote:
           | Wouldn't go that far. EU policymakers have good intentions, I
           | believe - but ultimately are products of their environment
           | and cultural inclination.
           | 
           | The EU is such a bizarre place because they treat capital and
           | entrepreneurs with such massive distrust, but never really
           | bothered getting rid of the quasi-static entrenched
           | hierarchies from feudalism? Like I'll go to the UK or France
           | and there will just be massive swathes of land owned by the
           | nobility or 'former' nobility? Maybe start there but let your
           | high-value human capital earn a good wage?
        
             | sofixa wrote:
             | > France and there will just be massive swathes of land
             | owned by the nobility or 'former' nobility
             | 
             | Yeah, no, this isn't even remotely true.
        
               | whimsicalism wrote:
               | will cede that, you're right for France.
        
             | coolewurst3000 wrote:
             | You are wrong in that you think the hierarchies stem
             | specifically from feudalism, but you are absolutely correct
             | in that these hierarchies exist and are deeply entrenched.
             | Sweden and Germany have one of the lowest percentages of
             | self-made vs. inherited fortunes in the western world.
             | Actually some tax policies in the US enable much more
             | upward mobility, such as real estate taxation and 401k-like
             | vehicles.
        
         | deaux wrote:
         | > What's REALLY much more important though if you want to be a
         | part of the AI race and I've posted for years here with
         | @euaccofficial is to make Europe a really extremely attractive
         | place to start and run an AI business. Remove regulatory
         | obstructions and give tax discounts for startups. Let them
         | build a business first that can compete worldwide and once they
         | make enough money (let's say $100M/y), then slowly start adding
         | regulation.
         | 
         | When you talk to most EU business owners, even in tech, the
         | limiting factor isn't regulations. This being the #1 reason is
         | such a tired trope.
         | 
         | Ironically, China has in some ways a bigger regulatory burden
         | when it comes to software, as there if the government doesn't
         | approve the business is dead in the water. I doubt that Klarna
         | would've gotten off the ground there, for one, I could see them
         | being shut down much earlier there. In the EU only now very
         | slowly are some governments even starting to talk about some
         | weak measures around their business model. But I've never, not
         | once in my life, heard "Chinese software companies can't get
         | off the ground due to the regulatory burden".
         | 
         | The same people who clamor about the EU regulations are the
         | ones who hate on the EU for their protectionist measures
         | against US tech. Yet another bout of irony here - China's
         | software industry has flourished exactly thanks to 10 times
         | stronger protectionist measures against US tech. So has
         | Korea's, and their protectionism has never even been anywhere
         | on the China level, more inbetween EU and China. No, if there's
         | anything that would help, it's much _more_ tech protectionism
         | in the EU.
         | 
         | Pieter Levels is at the end of the day an influencer, not a
         | serious founder.
        
           | whimsicalism wrote:
           | > When you talk to most EU business owners, even in tech, the
           | limiting factor isn't regulations. This being the #1 reason
           | is such a tired trope.
           | 
           | Okay, what is the limiting factor? Because when I talk to EU
           | business owners (admittedly, very few) - they point to lack
           | of big EU capital markets, which is directly downstream of
           | the policy environment. And when I talk to top EU human
           | capital, they all point to the lack of competitive wages.
           | There's a real difficulty in allocating capital to talented
           | humans.
           | 
           | And, at least in Southern Europe, the income tax schedule is
           | so aggressive it's hard to justify continuing working in many
           | of these countries if you are highly talented.
           | 
           | Like, if you can tell me what the induced operator norm from
           | l_2 -> l_2 is - probably you should come to the US and work
           | at a biglab and make bank. What can you do in Portugal,
           | Italy, Spain, etc.??
           | 
           | > Pieter Levels is at the end of the day an influencer, not a
           | serious founder.
           | 
           | Sure, agreed.
           | 
           | I think it is a complete misreading to point to protectionism
           | as the reason for Chinese success, but having a big unified
           | domestic market for consumers along with massive saving rates
           | and capital controls probably does help.
        
             | KaiserPro wrote:
             | Money.
             | 
             | Why work in the "europoor" countries when you can go to
             | america and earn megabucks.
        
               | miohtama wrote:
               | That's capital markets and the lack of capital markets is
               | because of not having business friendly environment.
               | Consumer protections strong, pro business not so much.
               | Companies like Spotify go to the US to IPO.
        
               | deaux wrote:
               | Are you saying that the other 199 non-US countries in the
               | world all have a business unfriendly environment, since
               | every one of them besides China has practically the same
               | amount of software VC funding compared to the US?
               | 
               | All of these purported EU-specific reasons completely
               | ignore that things are the same elsewhere. It's the US
               | that is the outlier.
        
             | actionfromafar wrote:
             | One fairly large factor is that even though English is much
             | more common today, you just can't operate (depending on the
             | product of course) in many countries without having
             | customer support, documentation etc in the local language.
        
             | deaux wrote:
             | > I think it is a complete misreading to point to
             | protectionism as the reason for Chinese success, but having
             | a big unified domestic market for consumers along with
             | massive saving rates and capital controls probably does
             | help.
             | 
             | Capital controls are protectionist measures, but anyway,
             | no.
             | 
             | > Okay, what is the limiting factor?
             | 
             | Let's look at which countries have a significant local
             | software industry compared to population size.
             | 
             | - China
             | 
             | - US
             | 
             | - Korea
             | 
             | - You can argue for Japan and India but that's already
             | starting to stretch.
             | 
             | - Yup, effectively no where else. Even in an "out of the
             | way" place like Myanmar everyone uses Meta, with a nice
             | little genocide to show for it. Sure, in Vietnam they use
             | Zalo, and other places have a few other local players. But
             | most of the famous US tech apps are dominant.
             | 
             | Is the EU the outlier here? No. _Everywhere else_ US tech
             | dominates. Meta, Netflix, Apple, Google, Uber, Spotify,
             | Microsoft, Match Group, Paypal, Amazon, and on and on. They
             | don 't just dominate the EU, they dominate _the world_.
             | 
             | Except for the countries I named above, where at least
             | _some_ of the markets that US big tech competes in, instead
             | have bigger local players. And even there, guess what?
             | 
             | Their market share is almost 1:1 linearly correlated to the
             | degree of protectionism in those countries, all the way
             | from China, then Korea, then India/Japan, and then
             | everywhere else! Who woulda thought!
             | 
             | Why does Korea have much less US tech dominance than, say,
             | Germany? Despite German companies theoretically having a
             | big advantage: the German public is 100x more privacy
             | conscious than the Korean one, and much less trusting of US
             | companies.
             | 
             | I can tell you that it's not less regulations; Korea's GDPR
             | is much more onerous than the EU's and so are investment
             | regulations. On every single regulatory aspect, German
             | software startups have it easier. But they were never
             | protected. US tech was allowed to waltz in, dump their
             | products - that's what they did, it's hilarious how now
             | China "dumping" EVs and solar is suddenly an issue when
             | it's exactly the strategy that US tech continues to this
             | day; the AI companies are doing it right now! And the
             | Korean companies were protected. Both by the rules burden,
             | that local companies had to deal with too, along with
             | intentional protectionism.
             | 
             | When it comes to solar and EVs, we all understand that a
             | foreign country dumping their goods kills local industry.
             | It's the exact same with software.
             | 
             | But then half of HN has millions on the bank exactly thanks
             | to the above - this is where all those fat SV salaries have
             | come from - so I do get the lack of desire to understand
             | it.
        
               | whimsicalism wrote:
               | > Their market share is almost 1:1 linearly correlated to
               | the degree of protectionism in those countries
               | 
               | Seems like you actually believe this. I think our
               | starting points on reality are different enough that we
               | are not going to have a productive conversation, I wish
               | you and other Europeans the best of luck in your
               | protectionism-led growth strategy. Make sure to not
               | discuss it with any pesky macroeconomists who might lead
               | you astray. take care
        
               | deaux wrote:
               | I've provided very specific cases that directly support
               | this, you've so far provided nothing. This is a really
               | poor comment.
        
               | vanviegen wrote:
               | You seem to have accidentally left the actual content out
               | of your comment.
        
               | coolewurst3000 wrote:
               | Spotify is Swedish. Uber is irrelevant in many places in
               | the EU due to protectionism.
        
               | BDPW wrote:
               | Spotify is not a US company.
        
             | sofixa wrote:
             | > Okay, what is the limiting factor
             | 
             | A few.
             | 
             | A big part is that the EU is a collection of countries that
             | (with very few exceptions) have different languages and
             | laws. For a company to serve Spain and France, for
             | instance, it would need to translate everything, hire local
             | lawyers and customer support agents. Considering the much
             | smaller size of the countries (biggest one is 70 million vs
             | 330 million in the US), the opportunity for "unlimited"
             | growth is limited.
             | 
             | This also rebounds in the fact that when an American
             | company makes it big, they have the resources to flood
             | other EU markets and be cheaper/better than the local
             | competition due to economies of scale and money based on
             | their big successful US market. A French company making it
             | big is still small compared to a US equivalent.
             | 
             | Then, there's the capital markets, no denying that. The
             | money being thrown around the US is like nowhere else on
             | the planet. Some of it definitely a bubble / unrealistic,
             | but that doesn't matter. But _in part_ it 's because of the
             | size of the total potential market that this is justified.
             | 
             | Education / national mythology also plays a part, I think
             | (this is pure conjecture now). In the US, the "American
             | Dream", "everyone can make it" etc is heavily ingrained. It
             | propagates through the world with the help of Hollywood and
             | other American cultural exports. In most EU countries,
             | there isn't such a heavy emphasis on independence and
             | "pulling yourself up by your bootstraps". "Hustle culture"
             | isn't a thing. So for most people, it isn't something that
             | comes naturally to them to start a company and work 100
             | hour weeks to be big and rich and successful and famous.
             | 
             | That's not to say there aren't such people, I went to 42
             | and have been to Station F and know some people in that
             | universe. A decent proportion of my classmates wanted to
             | make their startup and make it big, and some did end up
             | starting their own companies.
        
               | deaux wrote:
               | > This also rebounds in the fact that when an American
               | company makes it big, they have the resources to flood
               | other EU markets and be cheaper/better than the local
               | competition due to economies of scale and money based on
               | their big successful US market. A French company making
               | it big is still small compared to a US equivalent.
               | 
               | Ding ding ding! When China does it with solar and EVs we
               | call it "dumping". When Uber, OpenAI and Anthropic do it,
               | that term is never ever used. VC funded US techs dumps
               | harder than any Chinese industry ever has.
        
               | carlosjobim wrote:
               | > Considering the much smaller size of the countries
               | (biggest one is 70 million vs 330 million in the US), the
               | opportunity for "unlimited" growth is limited.
               | 
               | If you manage to get 10 million customers, your business
               | is already successful on a gigantic scale, and you should
               | have all the know-how in taking on the world. The success
               | of other people is rarely the reason why you are failing
               | in your own life. Start somewhere, do something.
               | 
               | > The money being thrown around the US is like nowhere
               | else on the planet.
               | 
               | That's true and it's awesome. In Europe money is only
               | thrown to real estate owners and any enterprising people
               | with a dream are cordially invited to fucking forget
               | about it, shut up, and fall back in line. Even if they
               | already have a proven track record. They take their idea
               | to the United States and are treated incredibly well in
               | comparison. Even if their business will only be a niche
               | business with limited reach, like 99% of businesses.
        
           | clickety_clack wrote:
           | It's probably the people who didn't start a business in the
           | EU that you want to talk to. Like, I'm European, but I
           | started my company in the US because everything is so much
           | easier here.
        
             | lukan wrote:
             | What would you want to see changed to consider coming back?
        
               | clickety_clack wrote:
               | When I got here, I realized that things are so much
               | better here that the only thing that could get me back to
               | Europe is a decision not to renew my visa.
        
             | sofixa wrote:
             | > but I started my company in the US because everything is
             | so much easier here
             | 
             | Which part is easier? That you have 50 different states
             | with slightly varying laws to consider (e.g. Californian
             | Data protection)? That you have a byzantine system of
             | "benefits" to choose and manage?
             | 
             | And compared to where? Germany or Estonia or Sweden or
             | Spain? The complexities will vary wildly depending on the
             | country (kind of like in the US, where lots of companies
             | pick the state to base themselves in based on the
             | combination of favourable laws and precedents and taxes).
        
               | whimsicalism wrote:
               | "That you have 50 different states with slightly varying
               | laws to consider (e.g. Californian Data protection)?"
               | 
               | there are certain sentences you can just tell would never
               | be written by an American lol
        
               | sofixa wrote:
               | Got me, I'm not American, but isn't it true?
               | 
               | California Consumer Privacy Act is a thing you need to
               | take into account for Californian customers.
               | 
               | Illinois has a Biometric Privacy Act.
               | 
               | And who knows what Wyoming or South Dakota or Oregon have
               | that you might take into account if your business falls
               | under any of them.
        
               | whimsicalism wrote:
               | we might be somewhat trending in this direction, but the
               | reality is largely that the US states are pretty
               | identical and have very similar laws on the books. the
               | federal government is in charge of commerce usually.
               | 
               | most laws like CCPA also have some threshold where you
               | already need to be pretty successful for it to apply to
               | you.
               | 
               | for some select industries (biometrics & healthcare), yes
               | you have a patchwork of laws.
        
             | deaux wrote:
             | Where in Europe and where in the US? You probably started
             | one in the easiest US state to do so. Did you try starting
             | one in the easiest EU state? Otherwise we already can't
             | take things very seriously.
             | 
             | Secondly, what's easier besides VC funding? If it's VC
             | funding, the disparity there has nothing to do with
             | regulations - guess how much VC funding the non-EU rest of
             | the world gets.
        
               | clickety_clack wrote:
               | I'm actually bootstrapping, so the VC situation isn't
               | relevant to me.
               | 
               | It's a distant memory to me now, I'm building a company
               | and so much has happened that the details of this
               | decision have faded away. But, between the AI act and
               | GDPR, there's a set of potential traps laid out for you
               | to step into, along with reams of paperwork. All that
               | requires lawyers and compliance consultants to help you
               | figure it out, and that's way too much for a fledgling
               | startup.
               | 
               | I think it said it all that the AI regulations were
               | written before there was really anyone to regulate. Why
               | would I want to pour my heart and soul into a system
               | that's geared to find ways to stop me from building?
               | 
               | Anyway, it's no longer relevant to me: I'm gone and I
               | don't have to worry about it anymore.
        
               | neoromantique wrote:
               | Hi! EU Resident here, if anything, I'd want EU
               | protections to apply even more to US companies than they
               | do now.
               | 
               | I don't want to exchange my freedoms for your shareholder
               | value, thank you.
        
           | pier25 wrote:
           | > _When you talk to most EU business owners, even in tech,
           | the limiting factor isn 't regulations._
           | 
           | I have a tech startup in Estonia and I agree. To me the
           | biggest limiting factor is lack of funding.
        
             | moffkalast wrote:
             | Yep, VCs don't exist here. Plus the absurd starting costs,
             | it's like what, 20k to set up a GmbH?
        
               | troupo wrote:
               | 2.5k EUR in starting capital, and two founders to start a
               | a limited liability company (AB) in Sweden, and a 240 EUR
               | processing fee: https://verksamt.se/starta-foretag/valj-
               | foretagsform/aktiebo...
               | 
               | And you register online.
        
               | pier25 wrote:
               | Depends on the country.
               | 
               | Opening a company in Estonia is very cheap but in Spain
               | the manager/CEO needs to be an "autonomo" (like a self-
               | employed tax status). This costs thousands of Euros per
               | year. Something like 2,400-30,000 Euros per year, every
               | year, forever.
        
               | troupo wrote:
               | And that's probably one of the big obstacles in the EU:
               | there's no common ground for these things. At least this
               | will hopefully be addressed:
               | https://www.reuters.com/business/eu-propose-uniform-
               | rules-st...
        
               | vanviegen wrote:
               | What does it matter that the rules for establishing
               | differ per country? I'm only founding in one of them.
               | 
               | The article is unclear, but is probably referring to
               | making it easier for startups to offer products in other
               | EU countries.
        
               | troupo wrote:
               | The idea is to establish common rules to make it easier
               | to register and move startups between countries, among
               | other things.
               | 
               | It's in very early stages, so info is very scattered.
               | More info, for example, here:
               | https://www.loyensloeff.com/insights/news--
               | events/news/the-2...
        
           | greg_V wrote:
           | Tbh, a lot of EU protectionism vs. US tech seems not to keep
           | the competition out. In fact, with the amount of free press
           | US startups get and the size of their coffers, they can
           | simply roll over the local competition in EU markets most of
           | the time.
           | 
           | What it's terribly good at is adding burdens that the US
           | giants don't face early on, slowing down the early growth
           | between 28 fragmented markets. I don't know specifically
           | about how China works, but the question is proving product-
           | market fit, and for that, you need a lot of users fast.
           | 
           | In the EU, it's a different battle country to country as the
           | media environment, the markets, the regulation etc. are all
           | fractured.
        
         | dzikimarian wrote:
         | While grant process in EU isn't fun, I think Levels has bit of
         | an ego issues. He mentioned that if he had issues like that on
         | eg X, he would see Elon himself in the replies.
         | 
         | While he is great at converting his influencer status to income
         | in his micro-SaaS projects, I don't think running ad-fueled
         | browser games on state-sponsored super computer should be
         | really aim of these grant programs.
        
           | whimsicalism wrote:
           | I'm actually no fan of his, so that's fine. That said, I went
           | to the actual website he was talking about (I'm also an EU
           | citizen) and in this case it is exactly as described and
           | bordering comical.
        
             | troupo wrote:
             | It's not even close to how he described it.
        
               | drexlspivey wrote:
               | There's a screen recording at the bottom
        
               | troupo wrote:
               | I have the same answer as here:
               | https://news.ycombinator.com/item?id=45735738
        
           | alecco wrote:
           | He is 100% right on this one. From personal experience trying
           | to figure out EU. Lawyer bureaucrats manage funds behind red
           | tape clearly meant to be for their pals.
           | 
           | All these while the EU is running out of funds and in a
           | process of de-industrialization. There should be an
           | independent corruption investigation on Brussels.
        
             | dzikimarian wrote:
             | I took part in application for EU grants a few times and
             | our company group did it many times over the years.
             | 
             | It's bureaucracy, often bordering with stupidity. You may
             | need advisors to navigate all their forms & processes. But
             | it certainly isn't "pals-only" type of deal.
             | 
             | On the other hand - is it harder than getting VC funding?
             | For seasoned founder with reputation - probably. For fresh
             | startup - probably not.
        
               | whimsicalism wrote:
               | > For fresh startup - probably not.
               | 
               | highly doubt, the whole thing about the success of the US
               | west coast is that they are&were willing to fund unproven
               | upstarts.
        
               | array_key_first wrote:
               | Right but if we do this with public funds then the
               | narrative shifts to "OMG the EU is so corrupt and stupid,
               | looking they're pouring taxpayer dollars into unproven
               | stuff! They're deindustrializing!!"
               | 
               | The point being that, as soon as public dollars are on
               | the table, people expect perfection. Anything less is
               | waste, fraud, and abuse.
               | 
               | There's literally no winning. Want to make sure the money
               | is allocated right? Bureaucracy. Want to not do that?
               | Waste, fraud, and abuse.
        
               | carlosjobim wrote:
               | The winning move is that governments should do government
               | stuff and private capital should do private capital
               | stuff. Startups belong to the latter.
        
               | sealeck wrote:
               | > Startups belong to the latter.
               | 
               | Except that Apple, Intel, Tesla, etc have all received US
               | government investment [1]. TSMC is a product of the
               | Taiwanese state! Government investment can be done well,
               | and seeds excellent companies.
               | 
               | [1]: https://www.sba.gov/blog/2024/2024-02/white-house-
               | sba-announ...
        
               | carlosjobim wrote:
               | It doesn't matter if government funded startups have been
               | successful. It's not the government's job to provide
               | capital to high risk ventures. They should provide public
               | services for the people and regulate the private sector
               | according to the interest of the people.
        
               | jacobgorm wrote:
               | Denmark has a large hearing aids industry due to lots of
               | government funding for hearing aids, and a large wind
               | turbine industry due to funding for wind farms. So
               | stimulating demand can work to build or strengthen an
               | industry, but what Denmark and EU are doing with GPUs is
               | stimulating supply in Europe and demand in the US. I
               | would be surprised if that does not end up strengthening
               | US and not EU industry.
        
               | radarsat1 wrote:
               | That's exactly the problem in Europe though. It's quite
               | the opposite here.
        
               | alecco wrote:
               | Someone told me I needed to hire some expensive law firm
               | in Brussels. See:
               | 
               | https://www.politico.eu/article/ombudsman-slams-
               | commission-f...
        
               | jll29 wrote:
               | Reviewer (Scientific Expert) for the EU (since 2009)
               | here.
               | 
               | The probability of getting a Horizon Europe grant
               | allegedly (not official stats) is about 8.5% according to
               | some friends, which may seem low. You need to write 70
               | pages following a Word template and the key goal is to
               | cover answers to a large number of questions. Each
               | proposal gets various grades across a range of
               | dimensions, which get added up and if you obtain at least
               | 13 out of a possible 15 points, you are eligible to get
               | funded, read: "You will get funded if there is enough
               | money." Often, there are several proposals that justly
               | achieve 15/15, and because of that, many prosals that
               | have 14 points and all proposals that have less may not
               | get funded, simply because there just is not enough total
               | funding available to fund all the technically eligible
               | proposals. Having judged many proposals in AI / ML /
               | search / "big data" / language technology etc. I
               | recommend optimizing recall, i.e. aspiring completeness.
               | 
               | The application process is not easy, but you can get
               | help: there are support agency in each member country,
               | free online Webinars to help, hotline help desks as well
               | as an ecosystem of paid consultants that typically charge
               | about 3kEUR to vet a proposal for you if you need that
               | kind of service (I never used it).
               | 
               | The process is neutral and conducted professionally and
               | with external oversight (consultants are hired as
               | "rapporteurs" that report on process/procedural integrity
               | in additional to the actual reviewers). I value the
               | research officers of the EC as people of high competency,
               | integrity and motivation (research money is tax payers
               | money so it should be spent carefully).
               | 
               | In comparison, VC (and even more so business angel)
               | funding is achievable with much less formal apparatus,
               | often a short business plan and a convincing slide deck
               | and demo can get people to a partner meeting if the time
               | is right. But the criteria and process are much
               | different, and ideas ready for public research grants are
               | typically too early for VCs (but the EC wants to foster
               | the creation VC-funded startups resulting from the
               | disseminated research).
        
               | alecco wrote:
               | Can you confirm (or not) the mandatory female co-founder?
               | I could swear I read it. Or could it be another EU fund?
        
               | jacobgorm wrote:
               | Someone should build a startup that uses the EuroLLM to
               | generate EU funding proposals.
        
             | bjourne wrote:
             | Of course there is red tape. EU funding comes from taxpayer
             | money and we want it to be spent wisely. The red tape is
             | precisely to prevent it from being funneled to pals. EU has
             | funded quite a few free software projects so it's not like
             | the red tape is an insurmountable burden:
             | https://www.ri.se/en/news/blog/europes-digital-future-
             | spells...
        
               | notahacker wrote:
               | I'd also say that their grants _aren 't_ unusually
               | burdensome and grantmaking is arms length compared with a
               | lot of other bodies.
               | 
               | Yes, some of the questions are weird, but I'd really
               | rather write a bit confirming that the AI system being
               | developed isn't going to be racist or Skynet than jump
               | through some other hoops that exist (and that absolutely
               | includes VC due diligence). The actual biggest issue with
               | European funds is they get way more competent
               | applications than they can fund anyway.
        
         | tinco wrote:
         | Yeah no, it's just not how it works. They're trying to support
         | fundamental research and they have limited resources to
         | accomplish them. Some random dude who wants to build a company
         | that generates pretty AI pictures is just not the target
         | audience, and he rightly got rejected.
         | 
         | And frankly, the dream scenario that Pieter describes where he
         | somehow would qualify for these resources also wouldn't help
         | kickstart the tech industry, and it's also not how it works in
         | the states.
         | 
         | What does help, and what European governments (at least the one
         | in The Netherlands that Pieter is from) actually do, is more
         | funding for startups. If you're a startup founder in NL almost
         | every angel you talk to has a matched funding deal with the
         | government. That's such a smart way of keeping up with the US.
         | Do you think US startups get free compute from the government?
         | They don't even get subsidies most of the time. What they get
         | is better funding because there's more capital available, and
         | helping investors with that is exactly how you solve that.
        
           | logifail wrote:
           | > What does help, and what European governments (at least the
           | one in The Netherlands that Pieter is from) actually do, is
           | more funding for startups. If you're a startup founder in NL
           | almost every angel you talk to has a matched funding deal
           | with the government. That's such a smart way of keeping up
           | with the US.
           | 
           | Does government offering matched funding to investors
           | actually help startups who are struggling to find (any)
           | funding? If a startup can't find (any) funding, matching is
           | irrelevant.
           | 
           | > Do you think US startups get free compute from the
           | government? They don't even get subsidies most of the time.
           | What they get is better funding because there's more capital
           | available, and helping investors with that is exactly how you
           | solve that.
           | 
           | Umm. I'm not really convinced that the political elites in
           | Europe understand how to do any of this stuff well.
           | 
           | See also: https://www.eib.org/en/publications/online/all/the-
           | scale-up-...
        
           | whimsicalism wrote:
           | I don't think what you're saying is inconsistent with what
           | I'm saying. I think you are making a big deal out of the
           | difference between state investment funds and subsidized GPUs
           | but I think they basically work by similar mechanisms.
        
         | softwaredoug wrote:
         | Is the point of these policies to pick winners? Or to upskill
         | the creators and stimulate the economy by giving possible
         | entrepreneurs experience Europeans can't get in big tech?
         | 
         | In the US, some ex-Googler might found a startup. Europe
         | doesn't have the equivalent of FAANG. (Europe-wide companies
         | are not quite as easy as US-wide)
         | 
         | Even if the super computer itself "fails", is the goal actually
         | the secondary impacts to the economy?
         | 
         | (And in the US, we do our own fair share of picking winners /
         | losers, especially in the current regime)
        
         | troupo wrote:
         | Levels is engagement farming. Instead of uncritically reposting
         | him you could've gone ahead and read what the cluster is for:
         | https://x.com/dmitriid/status/1982927767286231403
         | 
         | Cluster: for public benefit, cutting edge research in biotech,
         | medical, robotics.
         | 
         | Levels: I want to create AI photos of people for my AI Slop
         | startup
        
           | whimsicalism wrote:
           | > Cluster: for public benefit, cutting edge research in
           | biotech, medical, robotics.
           | 
           | That's not what the quoted paragraph says and you can read
           | the whole release if you want: https://ec.europa.eu/commissio
           | n/presscorner/detail/en/ip_25_...
        
             | troupo wrote:
             | I literally quoted the paragraph from this link in the
             | tweet I provided: _Edit_ : lol, I didn't, I quoted it from
             | a policy document, not from press release. However, my
             | point stands:
             | 
             | --- start quote ---
             | 
             | Apply AI Strategy
             | 
             | The Apply AI Strategy aims to harness AI's transformative
             | potential by driving adoption of AI across strategic and
             | public sectors including healthcare, pharmaceuticals,
             | energy, mobility, manufacturing, construction, agri-food,
             | defence, communications and culture. It will also support
             | small and medium-sized enterprises (SMEs) with their
             | specific needs and help Industries integrate AI into their
             | operations.
             | 
             | --- end quote ---
             | 
             | I also quoted a paragraph from a document I will find when
             | I'm not on mobile.
             | 
             | Levels literally wants to train AI Slop:
             | https://x.com/levelsio/status/1981499900266193028
             | 
             | --- start quote ---
             | 
             | Train a foundational model for AI photos of people
             | 
             | --- end quote ---
        
               | IshKebab wrote:
               | Seems like your quote was very misleading to me, so no
               | your point doesn't stand.
        
               | troupo wrote:
               | > Seems like your quote was very misleading to me, so no
               | your point doesn't stand
               | 
               | My quote: Cluster: for public benefit, cutting edge
               | research in biotech, medical, robotics.
               | 
               | Literal quote from your link: The Apply AI Strategy aims
               | to harness AI's transformative potential by driving
               | adoption of AI across strategic and public sectors
               | including healthcare, pharmaceuticals, energy, mobility,
               | manufacturing, construction, agri-food, defence,
               | communications and culture.
               | 
               | You: your quote was misleading.
               | 
               | I'm sorry, I don't have the time or the patience with
               | willfully ignorant and blind people getting their
               | interpretations from AI slop engagement farmers.
               | 
               | Adieu
        
               | IshKebab wrote:
               | Yeah you just demonstrated how it was misleading - by
               | omitting half the categories, especially communications
               | and culture.
               | 
               | > I'm sorry, I don't have the time or the patience with
               | willfully ignorant and blind people getting their
               | interpretations from AI slop engagement farmers.
               | 
               | Riiight.
        
           | fvdessen wrote:
           | Unfortunately the AI Slop is probably the most effective way
           | to fund AI research right now
        
             | tsimionescu wrote:
             | But the point here isn't to fund AI research, it is to use
             | AI to benefit concrete fields.
        
             | troupo wrote:
             | By funding AI slop, you're funding AI slop, not AI
             | research, or, quote, "drive adoption of AI across strategic
             | and public sectors including healthcare, pharmaceuticals,
             | energy, mobility, manufacturing, construction, agri-food,
             | defence, communications and culture"
        
         | antman wrote:
         | What are the effects of pick the winner strategy? Sounds
         | intriguing
        
         | saubeidl wrote:
         | This guy spreads FUD about the "unelected commission". What a
         | loon.
        
           | cbeach wrote:
           | The EU Commission is appointed, not elected. Only the
           | Parliament (MEPs) are elected.
           | 
           | What's worse, the parliament cannot originate law. Only the
           | unelected Commission can do so. And they can do it behind
           | closed doors. This is a setup that's ripe for corruption.
        
             | vanviegen wrote:
             | It's true that they are appointed.
             | 
             | However, they're appointed by the EU Council (the heads of
             | state, most of them elected, some appointed by a national
             | parliament), and approved by the (elected) European
             | Parliament.
             | 
             | At the cost of some transparency, this does make it
             | possible to select a bit more for management skills instead
             | of just campaigning skills.
        
             | saubeidl wrote:
             | The Commission is appointed by elected officials. That's
             | the same way the US presidency works. It's also how the UK
             | PM role works, or any minister in pretty much any
             | democratic government. All of those are still referred to
             | as "elected" in common tongue.
        
       | mezod wrote:
       | Of course catalan isn't in the list. 10 million speakers that
       | don't matter to the European Union. EU likes our productivity but
       | squanders our rights. We are 2nd class citizens.
       | 
       | Now let's wait for the people saying "Spain" could change this.
       | Hypocrites.
       | 
       | Cultural genocide at its best.
        
         | whimsicalism wrote:
         | yeah best to lean in more on national and linguistic
         | fragmentation, diversity has always been one of the EUs
         | strengths
        
           | mezod wrote:
           | if that's the argument, let's drop all the languages and
           | focus on english :)
        
             | jimbob45 wrote:
             | You've got to pick one as a lingua franca. English is
             | already popular but Spanish, French, or Esperanto would all
             | work just fine.
        
         | ks2048 wrote:
         | Catalan is included. It's called one of the "11 additional
         | languages" in the paper.
        
       | fulafel wrote:
       | See also Apertus: https://www.swiss-ai.org/apertus
        
       | sherinjosephroy wrote:
       | That's a cool idea -- training a multilingual model like that is
       | ambitious. But I'm curious how well it'll actually handle smaller
       | EU languages compared to English or French. If it truly nails
       | those, that's a big win for accessibility.
        
         | pembrook wrote:
         | All the models from all the big providers (even the Chinese
         | models!) support all of these languages already.
         | 
         | The big win for accessibility has already been won...3 years
         | ago.
        
           | viktorcode wrote:
           | I asked a Finnish person how good an answer about the
           | language example from ChatGPT was. It turned out to be a
           | hallucination, a confidently sounding nonsense.
           | 
           | The quality of internet trained models degrade very fast with
           | language material size
        
       | KronisLV wrote:
       | Here's the models: https://huggingface.co/utter-project/models
       | 
       | I used the 9B Instruct version, from the small models, it was the
       | one with the best Latvian knowledge out there, bar none. GPT-OSS
       | 20B and Qwen3 30B A3B and similar ones weren't even close.
       | 
       | That said, the model itself was a little bit dumb and not
       | something you'd really use for programming/autocomplete or tool
       | calling or anything like that, which also presented some problems
       | - even for processing text, if you need RAG or tool server calls,
       | you need to use something like Qwen3 for the actual logic and
       | then pass the contents to EuroLLM for translation/formatting with
       | the instructions, at which point your n8n workflow looks a bit
       | messy and also you have to run those two models instead of only
       | one.
       | 
       | Meanwhile, the best cloud model for Latvian that I've found so
       | far was Google Gemini 2.5 Pro, but obviously can't use cloud
       | models in certain on-prem use cases.
        
         | jim180 wrote:
         | If I ask something in Lithuanian, EuroLLM will reply in Latvian
         | lol.
         | 
         | I have to specifically tell something like this: "do you known
         | Lithuanian language", then it starts replying in Lithuanian
        
           | sublimefire wrote:
           | It seems there is some weird grouping of the language data
           | which LLM cannot distinguish well. I wonder if it is the same
           | for other similar languages like scandinavian or western
           | slavic
        
       | Steen3S wrote:
       | If multi-lang is the goal, why not translate the output of the
       | big labs?
        
         | sublimefire wrote:
         | Surely that would need to be both input and output. But even
         | then you could easily get lost in translation as the intent in
         | one language might mean slightly different thing in another.
         | Thus you could get subpar results.
        
         | layer8 wrote:
         | Because there is always something lost in translation.
        
       | bogtog wrote:
       | They report benchmarks on the huggingface page
       | (https://huggingface.co/utter-project/EuroLLM-9B)
       | 
       | They almost exclusively compare their model to prior models from
       | 2024 or older and brag about "results comparable to Gemma-2-9B".
       | I'm not sure what I expected. The eurollm.io homepage states
       | "EuroLLM outperforms similar-sized models", which just seems like
       | a lie for all practical purposes
       | 
       | An overly charitable interpretation is that EuroLLM isn't a
       | reasoning model and has minimal post-training, so they sought out
       | comparisons to such models (they're still ignoring reasoning
       | models that have non-reasoning modes)
        
         | aeontech wrote:
         | > They almost exclusively compare their model to prior models
         | from 2024
         | 
         | As another comment here noted, the title is missing (2024) -
         | this model was released almost a year ago, last December, so
         | it's not surprising that that's the models they compare to.
        
       | extraduder_ire wrote:
       | From the EuroLLM-9B page on hugginface;
       | 
       | >You need to agree to share your contact information to access
       | this model
       | 
       | Is this common? I've never seen it on the site before, and it
       | isn't on the smaller model. What are they collecting this
       | information for?
        
         | ks2048 wrote:
         | I'm not sure which models require this and why, but I've come
         | across it. e.g. the llama models, https://huggingface.co/meta-
         | llama/Llama-3.1-8B-Instruct
        
       | geretnal wrote:
       | Finally!
        
       | sireat wrote:
       | It is interesting how much traction this 9B model is getting
       | which is good.
       | 
       | Still two month earlier 19 European language model with 30B
       | parameters got almost no mention:
       | 
       | https://huggingface.co/TildeAI/TildeOpen-30b
       | 
       | Mind you that is another open model that is begging for fine-
       | tuning (it is not very good out of box).
        
       | websku wrote:
       | I'm looking to try this for ActorDO
        
       | dostick wrote:
       | What good does it do by having only include formal languages? For
       | example there's no Russian, while there's now at least 8 million
       | ethnic Russians living in Europe.
        
         | imcritic wrote:
         | Today's Russians are 1935's Jews: Nazis want to cancel Russians
         | and everything Russian as much as possible.
        
           | isodev wrote:
           | off topic but it's absolutely stunning how Russia once fought
           | the nazis and now Russia are the nazis.
        
             | Ylpertnodi wrote:
             | I thought it was ukkraine?
        
             | notahacker wrote:
             | tbf, the USSR fought the Nazis mainly because they didn't
             | have much choice after Nazis turned on them a little while
             | after they'd teamed up with those ideological enemies to
             | invade Poland, so it's not like they hadn't put the effort
             | into being on the wrong side of history :)
        
               | isodev wrote:
               | Indeed, we had a history teacher who used to joke about
               | Russia being a "historical bully" in every age since
               | they've been on the map.
        
           | simion314 wrote:
           | Oh, typical Ruzzian victim comlex, their brain can't
           | understand why all their neighbors and "brother slavs" hate
           | them, brainwashing for generations made them think in
           | unatural logic where you need to negate anything a Ruzzian
           | says and then you increase the probability 100 times to be
           | the truth.
           | 
           | Ruzzian = a Russian Zed patriot , we use this notation to
           | acknowledge that there still exists a small percentage of
           | educated Russians that are not Ruscists.
        
         | ks2048 wrote:
         | From the paper:
         | 
         | As the aim of EuroLLM is to provide EU citizens with powerful
         | and useful AI tools, it is critical that the model can also
         | translate and answer questions in other European and non-
         | European languages. With this in mind, we added support for 11
         | additional languages (Arabic, Catalan, Chinese, Galician,
         | Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and
         | Ukrainian).
        
         | layer8 wrote:
         | Perfect is the enemy of the good.
        
       | wildredkraut wrote:
       | Wow this site, logo and everything is so ugly. But the FAX styled
       | photos fits well to Europe's deficit.
        
       | fodkodrasz wrote:
       | Kivalo cel, remelem sikerre viszik!
        
       | memet_rush wrote:
       | Hopefully Albanian is added one day!
        
       | ks2048 wrote:
       | Their home page has link "Technical Report for EuroLLM" but links
       | to the same page as their other link for release article on
       | hugging face.
       | 
       | I suppose that's a typo and I found a technical report here:
       | https://arxiv.org/abs/2506.04079
        
       | johnjames87 wrote:
       | I prefer proprietary LLMs that are actually good products -
       | byproducts of free market competition (capitalism), instead of
       | products created from govt initiatives that lead nowhere (good).
        
       | zoobab wrote:
       | Can we add Gaumais to the list? I ask Llama3 questions on how to
       | translate french to Gaumais, it was pretty good at it.
       | 
       | https://fr.wikipedia.org/wiki/Gaumais
        
         | Ylpertnodi wrote:
         | All the different italian dialects, patois in French,
         | schwebish...
        
       | cess11 wrote:
       | In this vein there's also the recent swiss Apertus.
       | 
       | https://www.swiss-ai.org/apertus
        
       | adt wrote:
       | The EuroLLM-9B model release is from Dec/2024, and scores just
       | above random chance for benchmarks like MMLU-Pro (17.6%, random
       | chance is 10%).
       | 
       | Comparison with similar EU models + 600 other highlights:
       | 
       | https://lifearchitect.ai/models-table/
        
       | danielam wrote:
       | Curiously, just came across this paper [0].
       | 
       | [0] https://arxiv.org/abs/2503.01996
        
       | ph4evers wrote:
       | How does it compare to Mistral's model?
        
       | trilogic wrote:
       | Great job, Thank you.
       | 
       | We support your work and offer backup and distribution. Here a
       | copy just in case:
       | https://hugston.com/uploads/llm_models/EuroLLM-22B-Instruct-...
        
       | supermatt wrote:
       | > It is fully open source and available via Hugging Face.
       | 
       | This model was released in 2024, and I couldn't find any links to
       | the training data - is it just an open weights model?
        
       | rmoriz wrote:
       | Maybe we can call it "open weights" and not open source?
        
       | Zufriedenheit wrote:
       | EU officials should create an environment where abundant private
       | companies can afford to put out many great open models instead of
       | funding some selected individuals with taxpayer money.
        
       | hebejebelus wrote:
       | Some cursory clicking about didn't reveal to me the actual corpus
       | they used, only that it is several trillion tokens 'divided
       | across the languages'. I'm curious mainly because Irish (among
       | some other similarly endangered languages on the list) typically
       | has any large corpus come from legal/governmental texts that are
       | required to be translated. There must surely be only a relatively
       | tiny amount of colloquial Irish in the corpus. It be interesting
       | to see some evals in each language particularly with native
       | speakers.
       | 
       | I think LLMs may be on the whole very positive for endangered
       | languages such as Irish, but before it becomes positive I think
       | there's an amount of danger to be navigated (see Scots Gaelic
       | wikipedia drama for example)
       | 
       | In any case I think this is a great initiative.
        
       ___________________________________________________________________
       (page generated 2025-10-28 23:00 UTC)