[HN Gopher] EuroLLM: LLM made in Europe built to support all 24 ...
___________________________________________________________________
EuroLLM: LLM made in Europe built to support all 24 official EU
languages
Author : NotInOurNames
Score : 471 points
Date : 2025-10-28 14:58 UTC (8 hours ago)
(HTM) web link (eurollm.io)
(TXT) w3m dump (eurollm.io)
| adzm wrote:
| For those curious, the 24 official languages are Bulgarian,
| Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
| French, German, Greek, Hungarian, Irish, Italian, Latvian,
| Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak,
| Slovenian, Spanish, and Swedish.
|
| Maltese, interestingly, is the only Afro-Asiatic derived
| language.
|
| Hungarian, Finnish, and Estonian are the three Uralic languages.
|
| All the others are Indo-European, Greek being the only Hellenic
| one, Irish the only Celtic, the rest are Baltic, Slavic, Italic,
| or Germanic.
|
| (I originally used the term Balto-Slavic, though I was unaware of
| some of the connotations of that term until just now. Baltic and
| Slavic do share a common origin, but that was a very very long
| time ago)
| purrcat259 wrote:
| I read, write and speak Maltese, AMA if you are curious about
| the language.
| ebb_earl_co wrote:
| What is the name of Maltese in Maltese? Like "el espanol" in
| Spanish, it's neat to know what languages call themselves
| kridsdale3 wrote:
| 'ish' is a pretty universal english suffix. So Spanish is
| just "espan-ish".
| ggsp wrote:
| Wikipedia says it's "Malti"
| arbuge wrote:
| Il-Malti to be precise. Il- means "the" and changes its
| meaning to that of the language. Malti alone would mean a
| Maltese person.
|
| Source: I'm also Maltese.
| jll29 wrote:
| The "Il" in Il-Malti is like "al" in Arabic, which
| Maltese is closely related to as was pointed out above.
|
| Arabic (language): al-'arabiyyah (l`arabiyaW@).
| kwk1 wrote:
| A term for that concept, by the way, is "endonym":
|
| https://en.wikipedia.org/wiki/Endonym_and_exonym
| Raed667 wrote:
| Tunisians claim they can understand Maltese with minimum
| effort, is it reciprocal? How close is Maltese to arabic /
| tunisian dialect ?
| arbuge wrote:
| Not sure which Tunisians are claiming this but they'd
| definitely need a lot more than minimum effort. Maltese
| split off from Arabic around 1k years ago. The two
| languages sound pretty different, and are written with
| different alphabets.
| cenamus wrote:
| Also lots of influence from Italian and English.
| findyoucef wrote:
| As an Algerian, I can confirm that Maltese is
| surprisingly easy to understand. I was genuinely shocked
| the first time I heard it because the similarities are so
| obvious. Many Arabic dialects are also written using the
| Latin alphabet, especially online and on social media, so
| the different writing systems aren't really a barrier at
| all.
| purrcat259 wrote:
| I don't have much personal experience in attempting to
| communicate with arabic speakers. From others I have heard
| Lebanese arabic is the closest and you can have a passable
| conversation.
| adzm wrote:
| I'm actually really curious about everyday usage of the
| language; is code switching between English and Maltese more
| common than Maltese on its own? I've seen a few online
| communities where the vocabulary switches between Maltese and
| English very often which is interesting but I wonder how much
| of that is just online / written versus everyday speech.
| purrcat259 wrote:
| Depends on where you live and how you were brought up, but
| for the most part code switching is default.
|
| There was a point about 7 years ago when the overton window
| shifted to "speak english to strangers first" because of a
| large influx of foreigners who did not know the language.
| Since then I've met foreigners who have better Maltese than
| some natives.
|
| Older folks & geriatrics will sometimes be surprised when
| they assume someone is foreign and they turn out to be
| Maltese. "int Malti??" is a statement I get often because I
| don't look Mediterranean despite being born here.
| nxor wrote:
| How are loan words viewed? Do businesses work in Maltese? Are
| monolingual speakers of the language regarded differently
| than those fluent in English? Do young people in Malta listen
| to Maltese music?
| JAlexoid wrote:
| Yes, there's plenty of Maltese spoken and listened to.
|
| I was surprised to hear Maltese radio stations played in
| taxis, while visiting Malta just a few weeks back
| nxor wrote:
| The point of my question was to ask someone who lives
| there, not someone who visited
| purrcat259 wrote:
| Maltese has been loaded with loan words since forever. 5
| points if you can guess where bongu, bravu and mappa come
| from. At some point there was some literary council for the
| language that decided that any new loan words should just
| be spelled phonetically. Computer became kompjuter.
|
| Businesses do work in Maltese and English. Both are
| official languages. Its quite rare to encounter a business
| that deals near exclusively in Maltese. Many prefer Maltese
| but will fall back to english where necessary.
|
| Regarding monolignual speakers, I think theres a lot of
| stereotypes for maltese only, english only and code
| switchers. I think its all a bit silly... So as long as
| communication can happen I don't fuss.
|
| On Maltese music... There's a lot of low ish quality music
| then there's a few absolute gems. Look up The Travellers,
| Lapes, Jon Mallia on YouTube/Spotify.
| nxor wrote:
| Interesting, but I get the impression that ubiquitous
| English loan words in seemingly every language is a lot
| different than loan word patterns of the past. Do you
| think? Maybe not?
| purrcat259 wrote:
| I don't have much of an opinion I suppose english
| language cultural dominance has meant that newer words
| are just imported rather than adapted
| lullu57 wrote:
| I can concur. All older words (think any word that was
| needed since the older generations), are Arabic based.
| All the numbers, all older verbs etc. 'Newer' words are
| latin based.
| cm2012 wrote:
| Can you communicate with Maltese dogs more effectively?
| purrcat259 wrote:
| Only if we have a few Maltesers first
| Tade0 wrote:
| How is "Marsaxlokk" _really_ pronounced? I 've heard that
| word a few times, but never from a native. Google translate
| can't help me here, as it doesn't seem to have Maltese text-
| to-speech.
| purrcat259 wrote:
| Read with English pronunciation, closest would be mar-sa-
| shlock.
| cess11 wrote:
| From my experience it will be understood by locals when
| pronounced like that.
| franklin_p_dyer wrote:
| Not a question, but - Tatoeba could use your help! It is an
| open source (both code and data) dataset of parallel
| sentences and their Maltese data is very lacking. Also it's
| pretty fun to just translate a bunch of random sentences into
| a language you speak. :-)
|
| https://tatoeba.org/
| runarberg wrote:
| Is there any dialect of Arabic which you can understand
| without too much effort?
|
| How much do you consider Maltese its own language (as opposed
| to a dialect of Arabic)?
| notahacker wrote:
| I know that the reverse understanding isn't too bad from
| chatting with a Saudi-born member of staff on holiday in
| Malta.
|
| I don't think anyone would seriously consider it a dialect
| of Arabic though with its completely different alphabet and
| half the vocabulary and morphology coming from Italian
| languages/dialects, even if Malta hadn't spent the best
| part of a millennium trying very hard _not_ to become part
| of the Arab world
| purrcat259 wrote:
| From what I have heard, Lebanese Arabic is the closest, and
| still pretty far. Passable conversation is possible.
|
| Maltese is definitely its own language. Arabic roots are
| there (theres a Semitic joke in there ) but it isn't arabic
| anymore. Its written left to right with a variant of the
| english alphabet.
| barrell wrote:
| I recently discovered Maltese existed, and started learning
| it that day. I find it such an awesome language, and not just
| because of the letter H
|
| I do wonder what natives think and feel about the longevity
| of their language? What is taught in schools at what ages
| (assuming English is in the mix somewhere). Is there enough
| media in Maltese for Malti to go about the moderns at fully
| in Maltese? It's shockingly hard to find any information on
| Maltese, and even harder to find content.
|
| I'm not sure if's dying out, or in danger thereof; if there
| are preservation efforts, or if there is no need.
| lullu57 wrote:
| Native Maltese speaker here. It is thought in schools
| alongside English, with both being official national
| languages. Most people locally, that are not foreign born
| or immigrants speak the language, and it is used in most
| households as the main language. But everyone grows up
| bilingual, as English is essential for most everything else
| that we do as a nation.
| jim180 wrote:
| Lithuanian and Latvian are Baltic languages. Nothing to do with
| Slavic...
| Telaneo wrote:
| https://en.wikipedia.org/wiki/Balto-Slavic_languages
| asveikau wrote:
| See the section "historical dispute".
|
| I think some people get touchy about them being lumped
| together if their last period of commonality (per the
| article) was 1400 BCE. For comparison, I believe all the
| Slavic languages were mutually intelligible around 1200 AD.
| But much more recently than this, in the last few
| centuries, there have been notable attempts by east slavs
| to absorb the Baltic language cultures and deny them.
| krzyk wrote:
| I doubt that South Slavic and West/East Slavic were
| mutually intelligible at 1200 AD.
|
| I doubt West and East Slavic were. But inside those
| geographic groups they probably were (Czech and Polish
| AFAIR were around that time).
| actionfromafar wrote:
| Depends on your standards, too. Even today, any pair of
| slavic speakers should have a head start in understanding
| each other. Put them next to each other for a month and
| they should be talking, at least about basic everyday
| things.
| asveikau wrote:
| I may be off by 100-200 years, but this is what I read.
| There were accents and regionalisms but they were all
| mutually intelligible.
|
| It is an example I think of often, about how quickly
| languages can change. In the scale of 1000 years, a lot
| changes. Most of the diversity in Romance languages is
| from around that timescale too, it really started to
| diverge substantially around 900ad-1100ad.
| kaato137 wrote:
| Balto-Slavic branch divides into Baltic and Slavic language
| groups so nothing wrong here
| kreetx wrote:
| Yup, most of Eastern Europe are Balto-Slavic. While the
| division from the Eastern Slavic languages (Russian,
| Belarussian, Ukranian, etc) is distant, they are still
| Slavic. From Eastern Europe, only Estonian is not a Slavic
| language.
| d1sxeyes wrote:
| Hungarian too, although there's a question about whether
| Hungary is Eastern or Central Europe.
| kreetx wrote:
| Ah, yes, how could I forget! As a side note, though also
| Finno-Ugric then similarity in sound and appearance from
| Finnish or Estonian at least appears very far.
| dragonwriter wrote:
| "There's a question" implies that there is a ground truth
| that might be discovered to resolve this rather than
| simply a clash of different purely arbitrary definitions
| of the same terms.
| lo_zamoyski wrote:
| The Visegrad 4 (Poland, Czechia, Slovakia, Hungary)are
| generally taken to be "Central European". The strict
| East/West division is largely a product of the Cold War
| and the Iron Curtain.
| NicuCalcea wrote:
| > From Eastern Europe, only Estonian is not a Slavic
| language.
|
| Well, that and Romanian. And Hungarian. And outside the
| EU, Albanian. And Georgian, Azeri and Armenian if you
| consider those Eastern Europe.
| ardit33 wrote:
| Albania is not "East Europe", but South East. Same as
| Greece.
| NicuCalcea wrote:
| That's just your opinion, and the UN would disagree:
| https://www.un.org/dgacm/en/content/regional-
| groups#:~:text=...
|
| Some of my fellow Romanians will also claim they're
| Central European, but in my mind, all the ones I listed
| are Eastern European countries. I'd even include Turkey
| and Kazakhstan in there, part of the latter is to the
| West of the Urals, which is what we normally consider the
| border between Europe and Asia.
| kreetx wrote:
| I regret being that loose with the designation :),
| Romanian and Hungarian are valid counter arguments.
|
| In my mind, I was thinking of the belt of countries
| between Russia and Central Europe, starting from the
| Baltics down to the Balkan (excluding Greece).
| NicuCalcea wrote:
| Even by your definition, I can count at least seven
| countries where the official language is not Slavic. And
| that's not even including all the Altaic, Romance and
| other assortment of regional languages, many of which
| have some sort of official status.
| rich_sasha wrote:
| Latvian and Lithuanian are not at all Slavic.
|
| There is a branch that contains both Baltic and Slavic
| languages, but there's also one that contains Albanian
| and Greek.
| ardit33 wrote:
| Albanian and Greek are both completely separate branches,
| and both unique on the tree (they don't have common
| cousins like the others).
|
| There have been some attempts to tie Albanian to
| Germanic, or Greek, or other branches, but they all have
| failed.
|
| At some point they all are Indo_european, but they split
| a way ago.
| pqtyw wrote:
| > most of Eastern Europe are Balto-Slavic
|
| and
|
| > only Estonian is not a Slavic language.
|
| So following this logic saying "in Eastern Europe, only
| Estonian is not a Baltic language" would make as much
| sense?
| sublimefire wrote:
| It is just one of the theories, there is no clear evidence
| to suggest that Baltic and Slavic were the same language
| thousands of years ago.
| pqtyw wrote:
| Well there is if you go far enough. It's just the
| question when did they split off from each other. However
| there is no question that Baltic and Slavic are more
| closely related to each other than any other non extinct
| Indo-European languages.
|
| The fact they they are the closest surviving relatives on
| it own doesn't mean it makes sense to group them together
| (i.e. Italo-Celtic is also a theorized subgroup in a
| similar way but nobody is disputing that Celtic and
| Italic languages evolved into distinct groups).
|
| Then there is a huge amount of missing links and unknown
| unknowns. e.g. Thracian and Dacian probably were also
| pretty close to Baltic or Slavic (maybe even closer to
| Baltic than Slavic is but we don't know enough about them
| to make any conclusive claims at all... but we at least
| know these languages existed)
| Tade0 wrote:
| Plenty of wrong here, considering Lithuanian and Latvian
| are utterly unintelligible to slavs, save for loanwords,
| but Slavic languages between themselves retain some level
| of intelligibility, which even spawned two competing
| constructed languages.
| adzm wrote:
| I was thinking about separating the two groups when I was
| writing this but was afraid of getting too verbose, though in
| retrospect that probably would have made more sense
| regardless of the historical lineage. My apologies if this
| came off as inconsiderate.
|
| I updated my original comment, and learned a good amount
| about that dispute as a result, so thanks for calling it out.
| Vinnl wrote:
| Tomorrow there are elections in the Netherlands, and two
| parties are proposing adding Frysian to that list:
| https://neerlandistiek.nl/2025/10/kies-voor-taal/
|
| Best get to retraining those models.
| przemub wrote:
| Each EU country nominates one official language for the EU,
| otherwise we'd have Catalan, Breton, Kashubian and many more.
| rsynnott wrote:
| They could get Austria to do it, as it presumably has a
| spare slot.
| outside1234 wrote:
| This raises an interesting question. Is there only one
| dialect of German in the LLM? My understanding is that
| the German German and Austrian German dialects are
| significantly different.
| hebelehubele wrote:
| My German teacher always claimed that Swiss German and
| German German (Hochdeutsch) were so different that she
| needed subtitles to understand it, and she didn't
| understand why they weren't considered separate
| languages.
| umanwizard wrote:
| They are in fact considered separate languages.
| geretnal wrote:
| Try dutch, it is combination of German and English!
| layer8 wrote:
| If Switzerland was in the EU, it would certainly be made
| a separate official language.
| ipsi wrote:
| They really are very, very different. Knowledge of one
| helps with the other, but it's far more than just "a
| couple of weeks to adjust to the accent", for example.
|
| EDIT: It's worth noting that this is mostly a _spoken_
| thing, AIUI - most formal /semi-formal writing would be
| in Hochdetusch rather than a local dialect.
| lhoff wrote:
| It depends. There is not one Swiss German but multiple
| subdialects. The language spoke around the Bern region
| very far away from German while the one from Zurich or
| Basel is much closer. Since there is no official written
| from they never really converged to a homogeneous
| language.
| ipsi wrote:
| When spoken? Almost certainly. But I think they mostly
| write in Hochdeutsch, especially in formal contexts, at
| least that I've seen (private chats/etc are a totally
| different matter), so I don't foresee any major issues
| there.
| lxgr wrote:
| Austrian standard german is slightly different from the
| German variant, even when written. The differences are
| pretty minor, though, so it's very possible to have a
| relatively long text without being able to tell which one
| it actually is (especially when potatoes are not
| referenced in it).
| piltdownman wrote:
| Including the nasty political side-show that is Ulster
| Scots - literally only brought in as a chilling effect
| 'whataboutism' to diminish support when Irish speakers ask
| for language rights in Northern Ireland.
|
| https://www.reddit.com/r/northernireland/comments/1fivtob/n
| o...
| pqtyw wrote:
| Well Scots is a real language. As much as English or any
| other. Whether enough people speak it especially in NI to
| justify it having an official status and such is another
| matter.
| AlecSchueler wrote:
| This completely ignores the history of published writing
| in Ulster Scots going back centuries.
| wizzwizz4 wrote:
| This is one of those topics where the Hacker News take is
| unlikely to be correct. There's a _lot_ of strong feeling
| here, and an outsider would need at least three books to
| understand the historical context (one of which, afaict,
| has not been written yet: it 's oral tradition only).
|
| People closer to the issue are better-placed to gather
| the necessary information, but again: strong feeling.
| Most people find it hard to get past that. The most
| informed person I know is _so_ biased that I don 't at
| all trust their conclusions.
| Levitz wrote:
| Well, this was 4 days ago, Spain in talks with Germany
| regarding the addition of official languages:
|
| https://www.politico.eu/article/catalan-basque-galician-
| boos...
| runarberg wrote:
| Is English a legacy official language then from the time
| the UK was a member (I'm guessing Ireland nominated Irish
| instead of English). Aside it feels very un-EU to push this
| limitation, as I was under the assumption that EU was all
| about celebrating (European) diversity.
| handelaar wrote:
| Still an official language, thankfully. Officially,
| because of Cyprus.
| Muvasa wrote:
| Malta and ireland
| sigmar wrote:
| Should be noted- the Netherlands can't unilaterally make
| changes. Spain has been trying to push for languages to be
| added and hasn't had luck.
| Vinnl wrote:
| Haha I just added it as a fun fact, I don't actually
| believe folks will need to start retraining things, or that
| this is likely to be at the top of the priorities list for
| anyone. Party programmes are aspirational anyway.
| mikrl wrote:
| As a Brit I feel very at home when hearing/reading Dutch and
| Frisian. It's a reminder that England and the Low Countries
| share a lot of close history all the way back to Anglo-Saxon
| times; of being fishers, traders, burghers and mercenaries
| moving around the North Sea chasing opportunities, spreading
| and augmenting languages.
|
| "Brea, buter en griene tsiis is goed Ingelsk en goed Frysk"
| RobotToaster wrote:
| If you've ever read anything written in old English, it's a
| even closer to Dutch.
| lawlessone wrote:
| Before the Dutch arrived would it have been something
| like Welsh that was spoken in England?
| rgblambda wrote:
| Anglo-Saxons not Dutch. But the short answer is yes. The
| word Welsh is derived from the Old English word for
| foreigner.
|
| Latin would have been spoken in towns and cities but as
| Roman rule collapsed it was replaced by Brittonic
| (ancestor of Welsh), unlike in the continent where it
| developed into various Latin derived Romance languages.
| tirant wrote:
| Not only on the language but also in gastronomy and
| architecture. When I see old towns in UK I usually think
| about Dutch towns but just without any biking
| infrastructure.
| tannhaeuser wrote:
| > _However modern standard Dutch (Nederlands, Hollands) is
| based upon Franconian, rather than Saxon dialects._
|
| > _Some of these [Old Saxon] speakers took part in the
| Germanic conquest of England in the fifth century AD. While
| it is not true that English and Plattdeutsch derive
| completely from the same source, the Old Saxon input into
| Anglo-Saxon was of primary importance and this linguistic
| group contributed greatly to the Anglo-Saxon dialects which
| our English forefathers spoke._
|
| [1]: http://www.plattmaster.de/plattoew.htm
| tecleandor wrote:
| AFAIK, they are trying to get Frisian added to the "European
| Charter for Regional or Minority Languages", not the official
| language list.
|
| They get certain recognition, but they are not official in
| Europe. For example, just from Spain there are 13 languages
| on that list.
| ginko wrote:
| Just do a 50:50 mix of the German and Dutch model weights.
| Vinnl wrote:
| Oops, accidentally made the model speak Limburgish.
| ChrisMarshallNY wrote:
| Flemish? I remember watching a TV show in Flemish ( _Hotel Beau
| Sejour_ [0]), so it's prevalent enough to invest that kind of
| money into.
|
| What about Basque? Is that too controversial?
|
| [0] https://en.wikipedia.org/wiki/Hotel_Beau_Sejour
| td540 wrote:
| like British English vs US English, Flemish is a dialect of
| dutch
| ChrisMarshallNY wrote:
| Ah. That makes sense.
|
| It's all Greek, to me...
| mytailorisrich wrote:
| I think those 24 languages reflect all the languages that are
| official languages at country level.
|
| So for instance, Basque is not an official language of any
| country (only French in France and Spanish/Castilian in
| Spain). Belgium's official languages are French, Dutch, and
| German, "Flemish" is only a local variant of Dutch (Belgian
| French is also only a local variant of French).
| ChrisMarshallNY wrote:
| Thanks. That makes sense.
|
| In the US, people will resort to fisticuffs, over variants
| of Spanish. I usually translate into Castilian Spanish,
| because that seems to be the equivalent of "Vanilla"
| Spanish. No one is really happy (except the Spaniards), but
| I'm not accused of favoritism.
| contravariant wrote:
| Official is a weird concept though. Turns out Dutch law
| never really bothered to define an official language, Dutch
| simply is the de facto standard and is required for a lot
| of things making it effectively the standard. This makes
| Dutch Sign Language the only language officially recognised
| by law. An attempt to recognise Frysian and Dutch as
| official languages in the constitution failed.
| rags2riches wrote:
| Sweden didn't have an "official" language before the
| Language Law of 2009. Five minority languages (Finnish,
| Meankieli, Romani, Sami, Yiddish) were officially
| recognized as such since 1999.
| tirant wrote:
| Basque is an official language and declared as such in the
| Spanish constitution however restricted only to the regions
| that decide to apply it (Basque Country and Navarra).
| mytailorisrich wrote:
| If we want to go all legal, I believe that
| Spanish/Castilian is the only official language of the
| State, so at country level, with the other "Spanish
| languages" only official in their respective areas:
|
| _Section 3
|
| (1) Castilian is the official Spanish language of the
| State. All Spaniards have the duty to know it and the
| right to use it.
|
| (2) The other Spanish languages shall also be official in
| the respective Autonomous Communities in accordance with
| their Statutes.
|
| (3) The richness of the different linguistic modalities
| of Spain is a cultural heritage which shall be specially
| respected and protected._ [1]
|
| [1] https://www.senado.es/web/conocersenado/normas/consti
| tucion/...
| tirant wrote:
| Basque is not controversial, but spoken just by very little
| people.
| embedding-shape wrote:
| Not sure that should be the qualifier, there might be more
| people able to speak Basque in the world than Danish,
| doesn't stop Danish from being well supported.
| Levitz wrote:
| Quick google points to about 1M Basque speakers in the EU
| against 5-6M Danish speakers, there's also the fact that
| Basque is not the only official language in the country
| it belongs to, and that it's in fact not spoken in the
| vast majority of the country.
|
| From https://european-union.europa.eu/principles-
| countries-histor... we can find an excerpt relating to
| the policy and its purpose:
|
| >One of the EU's founding principles is multilingualism.
|
| >This policy aims to:
|
| >communicating with its citizens in their own languages
|
| >protecting Europe's rich linguistic diversity
|
| >promoting language learning in Europe
|
| With this in mind, the first intention fails by an
| enormous margin, given that 95%+ of Spain doesn't speak
| an iota of Basque, the second is met handily, given the
| long history of the language, and I'm not sure what to
| think about the third, any language whatsoever would
| serve that purpose.
| yvdriess wrote:
| Flemish is more of a political construct than linguistic,
| it's a grouping of belgian-dutch the coastal, brabant and
| limburg language groups with each having their own regional
| dialects.
| OptionOfT wrote:
| It's more than political. In speaking Flemish is to Dutch
| as UK English is to US English. In writing however there is
| no difference in spelling, but there is a difference in
| word choice.
|
| Now, being from Belgium, even within that small part of the
| country where everybody is supposed to speak Dutch, I
| genuinely don't understand people from near the coast,
| which was about 150 miles from where I used to live.
| punnerud wrote:
| Norwegian is also included, based on the model card:
| https://huggingface.co/utter-project/EuroLLM-9B
| arbuge wrote:
| > Maltese, interestingly, is the only Afro-Asiatic derived
| language.
|
| It's Semitic, to be precise.
|
| https://en.wikipedia.org/wiki/Semitic_languages
| UebVar wrote:
| Arabic, even. An outlier, as it is AFAIK the only arabic
| dialect that is not written with the arabic alphabet. Also
| it's far removed from other arabic dialects.
| findyoucef wrote:
| It's not at all far removed from the North African dialects
| of arabic which is the dialect that it's derived from.
| Tunisians and Algerians can understand Maltese quite well.
| fsckboy wrote:
| Is Ireland the only country to bring in two languages,
| Irish/Gaelic and English? Is English an official language of
| any other EU countries?
| JAlexoid wrote:
| I believe Malta has English as an official language.
|
| PS: Gaelic is a more general term for Irish and Scottish.
| Ireland brings specifically Irish(Gaeilge in Irish) language.
| rags2riches wrote:
| Malta has Maltese and English as official languages. I don't
| know what they bring to the EU list of official languages.
| ginko wrote:
| AFAIK Ireland only listed Gaelic as their official language
| with UK having English. That caused a bit of a problem during
| Brexit since technically English wasn't officially an EU
| language anymore. I guess they resolved it somehow.
| layer8 wrote:
| English is an official EU language because Regulation 1
| Article 1 says so [0] and hasn't been changed. In practice,
| English is the most widely used language in EU institutions,
| so it would be have been silly to remove it after Brexit.
|
| [0] https://eur-lex.europa.eu/legal-
| content/EN/TXT/?uri=CELEX:01...
| raattgift wrote:
| That said, whenever there is a language selection UI (e.g.
| at banking machines or institutional websites) in wider
| Europe that uses flags to represent languages -- probably
| not a good idea to start with, but very common -- the Irish
| tricolour should be used to indicate English rather than
| the UK or USA flags. (although cf Airteagal 8 of Bunreacht
| na hEireann).
| ChocolateGod wrote:
| English at this point has stopped culturally belonging to
| the United Kingdom and whilst one can discus it's not so
| very moral way of getting there, it's become the bridge
| language for people of different languages to communicate
| in, further solidified by the internet.
| rcbdev wrote:
| It's a national language in Malta, making it a popular
| destination for "language weeks" in European schools, where
| English is usually a main subject.
| threesmegiste wrote:
| Turkish?
| runarberg wrote:
| Is official in Northern Cyprus. But as I understand it while
| the whole island of Cyprus is in the EU, the state of
| Northern Cyprus isn't.
| sva_ wrote:
| Seems like the model isn't limited to those though, from the
| paper:
|
| > as well as some additional relevant languages (Arabic,
| Catalan, Chinese, Galician, Hindi, Japanese, Korean, Norwegian,
| Russian, Turkish, and Ukrainian).
|
| https://arxiv.org/pdf/2409.16235
|
| The paper also goes into detail on training set sources, which
| I feel like a curation thereof might be considered the main
| contribution of this publication?
| _kidlike wrote:
| In Greek we call our language Hellenic, and our country Hellas.
| "Greek" / "Greece" don't exist in the Hellenic language.
| ranadomo wrote:
| > Graikoi, Graikoi were an ancient Hellenic tribe
|
| https://en.wikipedia.org/wiki/Graecians
| 3836293648 wrote:
| Yes it does, it was a greek colony off the southern coast of
| Italy, which were the primary greek connection to the romans
| which how the name stuck.
| ks2048 wrote:
| From other comments, it seems many people don't realize that
| there are 11 more languages than these 24 official (this is
| mentioned in the paper):
|
| Arabic, Catalan, Chinese, Galician, Hindi, Japanese, Korean,
| Norwegian, Russian, Turkish, and Ukrainian.
| jll29 wrote:
| +1
| amarant wrote:
| I find it interesting that Norwegian isn't on the list.
|
| I have often joked that Norwegian is just a dialect of Swedish,
| but I never expected to get official validation like this!
| bdhtu wrote:
| Norway isn't in the EU.
| emil-lp wrote:
| Norway isn't in EU, though.
| rcbdev wrote:
| Norwegian is not on this list, because in fact no country
| with Norwegian as their national language is part of the
| European Union at the time of writing.
| jenadine wrote:
| No Luxembourgish?
| cyfex wrote:
| > Greek being the only Hellenic one
|
| Are there really any other Hellenic languages besides Greek?
| zhengiszen wrote:
| Maltese is derived from dialectical arabic
| moralestapia wrote:
| Benchmarks?
|
| Edit: Thanks, @Bengalilol.
|
| The 1.7B one looks meh.
|
| But really solid numbers on the 9B! Props to the team!
| Bengalilol wrote:
| 1.7B
|
| https://huggingface.co/utter-project/EuroLLM-1.7B#results
|
| 9B
|
| https://huggingface.co/utter-project/EuroLLM-9B#results
| nellyspageli wrote:
| Could you adjust the title from:
|
| "all official 24 EU languages" to "all 24 official EU languages"
| scoot wrote:
| @dang
| Philpax wrote:
| The former is used on the website itself.
| seydor wrote:
| It's just another Horizon2020 grant, people. Don't be overly
| harsh to a bunch of academics who are just earning their living.
| giorgioz wrote:
| I didn't know of the grant! https://research-and-
| innovation.ec.europa.eu/funding/funding...
|
| It seems the new version is called Horizon Europe
| tonyhart7 wrote:
| Yeah people comparing this to SOTA model is too harsh
| oytis wrote:
| I thought research grants were to make novel discoveries, not
| to replicate what industry has long done. Unless we are at the
| point where we study US as an alien civilization
| srameshc wrote:
| I was thinking the same, why are so many superior models coming
| from only countries like US and China. And why are European
| countries not in the list other than France with Mistral. Why are
| so few companies in India, Japan, South Korea even close to a
| promising new model like what Chinese companies did ?
| apples_oranges wrote:
| Does it even make sense? Just use the American or Chinese ones,
| adjust As needed. Where's the point in spending millions to
| build The same thing or worse
| t43562 wrote:
| Now that the big bets have been made, who wants to try to
| compete with them?
| loandbehold wrote:
| Because training frontier model is expensive and only US and
| China have capital structure to raise tens of billions of
| dollars to do it.
| busssard wrote:
| being able to train new frontier models is the new equivalent
| to nuclear capabilities.
|
| i predict at some point countries will get CIA'ed when they
| publish plans to build a large data center.
|
| Similar to the time when they got CIA'ed when announcing
| plans for new nuclear plants.
| henriquenunez wrote:
| They are already CIA'ed on a regular basis for much less
| than that.
| lossolo wrote:
| You can easily fit below 10 billion for the whole datacenter,
| then you only pay for electricity + maintenance + staff. 100k
| GPUs cost a few billion USD, that's more than enough to train
| frontier models, run experiments, and serve models in the EU
| to start. Look at what xAI did and how much it cost them and
| it's more expensive to do in US than in EU.
| nonethewiser wrote:
| "Why" is a fair question but are you surprised? Europe is
| consistently behind in tech.
|
| Europe has about 1.3 times the population of the USA and about
| 75% of the GDP yet EU tech output is a very small percentage of
| US tech output. We are not talking about 70, 50, 30, or even
| 20%. It's a drop in the bucket.
|
| >The seven largest U.S. tech companies, Alphabet (Google),
| Amazon, Apple, Meta, Microsoft, Nvidia, and Tesla, are 20 times
| bigger than Europe's seven largest, and generate 10 times more
| revenue.
|
| https://eqtgroup.com/thinq/technology/why-is-europes-tech-in...
|
| "Why" is a good question, but I definitely wouldnt expect
| significant competition in LLMs from Europe based on the giant
| tech disparity. Having 1 non-cutting edge model that isn't
| really competitive is pretty much what I would expect.
| emporas wrote:
| Also, commercial software is consistently behind from open
| source.
|
| I only use open source LLMs for writing (Qwen 32b from Groq)
| and open source editor of course, Emacs.
|
| If some people can write better using commercial LLMs (and
| commercial editors), by all means, but they put themselves at
| a disadvantage.
|
| Next step for me, is to use something open source for
| translation, I use Claude for the moment, and open source for
| programming, I use GPT curently. In less than a year I will
| find a satisfying solution to both of these problems. I
| haven't looked deep enough.
| neoromantique wrote:
| What a weird comment.
|
| llama-3.1-70b-versatile is pretty good at translating
| though
| InsideOutSanta wrote:
| _> The seven largest U.S. tech companies (...) are 20 times
| bigger than Europe's seven largest, and generate 10 times
| more revenue._
|
| I'm going to guess that this part is intentional. Europe
| tends to be more aggressive in enforcing antitrust laws.
| Economically, Europe's goal isn't to have the biggest
| companies but to have more smaller companies.
|
| So you're not going to get companies like Google, but you
| will get companies like Proton, Spotify, Tuta, Hetzner,
| Mistral, Threema, Filen, Babbel, Nextcloud, CryptPad, DeepL,
| Vivaldi, and so on.
| nonethewiser wrote:
| >I'm going to guess that this part is intentional. Europe
| tends to be more aggressive in enforcing antitrust laws.
| Economically, Europe's goal isn't to have the biggest
| companies but to have more smaller companies.
|
| So is your hypothesis that the total market cap of EU tech
| companies is something like 50,60,70, etc. % of total US
| tech marketcap? Something significantly different than the
| ~10% implied by that figure (largest us companies 10x
| largest EU companies). And it's just more broadly
| distributed?
|
| Hard to find data on this but this is showing EU tech
| market cap at 3.2T.
| https://www.stateofeuropeantech.com/chapters/outcomes
|
| Whereas this is saying the US "megacaps" ($200B+) are at
| 21T. https://www.cnbc.com/2025/09/05/tech-megacaps-worth-
| market-c...
|
| Which puts the entire EU tech market at 15% of the US
| megacaps. Not even the entire market.
| layer8 wrote:
| European companies are smaller on average and less likely
| to go public in general, so market cap comparisons don't
| show the whole picture. Growing big is less often seen as
| a goal than in the US. "Megacaps" aren't necessarily
| considered a healthy thing to have.
| jimbokun wrote:
| Yes, and this all but guarantees that Europe will stay
| behind USA and China in their technology capabilities.
| mjburgess wrote:
| What are these capabilities?
|
| I don't see any sense in which the EU has fewer
| capabilities. It has, say, a smaller number of businesses
| with smaller market dominance.
|
| It isnt clear to me what capability the EU would gain by
| having a monopolist social network, a monopolist search
| engine, a monopolist advertising trader
| jimbokun wrote:
| Europe has all of those things, they just come from the
| US.
| sunaookami wrote:
| EU made a >900 page law about AI and patted themselves on the
| back for being "the first to regulate AI" (which was not even
| true, China had an AI law before and it's two pages long).
| sajithdilshan wrote:
| This cannot be stressed enough. In my experience working in
| multiple tech startups in Germany, the power compliance,
| legal and all other 2nd line has over engineering is quite
| immense. Most of the time they act as a hindrance for
| innovation rather than a supporting factor.
|
| This AI law is a clear example of that. Pencil pushers
| creating more obstacles for the sake of creating more
| obstacles rather than actually taking a pragmatic approach.
| isodev wrote:
| It's strange, my real life experience is very different
| than yours. Unless you're training AI to do something
| shady, it's really no bother at all. In fact, most of what
| the AI Act requires, you have to do anyway for a good model
| card.
| sublimefire wrote:
| As a European citizen I think it boils down to access to the
| capital. EU/EEA is not a country and the market is sort of
| fragmented. The big players are UK, France, Germany, everyone
| else does not have the same access to money as say in the US.
| Folks want to do it but there is a glass ceiling. Hence you
| have these collabs among large institutions to tap into funds
| such as from Horizon which are academic in nature and do not
| translate well into products.
| isodev wrote:
| Because the value of these models is (actually) yet to be
| proven. Why saturate the market with something that we already
| have at least one of and others are selling as a service? No
| model provider (including the "big ones" like OpenAI) has been
| able to produce a viable business case. They're all literally
| running on government deals and investor money.
| elias_t wrote:
| Are there any benchmarks that exist for those 24 languages?
| moralestapia wrote:
| dupe of https://news.ycombinator.com/item?id=45733832
|
| which sank to the bottom thanks to HN's invisible hand
|
| Oh wait, one's not supposed to _notice_
| morkalork wrote:
| It's more like the default is to be ranked near the bottom
| unless your comment gets traction during the brief window of
| time it is ranked first for being new. Seeing your comments
| go _splat_ after that window expires is not some nefarious
| conspiracy..
| moralestapia wrote:
| Oh, you'd be surprised to know what's behind many of those
| "conspiracies"!
| nodja wrote:
| It's on the huggingface readme
|
| https://huggingface.co/utter-project/EuroLLM-9B#results
|
| https://huggingface.co/utter-project/EuroLLM-9B#english
| ks2048 wrote:
| The detailed results are in appendix to the paper:
| https://arxiv.org/abs/2506.04079
| loandbehold wrote:
| Aren't all frontier models already able to use all these
| languages? Support for specific languages doesn't need to be
| built in, LLMs support all languages because they are trained on
| multilingual data.
| melvinmelih wrote:
| > because they are trained on multilingual data
|
| But they were not trained on government-sanctioned homegrown EU
| data.
| saretup wrote:
| The entirety of the internet vs government-sanctioned
| homegrown EU data.
| raverbashing wrote:
| > But they were not trained on government-sanctioned
| homegrown EU data.
|
| If none of the LLM makers used the very big corpus of EU
| multilingual data I have an EU regulation bridge to sell it
| to you
| tonyhart7 wrote:
| "But they were not trained on government-sanctioned homegrown
| EU data."
|
| ok what are you implying on this
| sunaookami wrote:
| Who in their right mind would use this?
| tensor wrote:
| I'd use a model trained on a targeted and curated data set
| over one trained on all the crap on the internet any day.
| loandbehold wrote:
| I keep hearing that LLMs are trained on "Internet crap"
| but is it true? For instance we know from Anthropic
| copyright case that they scanned millions of books to
| make a training set. They certainly use Internet content
| for training but I'm sure it's curated to a large degree.
| They don't just scrap random pages and feed into LLM.
| nutjob2 wrote:
| > I'm sure it's curated to a large degree. They don't
| just scrap random pages and feed into LLM.
|
| How would they curate it on that scale? Does page ranking
| (popularity) produce interesting pages for this purpose?
| I'm skeptical.
| airspresso wrote:
| > I keep hearing that LLMs are trained on "Internet crap"
| but is it true?
|
| Karpathy repeated this in a recent interview [0], that if
| you'd look at random samples in the pretraining set you'd
| mostly see a lot of garbage text. And that it's very
| surprising it works at all.
|
| The labs have focused a lot more on finetuning
| (posttraining) and RL lately, and from my understanding
| that's where all the desirable properties of an LLM are
| trained into it. Pretraining just teaches the LLM the
| semantic relations it needs as the foundation for
| finetuning to work.
|
| [0]: https://www.dwarkesh.com/p/andrej-karpathy
| lm28469 wrote:
| Meh, it depends a lot on the dataset, which are heavily skewed
| towards the main languages. For example they almost always
| confuse Czech and Slovak and often swap one for the other in
| middle of chats
| mirekrusin wrote:
| But the only way to unskew it is to remove main language data
| because there isn't really any to add, no?
| tensor wrote:
| You can also correctly bias your sampling so that when
| selecting new training instances each language is chosen
| equally. Generally the diversity of data is good, unless
| that data is "wrong" which, ironically, is probably most of
| the internet, but I digress.
| RobotToaster wrote:
| Aren't they about as different as American English and
| British English?
| svobodovic wrote:
| The difference ia larger than let's say just a "dialect".
| They really are different languages, even though we
| generally understand each other quite well (younger
| generations less so). I've heard it's about as different as
| e. g. Danish and Swedish - not sure if that comparison is
| helpful.
| intended wrote:
| Nope. Capability begins to degrade once you move away from
| english.
|
| Plus all your T&S/AI Safety is not solved with translation, you
| need lexicons and data sets of examples.
|
| Like, people use someone in Malaysia, to label the Arabic
| spoken by someone playing a video game in Doha - the cultural
| context is missing.
|
| The best proxy to show the degree of lopsidedness was from this
| : https://cdt.org/insights/lost-in-translation-large-
| language-...
|
| Which in turn had to base it on this:
| https://stats.aclrollingreview.org/submissions/linguistic-di...
|
| From what I am aware of, LLM capability degrades once you move
| out of English, and many nation states are either building, or
| considering the option of building their own LLMs.
| tensor wrote:
| No, that's not how training works. It's not just about having
| an example in a given language, but also how many examples and
| the _ratio_ of examples compared to other languages. English
| hugely eclipses any other language on most US models and that
| 's why performance on other languages is subpar compared to
| performance on english.
| andy12_ wrote:
| I have never noticed any major difference in performance of
| ChatGPT between English and Spanish. The truth is that as
| long as the amount of training data of a given language is
| above some threshold, knowledge transfers between languages.
| Byamarro wrote:
| There's actually a research showing that llms are more
| accurate when questions are in Polish:
| https://arxiv.org/pdf/2503.01996
| voxgen wrote:
| Ratio/quantity is important, but quality is even more so.
|
| In recent LLMs, filtered internet text is at the low end of
| the quality spectrum. The higher end is curated scientific
| papers, synthetic and rephrased text, RLHF conversations,
| reasoning CoTs, etc. English/Chinese/Python/JavaScript
| dominate here.
|
| The issue is that when there's a difference in training data
| quality between languages, LLMs likely associate that
| difference with the languages if not explicitly compensated
| for.
|
| IMO it would be far more impactful to generate and publish
| high-quality data for minority languages for current model
| trainers, than to train new models that are simply enriched
| with a higher percentage of low-quality internet scrapings
| for the languages.
| numpad0 wrote:
| Not natively, they all sound translated in languages other than
| English. I occasionally come across French people complaining
| about LLMs' use of non-idiomatic French, but it's probably not
| a French problem at all, considering that this effort includes
| so many _Indo-European_ languages.
| FinnKuhn wrote:
| I can at least also confirm this for German. Here is one
| example that is quite annyoing:
|
| Chat GPT for example tends to start emails with "ich hoffe,
| es geht dir gut!", which means "I hope you are well!". In
| English (especially American) corporate emails this is a
| really common way to start an email. In German it is not as
| "how are you" isn't a common phrase used here.
| whazor wrote:
| European governments have huge collections of digitalised
| books, research, public data.
|
| But also European culture could maybe make a difference? You
| can already see big differences between Grok and ChatGPT in
| terms of values.
| pembrook wrote:
| If it's publicly available data, books and research, I can
| assure you the big models have already all been trained on
| it.
|
| European culture is already embedded in all the models,
| unless the people involved in this project have some hidden
| trove of private data that they're training on which diverges
| drastically from things Europeans have published publicly
| (I'm 99.9% positive they don't...especially given Europe's
| alarmist attitude around anything related to data).
|
| I think people don't understand a huge percentage of the
| employees at OpenAI, Anthropic, etc. are non-US born.
| charlieyu1 wrote:
| Training is a very different thing. Can't speak for European,
| but LLMs are often much worse in Japanese because tokenisation
| used Unicode and a single Japanese character often has to be
| represented by more than one token
| htrp wrote:
| >The EuroLLM Team brings together some of the brightest minds in
| AI including Unbabel, Instituto Tecnico Lisbon, the University of
| Edinburgh, Instituto de Telecommunicacoes, Universite Paris-
| Saclay, Aveni, Sorbonne University, Naver Labs, and the
| University of Amsterdam.
|
| >Europe is the only continent in the world to have a large public
| network of supercomputers that are managed by the EuroHPC Joint
| Undertaking (EuroHPC JU). As soon as we received the EuroHPC JU
| access to the supercomputer, we were ready to roll up our sleeves
| and get to work. We developed the small model right away and in
| less than 6 months the second model was ready.
|
| [1] https://www.eurohpc-ju.europa.eu/eurohpc-success-story-
| speak...
|
| Repurposing some of that physics sim compute
| sorenjan wrote:
| If I want to use an LLM to do translation, should I use a base
| model or an instruction tuned version? I've had mixed results
| using the chat models and a simple "Translate this to <language>:
| "
| wongarsu wrote:
| For a 9B model like EuroLLM, fine tuning the base model is
| pretty viable. You don't need a lot of samples, on the order of
| 300 high quality examples can produce good results, and the GPU
| time is pretty manageable with rented GPU instances
|
| Just the base model and a template like "English:
| {text}\n{language}:" can also work with a bit of filter and
| retry logic
| rob_c wrote:
| This, I hope, is close to multi-modal in lingual terms. There's
| potentially a lot to learn from examining where this works/fails
| :D
| jagermo wrote:
| looks cool, i hope kagi adds it to the assistant.
| Stagnant wrote:
| Title is missing "(2024)". The 9B model was released last
| december[0].
|
| 0: https://sites.google.com/view/eurollm/home
| aurintex wrote:
| Is it planned to have a VLM or something compareable like
| Qwen3-VL for the future?
| jug wrote:
| A multimodal release is planned.
| rvz wrote:
| As expected, Europe finally catches up to 2024 and launches an
| LLM that barely competes against the heavyweights.
|
| The US and China are running rings around Europe.
|
| Mistral is an exception as it was funded by US VCs and they are a
| great example showing that without VC funding, Mistral would have
| been begging to the EU for a microsopic grant to train a LLM
| worse than Llama.
| laurentiurad wrote:
| less exposure to a technology that doesn't bring that much
| revenue and it's not projected to do so in the upcoming years.
| whimsicalism wrote:
| yep, Europe is demonstrating the same sort of strategic
| thinking that economic behemoths like the Smithsonian use
| oytis wrote:
| Why wasting money on trying to compete at all then?
| t43562 wrote:
| Every country needs a few plumbers and carpenters whether
| or not they are at the forefront of technology. Some money
| must be spent to give academics work to do so they can
| sharpen up their skills and perhaps teach the next set of
| students who might be more commercial
| oytis wrote:
| It would be a better use of the money to hire someone who
| has worked on actual frontier models to teach at European
| universities
| t43562 wrote:
| If you could find one for the money, if they were happy
| to teach in the long term. If it wasn't better to have N
| for the price of 1. In other situations of import
| substitution I'm pretty sure people try to develop their
| local talent in addition to buying in experts.
| AJ007 wrote:
| Mistral is pretty much toast? Their models perform poorly and
| I'm not sure why anyone would use them. Maybe there is a
| catching up point somewhere in the future, hopefully.
| kreetx wrote:
| I'm somewhat skeptical of taxpayer funded innovation. Seen a few
| Horizon grants from the side, as a citizen I'd prefer to not pay
| for them, but unfortunately can't opt out.
| owisd wrote:
| How about Tesla for taxpayer funded innovation?
| https://www.energy.gov/lpo/tesla
| kreetx wrote:
| I wouldn't mind actually/visibly productive companies taking
| these grants. But I've also seen mostly research-focused
| (nominally) private companies who mostly live off of science
| grants, who don't produce nor sell much - because they don't
| have to.
| bigbadfeline wrote:
| > I'm somewhat skeptical of taxpayer funded innovation... as a
| citizen I'd prefer to not pay for them, but unfortunately can't
| opt out.
|
| There are a few variables here but at this point in time,
| private-funded innovation isn't different by much and all
| things considered, the difference isn't in its favor.
| tensor wrote:
| The vast majority of US discoveries are by immigrants using
| taxpayer money. AKA scientists at universities. Your media
| likes to give credit to the companies, but generally the
| companies only apply things, they rarely create new science
| these days.
| kreetx wrote:
| The above is not a discovery though.
|
| My experience with government funding is that they apply
| something and won't even try to sell it because selling is
| hard: you don't want to know that the thing you built is
| lacking nor that the competition is better. Especially the
| academic types don't. Yet I'm paying for these guys. Also, by
| funding the academics they won't even need to go to the job
| market.. But as I paid for their education I thought I was
| buying people who create value.
|
| Perhaps the above is rather harsh and it's "not that bad", my
| subjective experience nevertheless.
| tensor wrote:
| Much of the neural network work was funded by Canadian
| Universities, and commercialized by US companies. Even if
| you look just at the "Attention is all you need" paper,
| which is primarily by authors working at Google, most of
| those authors come from academia and are immigrants.
|
| Vaswani is an Indian born computer scientist, Shazeer is
| US, Parmar was born in India, Uszkoreit was born in
| Germany, Jones was born in the UK, Gomez is British-
| Canadian, Kaiser is a Polish computer scientist, and
| Polosukhin is Ukrainian.
|
| Almost all of these people have PhDs and Master degrees.
| The ROI on academia is vast for society, including European
| universities. The thing the US does well is capitalize on
| that education, and sadly also try to steal credit for it
| as "American exceptionalism." If Europe and other countries
| learn how to keep their academics and get them working in
| local industries, America's edge will evaporate overnight.
| notahacker wrote:
| A major factor in European academics moving to the US is
| that top US institutes can charge a small fortune, and
| some of that gets reflected in academic salaries.
| Interesting move by the US government to try to put them
| off...
|
| The wider availability of capital is a bigger deal
| though. "Attention is all you need" is available to
| people on other continents to read, but a computer
| scientist in Europe that understood exactly how big
| transformers were going to be and why had less chance of
| funding than a webdev in California with a pitchdeck full
| of cliches and me-too GPT wrapper for an industry they'd
| barely touched does today.
| nonethewiser wrote:
| How does this work?
|
| It seems like it, in most ways, it would be bad to train on 24
| separate languages. That's just 24 partitions to the data. Seems
| really inefficient and better to simply train in the biggest
| (english) and translate.
|
| I do think this will introduce some biases that correlate with
| the English language. It would be interesting to see more
| specifically what this means. But regardless, I don't think you
| can produce a competitive model with such a large subdivision of
| training data.
| antiloper wrote:
| If you train a model on multiple languages, you can use the
| model itself for translation. As well as allowing the model to
| naturally respond in the user's language.
| whimsicalism wrote:
| nah, it's better to train on all languages. 24 partitions? you
| are gravely underestimating these models and how they represent
| things in their latents... transfers easily
| DrNosferatu wrote:
| 1. It's a nice start, but the EU has to scale to Manhattan
| Project levels in order to properly compete with the US and
| China.
|
| 2. A credible scale effort for EU own silicon for AI Compute,
| wouldn't hurt either.
|
| 3. And this can only be achieved by vertical integration to
| combat fragmentation.
| fulafel wrote:
| Good to distinguish between publicly funded research models
| (like this one) and commercial ones (like Mistral in France).
| What are the chinese and usa public research models like?
| t43562 wrote:
| The Germans do have some neurpmorphic hardware. It might be
| smarter to invest in that to avoid having to build a lot of new
| power stations.
| bean469 wrote:
| > It's a nice start, but the EU has to scale to Manhattan
| Project levels in order to properly compete with the US and
| China.
|
| Yep, the US-government sponsored, open-weight LLM is miles
| ahead of EuroLLM
| DrNosferatu wrote:
| I propose an European AI-only "NASA" style agency that would
| have a frontier LLM-"Apollo Program" goal. It would subcontract
| the several blocks it needs across EU member states.
|
| Would you prefer European AI sovereignty with 15% overhead
| costs from geographic distribution, or 100% dependence on
| Nvidia/OpenAI with zero European industrial base?
| DrNosferatu wrote:
| Allow me to elaborate,
|
| EuroAI: Europe's Moonshot to AI Sovereignty
|
| https://open.substack.com/pub/ifiwaspolitical/p/euroai-europ...
| snek_case wrote:
| 2. New state-funded joint venture: EuroNV, pronounced euro-
| envy.
| whimsicalism wrote:
| Actually nuts to me the degree to which European policymakers do
| not even begin to understand _how_ to kickstart technologically-
| intensive industry. Anyone who has seen close-up the results of a
| "pick the winners" grant-style approach to innovation knows what
| will go wrong here.
|
| Also funny to read this narrative of how access to the European
| 'supercomputer' cluster is going.
| https://x.com/levelsio/status/1981485945745788969
| webdevver wrote:
| EU grifting is so much worse than even the most brazen Trumpian
| crypto pump n' dump.
|
| Geniunely repugnant. Atleast the Trump admin has the decency to
| pump everyones 401k...
|
| I'm trying to figure out why it bothers me so much. I think its
| because the EU are such unbelievable losers in everything they
| do. they can't even grift, thats how useless they are. they
| can't even steal properly. its so undignified, and offensive to
| the senses.
| whimsicalism wrote:
| Wouldn't go that far. EU policymakers have good intentions, I
| believe - but ultimately are products of their environment
| and cultural inclination.
|
| The EU is such a bizarre place because they treat capital and
| entrepreneurs with such massive distrust, but never really
| bothered getting rid of the quasi-static entrenched
| hierarchies from feudalism? Like I'll go to the UK or France
| and there will just be massive swathes of land owned by the
| nobility or 'former' nobility? Maybe start there but let your
| high-value human capital earn a good wage?
| sofixa wrote:
| > France and there will just be massive swathes of land
| owned by the nobility or 'former' nobility
|
| Yeah, no, this isn't even remotely true.
| whimsicalism wrote:
| will cede that, you're right for France.
| coolewurst3000 wrote:
| You are wrong in that you think the hierarchies stem
| specifically from feudalism, but you are absolutely correct
| in that these hierarchies exist and are deeply entrenched.
| Sweden and Germany have one of the lowest percentages of
| self-made vs. inherited fortunes in the western world.
| Actually some tax policies in the US enable much more
| upward mobility, such as real estate taxation and 401k-like
| vehicles.
| deaux wrote:
| > What's REALLY much more important though if you want to be a
| part of the AI race and I've posted for years here with
| @euaccofficial is to make Europe a really extremely attractive
| place to start and run an AI business. Remove regulatory
| obstructions and give tax discounts for startups. Let them
| build a business first that can compete worldwide and once they
| make enough money (let's say $100M/y), then slowly start adding
| regulation.
|
| When you talk to most EU business owners, even in tech, the
| limiting factor isn't regulations. This being the #1 reason is
| such a tired trope.
|
| Ironically, China has in some ways a bigger regulatory burden
| when it comes to software, as there if the government doesn't
| approve the business is dead in the water. I doubt that Klarna
| would've gotten off the ground there, for one, I could see them
| being shut down much earlier there. In the EU only now very
| slowly are some governments even starting to talk about some
| weak measures around their business model. But I've never, not
| once in my life, heard "Chinese software companies can't get
| off the ground due to the regulatory burden".
|
| The same people who clamor about the EU regulations are the
| ones who hate on the EU for their protectionist measures
| against US tech. Yet another bout of irony here - China's
| software industry has flourished exactly thanks to 10 times
| stronger protectionist measures against US tech. So has
| Korea's, and their protectionism has never even been anywhere
| on the China level, more inbetween EU and China. No, if there's
| anything that would help, it's much _more_ tech protectionism
| in the EU.
|
| Pieter Levels is at the end of the day an influencer, not a
| serious founder.
| whimsicalism wrote:
| > When you talk to most EU business owners, even in tech, the
| limiting factor isn't regulations. This being the #1 reason
| is such a tired trope.
|
| Okay, what is the limiting factor? Because when I talk to EU
| business owners (admittedly, very few) - they point to lack
| of big EU capital markets, which is directly downstream of
| the policy environment. And when I talk to top EU human
| capital, they all point to the lack of competitive wages.
| There's a real difficulty in allocating capital to talented
| humans.
|
| And, at least in Southern Europe, the income tax schedule is
| so aggressive it's hard to justify continuing working in many
| of these countries if you are highly talented.
|
| Like, if you can tell me what the induced operator norm from
| l_2 -> l_2 is - probably you should come to the US and work
| at a biglab and make bank. What can you do in Portugal,
| Italy, Spain, etc.??
|
| > Pieter Levels is at the end of the day an influencer, not a
| serious founder.
|
| Sure, agreed.
|
| I think it is a complete misreading to point to protectionism
| as the reason for Chinese success, but having a big unified
| domestic market for consumers along with massive saving rates
| and capital controls probably does help.
| KaiserPro wrote:
| Money.
|
| Why work in the "europoor" countries when you can go to
| america and earn megabucks.
| miohtama wrote:
| That's capital markets and the lack of capital markets is
| because of not having business friendly environment.
| Consumer protections strong, pro business not so much.
| Companies like Spotify go to the US to IPO.
| deaux wrote:
| Are you saying that the other 199 non-US countries in the
| world all have a business unfriendly environment, since
| every one of them besides China has practically the same
| amount of software VC funding compared to the US?
|
| All of these purported EU-specific reasons completely
| ignore that things are the same elsewhere. It's the US
| that is the outlier.
| actionfromafar wrote:
| One fairly large factor is that even though English is much
| more common today, you just can't operate (depending on the
| product of course) in many countries without having
| customer support, documentation etc in the local language.
| deaux wrote:
| > I think it is a complete misreading to point to
| protectionism as the reason for Chinese success, but having
| a big unified domestic market for consumers along with
| massive saving rates and capital controls probably does
| help.
|
| Capital controls are protectionist measures, but anyway,
| no.
|
| > Okay, what is the limiting factor?
|
| Let's look at which countries have a significant local
| software industry compared to population size.
|
| - China
|
| - US
|
| - Korea
|
| - You can argue for Japan and India but that's already
| starting to stretch.
|
| - Yup, effectively no where else. Even in an "out of the
| way" place like Myanmar everyone uses Meta, with a nice
| little genocide to show for it. Sure, in Vietnam they use
| Zalo, and other places have a few other local players. But
| most of the famous US tech apps are dominant.
|
| Is the EU the outlier here? No. _Everywhere else_ US tech
| dominates. Meta, Netflix, Apple, Google, Uber, Spotify,
| Microsoft, Match Group, Paypal, Amazon, and on and on. They
| don 't just dominate the EU, they dominate _the world_.
|
| Except for the countries I named above, where at least
| _some_ of the markets that US big tech competes in, instead
| have bigger local players. And even there, guess what?
|
| Their market share is almost 1:1 linearly correlated to the
| degree of protectionism in those countries, all the way
| from China, then Korea, then India/Japan, and then
| everywhere else! Who woulda thought!
|
| Why does Korea have much less US tech dominance than, say,
| Germany? Despite German companies theoretically having a
| big advantage: the German public is 100x more privacy
| conscious than the Korean one, and much less trusting of US
| companies.
|
| I can tell you that it's not less regulations; Korea's GDPR
| is much more onerous than the EU's and so are investment
| regulations. On every single regulatory aspect, German
| software startups have it easier. But they were never
| protected. US tech was allowed to waltz in, dump their
| products - that's what they did, it's hilarious how now
| China "dumping" EVs and solar is suddenly an issue when
| it's exactly the strategy that US tech continues to this
| day; the AI companies are doing it right now! And the
| Korean companies were protected. Both by the rules burden,
| that local companies had to deal with too, along with
| intentional protectionism.
|
| When it comes to solar and EVs, we all understand that a
| foreign country dumping their goods kills local industry.
| It's the exact same with software.
|
| But then half of HN has millions on the bank exactly thanks
| to the above - this is where all those fat SV salaries have
| come from - so I do get the lack of desire to understand
| it.
| whimsicalism wrote:
| > Their market share is almost 1:1 linearly correlated to
| the degree of protectionism in those countries
|
| Seems like you actually believe this. I think our
| starting points on reality are different enough that we
| are not going to have a productive conversation, I wish
| you and other Europeans the best of luck in your
| protectionism-led growth strategy. Make sure to not
| discuss it with any pesky macroeconomists who might lead
| you astray. take care
| deaux wrote:
| I've provided very specific cases that directly support
| this, you've so far provided nothing. This is a really
| poor comment.
| vanviegen wrote:
| You seem to have accidentally left the actual content out
| of your comment.
| coolewurst3000 wrote:
| Spotify is Swedish. Uber is irrelevant in many places in
| the EU due to protectionism.
| BDPW wrote:
| Spotify is not a US company.
| sofixa wrote:
| > Okay, what is the limiting factor
|
| A few.
|
| A big part is that the EU is a collection of countries that
| (with very few exceptions) have different languages and
| laws. For a company to serve Spain and France, for
| instance, it would need to translate everything, hire local
| lawyers and customer support agents. Considering the much
| smaller size of the countries (biggest one is 70 million vs
| 330 million in the US), the opportunity for "unlimited"
| growth is limited.
|
| This also rebounds in the fact that when an American
| company makes it big, they have the resources to flood
| other EU markets and be cheaper/better than the local
| competition due to economies of scale and money based on
| their big successful US market. A French company making it
| big is still small compared to a US equivalent.
|
| Then, there's the capital markets, no denying that. The
| money being thrown around the US is like nowhere else on
| the planet. Some of it definitely a bubble / unrealistic,
| but that doesn't matter. But _in part_ it 's because of the
| size of the total potential market that this is justified.
|
| Education / national mythology also plays a part, I think
| (this is pure conjecture now). In the US, the "American
| Dream", "everyone can make it" etc is heavily ingrained. It
| propagates through the world with the help of Hollywood and
| other American cultural exports. In most EU countries,
| there isn't such a heavy emphasis on independence and
| "pulling yourself up by your bootstraps". "Hustle culture"
| isn't a thing. So for most people, it isn't something that
| comes naturally to them to start a company and work 100
| hour weeks to be big and rich and successful and famous.
|
| That's not to say there aren't such people, I went to 42
| and have been to Station F and know some people in that
| universe. A decent proportion of my classmates wanted to
| make their startup and make it big, and some did end up
| starting their own companies.
| deaux wrote:
| > This also rebounds in the fact that when an American
| company makes it big, they have the resources to flood
| other EU markets and be cheaper/better than the local
| competition due to economies of scale and money based on
| their big successful US market. A French company making
| it big is still small compared to a US equivalent.
|
| Ding ding ding! When China does it with solar and EVs we
| call it "dumping". When Uber, OpenAI and Anthropic do it,
| that term is never ever used. VC funded US techs dumps
| harder than any Chinese industry ever has.
| carlosjobim wrote:
| > Considering the much smaller size of the countries
| (biggest one is 70 million vs 330 million in the US), the
| opportunity for "unlimited" growth is limited.
|
| If you manage to get 10 million customers, your business
| is already successful on a gigantic scale, and you should
| have all the know-how in taking on the world. The success
| of other people is rarely the reason why you are failing
| in your own life. Start somewhere, do something.
|
| > The money being thrown around the US is like nowhere
| else on the planet.
|
| That's true and it's awesome. In Europe money is only
| thrown to real estate owners and any enterprising people
| with a dream are cordially invited to fucking forget
| about it, shut up, and fall back in line. Even if they
| already have a proven track record. They take their idea
| to the United States and are treated incredibly well in
| comparison. Even if their business will only be a niche
| business with limited reach, like 99% of businesses.
| clickety_clack wrote:
| It's probably the people who didn't start a business in the
| EU that you want to talk to. Like, I'm European, but I
| started my company in the US because everything is so much
| easier here.
| lukan wrote:
| What would you want to see changed to consider coming back?
| clickety_clack wrote:
| When I got here, I realized that things are so much
| better here that the only thing that could get me back to
| Europe is a decision not to renew my visa.
| sofixa wrote:
| > but I started my company in the US because everything is
| so much easier here
|
| Which part is easier? That you have 50 different states
| with slightly varying laws to consider (e.g. Californian
| Data protection)? That you have a byzantine system of
| "benefits" to choose and manage?
|
| And compared to where? Germany or Estonia or Sweden or
| Spain? The complexities will vary wildly depending on the
| country (kind of like in the US, where lots of companies
| pick the state to base themselves in based on the
| combination of favourable laws and precedents and taxes).
| whimsicalism wrote:
| "That you have 50 different states with slightly varying
| laws to consider (e.g. Californian Data protection)?"
|
| there are certain sentences you can just tell would never
| be written by an American lol
| sofixa wrote:
| Got me, I'm not American, but isn't it true?
|
| California Consumer Privacy Act is a thing you need to
| take into account for Californian customers.
|
| Illinois has a Biometric Privacy Act.
|
| And who knows what Wyoming or South Dakota or Oregon have
| that you might take into account if your business falls
| under any of them.
| whimsicalism wrote:
| we might be somewhat trending in this direction, but the
| reality is largely that the US states are pretty
| identical and have very similar laws on the books. the
| federal government is in charge of commerce usually.
|
| most laws like CCPA also have some threshold where you
| already need to be pretty successful for it to apply to
| you.
|
| for some select industries (biometrics & healthcare), yes
| you have a patchwork of laws.
| deaux wrote:
| Where in Europe and where in the US? You probably started
| one in the easiest US state to do so. Did you try starting
| one in the easiest EU state? Otherwise we already can't
| take things very seriously.
|
| Secondly, what's easier besides VC funding? If it's VC
| funding, the disparity there has nothing to do with
| regulations - guess how much VC funding the non-EU rest of
| the world gets.
| clickety_clack wrote:
| I'm actually bootstrapping, so the VC situation isn't
| relevant to me.
|
| It's a distant memory to me now, I'm building a company
| and so much has happened that the details of this
| decision have faded away. But, between the AI act and
| GDPR, there's a set of potential traps laid out for you
| to step into, along with reams of paperwork. All that
| requires lawyers and compliance consultants to help you
| figure it out, and that's way too much for a fledgling
| startup.
|
| I think it said it all that the AI regulations were
| written before there was really anyone to regulate. Why
| would I want to pour my heart and soul into a system
| that's geared to find ways to stop me from building?
|
| Anyway, it's no longer relevant to me: I'm gone and I
| don't have to worry about it anymore.
| neoromantique wrote:
| Hi! EU Resident here, if anything, I'd want EU
| protections to apply even more to US companies than they
| do now.
|
| I don't want to exchange my freedoms for your shareholder
| value, thank you.
| pier25 wrote:
| > _When you talk to most EU business owners, even in tech,
| the limiting factor isn 't regulations._
|
| I have a tech startup in Estonia and I agree. To me the
| biggest limiting factor is lack of funding.
| moffkalast wrote:
| Yep, VCs don't exist here. Plus the absurd starting costs,
| it's like what, 20k to set up a GmbH?
| troupo wrote:
| 2.5k EUR in starting capital, and two founders to start a
| a limited liability company (AB) in Sweden, and a 240 EUR
| processing fee: https://verksamt.se/starta-foretag/valj-
| foretagsform/aktiebo...
|
| And you register online.
| pier25 wrote:
| Depends on the country.
|
| Opening a company in Estonia is very cheap but in Spain
| the manager/CEO needs to be an "autonomo" (like a self-
| employed tax status). This costs thousands of Euros per
| year. Something like 2,400-30,000 Euros per year, every
| year, forever.
| troupo wrote:
| And that's probably one of the big obstacles in the EU:
| there's no common ground for these things. At least this
| will hopefully be addressed:
| https://www.reuters.com/business/eu-propose-uniform-
| rules-st...
| vanviegen wrote:
| What does it matter that the rules for establishing
| differ per country? I'm only founding in one of them.
|
| The article is unclear, but is probably referring to
| making it easier for startups to offer products in other
| EU countries.
| troupo wrote:
| The idea is to establish common rules to make it easier
| to register and move startups between countries, among
| other things.
|
| It's in very early stages, so info is very scattered.
| More info, for example, here:
| https://www.loyensloeff.com/insights/news--
| events/news/the-2...
| greg_V wrote:
| Tbh, a lot of EU protectionism vs. US tech seems not to keep
| the competition out. In fact, with the amount of free press
| US startups get and the size of their coffers, they can
| simply roll over the local competition in EU markets most of
| the time.
|
| What it's terribly good at is adding burdens that the US
| giants don't face early on, slowing down the early growth
| between 28 fragmented markets. I don't know specifically
| about how China works, but the question is proving product-
| market fit, and for that, you need a lot of users fast.
|
| In the EU, it's a different battle country to country as the
| media environment, the markets, the regulation etc. are all
| fractured.
| dzikimarian wrote:
| While grant process in EU isn't fun, I think Levels has bit of
| an ego issues. He mentioned that if he had issues like that on
| eg X, he would see Elon himself in the replies.
|
| While he is great at converting his influencer status to income
| in his micro-SaaS projects, I don't think running ad-fueled
| browser games on state-sponsored super computer should be
| really aim of these grant programs.
| whimsicalism wrote:
| I'm actually no fan of his, so that's fine. That said, I went
| to the actual website he was talking about (I'm also an EU
| citizen) and in this case it is exactly as described and
| bordering comical.
| troupo wrote:
| It's not even close to how he described it.
| drexlspivey wrote:
| There's a screen recording at the bottom
| troupo wrote:
| I have the same answer as here:
| https://news.ycombinator.com/item?id=45735738
| alecco wrote:
| He is 100% right on this one. From personal experience trying
| to figure out EU. Lawyer bureaucrats manage funds behind red
| tape clearly meant to be for their pals.
|
| All these while the EU is running out of funds and in a
| process of de-industrialization. There should be an
| independent corruption investigation on Brussels.
| dzikimarian wrote:
| I took part in application for EU grants a few times and
| our company group did it many times over the years.
|
| It's bureaucracy, often bordering with stupidity. You may
| need advisors to navigate all their forms & processes. But
| it certainly isn't "pals-only" type of deal.
|
| On the other hand - is it harder than getting VC funding?
| For seasoned founder with reputation - probably. For fresh
| startup - probably not.
| whimsicalism wrote:
| > For fresh startup - probably not.
|
| highly doubt, the whole thing about the success of the US
| west coast is that they are&were willing to fund unproven
| upstarts.
| array_key_first wrote:
| Right but if we do this with public funds then the
| narrative shifts to "OMG the EU is so corrupt and stupid,
| looking they're pouring taxpayer dollars into unproven
| stuff! They're deindustrializing!!"
|
| The point being that, as soon as public dollars are on
| the table, people expect perfection. Anything less is
| waste, fraud, and abuse.
|
| There's literally no winning. Want to make sure the money
| is allocated right? Bureaucracy. Want to not do that?
| Waste, fraud, and abuse.
| carlosjobim wrote:
| The winning move is that governments should do government
| stuff and private capital should do private capital
| stuff. Startups belong to the latter.
| sealeck wrote:
| > Startups belong to the latter.
|
| Except that Apple, Intel, Tesla, etc have all received US
| government investment [1]. TSMC is a product of the
| Taiwanese state! Government investment can be done well,
| and seeds excellent companies.
|
| [1]: https://www.sba.gov/blog/2024/2024-02/white-house-
| sba-announ...
| carlosjobim wrote:
| It doesn't matter if government funded startups have been
| successful. It's not the government's job to provide
| capital to high risk ventures. They should provide public
| services for the people and regulate the private sector
| according to the interest of the people.
| jacobgorm wrote:
| Denmark has a large hearing aids industry due to lots of
| government funding for hearing aids, and a large wind
| turbine industry due to funding for wind farms. So
| stimulating demand can work to build or strengthen an
| industry, but what Denmark and EU are doing with GPUs is
| stimulating supply in Europe and demand in the US. I
| would be surprised if that does not end up strengthening
| US and not EU industry.
| radarsat1 wrote:
| That's exactly the problem in Europe though. It's quite
| the opposite here.
| alecco wrote:
| Someone told me I needed to hire some expensive law firm
| in Brussels. See:
|
| https://www.politico.eu/article/ombudsman-slams-
| commission-f...
| jll29 wrote:
| Reviewer (Scientific Expert) for the EU (since 2009)
| here.
|
| The probability of getting a Horizon Europe grant
| allegedly (not official stats) is about 8.5% according to
| some friends, which may seem low. You need to write 70
| pages following a Word template and the key goal is to
| cover answers to a large number of questions. Each
| proposal gets various grades across a range of
| dimensions, which get added up and if you obtain at least
| 13 out of a possible 15 points, you are eligible to get
| funded, read: "You will get funded if there is enough
| money." Often, there are several proposals that justly
| achieve 15/15, and because of that, many prosals that
| have 14 points and all proposals that have less may not
| get funded, simply because there just is not enough total
| funding available to fund all the technically eligible
| proposals. Having judged many proposals in AI / ML /
| search / "big data" / language technology etc. I
| recommend optimizing recall, i.e. aspiring completeness.
|
| The application process is not easy, but you can get
| help: there are support agency in each member country,
| free online Webinars to help, hotline help desks as well
| as an ecosystem of paid consultants that typically charge
| about 3kEUR to vet a proposal for you if you need that
| kind of service (I never used it).
|
| The process is neutral and conducted professionally and
| with external oversight (consultants are hired as
| "rapporteurs" that report on process/procedural integrity
| in additional to the actual reviewers). I value the
| research officers of the EC as people of high competency,
| integrity and motivation (research money is tax payers
| money so it should be spent carefully).
|
| In comparison, VC (and even more so business angel)
| funding is achievable with much less formal apparatus,
| often a short business plan and a convincing slide deck
| and demo can get people to a partner meeting if the time
| is right. But the criteria and process are much
| different, and ideas ready for public research grants are
| typically too early for VCs (but the EC wants to foster
| the creation VC-funded startups resulting from the
| disseminated research).
| alecco wrote:
| Can you confirm (or not) the mandatory female co-founder?
| I could swear I read it. Or could it be another EU fund?
| jacobgorm wrote:
| Someone should build a startup that uses the EuroLLM to
| generate EU funding proposals.
| bjourne wrote:
| Of course there is red tape. EU funding comes from taxpayer
| money and we want it to be spent wisely. The red tape is
| precisely to prevent it from being funneled to pals. EU has
| funded quite a few free software projects so it's not like
| the red tape is an insurmountable burden:
| https://www.ri.se/en/news/blog/europes-digital-future-
| spells...
| notahacker wrote:
| I'd also say that their grants _aren 't_ unusually
| burdensome and grantmaking is arms length compared with a
| lot of other bodies.
|
| Yes, some of the questions are weird, but I'd really
| rather write a bit confirming that the AI system being
| developed isn't going to be racist or Skynet than jump
| through some other hoops that exist (and that absolutely
| includes VC due diligence). The actual biggest issue with
| European funds is they get way more competent
| applications than they can fund anyway.
| tinco wrote:
| Yeah no, it's just not how it works. They're trying to support
| fundamental research and they have limited resources to
| accomplish them. Some random dude who wants to build a company
| that generates pretty AI pictures is just not the target
| audience, and he rightly got rejected.
|
| And frankly, the dream scenario that Pieter describes where he
| somehow would qualify for these resources also wouldn't help
| kickstart the tech industry, and it's also not how it works in
| the states.
|
| What does help, and what European governments (at least the one
| in The Netherlands that Pieter is from) actually do, is more
| funding for startups. If you're a startup founder in NL almost
| every angel you talk to has a matched funding deal with the
| government. That's such a smart way of keeping up with the US.
| Do you think US startups get free compute from the government?
| They don't even get subsidies most of the time. What they get
| is better funding because there's more capital available, and
| helping investors with that is exactly how you solve that.
| logifail wrote:
| > What does help, and what European governments (at least the
| one in The Netherlands that Pieter is from) actually do, is
| more funding for startups. If you're a startup founder in NL
| almost every angel you talk to has a matched funding deal
| with the government. That's such a smart way of keeping up
| with the US.
|
| Does government offering matched funding to investors
| actually help startups who are struggling to find (any)
| funding? If a startup can't find (any) funding, matching is
| irrelevant.
|
| > Do you think US startups get free compute from the
| government? They don't even get subsidies most of the time.
| What they get is better funding because there's more capital
| available, and helping investors with that is exactly how you
| solve that.
|
| Umm. I'm not really convinced that the political elites in
| Europe understand how to do any of this stuff well.
|
| See also: https://www.eib.org/en/publications/online/all/the-
| scale-up-...
| whimsicalism wrote:
| I don't think what you're saying is inconsistent with what
| I'm saying. I think you are making a big deal out of the
| difference between state investment funds and subsidized GPUs
| but I think they basically work by similar mechanisms.
| softwaredoug wrote:
| Is the point of these policies to pick winners? Or to upskill
| the creators and stimulate the economy by giving possible
| entrepreneurs experience Europeans can't get in big tech?
|
| In the US, some ex-Googler might found a startup. Europe
| doesn't have the equivalent of FAANG. (Europe-wide companies
| are not quite as easy as US-wide)
|
| Even if the super computer itself "fails", is the goal actually
| the secondary impacts to the economy?
|
| (And in the US, we do our own fair share of picking winners /
| losers, especially in the current regime)
| troupo wrote:
| Levels is engagement farming. Instead of uncritically reposting
| him you could've gone ahead and read what the cluster is for:
| https://x.com/dmitriid/status/1982927767286231403
|
| Cluster: for public benefit, cutting edge research in biotech,
| medical, robotics.
|
| Levels: I want to create AI photos of people for my AI Slop
| startup
| whimsicalism wrote:
| > Cluster: for public benefit, cutting edge research in
| biotech, medical, robotics.
|
| That's not what the quoted paragraph says and you can read
| the whole release if you want: https://ec.europa.eu/commissio
| n/presscorner/detail/en/ip_25_...
| troupo wrote:
| I literally quoted the paragraph from this link in the
| tweet I provided: _Edit_ : lol, I didn't, I quoted it from
| a policy document, not from press release. However, my
| point stands:
|
| --- start quote ---
|
| Apply AI Strategy
|
| The Apply AI Strategy aims to harness AI's transformative
| potential by driving adoption of AI across strategic and
| public sectors including healthcare, pharmaceuticals,
| energy, mobility, manufacturing, construction, agri-food,
| defence, communications and culture. It will also support
| small and medium-sized enterprises (SMEs) with their
| specific needs and help Industries integrate AI into their
| operations.
|
| --- end quote ---
|
| I also quoted a paragraph from a document I will find when
| I'm not on mobile.
|
| Levels literally wants to train AI Slop:
| https://x.com/levelsio/status/1981499900266193028
|
| --- start quote ---
|
| Train a foundational model for AI photos of people
|
| --- end quote ---
| IshKebab wrote:
| Seems like your quote was very misleading to me, so no
| your point doesn't stand.
| troupo wrote:
| > Seems like your quote was very misleading to me, so no
| your point doesn't stand
|
| My quote: Cluster: for public benefit, cutting edge
| research in biotech, medical, robotics.
|
| Literal quote from your link: The Apply AI Strategy aims
| to harness AI's transformative potential by driving
| adoption of AI across strategic and public sectors
| including healthcare, pharmaceuticals, energy, mobility,
| manufacturing, construction, agri-food, defence,
| communications and culture.
|
| You: your quote was misleading.
|
| I'm sorry, I don't have the time or the patience with
| willfully ignorant and blind people getting their
| interpretations from AI slop engagement farmers.
|
| Adieu
| IshKebab wrote:
| Yeah you just demonstrated how it was misleading - by
| omitting half the categories, especially communications
| and culture.
|
| > I'm sorry, I don't have the time or the patience with
| willfully ignorant and blind people getting their
| interpretations from AI slop engagement farmers.
|
| Riiight.
| fvdessen wrote:
| Unfortunately the AI Slop is probably the most effective way
| to fund AI research right now
| tsimionescu wrote:
| But the point here isn't to fund AI research, it is to use
| AI to benefit concrete fields.
| troupo wrote:
| By funding AI slop, you're funding AI slop, not AI
| research, or, quote, "drive adoption of AI across strategic
| and public sectors including healthcare, pharmaceuticals,
| energy, mobility, manufacturing, construction, agri-food,
| defence, communications and culture"
| antman wrote:
| What are the effects of pick the winner strategy? Sounds
| intriguing
| saubeidl wrote:
| This guy spreads FUD about the "unelected commission". What a
| loon.
| cbeach wrote:
| The EU Commission is appointed, not elected. Only the
| Parliament (MEPs) are elected.
|
| What's worse, the parliament cannot originate law. Only the
| unelected Commission can do so. And they can do it behind
| closed doors. This is a setup that's ripe for corruption.
| vanviegen wrote:
| It's true that they are appointed.
|
| However, they're appointed by the EU Council (the heads of
| state, most of them elected, some appointed by a national
| parliament), and approved by the (elected) European
| Parliament.
|
| At the cost of some transparency, this does make it
| possible to select a bit more for management skills instead
| of just campaigning skills.
| saubeidl wrote:
| The Commission is appointed by elected officials. That's
| the same way the US presidency works. It's also how the UK
| PM role works, or any minister in pretty much any
| democratic government. All of those are still referred to
| as "elected" in common tongue.
| mezod wrote:
| Of course catalan isn't in the list. 10 million speakers that
| don't matter to the European Union. EU likes our productivity but
| squanders our rights. We are 2nd class citizens.
|
| Now let's wait for the people saying "Spain" could change this.
| Hypocrites.
|
| Cultural genocide at its best.
| whimsicalism wrote:
| yeah best to lean in more on national and linguistic
| fragmentation, diversity has always been one of the EUs
| strengths
| mezod wrote:
| if that's the argument, let's drop all the languages and
| focus on english :)
| jimbob45 wrote:
| You've got to pick one as a lingua franca. English is
| already popular but Spanish, French, or Esperanto would all
| work just fine.
| ks2048 wrote:
| Catalan is included. It's called one of the "11 additional
| languages" in the paper.
| fulafel wrote:
| See also Apertus: https://www.swiss-ai.org/apertus
| sherinjosephroy wrote:
| That's a cool idea -- training a multilingual model like that is
| ambitious. But I'm curious how well it'll actually handle smaller
| EU languages compared to English or French. If it truly nails
| those, that's a big win for accessibility.
| pembrook wrote:
| All the models from all the big providers (even the Chinese
| models!) support all of these languages already.
|
| The big win for accessibility has already been won...3 years
| ago.
| viktorcode wrote:
| I asked a Finnish person how good an answer about the
| language example from ChatGPT was. It turned out to be a
| hallucination, a confidently sounding nonsense.
|
| The quality of internet trained models degrade very fast with
| language material size
| KronisLV wrote:
| Here's the models: https://huggingface.co/utter-project/models
|
| I used the 9B Instruct version, from the small models, it was the
| one with the best Latvian knowledge out there, bar none. GPT-OSS
| 20B and Qwen3 30B A3B and similar ones weren't even close.
|
| That said, the model itself was a little bit dumb and not
| something you'd really use for programming/autocomplete or tool
| calling or anything like that, which also presented some problems
| - even for processing text, if you need RAG or tool server calls,
| you need to use something like Qwen3 for the actual logic and
| then pass the contents to EuroLLM for translation/formatting with
| the instructions, at which point your n8n workflow looks a bit
| messy and also you have to run those two models instead of only
| one.
|
| Meanwhile, the best cloud model for Latvian that I've found so
| far was Google Gemini 2.5 Pro, but obviously can't use cloud
| models in certain on-prem use cases.
| jim180 wrote:
| If I ask something in Lithuanian, EuroLLM will reply in Latvian
| lol.
|
| I have to specifically tell something like this: "do you known
| Lithuanian language", then it starts replying in Lithuanian
| sublimefire wrote:
| It seems there is some weird grouping of the language data
| which LLM cannot distinguish well. I wonder if it is the same
| for other similar languages like scandinavian or western
| slavic
| Steen3S wrote:
| If multi-lang is the goal, why not translate the output of the
| big labs?
| sublimefire wrote:
| Surely that would need to be both input and output. But even
| then you could easily get lost in translation as the intent in
| one language might mean slightly different thing in another.
| Thus you could get subpar results.
| layer8 wrote:
| Because there is always something lost in translation.
| bogtog wrote:
| They report benchmarks on the huggingface page
| (https://huggingface.co/utter-project/EuroLLM-9B)
|
| They almost exclusively compare their model to prior models from
| 2024 or older and brag about "results comparable to Gemma-2-9B".
| I'm not sure what I expected. The eurollm.io homepage states
| "EuroLLM outperforms similar-sized models", which just seems like
| a lie for all practical purposes
|
| An overly charitable interpretation is that EuroLLM isn't a
| reasoning model and has minimal post-training, so they sought out
| comparisons to such models (they're still ignoring reasoning
| models that have non-reasoning modes)
| aeontech wrote:
| > They almost exclusively compare their model to prior models
| from 2024
|
| As another comment here noted, the title is missing (2024) -
| this model was released almost a year ago, last December, so
| it's not surprising that that's the models they compare to.
| extraduder_ire wrote:
| From the EuroLLM-9B page on hugginface;
|
| >You need to agree to share your contact information to access
| this model
|
| Is this common? I've never seen it on the site before, and it
| isn't on the smaller model. What are they collecting this
| information for?
| ks2048 wrote:
| I'm not sure which models require this and why, but I've come
| across it. e.g. the llama models, https://huggingface.co/meta-
| llama/Llama-3.1-8B-Instruct
| geretnal wrote:
| Finally!
| sireat wrote:
| It is interesting how much traction this 9B model is getting
| which is good.
|
| Still two month earlier 19 European language model with 30B
| parameters got almost no mention:
|
| https://huggingface.co/TildeAI/TildeOpen-30b
|
| Mind you that is another open model that is begging for fine-
| tuning (it is not very good out of box).
| websku wrote:
| I'm looking to try this for ActorDO
| dostick wrote:
| What good does it do by having only include formal languages? For
| example there's no Russian, while there's now at least 8 million
| ethnic Russians living in Europe.
| imcritic wrote:
| Today's Russians are 1935's Jews: Nazis want to cancel Russians
| and everything Russian as much as possible.
| isodev wrote:
| off topic but it's absolutely stunning how Russia once fought
| the nazis and now Russia are the nazis.
| Ylpertnodi wrote:
| I thought it was ukkraine?
| notahacker wrote:
| tbf, the USSR fought the Nazis mainly because they didn't
| have much choice after Nazis turned on them a little while
| after they'd teamed up with those ideological enemies to
| invade Poland, so it's not like they hadn't put the effort
| into being on the wrong side of history :)
| isodev wrote:
| Indeed, we had a history teacher who used to joke about
| Russia being a "historical bully" in every age since
| they've been on the map.
| simion314 wrote:
| Oh, typical Ruzzian victim comlex, their brain can't
| understand why all their neighbors and "brother slavs" hate
| them, brainwashing for generations made them think in
| unatural logic where you need to negate anything a Ruzzian
| says and then you increase the probability 100 times to be
| the truth.
|
| Ruzzian = a Russian Zed patriot , we use this notation to
| acknowledge that there still exists a small percentage of
| educated Russians that are not Ruscists.
| ks2048 wrote:
| From the paper:
|
| As the aim of EuroLLM is to provide EU citizens with powerful
| and useful AI tools, it is critical that the model can also
| translate and answer questions in other European and non-
| European languages. With this in mind, we added support for 11
| additional languages (Arabic, Catalan, Chinese, Galician,
| Hindi, Japanese, Korean, Norwegian, Russian, Turkish, and
| Ukrainian).
| layer8 wrote:
| Perfect is the enemy of the good.
| wildredkraut wrote:
| Wow this site, logo and everything is so ugly. But the FAX styled
| photos fits well to Europe's deficit.
| fodkodrasz wrote:
| Kivalo cel, remelem sikerre viszik!
| memet_rush wrote:
| Hopefully Albanian is added one day!
| ks2048 wrote:
| Their home page has link "Technical Report for EuroLLM" but links
| to the same page as their other link for release article on
| hugging face.
|
| I suppose that's a typo and I found a technical report here:
| https://arxiv.org/abs/2506.04079
| johnjames87 wrote:
| I prefer proprietary LLMs that are actually good products -
| byproducts of free market competition (capitalism), instead of
| products created from govt initiatives that lead nowhere (good).
| zoobab wrote:
| Can we add Gaumais to the list? I ask Llama3 questions on how to
| translate french to Gaumais, it was pretty good at it.
|
| https://fr.wikipedia.org/wiki/Gaumais
| Ylpertnodi wrote:
| All the different italian dialects, patois in French,
| schwebish...
| cess11 wrote:
| In this vein there's also the recent swiss Apertus.
|
| https://www.swiss-ai.org/apertus
| adt wrote:
| The EuroLLM-9B model release is from Dec/2024, and scores just
| above random chance for benchmarks like MMLU-Pro (17.6%, random
| chance is 10%).
|
| Comparison with similar EU models + 600 other highlights:
|
| https://lifearchitect.ai/models-table/
| danielam wrote:
| Curiously, just came across this paper [0].
|
| [0] https://arxiv.org/abs/2503.01996
| ph4evers wrote:
| How does it compare to Mistral's model?
| trilogic wrote:
| Great job, Thank you.
|
| We support your work and offer backup and distribution. Here a
| copy just in case:
| https://hugston.com/uploads/llm_models/EuroLLM-22B-Instruct-...
| supermatt wrote:
| > It is fully open source and available via Hugging Face.
|
| This model was released in 2024, and I couldn't find any links to
| the training data - is it just an open weights model?
| rmoriz wrote:
| Maybe we can call it "open weights" and not open source?
| Zufriedenheit wrote:
| EU officials should create an environment where abundant private
| companies can afford to put out many great open models instead of
| funding some selected individuals with taxpayer money.
| hebejebelus wrote:
| Some cursory clicking about didn't reveal to me the actual corpus
| they used, only that it is several trillion tokens 'divided
| across the languages'. I'm curious mainly because Irish (among
| some other similarly endangered languages on the list) typically
| has any large corpus come from legal/governmental texts that are
| required to be translated. There must surely be only a relatively
| tiny amount of colloquial Irish in the corpus. It be interesting
| to see some evals in each language particularly with native
| speakers.
|
| I think LLMs may be on the whole very positive for endangered
| languages such as Irish, but before it becomes positive I think
| there's an amount of danger to be navigated (see Scots Gaelic
| wikipedia drama for example)
|
| In any case I think this is a great initiative.
___________________________________________________________________
(page generated 2025-10-28 23:00 UTC)