[HN Gopher] The missing catalogue: why finding books in translat...
___________________________________________________________________
The missing catalogue: why finding books in translation is still so
hard
Author : AusiasTsel
Score : 32 points
Date : 2026-04-14 11:35 UTC (3 days ago)
(HTM) web link (blogs.lse.ac.uk)
(TXT) w3m dump (blogs.lse.ac.uk)
| AusiasTsel wrote:
| Author here. The piece is about bibliographic infrastructure, but
| the finding that surprised me most while building the dataset was
| language-specific: Catalan/Valencian (~10M speakers) jumped from
| near-invisibility in commercial aggregators to 8th place globally
| once nine national library catalogues were cross-referenced.
| Bengali, Thai and Urdu --all with substantial publishing
| industries-- remained near the bottom, not because translations
| don't exist but because the institutions documenting them haven't
| been connected yet. The 97% figure (editions appearing in only
| one of 14 sources) held across every sample I could run. Happy to
| answer questions about methodology, source coverage, or why ISBN
| metadata is such a mess.
| btrettel wrote:
| Have you all considered adding scientific articles to your
| bibliographic database? Finding existing translations of
| scientific articles can be a real pain. I know because I spent
| a lot of time doing that during my PhD [1].
|
| For a while I was collaborating with Victor Venema in the
| volunteer organization Translate Science [2] to try to create a
| bibliographic database of scientific translations, but
| unfortunately Victor died, and I became too busy to continue.
|
| [1] https://academia.stackexchange.com/a/93209/31143
|
| [2] https://translate-science.codeberg.page/
| tjirrkkkk wrote:
| Proper ISBN id is a lot of unpaid expensive work. If you run
| small print, you may have sent like 10% of all your prints to
| libraries at your own expense. Putting unregisted pdf on web is
| for free...
| shermantanktop wrote:
| I deal with similar issues. Translation is sometimes thought of
| as a mechanical process, but it is a creative process where the
| translator's approach varies from subtle to heavy-handed. At some
| point the translation can be thought of as a new creative work,
| and that line is hard to define.
|
| One of my parents was a translator who worked directly with
| authors, and in the review process the author would expand or
| refine the text in ways that were not present in the original. At
| that point, which work is the true representation of the authors
| intent, the fixed original or the updated translation?
| gobdovan wrote:
| It's so interesting to think about how there's fewer 'Le Petit
| Prince' versions in French (which there seems to be only one) vs
| in Chinese, where there seem to be at least 50 versions. [0]
|
| You could argue that there's more experimentation and creation in
| other languages than the original just because it's socially
| acceptable to do 'yet another translation', but not a newer
| version in the same language (unless it's a manual or technical
| material).
|
| [0] https://www.cjvlang.com/petitprince
| mysterypie wrote:
| > it's socially acceptable to do 'yet another translation', but
| not a newer version in the same language
|
| I wish they'd teach with modern English translations of
| Shakespeare in high schools. Maybe then kids would like it a
| lot more. But it seems like it's taboo to read Shakespeare in
| anything but the original.
| lamasery wrote:
| They do. One series often used is "No Fear Shakespeare".
| Facing-page "translation", relatively cheap.
|
| It's much better to watch it performed, though. The context
| the actors provide gets one past much of the difficulty with
| vocabulary or what have you. But yeah they do insist on
| reading them in school.
|
| > But it seems like it's taboo to read Shakespeare in
| anything but the original.
|
| You're definitely losing most of the sublimity in his actual
| words, if you don't read the original. Especially if the
| "translation" is into English at e.g. a 9th-grade reading
| level.
|
| In the case of Shakespeare in particular (and also certain
| archaic translations of the Bible, notably the King James)
| modernizing/simplifying it may alter the language enough that
| the reader may not recognize unacknowledged (because _of
| course_ your reader will know their Shakespeare) quotes from
| his works in other works, which quotes are _everywhere_ even
| in things like modern popular cinema or TV. A big part of why
| you read Shakespeare to begin with is that his influence is
| so extensive that you practically have to, or you 'll be
| missing one of a very-few not just helpful, but nigh-
| necessary, keys to understanding the rest of English
| literature (broadly, to include things like movies and video
| games and TV and so on)
| AusiasTsel wrote:
| You're right that version and edition aren't the same thing,
| and the catalogues I'm working with don't model "translation"
| as a first-class field -- translator credits live in free-text
| author fields and are wildly inconsistent across national
| libraries. The cleanest proxy I can offer is distinct
| publishers per language, read alongside the edition count. For
| Le Petit Prince, top languages by edition count:
| Language Publishers Editions Ed/Pub English
| 518 1,245 2.4 Spanish 416
| 1,055 2.5 Japanese 204 965 4.7
| French 312 928 3.0 (original)
| German 199 666 3.3 Italian
| 184 641 3.5 Chinese 233
| 361 1.5 ... Hebrew 3 138
| 46.0
|
| Two caveats are visible in the table. Publisher names aren't
| normalized across catalogues, so high counts in big markets
| (English, French) are inflated by imprint variants of a single
| house -- Gallimard, Gallimard Jeunesse, Editions Gallimard,
| Folio all show up as distinct. At the other extreme, Hebrew
| with 3 publishers on 138 editions is the proxy's other failure
| mode: one or two canonical translations reprinted repeatedly.
| So the number is directional, not absolute.
|
| The Chinese row is the cjvlang pattern in distilled form: 233
| distinct publishers with an edition-to-publisher ratio of 1.5
| means most Chinese publishers hold their own translation and
| reprint it only a handful of times before being displaced.
| That's consistent with -- and probably a conservative reading
| of -- cjvlang's "at least 50 versions" figure.
|
| One extra wrinkle worth flagging: "Chinese" in that row isn't
| one language. National library catalogues collapse at least
| five Sinitic languages -- Mandarin, Cantonese, Wu, Min Nan,
| Hakka -- under a single "zh" tag. Wikidata records separate
| Petit Prince translations in Cantonese, Wu, Hakka, and Min Nan,
| each with its own transliterated title ("Seu-Vong-Chu" in
| Hakka, "Sio Ong-chu" in Min Nan), but no national catalogue I
| pull from surfaces them as distinct. The same kind of collapse
| applies to Arabic, where "ar" hides Modern Standard plus
| several regional varieties that have their own literary
| traditions. So the 361 Chinese figure is already aggregating
| over a hidden second axis of variation.
|
| Japanese tells a different story: slightly fewer publishers
| (204) but almost five editions each, suggesting fewer distinct
| translations reprinted more widely. And the French baseline is
| dominated by one rights holder (Gallimard family), which is
| what you'd expect from an original-language market with a
| single canonical publisher.
|
| Retranslation within the source language is gated by copyright
| (Berne + 70 years post-mortem is a hard wall for most 20th-
| century work), the industry's default assumption that one
| canonical edition per language is enough, and reader
| expectation of fidelity when the original is in your own
| language. Saint-Exupery entered public domain in France in 2015
| and the French retranslation flow didn't materially open up --
| which I read as the publisher-economics side of your point
| dominating over the legal side. Retranslation into foreign
| languages has none of those brakes: every generation can argue
| its predecessor's Chinese / Japanese / Korean Petit Prince is
| dated or was done from English rather than French (often true),
| and a new translation is a lower-risk bet than trying to
| displace a domestic novel.
|
| Shakespeare is the visible English counterexample: "no-fear"
| modernizations, facing-page editions, precisely because the
| original has drifted far enough from contemporary English to be
| partly opaque. The Bible is the other obvious case. So
| "retranslation-within-language is taboo" breaks down once the
| time distance gets large enough -- roughly when the original
| stops being read without friction.
___________________________________________________________________
(page generated 2026-04-17 23:01 UTC)