[HN Gopher] The missing catalogue: why finding books in translat...
       ___________________________________________________________________
        
       The missing catalogue: why finding books in translation is still so
       hard
        
       Author : AusiasTsel
       Score  : 32 points
       Date   : 2026-04-14 11:35 UTC (3 days ago)
        
 (HTM) web link (blogs.lse.ac.uk)
 (TXT) w3m dump (blogs.lse.ac.uk)
        
       | AusiasTsel wrote:
       | Author here. The piece is about bibliographic infrastructure, but
       | the finding that surprised me most while building the dataset was
       | language-specific: Catalan/Valencian (~10M speakers) jumped from
       | near-invisibility in commercial aggregators to 8th place globally
       | once nine national library catalogues were cross-referenced.
       | Bengali, Thai and Urdu --all with substantial publishing
       | industries-- remained near the bottom, not because translations
       | don't exist but because the institutions documenting them haven't
       | been connected yet. The 97% figure (editions appearing in only
       | one of 14 sources) held across every sample I could run. Happy to
       | answer questions about methodology, source coverage, or why ISBN
       | metadata is such a mess.
        
         | btrettel wrote:
         | Have you all considered adding scientific articles to your
         | bibliographic database? Finding existing translations of
         | scientific articles can be a real pain. I know because I spent
         | a lot of time doing that during my PhD [1].
         | 
         | For a while I was collaborating with Victor Venema in the
         | volunteer organization Translate Science [2] to try to create a
         | bibliographic database of scientific translations, but
         | unfortunately Victor died, and I became too busy to continue.
         | 
         | [1] https://academia.stackexchange.com/a/93209/31143
         | 
         | [2] https://translate-science.codeberg.page/
        
       | tjirrkkkk wrote:
       | Proper ISBN id is a lot of unpaid expensive work. If you run
       | small print, you may have sent like 10% of all your prints to
       | libraries at your own expense. Putting unregisted pdf on web is
       | for free...
        
       | shermantanktop wrote:
       | I deal with similar issues. Translation is sometimes thought of
       | as a mechanical process, but it is a creative process where the
       | translator's approach varies from subtle to heavy-handed. At some
       | point the translation can be thought of as a new creative work,
       | and that line is hard to define.
       | 
       | One of my parents was a translator who worked directly with
       | authors, and in the review process the author would expand or
       | refine the text in ways that were not present in the original. At
       | that point, which work is the true representation of the authors
       | intent, the fixed original or the updated translation?
        
       | gobdovan wrote:
       | It's so interesting to think about how there's fewer 'Le Petit
       | Prince' versions in French (which there seems to be only one) vs
       | in Chinese, where there seem to be at least 50 versions. [0]
       | 
       | You could argue that there's more experimentation and creation in
       | other languages than the original just because it's socially
       | acceptable to do 'yet another translation', but not a newer
       | version in the same language (unless it's a manual or technical
       | material).
       | 
       | [0] https://www.cjvlang.com/petitprince
        
         | mysterypie wrote:
         | > it's socially acceptable to do 'yet another translation', but
         | not a newer version in the same language
         | 
         | I wish they'd teach with modern English translations of
         | Shakespeare in high schools. Maybe then kids would like it a
         | lot more. But it seems like it's taboo to read Shakespeare in
         | anything but the original.
        
           | lamasery wrote:
           | They do. One series often used is "No Fear Shakespeare".
           | Facing-page "translation", relatively cheap.
           | 
           | It's much better to watch it performed, though. The context
           | the actors provide gets one past much of the difficulty with
           | vocabulary or what have you. But yeah they do insist on
           | reading them in school.
           | 
           | > But it seems like it's taboo to read Shakespeare in
           | anything but the original.
           | 
           | You're definitely losing most of the sublimity in his actual
           | words, if you don't read the original. Especially if the
           | "translation" is into English at e.g. a 9th-grade reading
           | level.
           | 
           | In the case of Shakespeare in particular (and also certain
           | archaic translations of the Bible, notably the King James)
           | modernizing/simplifying it may alter the language enough that
           | the reader may not recognize unacknowledged (because _of
           | course_ your reader will know their Shakespeare) quotes from
           | his works in other works, which quotes are _everywhere_ even
           | in things like modern popular cinema or TV. A big part of why
           | you read Shakespeare to begin with is that his influence is
           | so extensive that you practically have to, or you 'll be
           | missing one of a very-few not just helpful, but nigh-
           | necessary, keys to understanding the rest of English
           | literature (broadly, to include things like movies and video
           | games and TV and so on)
        
         | AusiasTsel wrote:
         | You're right that version and edition aren't the same thing,
         | and the catalogues I'm working with don't model "translation"
         | as a first-class field -- translator credits live in free-text
         | author fields and are wildly inconsistent across national
         | libraries. The cleanest proxy I can offer is distinct
         | publishers per language, read alongside the edition count. For
         | Le Petit Prince, top languages by edition count:
         | Language    Publishers   Editions   Ed/Pub       English
         | 518        1,245      2.4       Spanish         416
         | 1,055      2.5       Japanese        204          965      4.7
         | French          312          928      3.0   (original)
         | German          199          666      3.3       Italian
         | 184          641      3.5       Chinese         233
         | 361      1.5       ...       Hebrew            3          138
         | 46.0
         | 
         | Two caveats are visible in the table. Publisher names aren't
         | normalized across catalogues, so high counts in big markets
         | (English, French) are inflated by imprint variants of a single
         | house -- Gallimard, Gallimard Jeunesse, Editions Gallimard,
         | Folio all show up as distinct. At the other extreme, Hebrew
         | with 3 publishers on 138 editions is the proxy's other failure
         | mode: one or two canonical translations reprinted repeatedly.
         | So the number is directional, not absolute.
         | 
         | The Chinese row is the cjvlang pattern in distilled form: 233
         | distinct publishers with an edition-to-publisher ratio of 1.5
         | means most Chinese publishers hold their own translation and
         | reprint it only a handful of times before being displaced.
         | That's consistent with -- and probably a conservative reading
         | of -- cjvlang's "at least 50 versions" figure.
         | 
         | One extra wrinkle worth flagging: "Chinese" in that row isn't
         | one language. National library catalogues collapse at least
         | five Sinitic languages -- Mandarin, Cantonese, Wu, Min Nan,
         | Hakka -- under a single "zh" tag. Wikidata records separate
         | Petit Prince translations in Cantonese, Wu, Hakka, and Min Nan,
         | each with its own transliterated title ("Seu-Vong-Chu" in
         | Hakka, "Sio Ong-chu" in Min Nan), but no national catalogue I
         | pull from surfaces them as distinct. The same kind of collapse
         | applies to Arabic, where "ar" hides Modern Standard plus
         | several regional varieties that have their own literary
         | traditions. So the 361 Chinese figure is already aggregating
         | over a hidden second axis of variation.
         | 
         | Japanese tells a different story: slightly fewer publishers
         | (204) but almost five editions each, suggesting fewer distinct
         | translations reprinted more widely. And the French baseline is
         | dominated by one rights holder (Gallimard family), which is
         | what you'd expect from an original-language market with a
         | single canonical publisher.
         | 
         | Retranslation within the source language is gated by copyright
         | (Berne + 70 years post-mortem is a hard wall for most 20th-
         | century work), the industry's default assumption that one
         | canonical edition per language is enough, and reader
         | expectation of fidelity when the original is in your own
         | language. Saint-Exupery entered public domain in France in 2015
         | and the French retranslation flow didn't materially open up --
         | which I read as the publisher-economics side of your point
         | dominating over the legal side. Retranslation into foreign
         | languages has none of those brakes: every generation can argue
         | its predecessor's Chinese / Japanese / Korean Petit Prince is
         | dated or was done from English rather than French (often true),
         | and a new translation is a lower-risk bet than trying to
         | displace a domestic novel.
         | 
         | Shakespeare is the visible English counterexample: "no-fear"
         | modernizations, facing-page editions, precisely because the
         | original has drifted far enough from contemporary English to be
         | partly opaque. The Bible is the other obvious case. So
         | "retranslation-within-language is taboo" breaks down once the
         | time distance gets large enough -- roughly when the original
         | stops being read without friction.
        
       ___________________________________________________________________
       (page generated 2026-04-17 23:01 UTC)