[HN Gopher] Fast and secure translation on your local machine wi...
       ___________________________________________________________________
        
       Fast and secure translation on your local machine with a GUI
        
       Author : Intralexical
       Score  : 141 points
       Date   : 2024-04-14 01:22 UTC (21 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Intralexical wrote:
       | Interestingly, I think this is actually related to the offline
       | translation features built into Firefox. Both are products of
       | "Project Bergamot", but the Mozilla-maintained version was later
       | merged into the Firefox application:
       | 
       | https://browser.mt/
       | 
       | https://blog.mozilla.org/en/mozilla/local-translation-add-on...
       | 
       | https://hacks.mozilla.org/2022/06/training-efficient-neural-...
       | 
       | https://github.com/mozilla/firefox-translations
       | 
       | https://firefox-source-docs.mozilla.org/toolkit/components/t...
       | 
       | Extra webpage with screenshot and links, impossible to search for
       | normally:
       | 
       | https://translatelocally.com/downloads/
       | 
       | Does one thing and does it well.
       | 
       | Oh-- For downloading models, it's much easier to pipe/`xargs`
       | `translateLocally --available-models` into `translateLocally -d`
       | than go through the GUI.
       | 
       | ---
       | 
       | Other self-hostable translation tools:
       | 
       | https://www.apertium.org/index.eng.html
       | 
       | - Traditional rule-based translation. Seems to work pretty well,
       | but no good desktop frontend.
       | 
       | https://www.argosopentech.com/
       | 
       | - Works, but crashy desktop app.
       | 
       | https://libretranslate.com/
       | 
       | - API wrapping Argos Translate.
       | 
       | https://lingva.thedaviddelta.com/
       | 
       | - Google Translate scraper/privacy frontend.
       | 
       | https://euroglot.com/
       | 
       | - Proprietary, subscription trialware.
        
       | eviks wrote:
       | How does this compare to Deepl in translation quality?
        
       | lbj wrote:
       | Anyone happen to know how this program got funded by Horizon?
        
         | DavidKarlas wrote:
         | https://cordis.europa.eu/project/id/825303 seems to have a lot
         | of info.
        
           | isoprophlex wrote:
           | Whoa, 3 million EUR! Nice!
           | 
           | I hope they'll fund more things that aim to break
           | cloud/vendor lock-in.
        
       | andrekandre wrote:
       | could this be used for locally translating between programming
       | languages in addition to natural languages?
       | 
       | any models for that available i wonder?
        
         | vertis wrote:
         | It depends on the scale you're talking about but the local LLMs
         | are fairly capable of translating between different programming
         | languages, particularly if you're not so concerned about
         | external library support.
         | 
         | Pasting a function in and asking for it in a different
         | programming language will get an implementation in the target
         | language. Using ollama run llama2:13b on my mac will allow
         | converting in this fashion.
         | 
         | It might not be the best code, but this is true of machine
         | translations as well.
        
           | andrekandre wrote:
           | thanks for the reply!                 > Using ollama run
           | llama2:13b on my mac will allow converting in this fashion
           | 
           | okay, i'll have t try this out                 > It might not
           | be the best code, but this is true of machine translations as
           | well.
           | 
           | true, one thing im hoping for is some local version of
           | copilot/chatgpt for code where its trained on local libraries
           | and the local project and such (and can translate them in
           | some scenarios)
        
         | Intralexical wrote:
         | Computer programming languages are already machine-parsable
         | though. ML does not seem like the appropriate solution for
         | converting between them.
         | 
         | Technically what you're describing is done by a
         | compiler/decompiler/transpiler, operating on the AST.
        
           | warkdarrior wrote:
           | Programming languages are machine-parseable _if_ you already
           | have a parser for them. The befit of LLMs is that you do not
           | need a parser for each programming language.
        
       | malloc-0x90 wrote:
       | I was looking for something like this since I found the awesome
       | Firefox plugin, thank you!
        
       | hgyjnbdet wrote:
       | How does this compare to something like Whisper?
       | 
       | EDIT: this is a genuine question as I don't have a clue. Rather
       | than downvoting without comment, maybe downvote and let me know
       | why my question is dumb?
        
         | andrewcamel wrote:
         | It's translation (text -> text), not speech -> text.
        
           | hgyjnbdet wrote:
           | Thanks, much appreciated for the clarification. I clearly
           | overlooked that, which now it's pointed out seems entirely
           | obvious, my bad. Only took negative karma for it to click,
           | haha.
        
             | Intralexical wrote:
             | Ironically, the other link I posted at the same is actually
             | speech to text. You want something like VOSK if you're
             | looking for local machine transcription:
             | 
             | https://news.ycombinator.com/item?id=40027675
             | 
             | As for quality, I think its models are, IDK, maybe around
             | the level that Youtube automatic captions were two or three
             | years ago? So well over 90% accurate, and servicable for
             | getting something to search for or clean up, but expect it
             | to get a word wrong every now and then.
        
         | specproc wrote:
         | This post got downvoted, but there's a legit point here. I've
         | found whisper's translated speech to text to be pretty decent,
         | certainly compared to the reported quality of this bergamot-
         | tiny used in the OP.
         | 
         | FWIW, I like Helinski opus on Huggingface, worth checking out
         | if you need machine translation and can deal with sub Google
         | Translate quality.
        
       | avodonosov wrote:
       | Good idea, I wish it to work well.
       | 
       | Tried to translate on the official website https://private.mt/.
       | The phrase "PRIVATE MACHINE TRANSLATION, RUNNING LOCALLY ON YOUR
       | DEVICE" translates to Russian as "PRAKTRONATRATIVNAIa RANNEE
       | PEREDAZhA, POVESTKI DLIa VAShEGO USEDANIIa", which is an
       | uncomprehencive combination of characters (although "vashego" is
       | a correct word).
       | 
       | Translating to Ukrainian also produced rubbish: "PRIVATNA
       | MAChINNA PEREKLAD, RUNUVANNIa LOKALIYi NA SVOYiKh DEVISNIKIV".
       | 
       | To German it translates as "PRIVATE MACHINE UBERSETZUNG, RUNNING
       | LOCALLY AUF IHREM DEVICE", comparing to Google translate "PRIVATE
       | MASCHINENUBERSETZUNG, LAUFT LOKAL AUF IHREM GERAT".
        
         | KTibow wrote:
         | It might because it's uppercased? I tried lowercasing it, and
         | it translated to "Chastnyi perevod mashin, rabotaet lokal'no na
         | sobstvennom ustroistve", which GPT claims is correct
        
           | fao_ wrote:
           | > which GPT says is correct
           | 
           | AI really has broken people's brains, huh :/
        
             | BolexNOLA wrote:
             | Either I'm missing a joke or you're being very
             | unnecessarily rude. Hoping it's the former.
        
               | prmoustache wrote:
               | He is right, you don't confirm that an LLM works well by
               | comparing it to the result of another LLM.
        
               | Onawa wrote:
               | He's basically trying to say don't fully put your trust
               | in any LLM, even one of the top ones. As you can see from
               | an adjacent comment from someone who speaks the language,
               | it's a closer translation but still not quite right.
               | 
               | > Seems so. I typed in a lowered version now, gives good
               | translation "chastnyi mashinnyi perevod, rabotaia
               | lokal'no na vashem ustroistve". (The one you got is a
               | little clumsy ~ "translation of machines")
        
               | Intralexical wrote:
               | If you don't know something, then just say you don't
               | know. Deferring to an LLM just comes across as low-effort
               | and irrelevant.
        
             | Intralexical wrote:
             | As with many things spread by the Internet, I think it's
             | just lowered the bar (and the effort) for participation.
             | But the brains are the same.
        
           | avodonosov wrote:
           | Seems so. I typed in a lowered version now, gives good
           | translation "chastnyi mashinnyi perevod, rabotaia lokal'no na
           | vashem ustroistve". (The one you got is a little clumsy ~
           | "translation of machines")
        
             | troupo wrote:
             | It's not a good translation. The participle is in the wrong
             | form, the first part of the sentence does not make sense in
             | this context and reads something like "this is the
             | machine's private business's translation".
        
               | avodonosov wrote:
               | No, it does not read like that. The translation is
               | (reletively) good.
        
               | troupo wrote:
               | It does read like that. While "chastnyi" iz the direct
               | mechanical translation of "private", it does not work
               | like that in the context.
        
           | troupo wrote:
           | Second part of that sentence is correct. First one makes no
           | sense in the context, and is awkwardly constructed. First
           | part says something like "machines' private translation (as
           | in this is the private translation business run by machines)"
           | though there other possible interpretations
        
         | Intralexical wrote:
         | It depends entirely on the specific model it's using, I guess.
         | I believe the currently list it queries is here:
         | 
         | https://translatelocally.com/models.json
         | 
         | ...Oddly, I don't actually even _have_ Russian available at all
         | in my desktop install of this. And there is Ukrainian, but only
         | Ukrainian-to-English, so there 's also no way that it could
         | even be using another language as a pivot as there aren't any
         | models that output Ukrainian. I guess the website might be
         | using old, known bad models or something?
         | 
         | With "English-German tiny", I get "Private maschinelle
         | Ubersetzung, lauft lokal auf Ihrem Gerat", and with "English-
         | German base", I get "Private maschinelle Ubersetzung, die lokal
         | auf Ihrem Gerat lauft", though I had to type it in lowercase.
         | 
         | I'd trust the translation quality enough to _read_ foreign
         | articles in it. Not enough to translate anything meant for
         | anyone else to read...
        
           | specproc wrote:
           | Yeah, it'll all be about the model. I do a fair bit of
           | machine translation and the Helinski opus models are
           | generally good enough for groking what a text is about.
           | Definitely better than the examples above.
        
         | roywiggins wrote:
         | Well, it did say it was fast and secure, nothing about
         | accuracy...
        
         | avodonosov wrote:
         | As we clarified under the sibling comment by KTibow (why is it
         | flagged now?), downcasing the input text results on OK
         | translations.
        
         | User23 wrote:
         | > To German it translates as "PRIVATE MACHINE UBERSETZUNG,
         | RUNNING LOCALLY AUF IHREM DEVICE"                 ATTENTION
         | This room is fullfilled mit special electronische equippment.
         | Fingergrabbing and pressing the cnoeppkes from the computers is
         | allowed for die experts only!  So all the "lefthanders" stay
         | away        and do not disturben the brainstorming von here
         | working        intelligencies.  Otherwise you will be out
         | thrown and kicked        anderswhere!  Also: please keep still
         | and only watchen astaunished        the blinkenlights.
        
       | ngcc_hk wrote:
       | What languages it supports? How about Japanese and chinese?
        
       | geococcyxc wrote:
       | In firefox, you can get this by navigating to about:translations
        
       | underlines wrote:
       | The models used, without really trying them yet, seem to be much
       | older and much worse compared to seamless-m4t-v2 [1] which is
       | multi-modal and support the tasks of:
       | 
       | Speech-to-speech translation (S2ST) Speech-to-text translation
       | (S2TT) Text-to-speech translation (T2ST) Text-to-text translation
       | (T2TT) Automatic speech recognition (ASR).
       | 
       | across
       | 
       | 101 languages for speech input. 96 Languages for text
       | input/output. 35 languages for speech output.
       | 
       | I tried it for low resource languages like Thai to German for
       | text and audio, and it works quite well.
       | 
       | 1 https://huggingface.co/facebook/seamless-m4t-v2-large
        
         | Intralexical wrote:
         | > https://huggingface.co/facebook/seamless-m4t-v2-large
         | 
         | Unfortunately, interpreting "CC-BY-NC" as a software license, I
         | think you'd be pirating if you used the linked models for
         | anything you might sell.
         | 
         | (Bergamot is BY-SA, but I think the virality would only apply
         | to derivative models and not model outputs, whereas Facebook's
         | NonCommercial clause might apply to usage of the original model
         | itself, as it usually does in software licenses.)
        
       ___________________________________________________________________
       (page generated 2024-04-14 23:02 UTC)