[HN Gopher] Kagi Translate
       ___________________________________________________________________
        
       Kagi Translate
        
       Author : lkellar
       Score  : 173 points
       Date   : 2024-11-07 19:32 UTC (3 hours ago)
        
 (HTM) web link (blog.kagi.com)
 (TXT) w3m dump (blog.kagi.com)
        
       | ziddoap wrote:
       | > _Quality ratings based on internal testing and user feedback_
       | 
       | I'd be interested in knowing more about the methodology here.
       | People who use Kagi tend to _love_ Kagi, so bias would certainly
       | get in the way if not controlled for. How rigorous was the
       | quality-rating process? How big of a difference is there between
       | "Average", "High" and "Very High"?
       | 
       | I'm also curious to the 1 additional language that Kagi supports
       | (Google is listed at 243, Kagi at 244)?
       | 
       | > _Kagi Translate is free for everyone._
       | 
       | That's nice!
        
         | lcnPylGDnU4H9OF wrote:
         | > I'm also curious to the 1 additional language that Kagi
         | supports (Google is listed at 243, Kagi at 244)?
         | 
         | I just copied all of the values from the select element on the
         | page (https://translate.kagi.com/) and there's only 243. Now I
         | genuinely wonder if it's Pig Latin.
         | https://news.ycombinator.com/item?id=42080562
        
           | banana_giraffe wrote:
           | Also, notable, Google claims to support Inuktut and Tshiluba,
           | and I don't see those two in Kagi.
        
         | up6w6 wrote:
         | I am very suspicious of the results. A few months ago they
         | published a LLM benchmark, calling it "perfect" while it
         | actually contained like only 50 inputs (academic benchmark
         | datasets usually contain tens of thousands of inputs).
        
         | ks2048 wrote:
         | A quick scrape of the two sites gives (literally a diff of sets
         | of the strings used in language selection),
         | 
         | In Kagi, not Google:                 Crimean Tatar
         | Santali
         | 
         | In Google, not Kagi:                 Crimean Tatar (Cyrillic)
         | Crimean Tatar (Latin)       French (Canada)       Inuktut
         | (Latin)       Inuktut (Syllabics)       Santali (Latin)
         | Santali (Ol Chiki)       Tshiluba
         | 
         | They really must have copied Google, because like I said this
         | was diffing exact strings, meaning that slight variations of
         | how the languages are presented don't exist.
        
       | krackers wrote:
       | How is this compared to using gpt-4 directly?
        
         | burkaman wrote:
         | I don't know how the translation quality compares, but the
         | advantages to this would be that it's free and it can translate
         | web pages in-place.
        
           | Aachen wrote:
           | And presumably the energy efficiency of a dedicated
           | translator compared to a generic language system, assuming
           | they didn't build this on top of a GPT. The blog post doesn't
           | say but I'm assuming (perhaps that's no longer accurate) that
           | it's prohibitively expensive for a small team without huge
           | funding to build such a model as a side project
        
         | elashri wrote:
         | It varies depending on the language but I find GPT4o to be good
         | into knowing the context and go sometimes with the intent not
         | just the grammar and rules of the language. But for most cases
         | it is an overkill and you still have the chance of
         | hallucination (although it has less occurrence chances in these
         | use cases)
         | 
         | This is of course based on my experience using it between
         | Arabic, English and French which is among the 5 most popular
         | languages. Things might be dramatically different with other
         | languages.
        
           | ilaksh wrote:
           | Have you compared gpt-4o to Kagi?
           | 
           | They might actually be the same thing in some cases.
        
       | gen3 wrote:
       | Has anyone seen info on how this works? "It's not revolutionary"
       | seems like an understatement when you can do better then DeepL
       | and support more languages then google?
        
         | kouteiheika wrote:
         | I'm pretty sure it's just a finetuned LLM.
         | 
         | I have some experience experimenting in this space; it's not
         | actually that hard to build a model which surpasses DeepL, and
         | the wide language support is just a consequence of using an LLM
         | trained on the whole Internet, so the model picks up the
         | ability to use a bunch of languages.
        
           | ilaksh wrote:
           | I'm almost sure they did not find tune an LLM. They are using
           | existing LLMs because fine tuning to best the SOTA models at
           | translation is impractical unless you target very niche
           | languages and even then it would be very hard to get a better
           | dataset than what is already used for those models.
           | 
           | Probably all they are doing is like switching between some
           | Qwen model (for Chinese) and large Llama or maybe OpenAI or
           | Gemini.
           | 
           | So they just have a step (maybe also an LLM) to guess which
           | model is best or needed for the input. Maybe something really
           | short and simple just goes to a smaller simpler less
           | expensive model.
        
         | freediver wrote:
         | It uses a combination of LLMs, selecting the best output. (from
         | the blog post)
        
           | gen3 wrote:
           | Ah, I missed that. Thank you!
        
         | a2128 wrote:
         | It just uses LLMs, I've had it output a refusal in the target
         | language by entering stuff about nukes in the input
        
       | leipert wrote:
       | Kudos in the launch! Looking good!
       | 
       | One benefit of Google Translate is with languages like Hebrew and
       | Arabic, you can enter in those languages phonetically or with on-
       | screen keyboards.
        
       | ks2048 wrote:
       | > Limitations
       | 
       | > We do not translate dynamically created content ...
       | 
       | What does that mean?
        
         | agluszak wrote:
         | I would guess it's only able to translate the html content sent
         | on page load - so static webpages, but not SPAs etc.
        
         | jsheard wrote:
         | I assume it means they only translate what's in the HTML, not
         | anything that's added via Javascript later.
        
           | freedomben wrote:
           | Indeed, that's what would make most sense to me.
           | 
           | I also strongly suspect the way they're able to make it free
           | is by caching the results, so each translation only happens
           | one time regardless of how many requests for the page happen.
           | If they translated dynamic content, they couldn't (safely)
           | cache the results.
        
             | kevincox wrote:
             | I don't think JS vs HTML would make any difference to
             | caching.
             | 
             | If they are caching by URL you can have dynamic HTML
             | generation or a JS generated page that is the same on every
             | load.
             | 
             | If you are caching by the text then you can do the same for
             | HTML or JS generated (you are just reading the text out of
             | the DOM when the JS seems done).
        
           | ks2048 wrote:
           | Ah, that makes sense. In my head it sounded like server-side
           | dynamic content OR not wanting to translate LLM outputs,
           | neither of which makes sense or is possible.
        
           | _kidlike wrote:
           | that's what I think too, which kinda makes sense since it's a
           | page, and not a browser plugin. If they implemented a browser
           | plugin that would do what Google recently removed from their
           | plugin, that would be a killer feature. (assuming they can
           | then translate all html as it comes in)
           | 
           | Brave browser does it already though, but sometimes it's
           | unusably slow.
        
           | Aachen wrote:
           | Is that a relevant username, or is J your initial? I can't
           | quite place what "JavaScript heard" would mean. I've wondered
           | before but there's no contact in your profile and now it felt
           | at least somewhat related to the comment itself, sorry for
           | being mostly off-topic
        
             | jsheard wrote:
             | It's an initial :p
        
               | Aachen wrote:
               | Mystery solved! Thanks for obliging my curiosity :)
        
       | ohmahjong wrote:
       | Disclaimer: I am already a Kagi customer.
       | 
       | At least for Afrikaans I'm not impressed here. There are some
       | inaccuracies, like "varktone" becoming "pork rinds" instead of
       | "pig toes" and also some censorship ("jou ma se poes" does NOT
       | mean "play with my cat"!). Comparing directly against Google
       | Translate, Google nails everything I threw at it.
       | 
       | I didn't see any option to provide feedback, suggested
       | translations, etc, but I'm hopeful that this service improves.
        
         | burkaman wrote:
         | This is the link they gave for feedback:
         | https://kagifeedback.org/d/5305-kagi-translate-feedback/4
        
         | wongarsu wrote:
         | Just tried translating your comment to German. Kagi took a very
         | literal approach, keeping sentence structure and word choice
         | mostly the same. Google Translate and DeepL both went for more
         | idiomatic translations.
         | 
         | However translating some other comments from this thread, there
         | are cases where Kagi outperforms others on correctness. For
         | example one comment below talks about "encountering multiple
         | second page loads". Google Translate misunderstands this as
         | "encountering a second page load multiple times" while DeepL
         | and Kagi both get it right with "encountering page loads of
         | multiple seconds" (with DeepL choosing a slightly more
         | idiomatic wording)
        
         | epoxia wrote:
         | I asked some inappropriate things and it was "translated" to I
         | cannot assist with that request. It definitely needs to be more
         | clear when it's refusing to translate. But, then again, I don't
         | even use kagi.
        
           | GaggiX wrote:
           | Maybe they are using Claude API for the translation, Claude
           | models are really good multilingual models.
           | 
           | EDIT: the "Limitations" section report the use of LLMs
           | without specifying the models used.
        
         | FurkanKambay wrote:
         | "The game is my poem" when back-translated from the Turkish
         | translation, "oyun benim siirimdir". And there's censorship too
         | when doing EN-TR for a few other profanities I tested. When you
         | add another particular word to the sentence, it outputs "play
         | with my cat, dad".
        
       | dlkmp wrote:
       | Just as a quick usability feedback: As long as Deepl translates
       | asynchronously as I type, while Kagi requires a full form send &
       | page refresh, I am not inclined to switch (translation quality is
       | also already too good for my language pairs to consider switching
       | for minor improvements, but the usability/ speed is the real
       | feature here).
       | 
       | This is coming from a user with existing Kagi Ultimate
       | subscription, so I'm generally very open to adopt another tool if
       | it fits my needs).
       | 
       | Slightly offtopic, slight related: As already mentioned the last
       | time Kagi hit the HN front page when I saw it: the best
       | improvement I could envision for kagi is improved search
       | performance (page speed). I still encounter multiple second page
       | loads far too frequently that I didn't notice with other search
       | engines.
        
         | Aachen wrote:
         | Interesting, I'm actually annoyed that DeepL sends every
         | keystroke and I'm using idk how many resources on their end
         | when I'm just interested in the result at the end and for DeepL
         | to receive the final version I want to share with them
         | 
         | That it's fast, you don't have to wait much between finishing
         | typing and the result being ready, that's great and probably
         | better than any form system is likely to be. But if it could be
         | a simple enter press and then async loading the result, that
         | sounds great to me
        
         | czottmann wrote:
         | I uninstalled the DeepL extension because it would load all its
         | assets (fonts etc) into every. single. page. No matter the
         | host.
         | 
         | Unacceptable.
        
         | burkaman wrote:
         | This will be a paid feature apparently:
         | https://kagifeedback.org/d/5305-kagi-translate-feedback/9
        
         | freediver wrote:
         | > As long as Deepl translates asynchronously as I type, while
         | Kagi requires a full form send & page refresh,
         | 
         | This leads to increased cost and we wanted to keep service
         | free. But yes we will introduce translate as your type (will be
         | limited to paid Kagi members).
        
       | pentacent_hq wrote:
       | I recently noticed that Google Translate and Bing have trouble
       | translating the German word "Orgel" ("organ", as in "church
       | organ", not as in "internal organs") to various languages such as
       | Vietnamese or Hebrew. In several attempts, they would translate
       | the word to an equivalent of "internal organs" even though the
       | German word is, unlike the English "organ", unambiguous.
       | 
       | Kagi Translate seems to do a better job here. It correctly
       | translates "Orgel" to "dan organ" (Vietnamese) and "`vgb"
       | (Hebrew).
        
         | ynoxinul wrote:
         | Google Translate often translates words through English.
        
           | Aachen wrote:
           | DeepL also, for the record (since it's being compared in the
           | submission)
           | 
           | It's pretty clear if you use the words out of context and
           | they're true friends but it gets you the German translation
           | of the English translation of whatever Dutch thing you put
           | in. I also heard somewhere, perhaps when interviewing with
           | DeepL, that they were working towards / close to not needing
           | to do that anymore, but so far no dice that I've noticed and
           | it has been a few years
        
       | o11c wrote:
       | If you write the input in Pig Latin, Kagi detects it as English
       | but translates it correctly.
       | 
       | Bing detects it as English but leave it unchanged.
       | 
       | Google detects it as Telegu and gives a garbage translation.
       | 
       | ChatGPT detects it as Pig Latin and translates it correctly.
        
       | jabroni_salad wrote:
       | Looks like the page translator wants to use an iframe, so of
       | course the x-frame-options header of that page will be the
       | limiting factor.
       | 
       | > To protect your security, note.com will not allow Firefox to
       | display the page if another site has embedded it. To see this
       | page, you need to open it in a new window.
       | 
       | This is a super common setting and it's why I use a browser
       | extension instead.
        
       | I_am_tiberius wrote:
       | I find it useless without an option to add context to the text I
       | want to translate.
        
         | Aachen wrote:
         | What do you mean? Does any other translator have such a
         | separate field that you could point to, or could you explain
         | what you're missing?
         | 
         | When I want to give DeepL context, I just write it in the
         | translation field (also, because it's exceptionally bad at
         | single word translations, I do it even if the word should be
         | unambiguous), so not type in "Katze" but "die Katze schnurrt"
         | (the cat purrs). Is that the kind of thing you mean?
        
       | Aachen wrote:
       | I can't use it because I'm not classified as "human" by a
       | computer. There is no captcha that I could get wrong, just a
       | checkbox that probably uses a black box model to classify me
       | automatically
       | 
       | Was curious after the post claimed that the quality is better
       | than Google and DeepL, but the current top comment showed
       | translations from Afrikaans that it got wrong but I could
       | understand as a Dutch person who doesn't even speak that language
       | (so it's not like seven levels of negation and colloquialisms
       | that they broke it on)
       | 
       | What do I do with this "Error Code: 600010"? I've submitted a
       | "report" but obviously they're not going to know if those reports
       | are from a bot author frustrated with the form or me, a paying
       | customer of Kagi's search engine. The feedback page linked in the
       | blog post has the same issue: requires you to log in before being
       | able to leave feedback, but "We couldn't verify if you're a robot
       | or not." The web is becoming more fragmented and unusable every
       | day...
        
         | kunwon1 wrote:
         | I had tons of issues with these Cloudflare checkboxes. I
         | finally figured out it was because I use this extension [1]
         | that disables HTML5 autoplay. I assume Cloudflare is doing some
         | kind of thing where they verify that the client can playback
         | media, as they assume that headless browsers or crawlers won't
         | have that capability
         | 
         | [1] https://addons.mozilla.org/en-US/firefox/addon/disable-
         | autop...
        
         | freediver wrote:
         | > I can't use it because I'm not classified as "human" by a
         | computer.
         | 
         | It uses Cloudflare Turnstile captcha.
         | 
         | The service shows no captcha to logged in Kagi users, so you
         | can just create a (trial) Kagi account.
        
           | Aachen wrote:
           | Thanks, but I am logged in and it still shows that. Clicking
           | log in at the top of the page leads me to the login page
           | which takes about 10 seconds to (while I'm typing) realise
           | that I'm already logged in and then redirects me to the
           | homepage (kagi search)
           | 
           | I don't have any site-specific settings and clearly HN works
           | fine (as well as other sites) so it's not that cookies are
           | disabled or such
           | 
           | Edit: come to think of it, I'm surprised that you find
           | translator data to be more sensitive (worth sticking behind a
           | gatekeeper) than user logins. Must have been a lot of work to
           | develop this intellectual property. There is no Cloudflare
           | check on the login page. Not that I'd want to give you ideas,
           | though! :-)
        
             | freediver wrote:
             | > come to think of it, I'm surprised that you find
             | translator data to be more sensitive (worth sticking behind
             | a gatekeeper) than user logins. Must have been a lot of
             | work to develop this intellectual property. There is no
             | Cloudflare check on the login page.
             | 
             | This is just a simple anti-bot measure so we do not get
             | hammered by them to death (kagi does not have an infinite
             | treasure chest). It is not needed for search, because you
             | can not use search for free anyway.
        
               | Aachen wrote:
               | I see, that makes sense!
        
         | ziddoap wrote:
         | > _What do I do with this "Error Code: 600010"?_
         | 
         | Cloudfare, the gatekeeper of the internet, strikes again.
         | 
         | The usual suspects are VPN or proxy, javascript, cookies, etc.
         | 
         | https://developers.cloudflare.com/turnstile/troubleshooting/...
         | 
         | Unfortunately, even with the error code, I doubt the above page
         | will help much.
        
         | baxtr wrote:
         | Interesting. Never had such an issue with Google. How do they
         | do it?
        
       | spiderfarmer wrote:
       | I would love to see an API to compete with DeepL.
        
       | erinnh wrote:
       | Kagi develops lots of features, but they seem to often be
       | quarter-baked.
       | 
       | Maps for example is basically unusable and has been for a while.
       | (at least in Germany)
       | 
       | Trying to search for an address often leads Kagi maps to go to a
       | different random address.
       | 
       | Still love the search, but Id love for Kagi to concentrate on one
       | thing at a time.
        
         | Aachen wrote:
         | Where do I find the map feature?
         | 
         | I'm curious to see if I can identify what data source and
         | search software it is based on, since I've heard similar
         | complaints about Nominatim and it is indeed finicky if you made
         | a typo or don't know the exact address; it does no context
         | search based on the current view afaik. Google really does do
         | search _well_ compared to the open source software I 'm partial
         | to, I gotta give them that
         | 
         | Edit: ah if you horizontally scroll on the homepage there's a
         | "search maps" thing. Putting in a street name near me that's
         | unique in the world, it comes up with a lookalike name in
         | another country. Definitely not any OpenStreetMap-based product
         | I know of then, they usually aren't unliteral like that. Since
         | the background map is Apple by default, I guess that's what the
         | search is as well
        
           | maronato wrote:
           | It's in Search. It's one of the types of search you can
           | perform. Below the search input is a bar with "Images",
           | "Videos", "News", and "Maps".
           | 
           | Can also be found here:
           | 
           | https://kagi.com/maps
        
         | freediver wrote:
         | We are focusing most our resources on search (which I hope you
         | can agree, we are doing a pretty good job at). And it turns out
         | search is not enough and you need other things - like maps (or
         | a browser, because some browsers will not let you change search
         | engine and our paid users can not use the service). Both are
         | also incredibly hard to do right. If it appears quarter-baked
         | (and I am first to say that we can and will definetely keep
         | imporivng improving with our products), it is not for the lack
         | of trying or ambition but the lack of resources. Kagi is 100%
         | user-funded. So we need users, and we sometimes work on tools
         | that do not bring us money directly, but bring us users (like
         | Small Web, Universal Summarizer or Translate). It is all part
         | of the plan. And it is a decade-long plan.
        
       | exi1up wrote:
       | I could be missing something, but is there some sort of metric
       | for these comparisons to other software? Like the BLEU score
       | which I've seen in studies relating to comparing LLMs to Google
       | Translate. I find it difficult to believe it is better than DeepL
       | in a vacuum.
        
       | ninalanyon wrote:
       | That's odd. Clicking the switch languages icon swaps the
       | languages but not the texts.
        
       | eduction wrote:
       | Has anyone else noticed that Google Translate trips up a lot on
       | GDPR cookie consent dialogs in Europe? I've often had to
       | copy/paste the content of a web page because Google, when given
       | the URL,couldn't navigate past the dialog to get to the page
       | content (or couldn't allow me to dismiss it). Not sure if Kagi
       | has solved this.
        
       | unsupp0rted wrote:
       | This is good. I wish it handled you-singular vs. you-polite-
       | plural though.
       | 
       | It would be nice to say "use a casual tone". Or "the speaker is a
       | woman and the recipient is a man".
        
       | gagabity wrote:
       | Some bugs to iron out
       | 
       | "Document Too Long Document is too long to process. It contains
       | 158 chunks, but the maximum is 256. Please try again later or
       | contact support if the problem persists."
        
         | freediver wrote:
         | Fixed, thanks for reporting.
        
       | Decoy1008 wrote:
       | I doubt it is better than deepl or google. On some tests it
       | couldn't recognize the correct language.
        
       | somat wrote:
       | Added to my list, very nice.
       | 
       | One thing I like about google translate that nether deepl or this
       | do is tell me how to say the word. I mainly use it to add a
       | reading hint to an otherwise opaque japanese title in a database.
        
       ___________________________________________________________________
       (page generated 2024-11-07 23:00 UTC)