[HN Gopher] LinguaCafe: Self-hosted software for language learne...
       ___________________________________________________________________
        
       LinguaCafe: Self-hosted software for language learners to read
       foreign languages
        
       Author : thunderbong
       Score  : 163 points
       Date   : 2024-01-08 12:36 UTC (10 hours ago)
        
 (HTM) web link (simjanos-dev.github.io)
 (TXT) w3m dump (simjanos-dev.github.io)
        
       | mionhe wrote:
       | This looks like a well thought out. It's honestly something I've
       | thought about trying to put together for myself due my own
       | language learning effort. I'm looking forward to trying it out.
        
       | z3n0n wrote:
       | Looks really promising! Well done. Would love to host an
       | installation for our local learning community. Hopefully we'll
       | get multiple user accounts soon.
        
       | CyberDildonics wrote:
       | Wouldn't "self hosted software" just be software?
        
         | s0ss wrote:
         | Sure. But the clarifying words are useful in this context. It's
         | software; a web app that you can host yourself.
        
         | davely wrote:
         | I found the additional context to be helpful in quickly
         | understanding how / where I could run it.
        
         | Gormo wrote:
         | It specifically refers to client/server software where the
         | server-side component is run within one's own server
         | environment, and usually describes FOSS alternatives to SaaS
         | webapps.
         | 
         | Normal desktop applications aren't self-hosted, as they aren't
         | _hosted_ in this sense to begin with.
        
       | yellow_lead wrote:
       | Looks great. I would love to give it a try if it had Chinese
       | support. The japanese support looks good though, maybe another
       | reason to try learning.
        
       | huimang wrote:
       | This is something I've thought about building for a while.
       | 
       | I would definitely use it if there were a Korean option!
        
       | seusscat wrote:
       | Looks amazing and I'm keen to try it out. However, I cannot find
       | the sources or set up instructions anywhere on the linked page.
       | Please add at least a link to the github repository to the page
       | so people that stumble upon it can find their way
        
       | burkaman wrote:
       | This is incredibly cool. I have been trying to do this exact
       | workflow manually by reading something in Kindle and copy/pasting
       | to DeepL and Anki and it sucks. If the author is here, I'm
       | wondering if you would be open to PRs for other languages? I'd
       | like to try this for French or Italian.
        
       | yurishimo wrote:
       | This looks sweet! The jellyfin integration especially looks
       | awesome as I find watching videos an excellent way to actively
       | absorb new vocabulary.
       | 
       | My current self study centers around movies/tv and Linq, which
       | this tool seems very similar to.
       | 
       | I'm learning Dutch, so it's a bummer that it's not supported
       | currently, but I'm keen to dig in and see how much effort it
       | takes to add a new language.
        
         | seusscat wrote:
         | I agree there. The Jellyfin integration here is the absolute
         | killer feature. I hope to see the first documentation on how to
         | set it up soon.
        
         | barrell wrote:
         | If you're shopping around for new ways to learn languages from
         | watching movies/tv, I'm working on another language learning
         | application. I just wrote up the basic features this weekend
         | [1]
         | 
         | We support many languages out of the box, would love to hear
         | what's making you consider LinguaCafe over LingQ :)
         | 
         | [1] https://blog.phrasing.app/phrasing-first-look/
        
           | BigElephant wrote:
           | Hello, when do you plan on launching the beta version?
        
       | anadem wrote:
       | I'd love to try it, but is there a way to get it? Maybe I'm
       | missing a link but I can't figure out a way to try it.
       | 
       | Ah, found, it's here: https://github.com/simjanos-dev/LinguaCafe
        
         | i_am_a_squirrel wrote:
         | +1 OP should add a link!
        
       | wahnfrieden wrote:
       | I made an iOS / macOS app with similar functionality, Manabi
       | Reader. It has its own flashcard companion app and also
       | integrates with Anki.
       | 
       | https://reader.manabi.io
       | 
       | Japanese only but I am expanding it to more languages early this
       | year.
        
       | outside1234 wrote:
       | Looks great! By the way, I use Apple Books (also works in Kindle)
       | to do something similar - if you press and highlight a section of
       | text, you can translate it, which has done wonders for building
       | my vocabulary in context.
        
       | tracnar wrote:
       | Looks good! I've been thinking about building something similar
       | but as a desktop app (and maybe browser extension) which would
       | work with whatever text you have on the screen. It seems doable
       | by (ab)using the OS accessibility APIs. I find it hard to stick
       | to importing text, reading them in the app, and marking the
       | words. Having something which works in the background and can
       | tell you where you've previously seen words in different contexts
       | would be ideal for me.
        
         | bunderbunder wrote:
         | Are you familiar with Language Reactor or Migaku? I think there
         | are a couple others too. They're all implemented as browser
         | extensions, but that works out pretty well because most content
         | that's useful for language learning gets accessed through a
         | browser these days, anyway.
        
           | tracnar wrote:
           | I wasn't, they look interesting thank you, I'll try it out!
           | Indeed the browser is the most important, even though it
           | would be nice to have something generic for any app.
        
             | wahnfrieden wrote:
             | My Manabi Reader app is a browser
             | 
             | https://reader.manabi.io
        
       | bunderbunder wrote:
       | Very nice.
       | 
       | This clearly takes a lot of inspiration from LingQ, but fixes
       | some of LingQ's more glaring challenges such as letting you use a
       | real dictionary, instead of relying on definitions that were
       | crowdsourced from other learners using the app. (And therefore
       | full of quality problems an inaccuracies.) On the other hand, it
       | sounds like some nice features aren't implemented yet, or maybe
       | not even planned, so maybe LingQ is still a good option if you
       | don't want to hassle with self-hosting a webapp or hunting down
       | your own resources, and don't mind paying the subscription fee.
       | 
       | All in all, though, it looks very promising!
        
         | barrell wrote:
         | I'd be curious to hear what niceties you feel LingQ has that
         | it's missing if you don't mind sharing
        
           | tenaf0 wrote:
           | Last time I checked, it couldn't handle expressions that are
           | not just tokens one after the other. For example, German
           | separable verbs. I tried fixing it here:
           | https://news.ycombinator.com/item?id=38915786
        
             | tenaf0 wrote:
             | (Misunderstood the question, please ignore my above
             | comment)
        
           | bunderbunder wrote:
           | Easy importing of lessons from YouTube and Netflix, the
           | built-in libraries of lessons, guidance on what content might
           | be most appropriate to your current level based on known
           | vocabulary, the mobile apps with playlists and audio player,
           | things like that.
           | 
           | (Disclaimer: I haven't actually used LinguaCafe, but am a
           | longtime LingQ user, so I'm not really making a fair
           | comparison. I know LingQ's feature set much, much better.)
        
       | zerop wrote:
       | My method of learning new languages is always by starting to
       | learn everyday conversations in the language.
       | 
       | 1. Learn the translation of the commonly used everyday words
       | 
       | 2. Learn the rule to build sentences in different tenses (Verb
       | conjugation)
       | 
       | 3. Keep practising in everyday conversations, starting with most
       | simple ones and gradually learn more.
        
         | Tepix wrote:
         | Regarding 2. i noticed that when learning french, once you know
         | how to form proper questions, conversations take a quantum
         | leap. It also helps to learn the 30ish verbs that are the most
         | important. Good luck.
        
       | simjanos-dev wrote:
       | Hey guys. I wrote LinguaCafe. I didn't know it was posted here,
       | I've just read it now.
       | 
       | I didn't think this many people would be interested. I'll write a
       | guide for Jellyfin, then add Italian, French and Dutch languages
       | tomorrow.
        
         | Beijinger wrote:
         | Chinese please.
        
         | nexawave-ai wrote:
         | This is very cool. Yes, please add French and Dutch. Dankjewel!
        
         | justinmayer wrote:
         | As a credited contributor to the EDICT[1] Japanese/English
         | dictionary, I am very pleased to see its successor JMdict[2]
         | actively supported by this project. Bravo!
         | 
         | And as someone who now also speaks Italian, I am even more
         | pleased to see that Italian support will be added tomorrow.
         | 
         | It is wonderful to see such a useful tool released as an open-
         | source, self-hosted project. (^_^)
         | 
         | [1] EDICT: http://edrdg.org/jmdict/edict_doc_2009.html
         | 
         | [2] JMdict: https://en.wikipedia.org/wiki/JMdict
        
           | simjanos-dev wrote:
           | Hi!
           | 
           | Thank you for helping me learn Japanese! :)
           | 
           | Can you please explain what do you mean by actively
           | supporting JMDict? I hope I didn't make an attribution
           | mistake, or misunderstood something. My understanding is that
           | I can use those files in my project as long as I follow the
           | license guidelines.
           | 
           | It makes me really happy that so many people are interested
           | in it. :)
        
             | justinmayer wrote:
             | Sorry for the confusing language choice on my part. I just
             | meant that I think it's great that your project supports
             | JMdict. I think how you are using JMdict is indeed totally
             | okay! :^)
        
               | simjanos-dev wrote:
               | Oh, okay. Thank you!
        
         | qnleigh wrote:
         | Very cool! How do you handle segmenting sentences into
         | individual words in Japanese? I've been building a similar app
         | for Android, but gave up on Japanese partly because segmenting
         | was so unreliable.
        
           | simjanos-dev wrote:
           | Hi! Thank you so much. I am using Spacy tokenizer with
           | python.
        
         | pm3003 wrote:
         | Seems great, I'll test it soon !
         | 
         | I know Christmas is over, but my letter to Santa would include:
         | - some Anki sync feature (over an external Anki sync server or
         | any other solution) - a non-docker install guide - of course
         | more languages!
         | 
         | I've been looking for a tool to study vocabulary this way,
         | especially in languages I'm already fluent in, to learn more
         | nuances or specific meanings to some words. Having tried
         | several things I settled on the bookmark feature of my
         | Wiktionary Android apps (Livio's, which are nice), and a small
         | sync/script chain that would let me review words, compare
         | definitions in different dictionaries, choose the best and
         | edit/complete it, and make an Anki card of it. The whole
         | process was still tedious.
        
         | jdeisenberg wrote:
         | I don't see a link on that page where I can download the
         | software. (I am exceptionally slow-thinking today, so it may be
         | in a very obvious place and I have overlooked it.)
        
           | BigElephant wrote:
           | https://github.com/simjanos-dev/LinguaCafe?tab=readme-ov-
           | fil...
        
         | kegs_ wrote:
         | How are you using the service on a boox tablet? Follow-up, what
         | kind of battery draw does it have on the tablet?
        
           | simjanos-dev wrote:
           | I installed it on a PC, and access it from my tablet's
           | browser. I do not know how much battery draw it has.
        
         | ipsi wrote:
         | I think the Jellyfin integration could be more than just a
         | niche feature. I've used https://www.languagereactor.com/, but
         | that only supports Netflix & YouTube, which is a bit limiting.
         | 
         | Reasons it's useful: * If you've got both Native & Target
         | Language subtitles, you can see a natural translation if you're
         | struggling to understand something * If there isn't a Native
         | translation, then you can machine-translate one - especially
         | useful early on to catch common idioms/etc that aren't just the
         | sum of each individual word. * Jellyfin _also_ supports eBooks,
         | although its reader isn 't great - but if someone has already
         | built their library, it would be nice to be able to re-use it
         | somehow.
         | 
         | I would be very interested in seeing that particular feature
         | expand, but I don't imagine it's at all simple!
         | 
         | Tangentially related, but I could see some desire for Calibre
         | support as well, somehow. Calibre was very much designed to be
         | completely stand-alone and it doesn't really support other apps
         | trying to read its database, but it is possible.
         | 
         | I'd also really like some language-specific features, like
         | separable-verb handling for German (see this comment:
         | https://news.ycombinator.com/item?id=38915786) - it's
         | relatively important and lacking support really limits the
         | usefulness of vocab tools. It would also be a _nightmare_ to
         | handle for subtitles, since it 's not always clear where a
         | sentence ends, but such is life - subtitles are sadly not aimed
         | at language leaners. For books and not-terrible Podcast
         | transcripts, though, it wouldn't be so bad.
        
       | acheong08 wrote:
       | Nice! I was just trying to learn Japanese this week but Duolingo
       | is painful
        
       | tenaf0 wrote:
       | I have been working on a similar project on-and-off in my spare
       | time, the only remotely interesting feature that other similar
       | software may not have is that it actually tries to parse/analyze
       | sentences (with an NLP lib). It's made specifically for German,
       | and the reason why I wanted to make it is that no existing
       | software managed to handle separable verbs properly - for example
       | learning "Wir fangen jetzt an." is just wrong if you learn it as
       | 'fangen' and 'an' separately, you actually care about 'anfangen',
       | dictionary-wise.
       | 
       | It unfortunately does have false-positives (a complete solution
       | would require LLMs, I believe over the much less complicated NLP
       | algorithms - I just don't want to send whole books to ChatGPT, as
       | that would quickly become expensive), but I found it usable, so I
       | made it public now: https://github.com/tenaf0/lwt
       | 
       | I don't want to "advertise" it even more, as the NLP lib is run
       | by academia as a free service, and I don't want to overburden it
       | (I have been planning on hosting it myself, but didn't yet get
       | there).
        
         | jchook wrote:
         | You could potentially use an NLP library like SpaCy, or even
         | bundle with a free fine-tuned LLM like Mistral 7b.
         | 
         | The fine-tuned mistral models are known to out-perform GPT-4 on
         | their specific tasks.
        
         | ipsi wrote:
         | Interesting! I have a partially-built, related, tool, to
         | extract "words" from e-books, so I could build flashcard lists
         | and make sure I knew the majority of words that were used -
         | most of them would be common words but every book has a
         | decently-sized selection of specialised vocabulary. I did think
         | about trying to get something fancy done with an LLM or an NLP
         | for figuring out the separable verbs, but in the end, I took a
         | very... brute-force approach, basically grabbing the final word
         | in the "phrase", then prepending that to every word in the
         | phrase one by one and asking "is this a known separable verb?"
         | - I'm not sure how _well_ it worked, but that 's a different
         | story.
        
       | katspaugh wrote:
       | Nice work!
       | 
       | It's great that you can track your progress in this app!
       | 
       | When I was learning German, I used the dictionary lookup on
       | Kindle a lot and made a web app to extract that vocabulary as
       | Anki flashcards. It's available on https://fluentcards.com. The
       | code is open source on GitHub.
        
         | mdaniel wrote:
         | because I didn't see any obvious link to said repo, for
         | convenience: https://github.com/katspaugh/fluentcards and
         | https://github.com/katspaugh/fluentcards-grammar
         | 
         | Being the resident licensing pedant, I'll point out that
         | neither of those repos have any licensing information aside
         | from package.json and I doubt gravely that's strong enough for
         | any contributor's comfort level
        
       | mtalantikite wrote:
       | This looks great! One feature request I'd make is to load up two
       | versions of the same text in both your source and target
       | languages to have them displayed side by side. Bonus would be to
       | have the audiobook as well, or some sort of text to speech.
       | Basically, L-R method (Assimil) [1] but for any book!
       | 
       | [1] https://learnanylanguage.fandom.com/wiki/Listening-
       | Reading_M...
        
       ___________________________________________________________________
       (page generated 2024-01-08 23:00 UTC)