[HN Gopher] Show HN: Vocab Miner - find new words in Spanish fro...
       ___________________________________________________________________
        
       Show HN: Vocab Miner - find new words in Spanish from texts
        
       Author : jsjoeio
       Score  : 24 points
       Date   : 2023-12-28 15:17 UTC (7 hours ago)
        
 (HTM) web link (vocabminer.com)
 (TXT) w3m dump (vocabminer.com)
        
       | eetus wrote:
       | I thought that this would look for unique words rather than just
       | cycle through all words. You should have a default ignore word
       | list that includes things like el, en, de, etc etc.
        
         | dbrueck wrote:
         | > I thought that this would look for unique words rather than
         | just cycle through all words
         | 
         | It appears to be cycling through all unique words.
        
       | hombre_fatal wrote:
       | The one-at-a-time UX is too slow I think. I'd rather see N words
       | at a time and select the ones I don't know.
       | 
       | I'm also not sure how to use this tool in a learning workflow
       | since you have to be able to copy and paste a bunch of text. I
       | guess the use-case is when you're reading articles online on a
       | desktop device, but as you read the article you can already
       | pinpoint the words that you don't know, yet this tool makes you
       | paste the text back into the tool and reconsume all the words
       | again.
       | 
       | A better version of this tool might be a browser plugin that lets
       | you click words as you read content online and add to a vocab
       | list. This way as you practice reading social media or news in
       | Spanish online, you can accumulate words and then do something
       | with them. Maybe export Anki cards or whatever it was that you
       | had planned.
        
         | dbrueck wrote:
         | Your comment sent me down a rabbit hole that led me to
         | https://www.languagereactor.com/ and it's great so far. Thanks!
        
           | davidzweig wrote:
           | Heh that's mine and Ognjen's project. :)
        
             | dbrueck wrote:
             | Well, thank you very much, because I'm really enjoying it.
             | So far I've only tried Netflix and PhrasePump, but both are
             | very helpful.
             | 
             | I'm honestly surprised by how well the Netflix integration
             | works. I have a good friend from China who has more or less
             | perfect English now, and I once asked him how he first
             | become proficient and he said that he really got going by
             | watching every episode of "Friends". I feel like
             | LanguageReactor's Netflix tools are the same idea but on
             | steroids!
        
         | qnleigh wrote:
         | Huh, I literally built an app that does what you describe!
         | Basically an e-reader for language learners called Polyreader.
         | It's a stand-alone app, not a browser plugin, but it has lots
         | of additional features like in-line translation. Maybe I should
         | finally open-source it!
        
       | david_allison wrote:
       | I clicked "let's mine" without inputting anything and the app
       | shows "-1 words remaining"
        
       | nescioquid wrote:
       | One suggestion for reducing the burden on your users would be to
       | start making predictions about what vocabulary your user already
       | knows, just based on what's known about word frequency and a
       | short quiz pulled from the text.
       | 
       | I copied in a long poem and it looked like I was going to be
       | prompted about whether I knew 1500+ "words" (are you lemmatizing
       | the input at all, BTW?). If your user knows the most common
       | verbs, they probably already know the prepositions, pronouns, and
       | other closed lexical classes of words (and vice versa). If your
       | user is familiar with less common vocabulary (e.g. something at
       | C1), raise the word frequency threshold for checking if the user
       | is familiar with the word. If your user is less familiar with
       | basic vocabulary, don't overwhelm them with moderate and advanced
       | vocabulary.
       | 
       | That would make the prompting portion more interesting -- you
       | select the most discriminating words to zero in on estimating the
       | user's ability (this is really how adaptive testing works). You
       | could gamify this too, by essentially establishing the user's
       | "vocab ELO" rating based on word frequency.
       | 
       | Admittedly, maybe my suggestion misses the point of your app in
       | case the objective was to be sure that you don't miss any new-to-
       | you vocab in a text. On the other hand, if you could do something
       | along the lines of my suggesting, you won't overwhelm beginners
       | and you won't exasperate more advanced learners.
        
       | bsnnkv wrote:
       | I created something similar to this a while ago, but perhaps
       | significantly more niche.
       | 
       | I like reading classical Dari poetry, but I'm not a native
       | speaker. Every now and then, I read a couplet which has a word
       | that I don't know, or that the dictionaries available to me
       | (either from Iran or Afghanistan) don't provide clear
       | explanations for.
       | 
       | I indexed a whole bunch of works from classical poets from across
       | Central Asia, South Asia and the Middle East and created
       | https://baytyab.com/ which lets me put in one of those words that
       | I've come across, and see other couplets that the word has been
       | used in to help me get a better, contextual understanding of its
       | meaning(s) and usage(s).
        
       ___________________________________________________________________
       (page generated 2023-12-28 23:01 UTC)