[HN Gopher] Etymological Wordnet
       ___________________________________________________________________
        
       Etymological Wordnet
        
       Author : polm23
       Score  : 61 points
       Date   : 2021-06-16 15:09 UTC (2 days ago)
        
 (HTM) web link (etym.org)
 (TXT) w3m dump (etym.org)
        
       | captaindiego wrote:
       | Not really involved in the field, but this makes me wonder, would
       | training deep neural nets first on root languages such as Latin
       | and ancient Greece improve speed of learning subsequent languages
       | that arose after? Essentially, make the steps that need to be
       | learned between languages smaller, helping to bridge larger
       | translation tasks between modern languages more quickly.
        
       | kevinpet wrote:
       | I can't help getting annoyed at anything showing "unexpected
       | etymology" that ignores key elements of the structure of words.
       | Fledermaus isn't a cognate with mouse, it's that the German word
       | for bat is "flying mouse" (compare English flying fox for a type
       | of big fruit bat).
        
       | bobcostas55 wrote:
       | I wanted to use this for a project but quickly discovered that
       | it's quite limited. Just trying out a bunch of easy words often
       | failed.
        
       | cesis wrote:
       | I often have trouble with claimed cognate words without any
       | references or analysis. This very often applies also to
       | Wiktionary.
       | 
       | E.g. even the given example - I would find it believable that
       | "muscle" cognates with Latvian "miesa", Russian "miasa" and
       | English "meat", but "mouse" seems sketchy.
        
         | kian wrote:
         | I had a teacher in high school, specializing in ancient latin
         | and greek, who told me about the musculus --> little mice
         | connection. He was also a very strong Austrian -- to
         | demonstrate the reason, he pulled up his shirt sleeve, made a
         | classic bicep curl motion, and rotated his fist rapidly from
         | inside to outside facing. It was pretty shocking to see the
         | 'little mice' running under the 'covers' of the skin of his arm
         | -- since then, I've never had trouble believing this particular
         | hypothesis.
        
           | arnsholt wrote:
           | To add on this, musculus is not only an entirely regular
           | diminutive derivation of mouse (mus + -culus) it's even the
           | same pattern as is used for testicle: it's a small witness
           | (testis, see also latinate English words like testify) of
           | manhood.
        
             | dredmorbius wrote:
             | TIL:
             | 
             |  _testis (n.) (plural testes), 1704, from Latin testis
             | "testicle," usually regarded as a special application of
             | testis "witness" (see testament), presumably because it
             | "bears witness to male virility" [Barnhart]. Stories that
             | trace the use of the Latin word to some supposed swearing-
             | in ceremony are modern and groundless._
             | 
             | https://www.etymonline.com/word/testis
        
         | johtso wrote:
         | This is my go to site for etymology
         | https://www.etymonline.com/word/muscle
         | 
         | Has a good description of this derivation.
        
           | Ericson2314 wrote:
           | Yeah I usually cross-reference etymonline and wiktionary, and
           | I can't recall anything too suspicious.
        
       | wolverine876 wrote:
       | > The information is for the most part mined from Wiktionary.
       | 
       | It's not a popular opinion here to criticize a star of the open
       | Internet, but Wiktionary is not a reliable source of information
       | (unless I misunderstand its provenance - it is crowd-sourced?). I
       | love crowd-sourcing, but not for factual research.
       | 
       | And this is how misinformation spreads - now someone builds
       | another thing on top of Wiktionary.
       | 
       | The core of the problem, however, is that the reliable sources of
       | etymology, such as the Oxford English Dictionary, are not open
       | and free. You can't just build a visualization of the
       | relationships. Shame on the scholars for hiding the most
       | precious, valuable treasures of civilization behind walls, when
       | the miraculous opportunity finally came to share them with the
       | world globally and freely (i.e., the Internet).
        
         | jan_Inkepa wrote:
         | In my experience wiktionary is a pretty great+reliable source
         | for word etymology. I've corrected a few things, but generally
         | it gets it right faar more often than wrong, is good about
         | citing sources, and it has an active + helpful community. It is
         | pretty reliable in my experience for
         | English/German/Latin/Chinese (in order of quantity of
         | experience).
         | 
         | Growing up in my dictionaries etymology always stopped at
         | Greek/Latin/Old English, which is a shame in some ways. Having
         | easy access to wiktionary to plumb the depths further back to
         | older reconstructed languages is a treat :)
         | 
         | Also many etymological resources are open/free -
         | http://pielexicon.hum.helsinki.fi/ for instance.
         | 
         | [disclaimer: I can't compare it to closed resources, and I
         | don't work professionally with this data.]
        
           | wolverine876 wrote:
           | > In my experience wiktionary is a pretty great+reliable
           | source for word etymology. I've corrected a few things, but
           | generally it gets it right faar more often than wrong ...
           | 
           | Serious question: How do you know if it's right or wrong?
           | 
           | How do I know the Oxford English Dictionary is accurate? I
           | trust it because experts trust it, because experts write it,
           | and because it's had over a century to mature and it has
           | retained its reputation for that long. Also, they show and
           | cite actual quotes.
           | 
           | > good about citing sources
           | 
           | Ever check cites on Wikipedia? Many of them do not at all
           | support what is written in the article.
        
             | jan_Inkepa wrote:
             | > Serious question: How do you know if it's right or wrong?
             | 
             | I don't know much linguistics, but sometimes the sound-
             | changes involved seem plausible to me which leads me to
             | trust it ("yeah that makes sense").
             | 
             | When I do follow the references they normally check out. In
             | some cases where there weren't references I did some
             | sleuthing myself and provided them (things checked out).
             | 
             | In another case it seemed that two words should be marked
             | as related, so I asked on the etymology scriptorium ( https
             | ://en.wiktionary.org/wiki/Wiktionary:Etymology_scriptor...
             | is their etymology messageboard - I find it highly
             | entertaining to browse through people figuring out a random
             | assortment of etymologies in the same place), and someone
             | took the time to explain to me (to my satisfaction) that no
             | it was just by random chance.
             | 
             | A few other times when I've found problems (last one was
             | some vowel-length inconsistencies for a Latin word entry)
             | I've asked about them, had them confirmed as problems, and
             | fixed them (in the case mentioned, it was reverting someone
             | else's erroneous change).
             | 
             | Also, some people I know who are a _lot_ more experienced
             | with linguistics [still amateurs, but...very highly skilled
             | amateurs] than I am contribute to it.
             | 
             | All of these experiences lead me to have a high level of
             | regard for wiktionary. As said, I'm not very experienced
             | with linguistics, but I haven't gotten the sense that the
             | people running the shop are anything other than competent.
             | 
             | >Ever check cites on Wikipedia? Many of them do not at all
             | support what is written in the article
             | 
             | Dictionary citations of the sort you get on wiktionary tend
             | to be less open to interpretation.
        
               | wolverine876 wrote:
               | Thanks for the thoughtful answer! IMHO:
               | 
               | > seem plausible to me
               | 
               | > someone took the time to explain to me (to my
               | satisfaction)
               | 
               | > I've asked about them, had them confirmed as problems,
               | and fixed them
               | 
               | Based on that, Wiktionary's standard of accuracy is what
               | will be accepted by a non-expert. That's a pretty low
               | standard; that's people talking in a college dorm room or
               | in a bar. It's nothing personal - my intuitions on
               | etymology are no better.
               | 
               | That's also how misinformation is created and spreads;
               | it's right out of the textbook. Research shows that our
               | intuitions about what's true are terrible if we don't
               | have expertise in the issue, and that is how we are most
               | easily fooled.
               | 
               |  _" The first principle is that you must not fool
               | yourself -- and you are the easiest person to fool."_ -
               | Richard Feynman
        
               | jan_Inkepa wrote:
               | Yeah as a non-expert I can only vouch so far. I can say
               | that the time I did my deepest dive (about the various
               | Latin words that look like 'pila' if you ignore vowel
               | length, and their etymology -
               | https://en.wiktionary.org/wiki/pila#Latin), using
               | wiktionary as a starting point for trawling 'real'
               | scholarly references I presented the work to my Latin
               | teacher, a university professor of Ancient Greek, who
               | said it checked out (though, he's a classicist and not a
               | general linguist, so maybe that doesn't count. And maybe
               | he was just humouring me).
               | 
               | What we need is a linguist to get on here and give their
               | take, and tell us how it fares compared to the
               | professional resources.
        
               | wolverine876 wrote:
               | > What we need is a linguist to get on here and give
               | their take, and tell us how it fares compared to the
               | professional resources.
               | 
               | Agreed. BTW, regardless of Wiktionary, I love the
               | research - it's such creative, intriguing, stimulating
               | work to explore those things. When people react, as
               | (IMHO) they are conditioned to, to intellectual things
               | with fear, hesitation and/or negativity, I think 'ugh,
               | you are missing so much, the most beautiful things in
               | this universe and many others.'
               | 
               | And see my other comment about the OED. It's the best
               | tool for it.
        
           | damenleeturks wrote:
           | What are your thoughts on etymonline.com?
        
             | jan_Inkepa wrote:
             | (To reiterate: I'm not very learned in this domain. This is
             | just an amateur impression).
             | 
             | For English it's pretty solid from what I've seen, and the
             | way it presents etymologies as coherently written readable
             | articles is more accessible than Wiktionary, which is a lot
             | rawer. It's nice that it goes back before written sources
             | to reconstructions of older languages (which OED doesn't do
             | IIRC, but Wiktionary does). On the Proto-Indo-European
             | language front Wiktionary is slightly more luxuriant in
             | this regard because you can search through non-English
             | languages as well and explore etymologies a bit more freely
             | because of this.
             | 
             | It seems to lacks citations as to where the info comes
             | from, which is a bit unfortunate, but I guess it's part of
             | its friendly vibe? But it also obscures how people know
             | this stuff, and makes it harder to fact-check. (I'd be
             | surprised if they didn't have the references stored
             | somewhere that's not getting published - I imagine they're
             | something you'd want to keep track of as you're writing the
             | articles).
             | 
             | [I don't use Etymonline very much because I'm mostly
             | looking at relating etymologies between words of different
             | (Indo-) European languages I'm learning/know right now,
             | rather than plumbing the origins of single English words.]
        
         | jberkel wrote:
         | Why is not reliable? The same has been said about Wikipedia for
         | long time, but for some reason Wiktionary still does not get
         | the same level of trust as Wikipedia. A lot of the etymologies
         | on Wiktionary come from reputable sources such as the mentioned
         | OED. In some cases there might be multiple conflicting sources
         | and theories, and such complex cases are likely misrepresented
         | by the automatic extraction tools.
        
           | wolverine876 wrote:
           | > The same has been said about Wikipedia for long time, but
           | for some reason Wiktionary still does not get the same level
           | of trust as Wikipedia.
           | 
           | I don't think Wikipedia is reliable at all. I can't speak for
           | others.
           | 
           | > A lot of the etymologies on Wiktionary come from reputable
           | sources such as the mentioned OED.
           | 
           | Wiktionary reuses OED content? I'm surprised the OED allows
           | that.
        
             | jberkel wrote:
             | > Wiktionary reuses OED content? I'm surprised the OED
             | allows that.
             | 
             | Facts (etymologies) are not copyrightable. The exact same
             | text can't be used of course, but it can be rephrased, or
             | to some extend quoted under "Fair use".
        
             | jan_Inkepa wrote:
             | >Wiktionary reuses OED content?
             | 
             | Not reuses wholesale, but cites as a source of information.
             | [Which is what you want, right?]
        
               | wolverine876 wrote:
               | > Not reuses wholesale, but cites as a source of
               | information. [Which is what you want, right?]
               | 
               | Yes! Thanks.
               | 
               | As an aside: I splurged on an OED subscription, which
               | isn't cheap (they had a sale recently, maybe still going
               | on, but usually it's something like $300/yr). If you care
               | about concepts and ideas, I can't recommend it enough:
               | It's a dictionary of every concept to which anyone has
               | ever assigned a word (or term) in English, and in minutes
               | you can see the concept from every perspective it's been
               | seen, in every time and place, and the OED bring that
               | together with the primary sources - the actual (brief)
               | quotes where the person introduced the term. IME, it's
               | also the best place to start for scientific and
               | mathematical terms. If you look up "relativity" you get
               | original quotes from Maxwell, Poincare, Einstein, etc.,
               | as well as several paragraphs succinctly and clearly
               | defining the Special Theory, General Theory, etc.
        
               | jan_Inkepa wrote:
               | Agreed; from my time with using it in University OED's
               | early usage sources are A+++. Definitely blows
               | wiktionary's out of the water (in as much as you can blow
               | something out of the water that doesn't exist to begin
               | with).
        
       | jamespwilliams wrote:
       | I played with this data a while ago and made a little project
       | which could plot graphs of the etymologies of words:
       | https://github.com/jamespwilliams/etymology.
       | 
       | I was thinking of deploying it somewhere but never got around to
       | it.
        
       | danans wrote:
       | I love deeply traced etymologies, but sometimes shallow
       | etymologies are pretty interesting too, especially in very common
       | words, for example:
       | 
       | dog: http://www.lexvo.com/info/eng/dog
       | 
       | bird: http://www.lexvo.com/info/eng/bird
       | 
       | which can only be traced backed to Anglo-Saxon and have no deeper
       | relatives in any known language.
        
       | mzs wrote:
       | This reminded me, there is more recent discussion than I was
       | previously aware for the etymology of kobieta (Polish word for
       | woman), an outlier in Slavic for a word that naively seems would
       | be ancient.
       | 
       | https://forum.wordreference.com/threads/all-slavic-kobieta-w...
        
       ___________________________________________________________________
       (page generated 2021-06-18 23:01 UTC)