[HN Gopher] Etymological Wordnet
___________________________________________________________________
Etymological Wordnet
Author : polm23
Score : 61 points
Date : 2021-06-16 15:09 UTC (2 days ago)
(HTM) web link (etym.org)
(TXT) w3m dump (etym.org)
| captaindiego wrote:
| Not really involved in the field, but this makes me wonder, would
| training deep neural nets first on root languages such as Latin
| and ancient Greece improve speed of learning subsequent languages
| that arose after? Essentially, make the steps that need to be
| learned between languages smaller, helping to bridge larger
| translation tasks between modern languages more quickly.
| kevinpet wrote:
| I can't help getting annoyed at anything showing "unexpected
| etymology" that ignores key elements of the structure of words.
| Fledermaus isn't a cognate with mouse, it's that the German word
| for bat is "flying mouse" (compare English flying fox for a type
| of big fruit bat).
| bobcostas55 wrote:
| I wanted to use this for a project but quickly discovered that
| it's quite limited. Just trying out a bunch of easy words often
| failed.
| cesis wrote:
| I often have trouble with claimed cognate words without any
| references or analysis. This very often applies also to
| Wiktionary.
|
| E.g. even the given example - I would find it believable that
| "muscle" cognates with Latvian "miesa", Russian "miasa" and
| English "meat", but "mouse" seems sketchy.
| kian wrote:
| I had a teacher in high school, specializing in ancient latin
| and greek, who told me about the musculus --> little mice
| connection. He was also a very strong Austrian -- to
| demonstrate the reason, he pulled up his shirt sleeve, made a
| classic bicep curl motion, and rotated his fist rapidly from
| inside to outside facing. It was pretty shocking to see the
| 'little mice' running under the 'covers' of the skin of his arm
| -- since then, I've never had trouble believing this particular
| hypothesis.
| arnsholt wrote:
| To add on this, musculus is not only an entirely regular
| diminutive derivation of mouse (mus + -culus) it's even the
| same pattern as is used for testicle: it's a small witness
| (testis, see also latinate English words like testify) of
| manhood.
| dredmorbius wrote:
| TIL:
|
| _testis (n.) (plural testes), 1704, from Latin testis
| "testicle," usually regarded as a special application of
| testis "witness" (see testament), presumably because it
| "bears witness to male virility" [Barnhart]. Stories that
| trace the use of the Latin word to some supposed swearing-
| in ceremony are modern and groundless._
|
| https://www.etymonline.com/word/testis
| johtso wrote:
| This is my go to site for etymology
| https://www.etymonline.com/word/muscle
|
| Has a good description of this derivation.
| Ericson2314 wrote:
| Yeah I usually cross-reference etymonline and wiktionary, and
| I can't recall anything too suspicious.
| wolverine876 wrote:
| > The information is for the most part mined from Wiktionary.
|
| It's not a popular opinion here to criticize a star of the open
| Internet, but Wiktionary is not a reliable source of information
| (unless I misunderstand its provenance - it is crowd-sourced?). I
| love crowd-sourcing, but not for factual research.
|
| And this is how misinformation spreads - now someone builds
| another thing on top of Wiktionary.
|
| The core of the problem, however, is that the reliable sources of
| etymology, such as the Oxford English Dictionary, are not open
| and free. You can't just build a visualization of the
| relationships. Shame on the scholars for hiding the most
| precious, valuable treasures of civilization behind walls, when
| the miraculous opportunity finally came to share them with the
| world globally and freely (i.e., the Internet).
| jan_Inkepa wrote:
| In my experience wiktionary is a pretty great+reliable source
| for word etymology. I've corrected a few things, but generally
| it gets it right faar more often than wrong, is good about
| citing sources, and it has an active + helpful community. It is
| pretty reliable in my experience for
| English/German/Latin/Chinese (in order of quantity of
| experience).
|
| Growing up in my dictionaries etymology always stopped at
| Greek/Latin/Old English, which is a shame in some ways. Having
| easy access to wiktionary to plumb the depths further back to
| older reconstructed languages is a treat :)
|
| Also many etymological resources are open/free -
| http://pielexicon.hum.helsinki.fi/ for instance.
|
| [disclaimer: I can't compare it to closed resources, and I
| don't work professionally with this data.]
| wolverine876 wrote:
| > In my experience wiktionary is a pretty great+reliable
| source for word etymology. I've corrected a few things, but
| generally it gets it right faar more often than wrong ...
|
| Serious question: How do you know if it's right or wrong?
|
| How do I know the Oxford English Dictionary is accurate? I
| trust it because experts trust it, because experts write it,
| and because it's had over a century to mature and it has
| retained its reputation for that long. Also, they show and
| cite actual quotes.
|
| > good about citing sources
|
| Ever check cites on Wikipedia? Many of them do not at all
| support what is written in the article.
| jan_Inkepa wrote:
| > Serious question: How do you know if it's right or wrong?
|
| I don't know much linguistics, but sometimes the sound-
| changes involved seem plausible to me which leads me to
| trust it ("yeah that makes sense").
|
| When I do follow the references they normally check out. In
| some cases where there weren't references I did some
| sleuthing myself and provided them (things checked out).
|
| In another case it seemed that two words should be marked
| as related, so I asked on the etymology scriptorium ( https
| ://en.wiktionary.org/wiki/Wiktionary:Etymology_scriptor...
| is their etymology messageboard - I find it highly
| entertaining to browse through people figuring out a random
| assortment of etymologies in the same place), and someone
| took the time to explain to me (to my satisfaction) that no
| it was just by random chance.
|
| A few other times when I've found problems (last one was
| some vowel-length inconsistencies for a Latin word entry)
| I've asked about them, had them confirmed as problems, and
| fixed them (in the case mentioned, it was reverting someone
| else's erroneous change).
|
| Also, some people I know who are a _lot_ more experienced
| with linguistics [still amateurs, but...very highly skilled
| amateurs] than I am contribute to it.
|
| All of these experiences lead me to have a high level of
| regard for wiktionary. As said, I'm not very experienced
| with linguistics, but I haven't gotten the sense that the
| people running the shop are anything other than competent.
|
| >Ever check cites on Wikipedia? Many of them do not at all
| support what is written in the article
|
| Dictionary citations of the sort you get on wiktionary tend
| to be less open to interpretation.
| wolverine876 wrote:
| Thanks for the thoughtful answer! IMHO:
|
| > seem plausible to me
|
| > someone took the time to explain to me (to my
| satisfaction)
|
| > I've asked about them, had them confirmed as problems,
| and fixed them
|
| Based on that, Wiktionary's standard of accuracy is what
| will be accepted by a non-expert. That's a pretty low
| standard; that's people talking in a college dorm room or
| in a bar. It's nothing personal - my intuitions on
| etymology are no better.
|
| That's also how misinformation is created and spreads;
| it's right out of the textbook. Research shows that our
| intuitions about what's true are terrible if we don't
| have expertise in the issue, and that is how we are most
| easily fooled.
|
| _" The first principle is that you must not fool
| yourself -- and you are the easiest person to fool."_ -
| Richard Feynman
| jan_Inkepa wrote:
| Yeah as a non-expert I can only vouch so far. I can say
| that the time I did my deepest dive (about the various
| Latin words that look like 'pila' if you ignore vowel
| length, and their etymology -
| https://en.wiktionary.org/wiki/pila#Latin), using
| wiktionary as a starting point for trawling 'real'
| scholarly references I presented the work to my Latin
| teacher, a university professor of Ancient Greek, who
| said it checked out (though, he's a classicist and not a
| general linguist, so maybe that doesn't count. And maybe
| he was just humouring me).
|
| What we need is a linguist to get on here and give their
| take, and tell us how it fares compared to the
| professional resources.
| wolverine876 wrote:
| > What we need is a linguist to get on here and give
| their take, and tell us how it fares compared to the
| professional resources.
|
| Agreed. BTW, regardless of Wiktionary, I love the
| research - it's such creative, intriguing, stimulating
| work to explore those things. When people react, as
| (IMHO) they are conditioned to, to intellectual things
| with fear, hesitation and/or negativity, I think 'ugh,
| you are missing so much, the most beautiful things in
| this universe and many others.'
|
| And see my other comment about the OED. It's the best
| tool for it.
| damenleeturks wrote:
| What are your thoughts on etymonline.com?
| jan_Inkepa wrote:
| (To reiterate: I'm not very learned in this domain. This is
| just an amateur impression).
|
| For English it's pretty solid from what I've seen, and the
| way it presents etymologies as coherently written readable
| articles is more accessible than Wiktionary, which is a lot
| rawer. It's nice that it goes back before written sources
| to reconstructions of older languages (which OED doesn't do
| IIRC, but Wiktionary does). On the Proto-Indo-European
| language front Wiktionary is slightly more luxuriant in
| this regard because you can search through non-English
| languages as well and explore etymologies a bit more freely
| because of this.
|
| It seems to lacks citations as to where the info comes
| from, which is a bit unfortunate, but I guess it's part of
| its friendly vibe? But it also obscures how people know
| this stuff, and makes it harder to fact-check. (I'd be
| surprised if they didn't have the references stored
| somewhere that's not getting published - I imagine they're
| something you'd want to keep track of as you're writing the
| articles).
|
| [I don't use Etymonline very much because I'm mostly
| looking at relating etymologies between words of different
| (Indo-) European languages I'm learning/know right now,
| rather than plumbing the origins of single English words.]
| jberkel wrote:
| Why is not reliable? The same has been said about Wikipedia for
| long time, but for some reason Wiktionary still does not get
| the same level of trust as Wikipedia. A lot of the etymologies
| on Wiktionary come from reputable sources such as the mentioned
| OED. In some cases there might be multiple conflicting sources
| and theories, and such complex cases are likely misrepresented
| by the automatic extraction tools.
| wolverine876 wrote:
| > The same has been said about Wikipedia for long time, but
| for some reason Wiktionary still does not get the same level
| of trust as Wikipedia.
|
| I don't think Wikipedia is reliable at all. I can't speak for
| others.
|
| > A lot of the etymologies on Wiktionary come from reputable
| sources such as the mentioned OED.
|
| Wiktionary reuses OED content? I'm surprised the OED allows
| that.
| jberkel wrote:
| > Wiktionary reuses OED content? I'm surprised the OED
| allows that.
|
| Facts (etymologies) are not copyrightable. The exact same
| text can't be used of course, but it can be rephrased, or
| to some extend quoted under "Fair use".
| jan_Inkepa wrote:
| >Wiktionary reuses OED content?
|
| Not reuses wholesale, but cites as a source of information.
| [Which is what you want, right?]
| wolverine876 wrote:
| > Not reuses wholesale, but cites as a source of
| information. [Which is what you want, right?]
|
| Yes! Thanks.
|
| As an aside: I splurged on an OED subscription, which
| isn't cheap (they had a sale recently, maybe still going
| on, but usually it's something like $300/yr). If you care
| about concepts and ideas, I can't recommend it enough:
| It's a dictionary of every concept to which anyone has
| ever assigned a word (or term) in English, and in minutes
| you can see the concept from every perspective it's been
| seen, in every time and place, and the OED bring that
| together with the primary sources - the actual (brief)
| quotes where the person introduced the term. IME, it's
| also the best place to start for scientific and
| mathematical terms. If you look up "relativity" you get
| original quotes from Maxwell, Poincare, Einstein, etc.,
| as well as several paragraphs succinctly and clearly
| defining the Special Theory, General Theory, etc.
| jan_Inkepa wrote:
| Agreed; from my time with using it in University OED's
| early usage sources are A+++. Definitely blows
| wiktionary's out of the water (in as much as you can blow
| something out of the water that doesn't exist to begin
| with).
| jamespwilliams wrote:
| I played with this data a while ago and made a little project
| which could plot graphs of the etymologies of words:
| https://github.com/jamespwilliams/etymology.
|
| I was thinking of deploying it somewhere but never got around to
| it.
| danans wrote:
| I love deeply traced etymologies, but sometimes shallow
| etymologies are pretty interesting too, especially in very common
| words, for example:
|
| dog: http://www.lexvo.com/info/eng/dog
|
| bird: http://www.lexvo.com/info/eng/bird
|
| which can only be traced backed to Anglo-Saxon and have no deeper
| relatives in any known language.
| mzs wrote:
| This reminded me, there is more recent discussion than I was
| previously aware for the etymology of kobieta (Polish word for
| woman), an outlier in Slavic for a word that naively seems would
| be ancient.
|
| https://forum.wordreference.com/threads/all-slavic-kobieta-w...
___________________________________________________________________
(page generated 2021-06-18 23:01 UTC)