hngopher.com

       [HN Gopher] Array Languages: R vs. APL (2023)
       ___________________________________________________________________
        
       Array Languages: R vs. APL (2023)
        
       Author : todsacerdoti
       Score  : 120 points
       Date   : 2024-03-20 20:52 UTC (2 days ago)
        
 (HTM) web link (jcarroll.com.au)
 (TXT) w3m dump (jcarroll.com.au)
        
       | olliej wrote:
       | I personally think APL is wonderful simply because of the
       | original APL specific keyboard [1]
       | 
       | I've looked briefly at R and found the syntax and semantics to be
       | less than stellar. Obviously there's going to be some bias in
       | that sentiment due me not generally doing "array programming",
       | but I don't believe the things that irked me were entirely as a
       | result of that.
       | 
       | The more annoying stuff for R is entirely second hand. As far as
       | I can tell R (or at least R studio) maintains implicit state
       | between runs which means you can get to a position where the same
       | code works on some runs, and then not on later runs. My friend
       | was having to do a lot of bioinformatics processing (many of the
       | libraries for this are in R) and was constantly fighting to have
       | code she wrote to process the data or produce charts
       | (publications in bioinformatics have an acceptance bias for
       | "looks like it came from R" that is similar to what CS [used to?]
       | have for gnu plot). But you could run the same scripts on the
       | same input and have it fail where previously it worked. This is
       | before you deal with inter-version compatibility problems which
       | also seemed frequent.
       | 
       | What was irksome to me looking at a lot of the stuff that were
       | doing is that it was fundamentally mostly basic scripting stuff
       | you could do in other languages trivially (and more cleanly imo)
       | but there were a bunch of functions (builtin or from libraries?)
       | that did the work, but those functions weren't in R, so the
       | claims that R was "necessary" seemed fairly bogus to me.
       | 
       | [1]
       | https://en.wikipedia.org/wiki/APL_(programming_language)#/me...
        
         | crispyambulance wrote:
         | > [R/RStudio] maintains implicit state between runs...
         | 
         | That can be turned off and is, in fact, widely recommended to
         | not keep one's workspace between runs.                 > This
         | is before you deal with inter-version compatibility problems
         | which also seemed frequent.
         | 
         | Yeah, that can be a problem with libraries (as it is with
         | python dependencies). It really afflicts long-running projects.
         | R has taken a cue from the python world there. renv the best
         | way (IMHO) to maintain a reproduceable environment in R
         | (https://rstudio.github.io/renv/articles/renv.html).
         | 
         | R is nicely cogent in syntax and largely "just works" once you
         | accept its idiosyncrasies.
        
         | goosedragons wrote:
         | You can save your workspace (state) in R. It's generally bad
         | practice to do so.
         | 
         | R is VERY VERY good at handling tabular data. Python can get
         | kind of close with Pandas but IMO, it's still more awkward than
         | base R data frames and way worse than data.table.
         | 
         | R also has a lot of built-ins geared for statistics and built
         | by statisticians. If you're do it statistics there's value in
         | not having to find a library or libraries that do that.
        
         | rzmmm wrote:
         | R has a lot high quality packages which implement e.g.
         | frequently used sophisticated regression analysis algorithms.
         | Python has these too but in my experience they are not that
         | well tested and suffer from bugs.
        
       | weinzierl wrote:
       | > _" So, would APL be "readable" if I was more familiar with it?
       | Let's find out!"_
       | 
       | An alternative test for this hypothesis might have been using the
       | language J, which is an array language based on APL and by the
       | designer of APL but only using ASCII characters.
        
         | nonfamous wrote:
         | R itself could be considered a test of this hypothesis, too.
         | It's been said that elegant, powerful Lisp would be more widely
         | adopted if it wasn't for all those gosh-darned parenthesis.
         | 
         | Well, at its core R _is_ a Lisp (specifically, Scheme) but with
         | a more traditional syntax (infixed operators, function calls,
         | etc). And it's fair to say the adoption of R has, indeed, been
         | more widespread than that of Lisp.
        
           | anthk wrote:
           | J it's standalone, it doesn't use APL in the background.
        
           | mik1998 wrote:
           | I'm not sure I would come to this conclusion. R has some
           | adoption, but it's also really not used as a generic
           | programming language, which most Lisp dialects are.
        
             | aydyn wrote:
             | That has more to do with (poor) performance, not syntax. At
             | it's core, R's source code written in C is still very badly
             | optimized and not performant at all.
        
           | dan-robertson wrote:
           | I'm not totally convinced that being 'secretly a lisp' is
           | what was good about R. I think the easy vectorisation is
           | good, and the consequences of the bizarre function argument
           | evaluation are good. I don't know of lisps that do the
           | vectorisation stuff so naturally, and while I guess fexprs
           | are a thing, I think they are possibly too general in the
           | syntax they can accept - basically the simplicity of lisp
           | syntax allows macros to have more tree-structured input in a
           | way you wouldn't want for a language with non-lisp syntax
           | (where the head lives outside the list), and I think the
           | flexibility makes the syntax more confusingly non-uniform.
        
             | levocardia wrote:
             | A lot of R's popularity and usefulness has to do with the
             | libraries that are available in it. I would put up with
             | almost any amount of BS from base R to use ggplot and the
             | tidyverse, and ditto for a number of modern stats
             | algorithms. In many cases, Python implementations of the
             | same techniques are either woefully outdated or completely
             | nonexistent.
        
               | dan-robertson wrote:
               | When I've seen attempts at ggplot2 or dplyr in other
               | languages, one issue is bad performance or bugs, but it's
               | also been a problem that the language features seem to
               | allow those libraries to be much more ergonomic. eg I
               | found Julia much less nice to use for those sorts of
               | things despite it seeming light it ought to be well
               | suited (making a reasonably good claim to a lot of CL
               | heritage for example)
        
               | disgruntledphd2 wrote:
               | Plotnine is a pretty good implementation of ggplot2,
               | apart from the necessity to quote variable names it
               | seemed to work almost perfectly.
        
           | seanhunter wrote:
           | As someone who loved learning lisp and regrets that the long
           | course of my programming career has never led me to use it in
           | a professional capacity, I just don't buy it when people say
           | that parentheses are the reason people didn't adopt lisp more
           | widely. I would say the main reasons are:
           | 
           | 1) The language is so frikkin massive. Common lisp is a huge
           | language with hundreds and hundreds of built-in functions etc
           | and the standard came very late in its evolution so there is
           | a bunch of back compat cruft and junk that everyone has to
           | live with. The object system is a whole epic journey in
           | itself. You could probably kill or at least seriously injure
           | someone with the impact if they were lying down and you
           | dropped a copy of Guy Steele's excellent book[1] on them from
           | a standing height.
           | 
           | 2) The ecosystem is so fragmented. First you have Common
           | Lisp, which isn't very common at all. Then you have all the
           | vendor lisps. Then you have whether they have or don't have
           | clos to contend with. Elisp is a lisp but is not common lisp
           | and differs in some important ways that I don't quite
           | remember. Then there's scheme, and guile scheme (which isn't
           | quite the same) then clojure, etc etc.
           | 
           | 3) That meant that the tooling was basically all
           | simultaneously amazing and awful. As an example my uncle
           | wrote a tcp/ip stack in lisp for the symbolics lisp
           | machine[2] for a project when he worked at xerox. He told me
           | in the late 80s about features in the symbolics debugger that
           | just totally blew my mind and are only now available in IDEs
           | for other languages, like being able to step backwards, alter
           | variables, then step forward again, jump to any stack frame
           | and just resume execution from there etc etc. On the other
           | hand he had to write the TCP/IP stack himself because they
           | didn't have one. I think that perfectly encapsulates the lisp
           | experience for me around 2000 when I last used it - some
           | things worked amazingly and were way better than anything
           | else (eg I remember at the time the things you could do with
           | serialization being just extraordinary compared to other
           | languages) but a bunch of basic stuff was painful, janky or
           | just completely missing.
           | 
           | 4) Some of the concepts are very powerful but result in
           | programs that are incredibly hard to understand. Macros,
           | continuation passing, multiple dispatch.. etc etc. This puts
           | a lot of people off because they just hit the learning cliff
           | face-first and give up.
           | 
           | This is part of why python saw such wide adoption in my
           | opinion. Not because it was in any sense the best language,
           | but it was a very easy, practical choice for doing a bunch of
           | things.
           | 
           | [1] https://www.cs.cmu.edu/Groups/AI/html/cltl/cltl2.html .
           | Paul Graham (yes that Paul Graham) wrote a good lisp book
           | also, although for me Steele is the one.
           | 
           | [2] https://en.wikipedia.org/wiki/Symbolics
        
         | pavon wrote:
         | J primitives are easier to type, but they aren't any more
         | readable or familiar to newcomers than APL symbols.
        
           | anthk wrote:
           | Well at least you can define new tokens with ease.
        
       | bnprks wrote:
       | One of the wildest R features I know of comes as a result of lazy
       | argument evaluation combined with the ability to programmatically
       | modify the set of variable bindings. This means that functions
       | can define local variables that are usable by their arguments
       | (i.e. `f(x+1)` can use a value of `x` that is provided from
       | within `f` when evaluating `x+1`). This is used extensively in
       | practice in the dplyr, ggplot, and other tidyverse libraries.
       | 
       | I think software engineers often get turned off by the weird
       | idiosyncrasies of R, but there are surprisingly unique (arguably
       | helpful) language features most people don't notice. Possibly
       | because most of the learning material is data-science focused and
       | so it doesn't emphasize the bonkers language features that R has.
        
         | VTimofeenko wrote:
         | Asking out of lack of experience with R: how does such
         | invocation handle case when `x` is defined with a different
         | value at call site?
         | 
         | In pseudocode:                 f =       let x = 1 in # inner
         | vars for f go here       arg -> arg + 1 # function logic goes
         | here            # example one: no external value       f (x+1)
         | # produces 3 (arg := (x+1) = 2; return arg +1)            #
         | example two: x is defined in the outer scope       let x = 4 in
         | f (x+2) # produces 5 (arg := 4; return arg + 1)? Or 3 if inner
         | x wins as in example one?
        
           | hugh-avherald wrote:
           | Well the point is that the function can define its own logic
           | to determine the behaviour. Users can also (with some limits)
           | restrict the variable scope.
        
           | bnprks wrote:
           | If the function chooses to overwrite the value of a variable
           | binding, it doesn't matter how it is defined at the call site
           | (so inner x wins in your example). In the tidyverse
           | libraries, they often populate a lazy list variable (think
           | python dictionary) that allows disambiguating in the case of
           | name conflicts between the call site and programmatic
           | bindings. But that's fully a library convention and not
           | solved by the language.
        
         | Avshalom wrote:
         | For those who haven't run into anything about this corner of R
         | before:
         | 
         | https://blog.moertel.com/posts/2006-01-20-wondrous-oddities-...
        
         | broomcorn wrote:
         | That sounds like asking for trouble. Someone coming from any
         | other programming language could easily forget that expression
         | evaluation is stateful. Better to be explicit and create an
         | object representing a expression. Tell me, at least, that the
         | variable is immutable in that context?
        
           | nerdponx wrote:
           | The whole magic is that expressions are in fact just objects
           | in the language. And no, there aren't any immutable bindings
           | here.
        
             | vharuck wrote:
             | It's crazy how literally R takes "Everything's an object."
             | While parentheses can be treated like syntax when writing
             | code, it's actually a function named `(`.
             | 
             | Of course, playing with magic sounds fun until you remember
             | you're trying to tell a computer to do a specific set of
             | steps. Then magic looks more like a curse.
        
           | bnprks wrote:
           | The good news is that most variables in R are immutable with
           | copy-on-write semantics. Therefore, most of the time
           | everything here will be side-effect-free and any weird
           | editing of the variable bindings is confined to within the
           | function. (The cases that would have side effects are very
           | uncommonly used in my experience)
        
         | empyrrhicist wrote:
         | I saw a funny presentation where Doug Bates said something
         | like: "This kind of evaluation opens the door to do many
         | strange and unspeakable things in R... for some reason Hadley
         | Wickham is very excited about this."
        
           | crest wrote:
           | Unspeakable horrors like changing `$[` in old Perl5 versions
           | to mess with someone's mind? Who doesn't like array indices
           | starting at 0, 1, ... or 42?
        
         | delusional wrote:
         | > I think software engineers often get turned off by the weird
         | idiosyncrasies of R
         | 
         | That was at least true when I was looking at it. I didn't get
         | it, but the data guys came away loving it. I came away from
         | that whole experience really appreciating how far you can get
         | with an "unclean" design if you persist, and how my gut feeling
         | of good (with all the heuristics for quality that entails) is
         | really very domain specific.
        
         | csimon80 wrote:
         | A lot of the time you're not actually using what is passed to
         | the function, but instead the name of the argument passed to
         | the function (f(x), instead of f('x')). Which, helps the user
         | with their query (dplyr) or configuration (ggplot2).
        
         | kkoncevicius wrote:
         | One of the stranger behaviours for me is that R allows you to
         | combine infix operators with assignments, even thou there are
         | no implemented instances of it in R itself. For example:
         | `%in%<-` <- function(x, y, value) { x[x %in% y] <- value; x}
         | x <- c("a", "b", "c", "d")       x %in% c("a", "c") <- "o"
         | x       [1] "o" "b" "o" "d"
         | 
         | Or slightly crazier:                 `<-<-` <- function(x, y,
         | value) paste0(y, "_", value)            "a" -> x <- "b"       x
         | [1] "a_b"
         | 
         | We with Antoine Fabri created a package that uses this
         | behaviour for some clever replacement operators [1], but beyond
         | that I don't see where this could be useful in real practice.
         | 
         | [1]: https://github.com/moodymudskipper/inops
        
         | staplung wrote:
         | I had a colleague at Google who used to say: "The best thing
         | about R is that is was created by statisticians. The worst
         | thing about R was that it was created by statisticians."
        
         | HdS84 wrote:
         | I once needed to implement an API in R, just saying that having
         | three or four object oriented systems did not help at all.
        
       | adregan wrote:
       | I had been wanting to sign up for exercism, and I love APL, so
       | this was a good nudge. However, I'm browsing the language tracks,
       | and I don't see APL. I see this post is from July 2023--has the
       | APL track been removed since then? Or am I just looking in the
       | wrong place?
        
         | jasonpeacock wrote:
         | From the post:
         | 
         | > APL isn't on the list of [Exercism] languages but I've seen
         | it in codegolf (codegolf.stackexchange.com) solutions often
         | enough that it seemed worth a look.
        
           | adregan wrote:
           | Thanks. I think I parsed that as "isn't on the list of
           | languages that I had wanted to learn."
        
       | anthk wrote:
       | J it's interesting too, without the non-ASCII mess:
       | 
       | https://www.jsoftware.com/indexno.html
       | 
       | https://code.jsoftware.com/wiki/System/Installation <- install
       | 
       | https://code.jsoftware.com/wiki/Guides/Getting_Started <- help
        
         | nmz wrote:
         | You can ignore this, given that I haven't used either APL/J
         | seriously, but if I were to truly dive in, I'd lean towards APL
         | exactly because of its non-ascii/symbolic leanings. the only
         | similitude I know of is operator overloading, and whenever that
         | is used, I have to relearn what each operator does in a certain
         | context. it is only if you use it regularly like regex which
         | while changing the meaning of the operators, since its an
         | entire DSL, is too different for me to think + means sum. If
         | another entirely different symbol was introduced, then I'm not
         | assigning any functionality to it, which is why I think it
         | should be easier.
        
       | bruturis wrote:
       | >> find the GCD (greatest common divisor) of the smallest and
       | largest numbers in an array
       | 
       | Just for a short comparison, In J the analogous code is </ +. >/
       | Where / is for reduce, +. is for the GCD,  the LCM is *.
       | The basic idea of J notation is using some small change to mean
       | the contrary, for example {. for first and {: for last, {. for
       | take and }. for drop (one symbol can be used as a unary or binary
       | operator with different meaning.  So if floor is <. you can guess
       | what will be the symbol for roof. For another example /:~ is for
       | sorting in ascending order and I imagine that you can guess what
       | is the symbol for sorting in descending order.  In a sense, J
       | notation include some semantic meaning, a LLM could use that
       | notation to try to change an algorithm.  So perhaps someone could
       | think about how to expand this idea for LLM to generate new
       | algorithms.
       | 
       | The matrix m, the sum of the rows, and the maximum of the sum of
       | the rows in J (separated by ;)                 m ; (+/ m) ; >./
       | +/ m       +-----+-------+--+       |0 1 2|9 12 15|15|       |3 4
       | 5|       |  |       |6 7 8|       |  |       +-----+-------+--+
        
         | KarlKode wrote:
         | I think you mistyped J code. I don't know any J but what I
         | understood from your comment that it should be something like
         | </ +. >/ *.
        
           | bruturis wrote:
           | You are right, the correct code is .</ +. >./
           | 
           | To understand this you need to know that >. and <. are the
           | min and max functions, and that in J three functions
           | separated by spaces, f g h, constitutes a new function
           | mathematically defined by (f g h)(x) = g(f(x), h(x)). An
           | example is (+/ % #) which applied to a list gives the mean of
           | the list. Here +/ gives the total, # gives the number of
           | elements and % is the quotient.
        
         | kqr wrote:
         | > So if floor is <. you can guess what will be the symbol for
         | roof.
         | 
         | Based on the examples, no, I cannot. It could be either of <:
         | and >.
        
           | bruturis wrote:
           | You are right, both are good options, the author of J chose
           | >. for ceiling and >: for greater than or equal.
        
       | AndyKluger wrote:
       | Not an array language (AFAIU), but here are some of the mentioned
       | problems solved in (glorious) Factor:                   : find-
       | gcd ( nums -- gcd )           [ infimum ] [ supremum ] bi gcd nip
       | ;              : max-wealth ( accounts -- n )           [ sum ]
       | map-supremum ;              : which-max-wealth ( accounts -- i )
       | [ sum ] supremum-by* drop ;              primes-upto
        
       | cl3misch wrote:
       | > what if we just generate all products from the set of numbers
       | 2:n and exclude those as "not prime" from all the numbers up to
       | n?
       | 
       | It's fun to translate terse APL to somewhat terse numpy. The
       | result still can be very compact and you can parse it easily if
       | you're used to looking at numpy:                   s = arange(2,
       | 50); p = outer(s, s).ravel(); sorted(set(s) - set(p))
        
         | jonocarroll wrote:
         | What's interesting there is that numpy is inspired (more than a
         | little) by APL and aims to bring that 'array' thinking to
         | python. I agree that thinking in this 'array' way helps to
         | better construct a solution in any language, so I'm leaning
         | towards 'designing' with APL glyphs, even if that's not the
         | language I'm implementing the thing in.
        
           | nerdponx wrote:
           | If it takes any inspiration from APL, it would be mostly
           | indirect, via Matlab.
        
             | jonocarroll wrote:
             | I've seen the connection made here
             | https://dev.to/bakerjd99/numpy-another-iverson-ghost-9mc
             | though the link to the source quote is broken. In all
             | fairness, Matlab is itself inspired by APL.
        
         | bruturis wrote:
         | Analogous code in J,                 /:~ s -. p [ p =: s*/s [
         | s=: 2+i.48
         | 
         | An exercise for numpy, test that GCD(x,y) * LCM(x,y) = x*y
         | using 1000 random numbers in the range 0..99 for x e y.
         | test =:  (* = *. * +.) & ?       *./ test~  1000 # 100
        
           | cl3misch wrote:
           | Thanks, that was fun.                   def d(x): N =
           | arange(1, x + 1); return N[x % N == 0]         def m(x, n):
           | return x * arange(1, n + 1)         def gcd(x, y): return
           | max(set(d(x)) & set(d(y)))         def lcm(x, y): return
           | min(set(m(x, y)) & set(m(y, x)))         def test(x, y):
           | return gcd(x, y) * lcm(x, y) == x * y         all([test(x, y)
           | for (x, y) in randint(1, 100, (1000, 2))])  # True
           | 
           | I am not a good golfer. Now I want to look at the codegolf
           | stackexchange for this...
        
       | jonocarroll wrote:
       | (author here, still getting over the first time I've seen one of
       | my own posts on this site)
       | 
       | The many recommendations for J here are a great nudge for me to
       | give it a proper go. I've taken quite a liking to the traditional
       | APL glyphs ( see a photo of the stickers on my laptop keys in
       | this post https://jcarroll.com.au/2023/12/10/advent-of-array-
       | elegance/ ) so I'm not looking for a way to avoid them.
       | 
       | Another detraction I've seen around is about the ambivalence of
       | APL glyphs (taking either 1 or 2 arguments and doing something
       | different in each case). I don't particularly mind it because I
       | think it becomes more natural to "understand" how a function is
       | being used the more familiar you become with it, but without the
       | limitation on the number of glyphs, I can see the benefit of
       | separating those.
        
         | dan-robertson wrote:
         | Can't the second argmax example be written with a right tack?
         | Is it nicer then?                 ([?][?][?]/)+/x
        
           | jonocarroll wrote:
           | Yep, that makes for a nicer tacit solution
           | maxrow-([?][?][?]/)+/
           | 
           | but I find                 [?]([?]+/)
           | 
           | to be an even cleaner tacit solution.
        
       | segmondy wrote:
       | The thing about APL/J/K is "Notation as a tool of thought". Sure,
       | most folks would frown and claim that Kanji or Arabic script
       | looks like noise and must be difficult. Yet some people read it
       | just as easy as we read roman scripts, the idea behind APL/J is
       | to learn it enough to read it as easy as you can read python or
       | javascript, with the added benefit that it's compact so would be
       | faster to read and reason about than an equivalent python code.
       | Of course python users that don't know APL are rolling their
       | eyes, yet when you make such statement about python vs Java they
       | get it.
        
         | kstrauser wrote:
         | I saw a recent paper saying that Standard Chinese and English
         | are approximately as fast to read by their respective native
         | readers. One Chinese character holds much more information than
         | one Latin letter, but an English reader can process many
         | letters simultaneously whereas the Chinese reader takes longer
         | to ingest each character.
         | 
         | In other words, if you were to write the same text in Chinese
         | and English, the Chinese version would take much less room, but
         | native readers of each would take about as long to read them.
         | 
         | In CompSci terms, Chinese gets more done per instruction, but
         | English has a much higher IPC.
         | 
         | I think we'd find the same between APL and Python. You could
         | express ideas much more compactly in APL, but someone skilled
         | in APL would take about as long to interpret a bit of code as a
         | Pythonista would take to understand the Python equivalent.
        
           | Qem wrote:
           | > but someone skilled in APL would take about as long to
           | interpret a bit of code as a Pythonista would take to
           | understand the Python equivalent.
           | 
           | If we take into account break of flow due scrolling up and
           | down trough text, APL probably has an advantage after a few
           | hundred Python-equivalent lines. One screen worth of APL
           | holds many screens worth of Python.
        
         | thaumasiotes wrote:
         | > Sure, most folks would frown and claim that Kanji or Arabic
         | script looks like noise and must be difficult. Yet some people
         | read it just as easy as we read roman scripts
         | 
         | This is folding several claims into each other, and they're not
         | all true.
         | 
         | I would tend to associate "looks like noise" with writing where
         | the units are not separated from each other. Arabic has this
         | feature and this makes it appear more forbidding to the
         | untrained.
         | 
         | Devanagari has it too: bhaart gnnraajy
         | 
         | Ge'ez script doesn't: ya[?]teyopheyaa feedeeraalaawi
         | deemokeraasiyaawi ripabelike
         | 
         | English does, if you choose to write in cursive.
         | 
         | By this standard that I just made up, kanji are clearly pretty
         | far toward the "organized" end of the spectrum. An untrained
         | person looking at them is going to be able to tell you what the
         | main principles of the script are. And while I suspect that
         | people complaining that a script "looks like noise" are mostly
         | just saying that they can't read it, I also think that if
         | people were forced to rank scripts based on how confusing they
         | look, Arabic would be rated a lot more confusing than kanji.
         | 
         | (For a parallel to the above examples: Da Ri Ben Di Guo ;
         | lkhilafa@ l`abaWsiyaW@.)
         | 
         | This belies reality; in fact, reading kanji is so much more
         | difficult that your claim that some people can do it as easily
         | as we read Roman scripts is not defensible. (Whereas it's fine
         | for Arabic.) The problem isn't in the appearance of the
         | elements, it is that there are too many of them. This means
         | that (a) learning to read kanji is a multi-year process; and
         | (b) even those who are considered "fully educated" nevertheless
         | need help in doing things like reading museum plaques or
         | technical documents that use words which modern people are not
         | expected to encounter in day-to-day life. If all you have is
         | "full" training in the writing system, these words cannot be
         | read _at all_ , so there is a system of phonetic annotation for
         | just this purpose. (For Japanese kanji, ruby; for Taiwan,
         | zhuyin; for mainland China, pinyin.)
         | 
         | Kanji in specific has an additional problem that Chinese
         | characters do not have, which is that the Japanese interpreted
         | it as being primarily a logographic script rather than a
         | phonetic script. To predict how a kanji is supposed to be
         | pronounced, you need to see a particular use in a particular
         | sentence and know the Japanese language. This is not true of
         | Chinese characters - but it's not so much of a problem for your
         | claim that "some people find this just as easy as we find
         | reading Roman characters"; native speakers of Japanese don't
         | have a problem with knowing Japanese. It does mean that it's
         | hard to predict how personal names are supposed to be
         | pronounced.
        
           | kazinator wrote:
           | Someone seeing an unfamiliar kanji-based term in a Japanese
           | technical document would _often_ have little trouble reading
           | it phonetically.
           | 
           | Furigana helps if one or more of the kanji used is rare
           | (outside of the joyo kanji), or when the word follows some
           | common variation (guessing which is possible, but wastes the
           | reader's time) or when the kanji spelling is arbitrarily
           | assigned ("ateji"; impossible to guess).
           | 
           | E.g. suppose the reader sees Gui Na . It is very likely they
           | can read it correctly as _kino_. The problem is not knowing
           | what it means, that it refers to (mathematical) induction, or
           | inductive reasoning. The dictionary lookup is trivial,
           | though.
           | 
           | Basically, someone fluent in Japanese can read unfamiliar
           | words like that most of the time. They will know a ton of
           | words in which Gui  is _ki_ and Na  is _no_. In situations
           | like that, it 's not much different from someone encountering
           | an unfamiliar word in English or Spanish text.
        
           | plagiarist wrote:
           | Someone literate in English would also need a dictionary just
           | as much to read academic, esoteric, or old words from a
           | museum. They might be able to guess the meaning based on word
           | roots and context, but I assume someone literate with kanji
           | could do the same based on radicals and context.
           | 
           | I think their claim is still defensible, and I think to make
           | a fair argument you'd have to reduce the scope of kanji down
           | to something the size of APL's vocabulary in any
           | counterclaim.
        
             | thaumasiotes wrote:
             | > Someone literate in English would also need a dictionary
             | just as much to read academic, esoteric, or old words from
             | a museum.
             | 
             | This is plainly untrue; if you go to a museum and they're
             | displaying an astrolabe, you can read the word "astrolabe"
             | despite the fact that you've never learned it. In Japanese
             | and Chinese, this is not true,+ and the text describing the
             | exhibit must separately inform you of how the word is
             | pronounced.
             | 
             | Similarly, if you look over your Chinese medical record, it
             | will be full of unfamiliar characters that you need to look
             | up by shape if you want to read them. An English speaker
             | may not know what a biopsy _is_ , but they won't have
             | trouble reading the word.
             | 
             | + It might be true; I have not looked into "astrolabe"
             | specifically. But for various premodern devices, it is not
             | true.
        
       | pocketsand wrote:
       | I've always wondered if the "hard productive way" as understood
       | by conventional wisdom (using VIM, using a language like APL) is
       | actually more productive.
       | 
       | For this topic, I have in mind two exercises:
       | 
       | First, get people to nominate "experts" in a language, and assign
       | them a straightforward task, and see who is actually faster using
       | their native environments with code completion, language servers,
       | etc. by systematically running these trials.
       | 
       | Second, get people untutored in each language of similar
       | experience, randomly assign them to a language, give them a not
       | short but not long period of time, then do a similar exercise as
       | above.
       | 
       | The first has problems with assuring there is skill balance
       | between the groups, but would I think tell you about how
       | productive one can be in each language.
       | 
       | The second would have good causal inference properties, but
       | probably not tell you about the downstream effects of learning
       | something hard with great rewards (e.g., vim or emacs).
       | 
       | I'd bet there are marginal gains to APL (or vim) but probably
       | nothing that matters in long-term productivity.
       | 
       | However, I do think people underrate how fun it can be to "get
       | good" at things like vim or APL, and I think that has really
       | positive knock-on effects.
        
       ___________________________________________________________________
       (page generated 2024-03-22 23:02 UTC)