[HN Gopher] Learn R Through Examples (2020)
___________________________________________________________________
Learn R Through Examples (2020)
Author : diplodocusaur
Score : 129 points
Date : 2021-06-05 11:43 UTC (11 hours ago)
(HTM) web link (gexijin.github.io)
(TXT) w3m dump (gexijin.github.io)
| nojito wrote:
| R's such a fantastic language and is leagues ahead of all others
| for data work.
|
| I do however recommend picking up data.table along the way
| because that is easily one of the best reasons to continue using
| R today.
| AndyPatterson wrote:
| I work using R almost everyday and I think many of the problems
| that are unique to R could be solved by having a couple of
| experienced SWE in the core team to point R in the right
| direction. As it stands, I think R will be left behind until it
| fixes things like performance and scalability (e.g. intuitive
| byref semantics and a faster runtime) and a consistent scoping
| model and OOP.
|
| Apart from that, I think there's a bigger challenge which still
| needs to be addressed is that analysis/modelling projects tend to
| be worked on by individuals and/or thrown away after the initial
| value is pulled out of them.
|
| Going forward, I think we need to start identifying design
| methodologies that would make collaborating on this sort of work
| pain-free and more agile. Doing so should give us more value and
| sooner and for longer.
| civilized wrote:
| Here is my take on R as a guy who does stats as well as some
| software engineering in more mainstream languages like Python.
|
| R is a fantastic DSL for data manipulation and statistical
| analysis, with both traditional and modern tools, on datasets up
| to the gigabyte scale. It has great, easy-to-use data structures
| and unparalleled APIs in the tidyverse. It is not the thing for
| the latest deep learning implementation on petascale data, but
| most data science work doesn't need or benefit from that.
| Surprisingly, some machine learning methods have their nicest
| APIs in R, because that's where the users of those methods are.
|
| It has its warts, but so do JavaScript and SQL, and I think few
| people dispute that these are very powerful DSLs. Statistical
| analysis is just as legitimate a computing task as building
| webpages or querying databases. It is not the same as general-
| purpose programming, and it needs a good DSL.
| ryando wrote:
| Agree with this. Nvim-R plus dplyr and the plotting libraries
| are the best tools I've found for manipulating and
| understanding the characteristics of a dataset quickly.
| Eventually I moved to Python for the specific reason that the R
| packages for interacting with cloud platforms couldn't really
| keep up with the development of those platforms, and I got
| tired of having projects that mashed together both languages. I
| haven't checked, maybe things have stabilized enough now that I
| would be happy going back to R.
| civilized wrote:
| That makes sense. I don't much like crossing streams between
| languages on a project. It's great if you can keep R for
| analysis and Python for production, but if R can't keep up
| with the wrangling part, you're a bit up a creek. I'm lucky
| in that most of my analysis only needs to query fairly well-
| behaved SQL databases.
| tylurp wrote:
| As an experienced R programmer that doesn't do much statistics
| or have a background in computer science. What language
| characteristics does R lack that makes it a DSL instead of GPL?
|
| Edit: I found an interesting quote from a guy named Martin
| Fowler about the subject.
|
| "Languages can have a domain focus but still be general-purpose
| languages. A good example of this is R, a language and platform
| for statistics; it is very much targeted at statistics work,
| but has all the expressiveness of a general-purpose programming
| language. Thus, despite its domain focus, I would not call it a
| DSL."
| specproc wrote:
| R is such a horrible language to learn. I gave up entirely and
| now just use rpy2 for the few things it can do that Python can't.
| vasili111 wrote:
| Have you tried to read R book? Lots of people are learning from
| videos, tutorials and etc and that is not a good approach to
| learn R.
| qntty wrote:
| If you already know another dynamic language and want to
| understand R, I would recommend skipping all the intros to R
| based around data analysis and start with Advanced R by Hadley
| Wickham. It will explain all the weirdness you'll encounter right
| up front before it confuses you. Then you can read the data
| analysis tutorials and focus on the content rather than the R
| weirdness.
| wodenokoto wrote:
| I was quite surprised how easy that book was to read and
| understand. I'd expected "advanced" any language to be much
| more difficult.
| beforeolives wrote:
| The title is kind of misleading - the book is more of a list
| of exceptions, edge cases, unexpected behaviour and other
| gotchas.
| carljv wrote:
| That is not an accurate description of the book.
| CornCobs wrote:
| I agree that Advanced R is a fantastic resource; however I do
| not think reading it is a good way to learn R. Advanced R
| explains the whys not the whats, and someone coming into R, who
| hasn't encountered any of the stuff Wickham gives intuitions
| for will likely find it an inexplicable truck of concepts.
|
| Instead I would propose using R in some capacity, encountering
| it's weird quirks (why sometimes evaluating an expression
| prints something to console and sometimes it doesn't? How is
| dplyr using my column names as variables? Why do I get warnings
| when using & instead of &&?) And then turning to Advanced R as
| a source of sanity
|
| Another source I would recommend is in fact the R language
| definition. It's very approachable and you quickly realize that
| R is pretty simple at the core, buried under piles of cruft
| claytonjy wrote:
| It's also available online under a creative commons license:
| https://adv-r.hadley.nz/
| temp8964 wrote:
| I recently tried to migrate my R code to Julia. Even though I
| already knew R data.table is faster than DataFrames.jl, I was
| totally blown away by how slow Julia is. So I quickly gave up. I
| think I will have to write unavoidable hard loop in cpp, which I
| really don't want to do...
| ku-man wrote:
| My experience as well.
|
| In order to get those so much vaunted C-like speeds the Julia
| fanboys claim, you need lots of contortions and hacks. Off the
| bat, Julia speeds are mediocre.
| xiaodai wrote:
| yeah. time to first plot (ttfp) is a real issue.
| cbkeller wrote:
| There are a few tricks to getting Julia to be actually fast,
| and while it's not hard _per se_ if you know them all (at least
| for numerical work), it 's definitely not trivial.
|
| IMHO, you really have to embrace dispatch-oriented programming,
| and that includes being scrupulous about avoiding _type
| instability_. You also have to be a bit conscious about
| allocations, since it 's easy to write Julia code (especially
| if you're trying to write in a "vectorized" style as is common
| in R, Python, Matlab) that generates absurd numbers of
| allocations, which must then be garbage-collected. But also
| easy to avoid those allocations if you know.
|
| It took about two years, but after picking up more of this, I
| was eventually able to switch everything my group does from a
| two-language solution of matlab for scripts and plotting and C
| (with MPI) for HPC to all-Julia. This [1] was originally
| targeted at academics making the same switch, but much of it
| could be relevant to those with an R background as well.
|
| [1]
| https://github.com/brenhinkeller/JuliaAdviceForMatlabProgram...
| nojito wrote:
| if it's grouping related check out the collapse package
|
| https://sebkrantz.github.io/collapse/
| glial wrote:
| For whatever it's worth, I was pleasantly surprised at how easy
| Rcpp is to use.
| [deleted]
| montmorencie wrote:
| I am currently a data scientist. Educational background in cs,
| few hobby web projects and currently updating my skills in java/
| kotlin with the idea to go in mobile dev.
|
| I use only Python in my work. I learnt R and honestly, it's the
| same thing as using Python scientific packages. It's mostly
| vectorized operations, spaghetti functional, if people know how
| to write functions, code just to get it done. To make graphs, web
| dashboard(no, we are not doing web dev, it's dark magic
| frameworks), build machine learning and eventually some reports.
| Stuff like that.
|
| I do some software engineering but that s optional and I do it
| because I can. Most data scientists/ ml engineers can't. So you
| guys are not fair. R and Python in these environments are not
| even being used for building stuff. This language is not build
| for that. Hence it's not good from the perspective you look at it
| from(software engineers).
|
| Unlike Python , R is solely for statistics, data science and
| probably some basic ml( I haven't tried tho). Also Shiny for
| building web dashboards. But don't look at the code for
| dashboards, it's bad, with 'get it done and forget' approach.
|
| That being said. Good luck scrapping, mining, cleaning data with
| something not called R/Python. Good luck with data engineering.
| Exploring and visualizing trends. Creating dashboards even.
| Machine learning. Monitoring and reporting in scientific manner.
|
| Try This type of work with your favorite languages. Then see how
| quickly and easily it's done with R/Python . Come back and say
| it's bad language.
|
| It's the same thing as embedded dev complaining about how bad js
| is for his job. You just totally ignore the context.
| AuthorizedCust wrote:
| R _plus the tidyverse_ is what makes it a great language. Some
| tidyverse concepts are being baked into base R, like the pipe,
| but base R by itself feels hollow.
|
| R's future is inseparable from the tidyverse. We need to just
| lump them together in any serious discussion of R.
|
| I teach a graduate level R course mainly for economics and
| statistics majors. (My educational background and career are
| computer science and technical; while that may stereotype me into
| Python, I just love R.) I spend the first three weeks on base R,
| to convey language-essential concepts, like vectorized objects,
| then the rest of the course is tidyverse-centric.
| CalChris wrote:
| I spend the first three weeks on base R, to convey language-
| essential concepts ...
|
| Awesome. When I was at Berkeley, Linear Systems had Matlab
| assignments. The real engineers (ME, CE, NE, ...) had taken a
| Matlab class and knew the language. We computer scientists
| hadn't and suffered horribly as a consequence. Your three weeks
| of learning base R instead of sink or swim will pay dividends.
| notagoodidea wrote:
| Got the reverse experience. Trained in Matlab, R and Python,
| we had to follow a database/application class with computer
| scientist and software engineers where the big project was to
| make a basic Android application with sqlite database. That
| was painfully to be dropped in Android Studio without any
| Java knowledge. And because the class was focused on
| database, we had no Java introduction or whatever. We were
| able to team up with students from the other cursus that
| already had multiple Java projects and classes under their
| belt but the pill was bitter swallow.
| kgwgk wrote:
| Just for the record, many users are happy with base R. There
| are dozens of us!
|
| R by itself is nice. But the tidyverse is creeping in and
| bringing dependency hell with it.
|
| http://www.tinyverse.org/
| ProjectArcturis wrote:
| Personally I much prefer data.table. The syntax is a bit harder
| to get a handle on, but you can do just about anything with it,
| and it's much faster at runtime than tidyverse.
| acomjean wrote:
| For those that don't know the "tidyverse" is a set of R
| packages that make using base R much easier/better (in my
| opinion)
|
| https://www.tidyverse.org/ There is also a free ebook that's a
| good reference.
|
| ggplot2 the plotting package included is pretty awesome
|
| I took a biostatistics class and after the basic examples in R
| using the tidyverse to analyze data for projects was very
| helpful.
| baron_harkonnen wrote:
| Lot's of negative comments in here about learning R from
| experienced programmers. I've found this is largely because
| experienced programmers have this unjustified bias that R is some
| toy language that should be easy to learn and has nothing to
| teach them. If you approached a language like Rust in the same
| way you would likely be just as frustrated with it.
|
| Certainly R has its quirks, but most of this comes from being one
| of the oldest continuing existing programming languages there is.
| It derives from S which was written 46 years ago. Because of this
| it has multiple object/class systems reflecting the changing
| standards for OOP. It's most dominant one, S3, predates Java and
| therefore uses the Generic Function paradigm of OOP similar to
| Common Lisp's ClOS. If you're experienced but have never worked
| with non-Java style OOP you're going to be a bit confused.
|
| R's most important feature, which is well worth studying and
| mastering for any serious programmer, is that it is a completely
| vectorized programming language. It borrows this style from APL
| (though is a million times more readable). Every value in R is a
| vector and for the vast majority of operations the best approach
| to solve your problem is by thinking in vector operations. This
| makes simple things like string formatting with `paste` seem like
| a confusing nightmare, but there is a real logic there. Functions
| like `ifelse` can seem strange, and writing C-style code in R,
| while possible will result in horrible performance.
|
| Once you do learn to think in vectors you realize that R isn't
| just popular in the stats world because most statisticians
| haven't seen a "real" programming language, but because you can
| very rapidly iterate on models. Translating mathematical notation
| into R, for the experience R programmer, is easier than any other
| language I've worked with by a long shot.
|
| My advice to any experienced programmer approaching R is to have
| some respect for the language. Most of the frustrations you'll
| have aren't because R is a bad language, but because you have
| less experience than you think and learning R well can expand
| your programming views in a similar way to Haskell.
| jmcdl wrote:
| What is the advantage of "thinking in vectors" in R versus
| "thinking in vectors" using numpy in Python (for example)?
| carljv wrote:
| There's some overlap, but vectors are essential to the
| language. Every type of data in R is a vector. There are no
| scalars, just vectors of length 1. Instead of dictionaries,
| it's idiomatic in R to use "lists", which are vectors of
| vectors. Data frames are lists (vectors of vectors)
| constrained to have equal length element vectors (ie
| columns). Classes are defined as lists with some metadata
| (stored in a vector) to direct method dispatch.
|
| It's not just vectorizing mathematical operations a la numpy.
| xiaodai wrote:
| NSE is what will do experiences programmer's head in. It's an
| interesting feature.
| foxes wrote:
| I find it highly unlikely that learning R will expand your
| programming views anywhere near Haskell.
|
| Haskell is an advanced functional programming language. Most R
| stuff seems to be incoherent, hard to verify correctness,
| hacky. It does not seem built on a solid foundation like
| Haskell. Truly everything being a vector is not a huge take
| away.
|
| As for "here's just a bunch of examples", well that seems sort
| of a brute force way to learn something. I agree examples are
| important, but they are usually to back something up. Having to
| reverse engineer some fundamental ideas out of just examples is
| more work. Seems like this is just promoting more hackyness.
| Seems like its training a neural network instead of actually
| understanding something.
| datastoat wrote:
| R certainly expanded my programming views! Haskell did too,
| but the lessons of Haskell didn't stick the way that R's
| lessons did. Here are some of the things I learnt from R
| (though they can be found in other languages of course).
|
| * Multiple dispatch. Before learning R, I knew about
| polymorphism in Java and C++, and multiple dispatch in R
| broadened my mind and turns out to be very handy.
|
| * The idea of "frames". In R, when you invoke
| `lm(height~sex*age, data=mydataframe)`, the first argument
| (the formula) doesn't get evaluated until the lm command asks
| it to be evaluated, and lm can set up the "frame" for that
| evaluation, i.e. the place where variables are looked up,
| however it likes. In fact, lm sets it up to include variables
| from both the scope in which you invoked lm, and also from
| mydataframe. This is what makes R so wonderfully concise for
| modelling in data science, compared to e.g. Python + pandas.
| I knew about frames from interactive debuggers, but until R
| it never occurred to me that the programming language could
| manipulate them.
|
| * "Held" arguments. In R, when you invoke `plot(x, y1+y2)`,
| it doesn't just evaluate the arguments and then call the plot
| function -- it leaves the arguments unevaluated, and invokes
| plot. Plot then (1) decides when to evaluate them, (2) gets
| access to the language expression `y1+y2`, which means that
| it can print "y1+y2" on the plot label, (3) it can even
| define extra variables to include in the scope when y1+y2
| gets evaluated. (I knew about held arguments earlier, from
| Mathematica, but they only clicked when I read the R
| documentation.)
|
| I've read that R is a descendent of Scheme, and that that's
| where it gets all its "manipulate language expressions" from.
| I don't know any Scheme, nor Lisp, and I should definitely
| learn them -- but in the meantime, my experience has been
| that R's ability to manipulate language expressions is what
| makes it such a wonderful sweet spot as a data modelling
| language. I mostly use Python + pandas nowadays, but it feels
| such a slog in comparison.
| bookofsand wrote:
| Syntactic forms ('frames', 'held arguments') are reasonably
| useful, but have two flaws:
|
| A. Understanding how to _implement_ functions using
| syntactic forms is a steep learning curve. I remember
| running out of dplyr and having to implement a udf. Fairly
| unpleasant experience (enquo, !!, perhaps other unusual
| constructs). Felt like programming C macros.
|
| B. "A function can decide where variables are looked up
| however it likes" is a significant obstacle in
| understanding how even basic constructs like function calls
| actually work. There is a non-trivial amount of hard-to-
| debug dark magic lurking behind every corner.
|
| A middle ground has never been achieved. For example,
| `plot(expr(x), expr(y1 + y2))`, where the system limits the
| dark magic to explicit uses of the `expr()` construct, and
| `expr(x)` always means `{vars: vars(x), expr: (vars(x)) =>
| x}`. Instead of patching interpreter environments, simply
| call a lambda function.
| datastoat wrote:
| I completely agree about the steep learning curve and the
| feeling of dark magic -- how many times have I had to
| relearn what deparse(substitute(x)) means -- but oh the
| satisfaction of broadening my programming horizons. For
| me it didn't feel like C macros, it felt like "This must
| be what it feels like to have the power of Lisp"!
|
| That's the weird thing about R. All this dark magic is
| hiding under the hood, but the core R team hid it so
| deftly that to the casual statistician it's a
| straightforward data modelling language that "just
| works". I'm not sure that it's possible to get rid of the
| dark magic and retain that data-modeller friendliness.
| CornCobs wrote:
| This sounds like a highly biased perspective of both R, and
| what it means for a language to be respectable.
|
| If you define a language that "will expand your programming
| views" as one that can verify correctness easily then yeah, R
| is terrible at that. But so are many languages that are as
| flexible as R. Would you have the same opinion of FORTH? Or
| LISP? or TCL? I think these languages definitely count as
| "hacky" languages and yet they don't seem to draw the same
| derision as R (in my experience)
| jghn wrote:
| S4 is the one that is reminiscent of CLOS. Dylan was explicitly
| cited as an inspiration [0]
|
| [0] There's an old article from Robert Gentleman named
| something like "S4 objects in 5 pages, more or less" but I
| can't find it. However, there's a mention of Dylan and CLOS
| here:
| https://genomebiology.biomedcentral.com/articles/10.1186/gb-...
|
| EDIT: Here's the document I was looking to cite:
| https://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objec...
| kgwgk wrote:
| Arguably the S3 object system is also "functional" in spirit,
| even if it's single-dispatch.
|
| https://arxiv.org/pdf/1409.3531.pdf
|
| Object-Oriented Programming, Functional Programming and R
| (John M. Chambers)
|
| "Chambers and Hastie (1992), in the discussion of classes and
| methods, noted that S differed from other OOP languages
| because of its functional programming style. In fact, this
| version of functional OOP finessed the resulting distinction
| from encapsulated OOP in two ways. First, the methods were
| dispatched according to a single argument, the first formal
| argument of the generic function in principle. As a result,
| the methods were unambiguously associated with a single
| class, as they would be in encapsulated OOP. Methods were
| actually dispatched on either argument to the usual binary
| operators, but a number of encapsulated OOP languages do the
| same, under the euphemism of operator overloading.
|
| "Second, the question of whether methods belonged to a class
| or a function was avoided by not having them belong to
| either. Methods were assigned as ordinary functions and
| identified by the pattern of their name: "function.class". In
| any case, there were no class objects and generic functions
| were ordinary functions that invoked UseMethod() to select
| and call the appropriate method. Neither the function nor the
| class was able to own the methods."
| mraza007 wrote:
| Okay I'm not sure why R is getting a lot of Hate. After all its
| a programming language that gets the job done and its very
| popular in finacial industry especially among Risk Modelers,
| Quants and i have even seen this being used in analytics space
| within financial industry
| dwrodri wrote:
| I have used R a few times now, and I definitely agree with the
| statement that thinking in vectors is central to writing good R
| scripts. However, as a computer engineer and performance
| junkie, its unfortunate that it doesn't get as much attention
| as other "STEM DSLs" (Julia, MATLAB) when it comes to
| performance.
|
| The same could technically be argued for Python; The current
| approaches to dealing with high-performance compute workloads
| either rely on JITing (e.g. Numba, Tensorflow/JAX's XLA) or
| bridging over to giant binary blobs through the CPython's well-
| supported C interop.
| ttz wrote:
| To add: in my experience, programmers who denigrate R think of
| it as a software engineering language. Not all programming
| languages are meant to be languages for building large scale
| applications. Programming is not just about building business
| applications, it's about getting a computer to do things.
|
| And R excels at doing statistics and data science. If you keep
| that mindset, I believe many will find that its an excellent
| programming language.
| kgwgk wrote:
| > S3, predates Java and therefore uses the Generic Function
| paradigm of OOP similar to Common Lisp's ClOS
|
| S3 appeared a few years before Java but there were other OOP
| languages like C++ around at the time.
| auto wrote:
| Honestly, I'd argue that the issue I had in my experience in R
| wasn't that I myself wasn't giving it the respect it deserved,
| but rather that the course constructors for my degree didn't
| give it that respect.
|
| We were essentially told "Install R Studio, then just copy and
| paste these library imports and you're good to go". Your
| description of a vectorized language makes total sense, and
| that single paragraph is more of an intro to R than we ever got
| in class.
|
| That said, I think the reason this happened with this class in
| particular is the viewpoint of (what I perceive) as the
| majority user's of R. Mathematics focused researchers who never
| learned the language, they just have done enough to get by and
| don't _really_ appreciate the underpinnings, or the nuances of
| running an environment on a machine that isn 't theres.
|
| I can't entirely absolve myself of blame though, once I
| realized what was happening I should have gone and done some
| more foundational R learning, but at that point I just wanted
| to be done with the class.
| hdkrgr wrote:
| I wouldn't say this is true for "the majority user's of R" at
| all.
|
| But for "the majority of professors who tangentially use R
| code in classes on
| statistics/bioinformatics/economics/finance (anything not
| explicitly about R and/or Data Science best practices)"?
| Absolutely.
|
| The R code you see in industry (or academic labs where
| someone cares about modern R) looks vastly different from
| those script examples in college that are most people's first
| impression of the language.
| Closi wrote:
| I currently do lots of data analysis in excel and know basic
| Python. I would be interested to get opinions on if R is better
| suited to data analysis than python if that's all I was doing.
| trailrunner46 wrote:
| Python is certainly more popular and for job prospects I always
| tell that to newer data folks. That being said if you want to
| load in some data do some SQL like manipulation, run some stats
| and make a graph or output a report I would argue R is way
| better experience than Python but that's much more about the
| package ecosystem and less a comment on the language. Dplyr is
| just more friendly to use than pandas (often 3-5 ways to do
| something and as a beginner this can be disorienting) and
| ggplot2 vs matlibplot. For interactive graphs you are probably
| going to use plotless anyway from both languages.
|
| One other thing I would mention is knowing SQL well is the most
| translatable skill. A lot of dplyr and pandas are doing SQL
| like operations (in fact dbplyr will generate SQL equivalent
| commands for your dplyr code for various backends).
|
| In summary know how to manipulate data in SQL then pick a
| language (because you will need to do some IO/reporting stuff
| outside just data work) where the ecosystem of packages feels
| user friendly to you and your work flow and roll with that.
| scottmcdot wrote:
| I'm strong in R, Python and Excel. I'd say anyone transitioning
| from Excel would be better off using R first. Because the R
| Integrated Development Environment (IDE) RStudio is fantastic
| compared to any Python IDEs that you can actually figure out
| how to install. The IDE makes it easy to visualise what you're
| actually doing to your dataframes by using multiple table tabs.
| auto wrote:
| I consider myself a pretty experienced developer/software
| engineer/whatever. Decade into my career, started in iOS in
| Obj-C, learned Swift along the way, eventually migrated to the
| backend with Java/SQL, and finally found myself where I really
| wanted to be, embedded doing C/C++ work for a household name.
|
| That said, about halfway through my master's about two years ago,
| I found myself in an intro to data mining course that was sold as
| an "we will teach you R". I had heard non-programmer math friends
| talk about what they had accomplished in R, and was excited to
| dive in.
|
| Now, it didn't help that the class ended up being _heavy_ on the
| statistics side (which despite a math /cs double major, stats was
| never my thing), but the actually learning R part was 99% left as
| an exercise to us alongside of the classwork required.
|
| I can say without a doubt, learning R is the worst programming
| experience I've ever had. Our assignments would give some high
| level direction on which libraries to use, but getting the right
| libraries setup and in the environment was just an absolute
| nightmare. All I remember from that class is hours every week
| googling unreadable python pukes from R studio (because
| apparently everything data mining/ML related in R is actually
| just python), and then spending an hour or less actually doing
| the statistics work.
|
| I feel bad because I feel like I was setup to not be able to give
| it a fair chance, but if that's what non-programmer math types
| are subjected to when told "you need to do some programming for
| your job", I can understand the apprehension.
| CapmCrackaWaka wrote:
| > All I remember from that class is hours every week googling
| unreadable python pukes from R studio (because apparently
| everything data mining/ML related in R is actually just python)
|
| I'm curious what libraries you were using, I've had to go into
| the source of quite a few popular libraries and I don't think
| I've ever encountered Python. Lots of C and it's derivatives,
| lots of Stan and FORTRAN, but I don't think I've seen Python
| yet.
| auto wrote:
| Just going back and looking through some of the homework,
| here's what I'm seeing imported:
|
| readr, caret, lattice, ggplot2, RColorBrewer, mlbench,
| ElemStatLearn, klaR, dplyr, arules, arulesViz, tensorflow
| malshe wrote:
| In this list only tensorflow requires Python
| BoiledCabbage wrote:
| And I haven't used it, but there is Torch for R as the
| alternative which isn't supposed to have any dependency
| on Python.
|
| https://torch.mlverse.org/
| disgruntledphd2 wrote:
| Yeah tensor flow is python, and is a nightmare to work
| with.
|
| That's not Rs fault though
| temp8964 wrote:
| What you described is a common phenomenon in stats / data
| mining / psychometrics / econometrics. They all need students
| to use certain programing languages, such as SAS / Stata / R /
| Python, but they don't really spend time to teach those
| programing languages. I guess this could also be the same for
| MATLAB in math?
| pacbard wrote:
| The problem is that there are no incentives in learning how
| to program for people in those fields. Most people just get
| their code to "work" (i.e., output the analyses that they
| want) without really wanting to know how it works. Most of
| the time code is passed from grad student to grad student and
| modified to make it work for the specific analysis. As a
| result, you get Frankenstein code that somewhat works but
| that is good enough for writing a results section of a paper.
|
| There are people that know how to code but those are few and
| in-between. Usually they are pushed out of academic positions
| because there are very few ways to fund work to develop
| scientific code.
| warlog wrote:
| Dataframes...python has R to thank...too bad they suck compared
| to R.
| malshe wrote:
| > because apparently everything data mining/ML related in R is
| actually just python
|
| I think it is apparent only to you. The most popular package
| for ML in R is `caret` and it has nothing to do with Python.
| Similarly, `mlr` also has no Python. In fact, except for
| tensorflow and keras, I can't think of any major ML package
| that needs Python. Even torch package which brings pytorch to R
| doesn't need Python
| (https://cran.r-project.org/web/packages/torch/index.html).
|
| What confuses me even more is that you were learning statistics
| but using R packages that use Python as backend. In my
| experience, almost all the new statistics and econometrics
| methods are first released as R packages by the researchers.
| Can you name any data mining R packages that you used that
| required Python? I am really curious to know.
| gonzo41 wrote:
| I pretty much had the same experience with R. I've never been
| able to get really productive with it as a software developer.
| I feel like I know too much and can't break from old habbits.
| It's very academic which I feel really holds it back from
| software devs and also captures non developers in it's web.
|
| Whilst it's certainly got some runs on the board. I think
| having data science folk work in more standard languages would
| actually make supporting them and their needs easier.
| notagoodidea wrote:
| Funny, I'll never say that R itself is very academic. The
| environment maybe, the users sure but the language is very
| much an mutated Lisp with vectorized operation. Same for the
| python stuff, I never hit that problem working a lot with it
| but most of the time when I wanted to check a lib, I landed
| in C++ aka I am not sure to understand how to read the code.
|
| What in R made it feel academic for you?
| beforeolives wrote:
| > because apparently everything data mining/ML related in R is
| actually just python
|
| Everything? I'm just wondering what you've been doing exactly.
| I know that Tensorflow in R is just a wrapper on top of Python,
| not sure what else. If you're doing any deep learning, then
| going straight to Python is certainly much better than using R.
| For most other things it's not quite as clear of a decision.
| _Wintermute wrote:
| Hilariously there's an argparse library for R, which has
| python as a dependency.
___________________________________________________________________
(page generated 2021-06-05 23:01 UTC)