[HN Gopher] The counterintuitive rise of Python in scientific co...
       ___________________________________________________________________
        
       The counterintuitive rise of Python in scientific computing (2020)
        
       Author : leonry
       Score  : 142 points
       Date   : 2022-03-26 14:27 UTC (8 hours ago)
        
 (HTM) web link (cerfacs.fr)
 (TXT) w3m dump (cerfacs.fr)
        
       | jrm4 wrote:
       | Yet another hardcore programmy type discovers that usability is
       | infinity more important that how many clock-cycles you save.
       | 
       | Programming languages aren't for computers, they're for people.
        
         | taeric wrote:
         | This feels misguided, too. It begs the question that python has
         | good usability. Anyone that has tried managing dependencies in
         | it will know that is mostly a lie.
         | 
         | What python had, was that it was preinstalled on many computers
         | and then had a large cohort of users that are insisting that
         | others use it. And mostly force proclaiming that it is easy and
         | readable.
         | 
         | I'll not claim that it is hard, per se. More that it is not
         | intrinsically easier than any other dynamic language.
         | 
         | For evidence, the main packages that are popular are often
         | clones of packages from other environments that were not widely
         | installed. Jupyter can be seen as free version of many
         | scientific applications. Matlab, Mathematica, etc. Matplotlib
         | is rather direct in it's copy. Pretty sure there are more
         | examples.
        
           | iainctduncan wrote:
           | Is managing deps in Python a pain? sure is!. Is it a pain in
           | the other contenders of easily available dynamic languages?
           | Yup. So that's a wash. Managing deps in dynamic languages is
           | not a simple problem, I can't say I've tried one that did it
           | super well yet.
        
             | taeric wrote:
             | I'm... Not sure insulting other languages is a path to
             | victory.
             | 
             | Npm, as much as it annoys me, is light-years ahead of
             | anything in python. Quicklisp is rather pleasant, now. Ruby
             | has had gems for a long time.
             | 
             | I grant that it is a hard problem. I am not griping that it
             | is not solved. More that the option community has largely
             | failed to even pick a direction. The most used dependency
             | methods are, in the modern spirit of python, deprecated
             | already.
        
               | Tijdreiziger wrote:
               | > Npm, as much as it annoys me, is light-years ahead of
               | anything in python.
               | 
               | Hence PEP 582:
               | 
               | > This PEP proposes to add to Python a mechanism to
               | automatically recognize a __pypackages__ directory and
               | prefer importing packages installed in this location over
               | user or global site-packages. This will avoid the steps
               | to create, activate or deactivate "virtual environments".
               | Python will use the __pypackages__ from the base
               | directory of the script when present.
               | 
               | https://peps.python.org/pep-0582/
               | 
               | I haven't tried it yet, but there's already a PEP
               | 582-compatible dependency manager: PDM
               | https://pdm.fming.dev/
        
               | taeric wrote:
               | Right. I am confident they will get things in place to
               | solve this. That it took so long for them to want to
               | strengthens my assertion that the community was piggy
               | backed on the machine package management for a lot of its
               | initial popularity.
        
               | iainctduncan wrote:
               | I'm not insulting other languages - I'm pointing out they
               | have the same situation! Where in that post is there an
               | _insult_???
               | 
               | I have had many dep fights with Ruby and Node too, and I
               | would disagree that their solutions are in any way better
               | than pip and virtualenv.
        
               | taeric wrote:
               | It isn't that similar. Most languages actually have
               | solutions that the community are pushing together. Python
               | alone bungled a major version change and then refused to
               | endorse a package management system for so long.
        
         | da39a3ee wrote:
         | matplotlib doesn't score highly on usability.
         | 
         | jupyter notebooks encourage disorganized, unprincipled
         | programming; chaotic re-running of cells in the face of global
         | mutable state; and prevent budding programmers from learning to
         | use version control because the JSON format was designed
         | without version control in mind.
        
           | wiz21c wrote:
           | For Jupyter, it depends on the workflow. Especially with data
           | sciences. In data science, you spend a lot of time playing
           | with the data, testing things, drawing charts, computing,
           | etc. When you do that, the cost of starting a python
           | interpreter, loading the imports, loading the (usually big)
           | data becomes a real pain 'cos you iterate like hell. Working
           | in a REPL becomes _really_ important.
           | 
           | But even more, working with Jupyter allows you to work out a
           | very detailed explanation of your thought process, describing
           | your ideas, your experiments. Being able to mix code and
           | explanations is really important (and reminiscent of literate
           | programming). You got the same kind of flow with R.
           | 
           | As data scientist, I'm concerned about data, statistics,
           | maths, understanding the problem (instead of the solution). I
           | don't care about code. Once I get my data understanding right
           | _then_ comes the time of turning all of that into a software
           | that can be used. Before that, Jupyter really gives a
           | productivity boost.
           | 
           | For the code part, yep, you need other principles where
           | Jupyter may not be suitable.
        
             | patrick451 wrote:
             | It's interesting, I never feel like I get these exploratory
             | benefits from jupyter notebooks. I just end up feeling like
             | one hand and half my brain is tied behind my back. I'm most
             | productive iterating in a similar way to what you describe,
             | but in an ipython terminal, running a script and libraries
             | that I'm iterating on in a real editor. If there are
             | expensive computations that I want' to check point, I just
             | save and load them as pickle files.
        
               | wiz21c wrote:
               | really interesting. I may have overlooked IPython a bit
               | (I just thought Jupyter was its improved version). For
               | the moment, maybe like you, I prerpocess the data (which
               | takes minutes) into numpy array which then take seconds
               | to load. But once I add imports, everything takes about 5
               | or 6 seconds to load everything I need. So Jupyter
               | remains a good idea. Moreover, I love (and actually need)
               | to mix math and text, so markdown+latex maths is really a
               | great combo. I dont' know if one can do that in IPython,
               | I'll sure look!
        
               | kzrdude wrote:
               | I have to say I think a jupyter notebook format is a 10x
               | improvement in productivity over ipython. It's just so
               | much easier to work with - and a step more reproducible
               | too, my scribbles are all there saved in the notebook, at
               | least!
        
           | patrick451 wrote:
           | > matplotlib doesn't score highly on usability.
           | 
           | The one place matplotlib sucks is any kind of interactivity.
           | But other than that, matplotlib has the best, most intuitive
           | interface of all the python plotting libraries I've tried.
           | It's also one of the few libraries that doesn't rely on
           | generating html for a webbrowser, which makes for a miserable
           | workflow.
           | 
           | I still think Matlab's plotting is untouched by open source
           | options.
        
             | taeric wrote:
             | The charts that R produces are typically better looking.
             | But...R. :(
        
             | cinntaile wrote:
             | Matplotlib hides a lot of complexity if you ask me. As soon
             | as you do something in a different way than intended you're
             | off searching stackoverflow for a post that did something
             | similar to what you want. Then you tweak it a little and
             | hope it works.
        
           | kzrdude wrote:
           | papermill is good and ploomber is a thing to watch.
           | 
           | Ploomber makes it systematic - store notebooks as .py
           | (py:percent files for example), parameterize them with
           | papermill and execute as a batch job. One can view the
           | resulting jupyter notebooks as .ipynb later and produce
           | reports as html if wanted. It's really good already, and
           | better if ploomber gets more development.
           | 
           | The whole reason it works is because it's easy to open the
           | .py notebook and work on it, interactively, in jupyter.
           | 
           | The main idea - jupytext for .py notebooks and papermill for
           | parameters & execution - that's already "stable" and easy for
           | anyone to use for their own purposes.
        
             | edublancas wrote:
             | (ploomber maintainer here)
             | 
             | Any feedback for us? What can we do to improve Ploomber?
        
           | analog31 wrote:
           | I've programmed in a number of languages over the past 40+
           | years, starting with BASIC, and every one of them encourages
           | sloppy coding. The good discipline always has to be taught,
           | learned, and willingly practiced. The closest I came to a
           | language designed for teaching good practices was Pascal.
           | 
           | I find it easier to read and understand bad code written in
           | Python, than good code written in the C family languages.
        
           | mistrial9 wrote:
           | .. and get more new users each year than four generations of
           | PCs combined
        
           | monkeybutton wrote:
           | Being able to hack out code to explore and experiment with
           | data while not having to reload and reprocess data (thanks to
           | that global mutable state!) saves a hell of a lot of time in
           | the long run.
        
         | rewq4321 wrote:
         | Not infinity, but yeah it's worth more than people generally
         | think. But in the end you don't really lose many clock cycles
         | anyway because everything actually runs in C/CUDA/etc. behind
         | the scenes
        
         | dang wrote:
         | " _Don 't be snarky._"
         | 
         | " _Please don 't post shallow dismissals, especially of other
         | people's work. A good critical comment teaches us something._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
         | [deleted]
        
         | Yoric wrote:
         | I personally find numpy irritating, surprising, misleading and
         | definitely a wrong definition of "usability". YMMV.
        
           | Scene_Cast2 wrote:
           | What are your thoughts on Pandas?
           | 
           | As someone who uses numpy almost daily, I think that numpy is
           | "overextended" beyond its core niche, sure. So - making it
           | work with things outside that niche (e.g. streaming, non-
           | rectangular data, non-uniform data, nonhomogeneous data, etc)
           | is painful. However, 1) there's Pandas for that, and 2) I
           | disagree with "misleading" and "surprising". What makes you
           | think that?
        
             | m_mueller wrote:
             | Not GP, but I'm using pandas daily to build up a BI
             | platform within a financial institution. Compared to Matlab
             | and even Fortran it has some issues IMO:
             | 
             | * why distinguish between Series and DataFrame? just give
             | me an interface for m x n matrices or even higher
             | dimensions.
             | 
             | * pure vs. in-place operations. not such a big fan of
             | having multiple versions of the same function, e.g. a more
             | pythonic                   df["my_col"] = series
             | 
             | vs. a more functional
             | df.assign({"my_col": series})
             | 
             | ; I'd rather have everything like the latter to be able to
             | more easily have best practices in place.
             | 
             | That brings me to another point: if we keep everything
             | purely functional, then python's syntax is making things a
             | bit awkward. Where in something like JS you could just put
             | every function call with its dot on a new line without the
             | need to assign, in Python this requires putting line break
             | characters or wrapping it in round brackets. This is one
             | place where a language with explicit assignment terminators
             | (semicolons) are a bit cleaner to work with.
             | 
             | All that being said scipy is still a great choice to have
             | both system programming and numerical business logic in one
             | language.
        
             | J253 wrote:
             | In my opinion and experience, I think you're right about
             | "there's Pandas for that" and "that" can be almost
             | anything. It can do almost anything but making it do almost
             | anything requires constant reference to the docs. And I
             | find maintainability difficult. It seems like there's 50
             | kwargs for every method. Sometimes things happen in place
             | by default, other times they don't. Compound indexes still
             | confuse me. But I'm not a data scientist so I don't do much
             | ad-hoc analysis that seems typical with pandas users.
        
             | patrick451 wrote:
             | Not the OP but I agree with them:
             | 
             | Little things, like some functions want to be called with a
             | tuple of dimensions,                   np.zeros((rows,
             | cols))
             | 
             | others just want to be called like
             | np.random.randn(n, m)
             | 
             | The 1d array is a huge, fundamental design flaw in numpy.
             | It makes zero sense that I can do matrix-vector
             | multiplication against both an nx1 2d array as well as a 1d
             | array. The latter is complete nonsense.
             | 
             | When you slice a column from a matrix, and get a not an nx1
             | vector, but a 1d array, it makes me want to shell out
             | $10,000 for matlab (yes, I know I can get a column vector
             | with the slice A[:, [2]], but I shouldn't have to).
             | 
             | This problem leaks out into the ecosystem. For example,
             | when you try to use scipy to integrate an ODE, and pass it
             | an initial condition vector that is nx1, the scipy
             | integrator will silently coerce your vector to a 1d array,
             | pass it to your RHS function, which then either blows up,
             | or more likely, produces silently wrong result because of
             | numpy's insane array broadcasting rules.
             | 
             | This problem further leaks into the ridiculous function
             | hstack. If you just used the function vstack, which made a
             | 2x3 matrix from 2 1d 3 element arrays, you might imagine
             | that hstack would produce a 3 x 2 matrix. But no. It
             | creates a 1d 6 element array. For what you wanted, you
             | actually need np.column_stack.
             | 
             | I think the way Eigen handles this is the most intuitive.
             | You do linear algebra with 2d objects, and cast to arrays
             | for elementwise operations.
             | 
             | There is also a huge inconsistency between what numpy
             | exposes as an object oriented interface vs a "functional"
             | interface. What I mean by this, is that I can call x.sum()
             | on an array, but not x.diff(). For that, I need np.diff(x).
             | There seems to be no pattern to what is exposed as a method
             | vs a function.
             | 
             | The array slicing api is also really inconsistent. For
             | instance, given a 3 element array x,                  a =
             | x[5]
             | 
             | is an IndexError. However, this perfectly fine
             | a = x[2:5]
             | 
             | I just can't forgive that this is not also an IndexError.
        
             | Yoric wrote:
             | I actually don't remember the details, I haven't used numpy
             | in 4-5 years. I remember being bitten a few times by some
             | operators that had a different behavior based on how you
             | had arrived to what looked to be the same data. These were
             | issues I don't remember encountering with e.g. Mathematica,
             | MatLab or R, but then, I was manipulating different kinds
             | of data.
             | 
             | Next time I find myself manipulating numerical data, I'll
             | definitely take a look at Pandas!
        
             | civilized wrote:
             | Pandas is better than nothing, but I would look to R's
             | dplyr/tidyverse for a really well-designed tabular data
             | manipulation ecosystem. Compared to tidyverse, the pandas
             | API feels bloated, obscure, and inefficient. I often see
             | people using very slow apply-based solutions in pandas
             | because the faster solution is so non-obvious.
             | 
             | The tidyverse ironically ends up feeling more Pythonic,
             | with more of a "there is one obvious way to do it" vibe.
        
         | matsemann wrote:
         | I wonder how many days have been wasted on non-programmers
         | trying to get their Conda environment up and running or
         | similar. Half the data science stuff isn't reproducible, not
         | because of the science, but because getting the notebooks
         | running with its dependencies is almost impossible.
        
           | lysecret wrote:
           | have you heard of docker?
        
           | BeFlatXIII wrote:
           | I'm increasingly convinced that the majority of so-called
           | "data science" is pure sciencism with little to no actual
           | science for exactly this reason. It's reading correlations
           | from digital tea leaves.
        
           | irrational wrote:
           | Which begs the question, why has nobody built anything
           | better?
        
             | taeric wrote:
             | I'd argue that they probably have. It just isn't free.
        
             | belval wrote:
             | Because it's a hard problem and people love hating on
             | Python because it doesn't come with a way to handle all the
             | compiled dependencies that work for every OS.
             | 
             | Data science has a ton of moving library parts, it is
             | genuinely difficult to distribute precompiled libraries for
             | everyone when you have 2-3 actively maintained CUDA version
             | with 2 cuDNN version for accelerators that change every 2
             | years. Most team fail to standardize on an environment (say
             | Python 3.8, Ubuntu 20.04, CUDA 11.1, and cuDNN 8) and then
             | get hung up on a dependency not building as if it's
             | Python's fault that it does not have control of your entire
             | OS.
        
               | matsemann wrote:
               | But why is it such a big problem in Python compared to
               | other stacks? Why does all python projects end up
               | depending on you having those exact tools of things
               | installed locally and the planets aligned a certain way,
               | when other stacks do not?
        
               | belval wrote:
               | Python is for all intents and purposes a "glue" language.
               | You don't do the heavy computing in Python, you just pull
               | in a C++ library that has a Python interface. This adds a
               | ton of friction because these dependencies will often not
               | be precompiled so you need to have the right system
               | libraries to build the module before using it.
               | 
               | It's not much of a problem for other stacks because they
               | either are fast enough that they have a library written
               | in the same language for problem X (C#/Java/Rust) or they
               | aren't targeting the same type of work (JS, Ruby,
               | etc...). C++ has the exact same problem as Python and I'd
               | argue that it's even worse.
        
               | m4x wrote:
               | I'm not sure that it _is_ a problem in Python more than
               | other languages.
               | 
               | It might look worse because many Python projects use
               | tools such as CUDA, which are notoriously dependent on
               | the specific OS, architecture, method of installation
               | etc. But that same issue will exist in most languages -
               | if you're linking against CUDA, you will sometimes have
               | problems with the package installation. Particularly if
               | you try to run the code on a different OS, CPU
               | architecture, using a different GPU, etc.
               | 
               | I don't think it really has anything to do with Python.
               | It just happens that most people doing work that depends
               | on tricky packages such as CUDA also happen to be using
               | Python.
        
               | analog31 wrote:
               | There are 365k projects in the "official" package index.
               | While not all of these are important, it's a tip-off to
               | the magnitude of the problem. The habit of blowing past a
               | problem by grabbing a random library and moving on to the
               | next problem leaves us with a mess of dependencies. And
               | many of those were either written by amateurs like us,
               | not maintained, etc.
               | 
               | Maybe other languages have fewer libraries, or maybe the
               | habit of grabbing libraries at random evolved
               | concurrently with the rise of Python.
               | 
               | My team has a rule that we don't let a project get past a
               | certain stage without proving that it can be installed
               | and run on a clean machine _and_ archiving all of the
               | necessary repo 's with the project. It's easily forgotten
               | that testing your installer is part of testing your
               | program.
        
               | NoThisIsMe wrote:
               | It's not a big problem in Python in general, only in
               | scientific computing / number crunching projects, because
               | of the dependencies on huge complex software, some of it
               | ancient, written in C, Fortran, and C++. So why do we
               | hear about this problem in Python a lot? Well, because
               | it's what's used for the glue/frontend, which is what
               | users work with directly. It's selection bias. Sure,
               | another language might fare somewhat better or worse for
               | this or that reason, but at the end of the day it's gonna
               | be a pain in the ass (at least until next-generation,
               | complete, deterministic, language-agnostic solutions like
               | Nix/GUIX really gain traction).
        
             | ryukafalz wrote:
             | They have: Nix and Guix for example can handle Python
             | dependencies as well as native dependencies, and can build
             | them reproducibly. They just haven't caught on yet.
        
             | nerdponx wrote:
             | There is the Mamba project, which is developing a drop-in
             | Conda replacement, along with a bunch of other related
             | tooling.
        
           | belval wrote:
           | > non-programmers trying to get their Conda environment up
           | and running
           | 
           | I see this issue brought up a lot, but I have yet to see a
           | language that addresses this reliably. By definition setting
           | up an environment for non-programmer is a tall order, what
           | language should they use?
        
             | matsemann wrote:
             | I'm just grumbling because even I as a professional dev can
             | sometimes spend days getting some python project up and
             | running correctly. Then I feel sorry for non-devs for which
             | all this is only a tool.
        
             | mnw21cam wrote:
             | The simple and easy Java way to do it is to just bundle
             | everything into a Jar. Then it really is a single file
             | "environment". Then you only have the problem of different
             | Java versions rejecting the jar file because it is too new
             | _grumble_.
        
               | _Algernon_ wrote:
               | Wouldn't this be the perfect use case for docker
               | containers (or something equivalent).
               | 
               | Once the initial Dockerfile is written its very low
               | maintenance, though getting results out of it could
               | probably be made easier for scientific use.
        
             | smitty1e wrote:
             | One hopes that Python's packaging story can be streamlined
             | in the next few versions.
             | 
             | Never needed Conda, but the packaging tool proliferation is
             | an embarrassment, IMO.
        
           | tehnub wrote:
           | My advice: Docker.
        
             | m4x wrote:
             | Docker isn't a good fit in HPC environments, but you can
             | achieve a similar thing using Singularity.
        
           | nerdponx wrote:
           | I think a lot of this has to do with just how bad/incomplete
           | the docs are, how unnecessarily janky the shell integration
           | is, and how the Anaconda launcher itself makes a huge mess
           | and actively works against best practices.
           | 
           | The docs for building your own packages are even worse, to
           | the point where you basically are left copying snippets from
           | Conda Forge to build anything nontrivial.
           | 
           | Basically Conda is a tremendous engineering achievement, but
           | it's very much still a "first draft" in a lot of ways, and
           | Continuum/Anaconda made some weird decisions that work
           | against its user-friendliness. Imagine for example if third-
           | party repos on anaconda.org could have a description box,
           | link to a homepage, etc...
        
           | [deleted]
        
           | BeetleB wrote:
           | > Half the data science stuff isn't reproducible, not because
           | of the science, but because getting the notebooks running
           | with its dependencies is almost impossible.
           | 
           | As someone who did scientific programming in other languages
           | (Fortran/C++), I can assure you the nonreproducibility was
           | there in those projects as well. Not because of the tech
           | stack but because no one valued reproducibility.
           | 
           | The current situation with notebooks isn't worse. It's more
           | of the same. I think people criticize it more because
           | notebooks are advertised as reproducible research.
        
       | whatever1 wrote:
       | Julia is the next big thing. I am always blown away by its
       | readability and speed.
       | 
       | But it will take years to build a library ecosystem that can
       | rival the python one.
        
         | a9h74j wrote:
         | In this thread I am seeing a number of explanations, including:
         | 
         | Ecosystem; mind-share; readability and engineering mind-set;
         | history/Numpy/Matlab; teachability and academic focus.
         | 
         | There are also comments emphasizing the "dynamic" scientific
         | environment and need to just pick up code left by others.
         | 
         | In terms of the latter, could one apparent requirement be this:
         | The main contact should be with top-level code which at least
         | _looks_ like it is interpreted -- even if through compile-with-
         | run-combined and /or memoization? Need part of the user
         | interface, so to speak, be to hide all intermediate artifacts,
         | even the very thought of object code and executables? That such
         | stuff is for, say, "module creators" not primary users?
        
         | II2II wrote:
         | That is a big part of the author's point: the library ecosystem
         | is here for Python today. While there is a heavy penalty for
         | anything written in Python itself, it doesn't really matter
         | since there isn't much of a penalty once the data is passed to
         | highly optimized libraries and those libraries allow developers
         | to select efficient algorithms rather than implementing their
         | own algorithms (which are likely to be less efficient).
        
         | a9h74j wrote:
         | Remind me if anyone might, is there any story for Julia making
         | use of existing python libraries?
        
           | sundarurfriend wrote:
           | If you mean whether it's possible, PyCall.jl has existed
           | since nearly the beginning of Julia, and PythonCall.jl [1] is
           | a more recent package for the same core functionality -
           | calling into Python code.
           | 
           | [1] https://github.com/cjdoris/PythonCall.jl
        
         | bee_rider wrote:
         | Do you think Julia will chip away at Python's marketshare, or
         | Fortran's? I thought it was aiming to be more of a replacement
         | for the latter, but I've never written a line of Julia in my
         | life, so I am very uninformed.
        
           | Bromeo wrote:
           | Julia advertises itself as solving the "two-language
           | problem". This assumes that people first write exploratory
           | code in python or something similar, and then rewrite it in
           | Fortran etc. So in this scenario, Julia takes marketshare
           | from both.
           | 
           | Personally, I find that many Fortran codes are still used
           | because they have been build for many years, and they can't
           | be rewritten easily. On the other hand, new data science
           | projects start all the time, and the transition to Julia is
           | easy (and worth it in my opinion). That means that in my
           | experience, Julia is mostly competing for marketshare with
           | NumPy/SciPy/SKLearn/Pandas/R/Matlab.
        
           | adgjlsfhk1 wrote:
           | Imo, it eats away at both. Julia makes it relatively easy to
           | meet or exceed Fortran performance, but also gives you the
           | high level abstractions and ease of use of a language like
           | python. I think the biggest problem for Julia currently is
           | the difficulty of AOT compilation and the lack of tiered
           | compilation (like Java/Javscript). Making the story for
           | either of these better would be a significant quality of life
           | improvement for Julia, and would make it pretty much
           | unrivaled for scientific computing in my opinion.
        
         | pwnna wrote:
         | Python is sufficiently readable, and with the right extension,
         | it is sufficiently fast for vast majority of the purposes. For
         | Julia to truly gain momentum, I think it needs a "killer
         | app/library". However, I'm not sure what it would be that would
         | not already be built for Python.
         | 
         | My personal killer app would be a significantly revamped
         | plotting library/app. While matplotlib is great, it is
         | fundamentally based on imaged-based plotting. The next
         | generation of data visualization, imo, will likely be
         | interactive. Having an interactive plotting library that allows
         | you to produce publication-quality plots faster and simpler
         | (think of all the time spent aligning text manually..) could be
         | a big deal, but it could also not matter as no one else wants
         | the same things I do.
        
           | nicolaskruchten wrote:
           | What are your thoughts on PlotlyJS.jl?
           | https://plotly.com/julia/
        
           | hpcjoe wrote:
           | Have a look at Makie.jl[1] in Julia. I've been using it for
           | exploring large data sets recently. Ticks your boxes. Jupyter
           | version is image based though, as Jupyter is inherently
           | static. You could use Pluto.jl[2] to build a reactive page.
           | 
           | [1] https://github.com/JuliaPlots/Makie.jl
           | 
           | [2] https://github.com)fonsp/Pluto.jl
        
       | wheelerof4te wrote:
       | One thing that I don't like with Python's scientific libraries is
       | how they change the overall Python syntax.
       | 
       | There are so many ways to slice an array or a dataframe, and only
       | a few of them are valid Python code.
       | 
       | Keeping the language API should have been a priority, but that is
       | a consequence of operator overloading features.
        
         | canjobear wrote:
         | I'm not sure what you're referring to. Nothing you import into
         | Python changes its syntax.
         | 
         | Maybe you're thinking of things like x[:, np.newaxis] where x
         | is a numpy array? This is valid Python code outside of numpy as
         | well, although the built-in data structures like lists and
         | dicts won't know what to do with the :.
        
         | lvass wrote:
         | What language wouldn't suffer from this, besides APL? Even very
         | recent and well designed libraries like Elixir's Nx look like
         | another APL-like language bolted on. Pipe syntax helps but not
         | much.
        
         | kzrdude wrote:
         | Can you explain what you mean more in detail? Libraries can't
         | change the syntax of the Python language, not in the formal
         | sense.
         | 
         | Is this about things you want to be able to express in syntax
         | but can't? Or the other way around - stuff that uses
         | syntax/operators but should really be methods?
        
           | ohyoutravel wrote:
           | Numpy syntax comes to mind. The extra commas often aren't
           | valid pure Python but are required for some operations on
           | numpy arrays. I don't know how this works under the hood, but
           | expect it's a state machine under the numpy ndarray looking
           | for the extra commas and such.
           | 
           | i.e. some_array[0:5,0] which isn't valid pure Python
           | notation.
        
             | kzrdude wrote:
             | Extra commas are "valid in pure python" in the following
             | sense that I can demonstrate.
             | 
             | Open ipython3                   In [3]: class Test:
             | ...:     def __getitem__(self, index):            ...:
             | print(index)            ...:               In [4]:
             | Test()[1, 2, 1:3, ..., :]         (1, 2, slice(1, 3, None),
             | Ellipsis, slice(None, None, None))
             | 
             | It's valid and we get the complicated tuple of integers,
             | slices, ellipsis etc as printed.
             | 
             | Numpy has existed for a long time. Its needs have been
             | taken care of in upstream Python, to a big extent, and
             | other libraries can use the same features.
        
               | ohyoutravel wrote:
               | Interesting! Neither myself nor my coworkers could get
               | the snippet I posted working outside the context of an
               | ndarray, so I had speculated at that time that it there
               | was something else going on under the hood.
               | 
               | You seem to have a much better grasp of Python than us,
               | would you mind posting an example where the snipped I
               | posted successfully accesses data from an array in pure
               | Python? That way I can not only take the L, but correct
               | the record and learn something in the process.
        
               | kzrdude wrote:
               | This program is quick & lazy but it uses a 1D python list
               | and pretends it's a 2D list. It implements 2D slicing,
               | giving you a square subset just like ndarray. It doesn't
               | intend to be all correct or nice or useful.
               | 
               | https://www.pythonmorsels.com/p/2rk5t/
               | 
               | Is laziness a virtue? I reused the slicing implementation
               | in `range(foo)[index]` so I didn't need to do that logic
               | myself.
        
       | nanochad wrote:
        
       | iainctduncan wrote:
       | If you know actual scientists, this isn't counter intuitive at
       | all. My partner is a scientist, so now I know tons of them, and I
       | have done a bunch of Python coding and support for scientists,
       | have been a Python programmer (as well as other languages) since
       | 2005-ish. I saw this coming (as did many) 15 years ago.
       | 
       | Most scientists, and their grad students, are trying to do a
       | whole bunch of things in their research, and programming is just
       | one of them. Field work, experiments, data wrangling, writing
       | papers, defending papers, teaching, etc. And most of them do not
       | have access to budgets for programmers or when they do, it's for
       | a limited amount of time and work, meaning they need to be able
       | to pick up and run with whatever the programmer did. So the fact
       | that with Python they and their grad students (who might be there
       | for only 2 years) can be working productively, and figure out
       | what the hell the code did when they come back to it months
       | later, is HUGE. As in, literally blows every other consideration
       | to smithereens. This has meant that over the last 20 years the
       | scientific libraries in Python got mature faster than in any
       | other language, and this in turn has had a snowball effect. And
       | when speed is necessary, C++ extensions can be written. But
       | honestly, most of the time speed is not the main factor.
       | 
       | The downside of Python in my experience is that junior teams can
       | make heinous atrocities when a project gets really big (I have
       | had to step in as CTO to one of those messes, so much as I love
       | Python, I must admit this is true!) But the stuff the scientists
       | are doing is very rarely that big. It's tools programming,
       | scripting, making utilities, data analysis and so on.
       | 
       | Readability counts. In some fields, it counts more than anything.
       | I've worked in about 10 languages now over the last 20 years, and
       | Python is still the easiest to read when you come back to some
       | old code or have to pick up code for a small job, or hand it to a
       | beginner to extend without having them create an unreadable mess.
       | This is what scientists need to do all the time.
       | 
       | Re other people's comments on Python packaging and setup being
       | hard, well honestly I've had just as much pain with Ruby or Node.
       | The shining exception there is R, which is giving Python a run
       | for its money in many scientific areas. R Studio has the best
       | "hit the ground running" experience out there and is really slick
       | for data programming.
        
         | analog31 wrote:
         | My partner's partner is a scientist too. ;-)
         | 
         | In addition to not having budgets for programmers, we also
         | don't know how to manage them, for instance how to communicate
         | our needs, decide if their implementation plans make sense, or
         | gauge their progress. Nearly half a century after _The Mythical
         | Man Month_ , managing software development is still generally
         | acknowledged to be an unsolved problem.
         | 
         | The other two obstacles are that most programmers _hate_ the
         | scientific work environment, with its ever-changing
         | requirements and frequent dead ends. And, the programmers who
         | can work on math related stuff are in the highest demand.
        
           | protomolecule wrote:
           | "the programmers who can work on math related stuff are in
           | the highest demand."
           | 
           | Could you suggest the best way for finding a remote job for
           | such programmer?
        
             | iainctduncan wrote:
             | Network with scientists. Doing some small jobs or favours
             | for scientists who will tell other scientists about you is
             | the way to go. Universities are a good source of
             | connections.
        
               | Icathian wrote:
               | If labs are really struggling to find math-literate
               | programmers, I would imagine it's in part because the
               | process for matching them with the work is so terrible.
               | Generally speaking, skilled programmers do not want to
               | (and certainly don't have to) shake hands and do favors
               | to find work.
               | 
               | I wonder if there's any concerted effort to fix that for
               | academia, or if the "shortage" of math-literate
               | programmers just isn't a problem worth fixing.
        
               | iainctduncan wrote:
               | That was kind of my point. What skilled programmers do to
               | find work is one thing, what scientists, who just need
               | some help for a short project, do to find people is
               | another. Assuming you are a programmer who wants to do
               | work for scientists for some reason, you need to go where
               | they are - they won't find you in your regular tech
               | recruiting circles, which tend to be all about full time
               | jobs. I happen to like doing some work for scientists so
               | that my career isn't entirely about making private equity
               | companies richer, but I don't expect them to pay my
               | enterprise rates or find me on Linked In.
        
               | sjackso wrote:
               | To make matters worse, university staff software
               | engineering jobs usually pay 1/3 to 1/2 of comparable
               | jobs in industry (even after excluding FAANG-level
               | outlier salaries), and in most cases offer no meaningful
               | career progression.
               | 
               | I think universities will never be able to compete for
               | engineering talent until they can create attractive
               | career paths for people who aren't professors.
        
             | chasely wrote:
             | Early stage startups founded by scientists? At least that's
             | my use case.
             | 
             | - get more traction
             | 
             | - I'm less able to focus on the software and have to focus
             | on business development
             | 
             | - I'm going to need to hire someone who is a competent
             | programmer but can also deal with the mathy bits
        
           | iainctduncan wrote:
           | Spot on with my experience! Much of our work was helping them
           | manage the project and figure out how to work with us. And
           | someone went on sabbatical, and then someone dropped their
           | program, and someone else left for another school, and
           | someone was stuck managing the program for a semester who had
           | literally no time or experience doing that, etc. It's a
           | Dynamic Environment. lol.
           | 
           | There is no other language I have used that makes it as easy
           | to read code from somebody else, especially where that
           | contributor is likely to be a domain expert with very limited
           | programming experience. It's not actually my favourite
           | language anymore (hello Scheme!) but if you want me to do
           | work in that environment, I'll reach for Python first.
        
         | blunte wrote:
         | > Python is still the easiest to read when you come back to
         | some old code
         | 
         | Lucky you. You must not have seen the "pythonic" monstrosities
         | I've seen.
         | 
         | Python has such a low barrier for entry that one can "get stuff
         | done" with absolutely atrocious and often very overly
         | complicated OOP-ish code.
         | 
         | Ruby is not my favorite language, but I would bet real money
         | that without dependence on libraries, nobody could show me
         | Python code which I could not show more logical, consistent,
         | and readable Ruby code which solves the same problem. I say
         | Ruby because it's of the same "type" and follows similar
         | methodologies.
         | 
         | Python suffers from far too many years under the leadership of
         | one odd person. It has a cult-like following, whereby anyone
         | who disagrees is an outcast. Where else could you hear comments
         | like, "why would you ever need a switch statement? if/if else
         | works fine!" That's just the tip of the iceberg.
         | 
         | Python is great for integration glue code, but only because of
         | the libraries it has. But now it is becoming more Javascript
         | like, and the dependencies are multiplying to the point where
         | you're better off writing your own left-pad instead (or even
         | re-evaluating your approach) instead of taking on new duct tape
         | like django-database-view.
         | 
         | Sometimes the bar needs to be high enough to force the juniors
         | to actually learn something before they start building "MVP"
         | startups. On the other hand, who cares if the MVP is a horror
         | show as long as you get that IPO and take your f-u money and
         | leave.
        
           | iainctduncan wrote:
           | So my real job is technical due diligence on companies being
           | purchased. I get the keys to the kingdom when we do a
           | diligence and trust me, there are just as many people making
           | unmaintainable monstrosities that get bogged down in tech
           | debt in Ruby. Looking at this scenario is literally my job,
           | and the company I work for does more of these than anyone in
           | the world.
           | 
           | Bad coders can make terrible stuff in any language, and with
           | two as similar as Python and Ruby, the minor differences are
           | a drop in the bucket in the grand scheme of things. Both
           | Django-database code and RoR's Active Record have bogged down
           | many a startup when they got big enough that DB size and
           | query performance mattered.
           | 
           | None of which, as I pointed out, is relevant to the vast
           | majority of scientists writing code.
        
         | ip26 wrote:
         | Not to mention, what they are working on is often very abstract
         | compared to the math many programmers are used to doing. I
         | write a lot of boolean; my scientist partner writes
         | regressions, surface transformations, eigenmaps, linear
         | algebra, and so on. Imagine being something _other_ than a
         | programmer by trade, and trying to apply linear algebra to your
         | problem without good tools or libraries.
        
         | liamwestray wrote:
         | Literally can't add anything to this.
         | 
         | You nailed it.
         | 
         | I suspect MicroPython is going to do the same thing to
         | Arduino/C as Python just did here in Academia as well.
        
         | zozbot234 wrote:
         | > Readability counts. In some fields, it counts more than
         | anything. I've worked in about 10 languages now over the last
         | 20 years, and Python is still the easiest to read when you come
         | back to some old code or have to pick up code for a small job,
         | or hand it to a beginner to extend without having them create
         | an unreadable mess. This is what scientists need to do all the
         | time.
         | 
         | Meh. Python might be readable at the smallest scale, but then
         | COBOL is even more readable. What matters is large-scale
         | development, and your implied point that large Python projects
         | turn into unstructured big-ball-of-mud monstrosities is well
         | taken. A big ball of mud is _not_ surveyable, or  "readable".
         | 
         | Which is where other modern languages (e.g. Julia in the
         | scientific programming domain, heck even Go or Rust) will
         | probably have an advantage.
        
           | bigbillheck wrote:
           | > What matters is large-scale development,
           | 
           | Not for the kind of computing being talked about.
        
             | zozbot234 wrote:
             | In a relatively terse language like Python, anything beyond
             | a few screenfuls of code is already "large scale"
             | development. It's unwise to keep it all in a single module.
        
           | icedchai wrote:
           | Some of the best code I've worked on has been python.
           | Unfortunately, also some of the worst. 5000 line, single file
           | "modules", with spaghetti class hierarchies (5+ levels deep)
           | and dynamic method calls making it nearly impossible to
           | debug.
        
           | iainctduncan wrote:
           | think what you will, the scientists disagree. Which was the
           | point. Not holding my breath to find many scientists matching
           | my description who would rather learn Go or Rust...
        
             | shpongled wrote:
             | I'll be the counter example. PhD in life sciences, but I've
             | also been programming since I was a teen. Rust is by far my
             | most used language for both general fun projects and in my
             | role as a programmer in the life-sciences. Python is OK for
             | ad-hoc analyses, but I cannot stand to use dynamically
             | typed languages for anything "real" given how much
             | difficulty dynamic typing imposes on reading and
             | understanding code.
        
               | iainctduncan wrote:
               | Sure, but by your description, you aren't really the
               | people I'm describing. If you've been doing this since
               | you were a teen, you're a "Real Programmer". My point was
               | that people who have to do this as item 7 of 10 things in
               | their job description are very much less likely to learn
               | something like Rust than Python. That is undeniably a
               | bigger lift to a non-programmer. Python's success in the
               | sciences is in large part due to how good a fit as a
               | language it is for part-time occasional programmers.
               | 
               | I like all kinds of languages, but the only ones I would
               | encourage my partner to bother with as tools for her
               | science work would be R and Python.
        
       | uoaei wrote:
       | Python is an API to efficient scientific computing code. It's
       | good for that, assuming you're using old and more verbose
       | languages.
       | 
       | Look into Julia as a promising alternative -- the language itself
       | is superbly fast (aside from initial compilation) and there's an
       | impressive scicomp ecosystem to say the least, all written in
       | native Julia. This allows for program rewriting / metaprogramming
       | more broadly and is insanely powerful once you get a feel for it.
        
       | hulitu wrote:
       | I try to love micropython. However, its UI is at ed level. It
       | only says "Syntax error".
        
         | the__alchemist wrote:
         | Python excels in several domains. For example, the non-speed-
         | critical numerical computing this article is about. It's also
         | nice for backend web development, and scripting. Embedded isn't
         | one of its strengths, and I'm suspicious micropython was an
         | attempt at bringing embedded programming to people who don't
         | want to learn more than one language.
        
       | derbOac wrote:
       | C, C++, Fortran are still used, most Python users just don't see
       | it because it's hidden away underneath the calling function.
       | 
       | I've been surprised by the rise of Python in some ways although
       | not at all in others. Languages like C, C++, Fortran, and dare I
       | say it Rust are too low-level in their raw state for numerical
       | computing. You had the US federal government funding language
       | competitions because of this (see: Chapel). Languages like Python
       | and R (and before that things like Lisp) came along and gave
       | people a taste of something different, and it's obvious what
       | people migrated to.
       | 
       | Part of it is timing: multivariate computational statistics
       | (ML/data science/DL/whatever you want to call it) just sort of
       | started taking off in computer science communities before LLVM-
       | based languages like Julia or Nim could get a foothold. OCaml
       | might have fit that niche but never got there because of a desire
       | to take a different path, or take the path more slowly.
       | 
       | So people looked for a nice expressive language, found it in
       | Python, and buried all the messy stuff behind wrapper functions
       | and called it a day. It was furthered along by Matlab being
       | another comparison on the other side -- Python looks kludgy
       | compared to modern Fortran or C, but not compared to Matlab.
       | 
       | All that wrapper time in Python has its costs, so I suspect as
       | limits get pushed further we'll eventually see a migration to
       | something else like Julia or Nim, or something else not on
       | anyone's radar.
       | 
       | One moral to this story is that expressiveness matters. People
       | will go out of their way to avoid talking directly to machines at
       | a low level.
        
         | nneonneo wrote:
         | > Python looks kludgy compared to modern Fortran or C
         | 
         | I'm not sure I can agree with this. Both Python and Matlab
         | provide very nice, high level ways to interact with
         | multidimensional data using simple syntax. Under the hood, both
         | will wind up using fast algorithms to implement the operations.
         | C and Fortran require much more low-level considerations like
         | manually managing memory, futzing with pointers or indices, and
         | generally writing a lot more boilerplate code to shuffle data
         | around.
         | 
         | Matlab, despite all its quirks, could probably have won if it
         | was open source. It's got a very long history of use in
         | scientific computation and a large user base despite its high
         | price.
        
           | auxym wrote:
           | Matlab works fine for anything purely "numerical" but fails
           | hard as soon as you need to do more "general computing". Just
           | string handling for example. Or, as far as I know, it's still
           | not possible to implement a custom CLI interface in a matlab
           | script, like you would with argparse in python.
           | 
           | Matlab also historically was really bad for abstraction and
           | code architecture in general. For example, the hard "1
           | function per file" rule, which encouraged people to not use
           | functions at all, or if you really had to, write 2 or 3
           | really huge functions (in separate files). Only in recent
           | years (the past 5 or 10 years) did matlab get OOP stuff
           | (classes) and the option for multiple (private) functions in
           | a single script file (still only one public/exported function
           | is possible per file, because the file name is the function
           | name and matlab uses path-based resolution).
        
           | leephillips wrote:
           | Fortran does not require (nor has much available for) manual
           | memory management, and its array syntax is more convenient
           | than Numpy (and far more convenient than Python without
           | Numpy), obviating any futzing around with pointers or
           | indices.
        
         | cb321 wrote:
         | You may know this, but since you always mentioned Nim & Julia
         | together, it might confuse passers by. Nim does not, in fact,
         | need LLVM (though there is a hobby side project using that).
         | Mainline Nim compiles directly to C (or C++ or Javascript) and
         | people even use it on embedded systems.
         | 
         | What seems to attract scientists is the REPL and/or notebook UI
         | style/focus of Matlab/Mathematica/Python/Julia/R/... As
         | projects migrate from exploratory to production, optimizing for
         | interactivity becomes a burden -- whether it is Julia Time To
         | First Plots or dynamic typing causing performance and
         | stability/correctness problems in Python code or even just more
         | careful unit tests. They are just very different mindsets -
         | "show me an answer pronto" vs. "more care".
         | 
         | "Gradually typed" systems like Cython or Common Lisp's
         | `declare` can sometimes ease the transition, but often it's a
         | lot of work to move code from everything-is-a-generic-object to
         | articulated types, and often exploratory code written by
         | scientists is...really rough proof of concept stuff.
        
           | leecommamichael wrote:
           | Having taught a number of scientists both pre and post grad,
           | I agree with your take on notebooks/REPLs. Data-scientists
           | are not generalist programmers, in some cases, they are
           | hardly more advanced than some plain end-users of operating
           | systems. They shy away from the terminal, they have fuzzy
           | mental models of how the machine operates.
           | 
           | Being a generalist programmer that sometimes deploys the work
           | that data-scientists craft, I'd really like an environment
           | for this that can compile to a static binary.
           | 
           | Having to compile a whole machine with all the right versions
           | of shared libraries is a terrible experience.
        
           | nextos wrote:
           | The time to first plots in Julia is drastically lower now.
           | And still, it was something you only paid once per session,
           | due to JIT.
           | 
           | Julia is the first language I find truly pleasant to use in
           | this domain. I am more than happy to pay a small initial JIT
           | overhead in exchange for code that looks like Ruby but runs
           | 1/2 the speed of decent C++.
           | 
           | Plus, lots of libraries are really high quality and
           | _composable_. Python has exceptionally good libraries, but
           | they tend to be big monoliths. This makes me feel Julia or
           | something like Julia will win in the long run.
        
             | exdsq wrote:
             | Julia runs 2x the speed of decent C++?!
        
               | nextos wrote:
               | Sorry I meant 1/2 the speed or 2x the time, edited :)
               | 
               | Consider that BLAS written in _pure_ Julia has very
               | decent performance. If you are into numerical computing,
               | you will quickly understand this is crazy.
               | 
               | Carefully written Julia tends to be surprisingly fast.
               | Excessive allocations tend to be a bigger performance
               | problem than raw speed. Of course excessive allocations
               | eventually have an impact on speed as well. There are
               | some idiomatic ways to avoid this.
        
           | derbOac wrote:
           | That's a good point about Nim. Nim has a nice set of
           | compilation targets, which I tend to forget.
           | 
           | You might be right about the REPL aspect of things. On the
           | other hand, R took off with a pretty minimal REPL, and my
           | first memories of Python didn't involve a REPL. I think as
           | the runtime increases a REPL becomes less relevant, and it
           | seems like most languages with significant numerical use
           | eventually get a REPL/notebook style environment even if it
           | wasn't there initially.
        
         | kzrdude wrote:
         | Python was pragmatic and adopted changes that numpy needed and
         | advocated for. Maybe Julia is the only other worthy comparison?
         | 
         | Also, dynamic typing is a boon - and default & keyword
         | arguments is a great feature for complicated, versatile, useful
         | algorithm implementations and interfaces to them. Both of these
         | features have a cost in bigger programs, but they really make
         | Python stand out.
        
         | ip26 wrote:
         | _People will go out of their way to avoid talking directly to
         | machines at a low level_
         | 
         | I would put it differently. At 30 bugs per kLOC, I'd prefer my
         | codebase expresses a problem & it's solution- and as little
         | below that level as possible.
         | 
         | Each well-vetted layer of abstraction between a scientific
         | programmer and the machine's low level interface eliminates
         | whole classes of bugs that are irrelevant to the problem that
         | user is actually working on.
        
           | DasIch wrote:
           | > At 30 bugs per kLOC[...]
           | 
           | Where does that number come from?
        
             | arthurcolle wrote:
             | The context suggests to me that this was a self-reported
             | approximation from GP
        
             | ip26 wrote:
             | It's a median-ish of various studies. You can google "bugs
             | per 1000 lines of code".
             | 
             | The important part wasn't the exact number, but rather the
             | discovery that the ratio is pretty stable.
        
           | catlifeonmars wrote:
           | I believe an appropriate term for that low level in this
           | context is _undifferentiated lifting_
        
         | Robotbeat wrote:
         | It's because Matlab (and Mathematica, etc) is proprietary, and
         | therefore you always have to pay the Danegeld. So we use numpy
         | instead because it's extensible, it uses all the super fast
         | C/C++/FORTRAN stuff on the backend, and is fairly easy to
         | learn.
         | 
         | I actually still would prefer Matlab as the syntax is more
         | compact and natural than numpy (which is like a matlabified
         | Python), but that's probably just due to more experience in
         | Matlab.
        
           | adgjlsfhk1 wrote:
           | I'd definitely recommend checking out Julia for this usecase.
           | You get code that looks pretty much like matlab, but which
           | runs like fortran/C++. (Also there is very solid and fast
           | interop with python, so you can call anything you need from
           | the python side).
        
             | Robotbeat wrote:
             | What does a Julia environment look like, in practice? Is it
             | anything like the Matlab environment, where not online is
             | there a console and integrated editor and super easy to use
             | debugging/performance measurement, but also all the
             | variables are visible in the GUI?
             | 
             | If so, I'd consider switching (as Matlab does that better
             | than vanilla numpy). Julia is pretty great in theory. It is
             | still a very new language for my uses, which means the
             | documentation and community are orders of magnitude smaller
             | than Matlab or Numpy/python.
        
               | ForHackernews wrote:
               | Yes(-ish). You can use Julia in a Jupyter (the Ju- is
               | Julia) notebook, just like Python. This is a pretty user-
               | friendly experience for students, academics and data
               | scientists.
               | 
               | https://towardsdatascience.com/how-to-best-use-julia-
               | with-ju...
        
           | kragen wrote:
           | Octave is free software with Matlab syntax and Matlab-style
           | interactivity (autoreload, etc.) I'm not a huge fan of the
           | language (Matlab/Octave) but it certainly does make it quick
           | to whip things up.
        
             | Robotbeat wrote:
             | It sucks compared to Matlab, though. Unfortunately. (Sci
             | lab is better, altho not compatible.) But I have also used
             | it in a pinch. Size of the community means Matlab or Numpy
             | are your best options. If you aren't happy with Matlab due
             | to cost or licensing stuff, numpy is really good. Also
             | integrates with a lot of Python stuff like machine vision,
             | machine learning, etc, which have expensive or nonexistent
             | packages in Matlab.
        
               | kragen wrote:
               | Interesting! What are the drawbacks of Octave?
        
           | pm90 wrote:
           | Yep. When I was in grad school all the labs were furiously
           | migrating away from matlab because of its costs and confusing
           | licensing around running multiple replicas.
        
           | aimor wrote:
           | I have the same experience, but it's more than just syntax.
           | The Matlab IDE pulls together so much in a polished and
           | robust product. Python notebooks and IDEs (Spyder, Jupyter,
           | PyCharm, VSCode among a few others I've tried) are
           | frustrating to use in comparison.
        
             | Robotbeat wrote:
             | Yup, I agree 100%. I've been trying to use just vanilla
             | python because of interdependency hell (and changing terms
             | of service for anaconda), and I've been succeeding, but
             | it's a LOT more work and less clear what's going on.
        
         | agumonkey wrote:
         | Isn't it the usual "dynlang as prototyping clay" story ? python
         | (with FFI -> native libs) gets you iterate over ideas faster
         | and leaner.
        
         | zozbot234 wrote:
         | > You had the US federal government funding language
         | competitions because of this (see: Chapel).
         | 
         | Wait, weren't they supposed to be using Ada for everything
         | anyway? What's wrong with Ada?
        
           | tormeh wrote:
           | Different design objectives, I believe.
        
         | _dain_ wrote:
         | Nim is not LLVM based, it compiles to plain old C/C++.
        
           | adenozine wrote:
           | Nim actually has LLVM support via nlvm
           | 
           | https://github.com/arnetheduck/nlvm
           | 
           | It's not officially blessed, but it does work.
        
         | pdonis wrote:
         | _> C, C++, Fortran are still used, most Python users just don
         | 't see it because it's hidden away underneath the calling
         | function._
         | 
         | Yes, the article talks about this: Python is a glue language
         | and the actual heavy duty computation is being done inside an
         | extension module like numpy that's written in a faster
         | language.
        
           | joshuamorton wrote:
           | OpenBLAS and LAPACK are mostly Fortran and numpy will use
           | them if present on the system.
        
       | oh_my_goodness wrote:
       | This article expresses the ancient Python(/Matlab) v Fortran
       | argument beautifully ... but it's kind of shocking that the
       | argument is still going on at all. My generation came out of
       | school happy to use FORTRAN indirectly, via a scripting language,
       | for rapid prototyping. That was 30 years ago.
        
       | scythe wrote:
       | As a physicist, having spent eight years in academia, Python did
       | not win by beating Fortran. Nor did it beat C++. It didn't really
       | compete with Ruby or Lisp, although Lua (Torch) was a briefly
       | serious competitor before everyone realized that a language
       | developed by four people, one of whom doesn't get along with the
       | others, couldn't be responsive to users' needs.
       | 
       | Python defeated Matlab. I know because I cheered it on. I was
       | there. I watched my roommates and friends struggle with
       | introductory scientific computing in Matlab and I joined the
       | chorus that was practically begging for Python, even though I
       | didn't really like it. I can't even begin to explain how awful it
       | is to try to teach programming concepts in Matlab. But something
       | like Python or Matlab had to be the choice because the schools
       | wanted to teach programming through a language where you could
       | just call "graph" and the computer would display a graph.
       | 
       | Python's team, unlike Lua's, aggressively courted educational
       | institutions by offering scientific, numerical and graphical
       | libraries within a programming language that works like a
       | programming language, not a glorified computer algebra system.
       | They even added a dedicated operator for matrix multiplication.
       | It's a great example of finding a niche and filling it: I still
       | don't like _using_ Python, but I can 't dispute that no other
       | language/ecosystem comes close to offering what we need to teach
       | programming to physics students.
       | 
       | You want to beat Python? Build a type system that can capture
       | dimensional analysis. Warning: it won't be easy.
        
         | poleguy wrote:
         | I'm in engineeering at a major engineering company historically
         | using simulink and matlab. Python took over here in large part
         | because matlab licensing caused so much friction, and we wanted
         | to scale the simulink and matlab models up to run on a cluster
         | of machines. We wanted to give scripts to people without matlab
         | licenses quickly. etc. It was not the cost per-se, but the red
         | tape.
         | 
         | We also ditched simulink because it is very difficult to
         | version control and collaborate with a graphical interface.
         | 
         | Matlab is pushed heavily in the schools so all the engineers
         | knew it and were comfortable with it. Matplotlib and numpy
         | mimicing matlab very closely allowed the transition to be easy.
         | We're not looking back. Only a handful of people still use
         | matlab for their individual work because the python camp hit
         | critical mass and the transition is not hard.
         | 
         | Matlab working to control serial ports, ethernet, visa/gpib
         | instruments, all without the friction of getting extra licenses
         | was icing on the cake. Matlab has a buy the cadillac model: the
         | wheels, doors, hood, gas cap, mirrors are all optional add-ons.
         | Each point causes friction, as only a few people had the whole
         | tool, and therefore nobody could reliably share code.
        
         | _aavaa_ wrote:
         | Oh god, the atrocities I have seen colleagues do with MATLAB
         | scripts...
        
         | analog31 wrote:
         | I think a factor in Python vs Matlab is that Python grew into
         | areas where Matlab was not entrenched. Also, students with an
         | aptitude for programming and an eye for the market want to
         | learn languages that are used by software developers. Very few
         | engineers actually want to _program_ in Matlab. If they can
         | program, then they want to market themselves as programmers.
         | 
         | A benefit of Matlab remains that it all comes from one place,
         | with one installer, meaning that you can get a classroom full
         | of students up and running almost instantly. And it offers some
         | relief for students who will never grasp programming, through
         | its collection of pre-written apps.
        
         | adw wrote:
         | As a physicist who spent a decade in academia, including a PhD
         | where all the new work was done in Python, it absolutely won in
         | some fields by beating - or rather, by conveniently wrapping -
         | Fortran.
         | 
         | (In particular, that's how things have gone in the materials
         | physics/solid-state/quantum chemistry field. It absolutely beat
         | out Matlab in other fields. One of the underrated benefits was
         | being a lingua franca across more of physics!)
        
         | dboreham wrote:
         | Always nice to hear an authentic telling of history from
         | someone who was there and had the necessary insight to
         | interpret events and motivations. So much of what we read is
         | "the victors' written revision".
        
         | rsfern wrote:
         | > You want to beat Python? Build a type system that can capture
         | dimensional analysis. Warning: it won't be easy.
         | 
         | Curious about your thoughts on pint and Unitful.jl -- pint
         | doesn't really go all the way to a full type system, and
         | Unitful.jl doesn't work with everything (autograd is a problem
         | still I think). But Unitful.jl is super cool.
         | 
         | https://pint.readthedocs.io/en/stable/wrapping.html#checking...
         | 
         | https://painterqubits.github.io/Unitful.jl/stable/
        
         | BeetleB wrote:
         | > Python defeated Matlab.
         | 
         | This is _the_ answer. Scientific Python was originally an
         | alternative to MATLAB. When I was in grad school, I did most of
         | my research in MATLAB. Then we had a visiting student who was
         | doing very similar computations in SciPy, and he assured me
         | performance was not a problem. I migrated my MATLAB scripts to
         | Python and never looked back.
         | 
         | It was only _after_ being a viable alternative to MATLAB did
         | people decide it can be used for much more than what you
         | typically get with MATLAB.
        
       | [deleted]
        
       | jrochkind1 wrote:
       | As a rubyist, it makes me sad that python ended up here rather
       | than ruby. And I sometimes wonder why.
       | 
       | > As the name suggests, numeric data is manipulated through this
       | package, not in plain Python, and behind the scenes all the heavy
       | lifting is done by C/C++ or Fortran compiled routines.
       | 
       | So I wonder, was it easier to write C/C++ or fortran compiled
       | extensions in python than it was in ruby?
        
         | belval wrote:
         | I don't know how easy it is in Ruby so I cannot give you a
         | comparison.
         | 
         | However it is very very easy to write Python bindings for a
         | C/C++ library with minimal work. Solutions range from "just
         | works" like ctypes to "actually integrates with the language"
         | like Cython. You also have automated tools for wrapping like
         | pybind11 which does a lot of the heavy lifting for you.
        
         | bsder wrote:
         | > And I sometimes wonder why.
         | 
         | David Beazley talks about this in a YouTube video somewhere.
         | (Can't find it right now, maybe someone will in the comments.)
         | 
         | It was a lot of serendipity. Python was up and running when the
         | US national labs wanted to collaborate and their tools all
         | sucked. Since they wanted visualization this left only Tcl/Tk
         | or Python/Tk. And Beazley was hanging around as a grad student
         | in a national lab with a connection machine, no real boss, no
         | real oversight, and very little budget. He built stuff out of
         | Python, and it snowballed to other labs.
         | 
         | (Found it: see jasode's response)
         | https://www.youtube.com/watch?v=riuyDEHxeEo&t=1804s
        
         | mountainriver wrote:
         | It's all about the community. As soon as a language gets
         | attached to a profession it's hard to break. Ruby has primarily
         | been a web dev language, also the syntax is bad =P
        
         | jltsiren wrote:
         | From what I remember, people were actively promoting Python as
         | the first programming language already in the 90s. Many
         | universities started teaching Python, creating a steady supply
         | of non-CS majors who were familiar with Python but no other
         | language. And because the community was there, people started
         | building the ecosystem.
         | 
         | In contrast, I've never really encountered anyone advocating
         | for Ruby outside web development.
        
           | zozbot234 wrote:
           | Python got its _start_ as a pure teaching language. It 's
           | what the language was designed for in the first place, a
           | modern alternative to old BASIC.
        
         | wwfn wrote:
         | Perl was my horse in the race. I attribute it's, lisp's,
         | ruby's, etc loss to 1. "There should be one-- and preferably
         | only one --obvious way to do it" being part of python's ethos.
         | 2. ipython repl
         | 
         | 1. pairs with jaimebuelta's artistic vs engineering dichotomy,
         | but also plays into the scientist wearing many more hats than
         | just programmer. Code can be two or more degrees removed from
         | the published paper -- code isn't the passion. There isn't
         | reason, time, or motivation to think deeply about syntax.
         | 
         | 2. For a lot of academic work, the programming language is
         | primarily an interface to an advanced plotting calculator. Or
         | at least that's how I think about the popularity of SPSS and
         | Stata. Ipython and then jupyter made this easy for python.
         | 
         | For what it's worth, the lab I work for is mostly using shell,
         | R, matlab, and tiny bit of python. For numerical analysis, I
         | like R the best. It has a leg up on the interactive interface
         | and feels more flexible than the other two. R also has better
         | stats libraries. But when we need to interact with external
         | services or file formats, python is the place to look (why PyPI
         | beat out CPAN is similar question).
         | 
         | Total aside: Perl's built in regexp syntax is amazing and a
         | thing I reach for often, but regular expressions as a DSL are
         | supported almost everywhere (like using languages other than
         | shell to launch programs and pipes -- totally fine but misses
         | all the ergonomics of using the right tool for the job). It'd
         | love to explore APL as an analogous numerical DSL across
         | scripting languages. APL.jl [0] and, less practically april[1],
         | are exciting.
         | 
         | [0] https://github.com/shashi/APL.jl [1]
         | https://github.com/phantomics/april
        
         | prpl wrote:
         | It was multiple things, really. I would attribute ute some of
         | it to Swig, Perl attrition, SCons/Software Carpentry,
         | integration with GUI libraries, good documentation, and various
         | other efforts in the mid 2000s. A lot of those things were
         | solving research problems simply, and Python's use just kept
         | expanding.
         | 
         | Python was already taking over in many use cases by late 2000s.
         | 
         | Ruby was known, but it didn't have the following at multiple
         | levels in academia like Python did
        
           | jrochkind1 wrote:
           | You describe what happened, which I saw happen too. The
           | question I have is _why_ though. Right, _why_ did python 's
           | use in scientific computing keep expanding, and not ruby's?
           | _Why_ was python already taing over many use cases by the
           | 2000s, but not ruby? _Why_ did python develop the following
           | at multiple levels in academia, and not ruby? (Why is Perl
           | attrition relevant, when ruby was in fact explicitly based on
           | Perl?)
           | 
           | That's the question, not the answer!
           | 
           | It seems like a lot of the answer is NumPy, which makes the
           | question -- why did NumPy happen on python, not ruby?
           | 
           | Certainly one answer could be "nothing having to do with the
           | features of the language, it's just a coincidence, they chose
           | to write it in Python, if those working on numpy had chosen
           | to use ruby instead, history would be different."
           | 
           | But one hypothesis is that maybe NumPy wouldn't have been as
           | easy in ruby as python.
           | 
           | Someone else suggested the first numpy release happened
           | before the first ruby release, so that could also be an
           | answer.
        
             | [deleted]
        
             | calmdown13 wrote:
             | This episode of the lex fridman podcast gives a good
             | overview of how python's scientific computing community
             | developed. https://youtu.be/gFEE3w7F0ww
        
             | beagle3 wrote:
             | Nitpick: Numpy is the newest, revised and reconciled vector
             | library for Python; The first one was called "Numeric";
             | then there was "Numarray" which was not fully compatible,
             | which caused a bifurcated ecosystem; and then IIRC it was
             | Travis Oliphant who decided enough is enough, created Numpy
             | which was somehow magically backward compatible with both,
             | and reunited the community.
        
             | dalke wrote:
             | I was using Perl and Python in the 1990s for scientific
             | work.
             | 
             | Around 1993 I got hooked on Perl. I read the Perl book and
             | it was great. But 1) I couldn't figure out how to handle
             | complex data structures (this was Perl 4), and 2) I
             | couldn't embed it into other projects.
             | 
             | More specifically, worked on a molecular visualization
             | program called VMD. It had its own scripting language. I
             | wanted a language to embed in VMD that was usable by my
             | grad student users. This is when I first learned about
             | Python, but I chose Tcl because it fit the existing command
             | language almost perfectly.
             | 
             | At around the same time, UCSF started embedding Python for
             | their molecular visualization package, Chimera, so it was
             | already making in-roads in structural biology.
             | 
             | I later (1997) went into more bioinformatics-oriented work,
             | where I did a lot of Perl. I tried out one implementation
             | (a Prosite pattern matcher) in Perl - which took me reading
             | an advanced Perl book to learn how Perl 5 objects worked. I
             | then tried the same in Python, a language I wasn't as
             | familiar with. And it was just so much easier!
             | 
             | At this time Perl was THE language for bioinformatics, but
             | I thought it was a difficult language for complex data
             | structures. (Bioinformatics at that time was mostly string
             | related, plus CGI and databases - Perl was a great fit.)
             | 
             | I then moved over (1998) to cheminformatics, working more
             | directly on molecular graphs. Python was a much better fit
             | for those data structures than Perl. I started using Python
             | full-time, and it's been that way since.
             | 
             | We used a third-party commercial package for the underlying
             | cheminformatics called the Daylight toolkit. It had C and
             | Fortran bindings. Someone else had already written the SWIG
             | configuration to generate Perl, Python, and Tcl bindings,
             | but these still meant manual garbage collection.
             | 
             | I was able to use __getattr__, __setattr__, and __del__ to
             | turn these into a natural-feeling high-level API, hooked
             | into (C)Python's reference-counted garbage collector.
             | 
             | I presented a couple of talks about this work, got an
             | article in Dr. Dobb's (!) and got consulting work helping
             | companies which either had existing Python work, or were
             | moving to Python.
             | 
             | By contrast, I don't think I heard about Ruby until 2000 or
             | so, years after Python started entering structural
             | biology/cheminformatics. [1]
             | 
             | I wasn't particularly cutting edge - others had already
             | developed tool like SWIG, which was because Beazley and
             | others were using Python at LANL. Numeric Python started in
             | part because of work at LLNL and other research
             | organizations. The concept already firmly established was
             | that Python would be used to "steer" a high-performance
             | kernel.
             | 
             | And Python in turn changed, to better reflect the needs of
             | numeric computing, in particular, the "..." notation in
             | array slices was added to make matrix operations easier.
             | (This was 20 years before '@@' was added to simplify matrix
             | multiplication.) I believe the needs of numeric computing
             | also influenced the changed to "rich" comparisons.
             | 
             | This all took place around the time Matz started developing
             | Ruby. Python had a clear head-start. And except for
             | bioinformatics, Perl never had much presence in the fields
             | I worked in.
             | 
             | So:
             | 
             | > why did python's use in scientific computing keep
             | expanding, and not ruby's?
             | 
             | Because Python was in-use several years before Ruby, and
             | already rather visible as one of the three main languages
             | to consider in that space (Tcl and Perl being the other
             | two).
             | 
             | > Why was python already taing over many use cases by the
             | 2000s, but not ruby?
             | 
             | Because people didn't really know about Ruby, while Python
             | already had a pretty large user community. Probably also
             | because Python's work was all in English, while a lot of
             | the Ruby community was using Japanese.
             | 
             | > Why is Perl attrition relevant, when ruby was in fact
             | explicitly based on Perl?
             | 
             | Perl attrition started before Ruby was much known. The
             | complexity of the language, and the cumbersome need to
             | roll-your-own OO, made it difficult for me to recommend to
             | the typical software developers I work with - grad students
             | and researchers in the physical sciences with little formal
             | training in CS. Python by comparison which easier to pick.
             | 
             | So a language which explicitly based on Perl also picks up
             | that negative impression.
             | 
             | (FWIW, I think Tcl is an easier language to start with than
             | Python.)
             | 
             | > why did NumPy happen on python, not ruby?
             | 
             | Numeric computing in Python started before Ruby was much
             | known. Quoting https://en.wikipedia.org/wiki/NumPy
             | 
             | """In 1995 the special interest group (SIG) matrix-sig was
             | founded with the aim of defining an array computing
             | package; among its members was Python designer and
             | maintainer Guido van Rossum, who extended Python's syntax
             | (in particular the indexing syntax[8]) to make array
             | computing easier."""
             | 
             | Quoting
             | https://en.wikipedia.org/wiki/Ruby_(programming_language)
             | 
             | """The first public release of Ruby 0.95 was announced on
             | Japanese domestic newsgroups on December 21, 1995. ... In
             | 1997, the first article about Ruby was published on the
             | Web. ... In 1999, the first English language mailing list
             | ruby-talk began, which signaled a growing interest in the
             | language outside Japan."""
             | 
             | [1] Ha! I found a comment I made in 2003 saying I had
             | looked into Ruby "a few years ago", at https://groups.googl
             | e.com/g/comp.lang.python/c/xBWUWWWV5RE/m... . I also wrote:
             | 
             | """ I think my criteria for selecting Python over Perl is
             | still true for Python over Ruby, in that it has too many
             | special characters (like @ and the built-in regexpes),
             | features (like continuations and code blocks) which are
             | hard to explain well (I didn't understand continuations
             | until the Houston IPC), and 'best practices' (like
             | modifying base classes like strings and numbers) which
             | aren't appropriate for large-scale software development."""
        
             | jaimebuelta wrote:
             | I think the difference is in the community. I've used both
             | Python (extensively) and Ruby (a little bit). While the
             | capacities of the languages are relatively similar, the
             | people around the languages, at least the ones creating
             | packages and driving the discussion in conferences are
             | actually quite different, for some reason.
             | 
             | People attracted to Ruby are mostly of an "artistic
             | mindset", they want to be expressive, write code that
             | doesn't look like programming code and using "magic" like
             | dynamically created methods, monkey-patching, etc is
             | accepted or even encouraged.
             | 
             | On the other hand, Python attracts more people with
             | "engineering mindset", they like straight forward code
             | that's readable, clear and understandable, even if it's not
             | as expressive. "Magic" elements are frowned upon: for
             | example, imports are explicit and always included in each
             | file.
             | 
             | Obviously, I'm exaggerating it, but I think is a clear
             | differentiation between the communities.
             | 
             | My guess is that the "Python mindset" got into creating
             | better integrations for "engineering applications", like
             | NumPy or SciPy, and that created some positive feedback in
             | certain environments. The main strength of Python is its
             | rich ecosystem of third party packages. There's a
             | compounding effect, making it grow faster and faster.
        
               | kstrauser wrote:
               | I think that's exactly it, and that there's much less
               | understanding required to start reading and writing
               | Python code. Ruby has some beautiful features, but they
               | make it much less clear to newbies who are trying to
               | figure out what on earth's going on.
        
         | masklinn wrote:
         | > As a rubyist, it makes me sad that python ended up here
         | rather than ruby. And I sometimes wonder why.
         | 
         | Work on numerical packages and scientific computing started
         | almost as soon as the language did, for instance the origins of
         | Numpy lie in the Numeric package which was introduced in 1995.
         | 
         | And the core team introduced several niceties at the behest of
         | the scientific community (advanced slicing for instance, more
         | recently the matmul operator).
        
         | jasode wrote:
         | _> So I wonder, was it easier to write C/C++ or fortran
         | compiled extensions in python than it was in ruby?_
         | 
         | Don't know about technical aspects of "easier" but it may have
         | simply been an accident of history.
         | 
         | E.g. in 1995 (before Ruby 1.0 December 1996[0]), David Beazley
         | was already wrapping C Language code to Python. Deep link to
         | presentation: https://youtu.be/riuyDEHxeEo?t=52m27s
         | 
         | So DB's Python code for scientific code was released in Feb
         | 1996 and presented in July 1996. Python being released in 1991
         | was already talked about in magazines in 1995. David's
         | presentation also references Jim Hugunin[1] and he authored the
         | _1995 Numeric package_ which was the ancestor to NumPy. Once an
         | ecosystem gets started, it can attract more mindshare and
         | snowball into an insurmountable lead that neither Ruby nor
         | Julia will ever catch up to.
         | 
         | In other words... If the opposite timeline happened and Ruby
         | was released earlier in 1991 and Python later in 1996, things
         | may have played out differently.
         | 
         | So folks like David Beazley and Jim Hugunin chose Python as the
         | scripting host language for their C Language code probably
         | because Ruby wasn't mature and well-known back in 1995.
         | Apparently, Ruby didn't widely spread outside of Japan until
         | 1998 when the first documentation in English appeared.[2]
         | 
         | [1] https://youtu.be/riuyDEHxeEo?t=30m04s
         | 
         | [2] http://blog.nicksieger.com/articles/2006/10/20/rubyconf-
         | hist...
        
         | civilized wrote:
         | Timing could be a factor. Python was released in 1991. Numeric,
         | the ancestor of NumPy, followed in 1995, the same year Ruby was
         | released. So Python already had its hooks into scientific
         | computing before Ruby even started.
        
         | adw wrote:
         | Fortran interop (f2py in particular) was a significant factor,
         | and as soon as you get one thing (in this case LAPACK and BLAS
         | bindings) it snowballs. Also, Python is significantly more
         | initially familiar for informal programmers and that's
         | critical; the hard part of learning a language is often
         | believing that you can -and Ruby looks weirder than Python, so
         | it makes people doubt themselves.
        
         | guidoism wrote:
         | In 2009 I began writing code for a new company doing natural
         | language processing. I was _the_ engineer at the time and got
         | to pick my tools. I started with Ruby because I was sick of C++
         | and Perl and Ruby looked like the future. But I soon discovered
         | the NLTK and then Numpy and so I started playing around in
         | Python. I never again wrote line of Ruby... until the later
         | hired front end devs threw a fit of not being able to use
         | Rails.
         | 
         | It was clear at the time that there basically was no non-web
         | Ruby community. Ruby was Rails and Rails was Ruby. Ruby had a
         | nice little niche in 2009 but the Python had Numpy and there
         | were a lot of ML people doing lots of math and Ruby wasn't
         | going to cut it unless they wrote their own libraries, which
         | wasn't worth the effort since Python and Nunpy already existed
         | and already had a growing community behind it.
        
         | it_does_follow wrote:
         | > And I sometimes wonder why.
         | 
         | Numpy.
         | 
         | I honestly think it all boils down to numpy being developed
         | long before matrix libraries became a standard part of software
         | development.
         | 
         | Ruby's early "killer app" (remember that term?) was Rails. Even
         | to this day there is almost no major code out there built in
         | Ruby that isn't ultimately related to building CRUD web apps.
         | While Ruby may be losing popularity now, it moved the web-
         | development ecosystem ahead in the same way that Python has
         | moved the scientific computing world ahead.
         | 
         | 20 years ago if you wanted to use open source tools to
         | performant vector code there was Python and a hand full of oss
         | clones of commercial products. Given the Python was also useful
         | for other programming tasks in a way that say Matlab/Octave is
         | not, it was the choice for more sophisticated programmers who
         | wanted an OSS solution and need to do scientific computing.
         | This creates a positive feed back that persists to this day.
         | 
         | Given that Python remains a decent language relative to it's
         | contemporary peers and it has a massive and still growing
         | library of numerical computing software it is extremely
         | unlikely to be dethroned, even by promising new languages like
         | Julia.
         | 
         | Even to this day there is nothing even close to numpy in Ruby.
         | I do DS work in an org that is almost entirely Ruby, but we
         | still use python without question because we know re-
         | implementing all of our numeric code into Ruby would be a fools
         | errand.
         | 
         | Had ruby had early support of matrix math, it wouldn't have
         | surprised me if it would have replaced Python.
        
           | jrochkind1 wrote:
           | I think it's clear numpy is a huge part of it.
           | 
           | But that begs the question -- why did numpy develop in python
           | and not ruby?
           | 
           | The rest of the thread offers some suggestions though. One is
           | simply that python was born first, and got the numpy
           | precursor before ruby 1.0 even happened. Which seems like a
           | thing.
        
             | jasonwatkinspdx wrote:
             | Ruby had a numpy style library since the early 00's, I
             | forget exactly when. But it never got the kind of momentum
             | numpy and the Python ecosystem surrounding it did.
             | 
             | Lots of comments in this thread from people who's Ruby
             | experience is only from the post Rails era after ~2008, and
             | don't understand that the post Rails culture wasn't really
             | a thing when Python was first gaining momentum for
             | scientific computing.
        
         | largbae wrote:
         | Readability, 100%. I have programmed in large projects in both
         | Python and Ruby.
         | 
         | Ruby is very productive to write, because everything and the
         | kitchen sink is at your fingertips at all times.
         | 
         | But because of Ruby's many ways to skin a cat, everyone's code
         | is very different. Add to that the penchant for domain-specific
         | sub-languages in Ruby: new syntaxes that you might have to
         | learn half a dozen of to integrate a large project, all of
         | which end up being more limiting than if you could just, you
         | know, write Ruby.
         | 
         | Contrast with Python, which goes so far at normalizing as to
         | have a language-wide coding standard in PEP8. Python has its
         | problems, package management and distribution is still ugly for
         | example. But I can read any project I find and understand it
         | without loads of context.
        
           | pwnna wrote:
           | A language like Ruby can be very productive for someone who
           | has climbed the learning curve to learn all its ins and outs.
           | However, in my experience, this turns into a large
           | productivity drain the moment someone else (who is less of an
           | expert) has to touch it.
           | 
           | For large projects with multiple developers, readability
           | should win over writability every time, most code are read
           | more than they are written (my hypothesis). You can see
           | evidence of this, given the success of languages like Python
           | and Go.
           | 
           | That said, for scientific compute, in a lot of cases,
           | writability matters way more, as your job as a scientist is
           | to produce results as fast as possible, code quality be
           | damned. However, only a small number of scientists are expert
           | developers and have climb the learning curve and can write
           | code with ease. The vast majority of them are junior at best,
           | and Python's approachability (which is rooted from its
           | readability of course), wins. With most of the people using
           | Python, the ecosystem develops and there are no other viable
           | alternatives. In the long run, I suspect even languages like
           | MATLAB and Mathematica will die out as the open source stack
           | becomes more mature and (eventually, if not already)
           | significantly more capable. Julia might be a wildcard due to
           | its (potential) performance advantages, but the aesthetics of
           | the programming language is simply not in the minds of 99% of
           | the scientific compute users out there.
        
           | civilized wrote:
           | My impression is that Perl, Ruby, and Lisp all suffer from
           | this issue.
           | 
           | Even proponents will say things like "this language is so
           | expressive, I feel so productive in it, but people do such
           | idiosyncratic and clever things that it's hard for anyone
           | else but the author to understand".
           | 
           | That sort of "solo rockstar" programming culture doesn't
           | really lend itself to large-scale FOSS projects, which need
           | to be inviting to wide participation.
        
           | jrochkind1 wrote:
           | I am totally interested in hearing opinions from people who
           | have done serious hours of programming in both ruby and
           | python, as to readabilty comparison.
           | 
           | If it's just people who have done a lot of work in ruby and
           | little in python saying ruby is more readable, and vice
           | versa, I don't find it very useful even anecdotally.
        
             | pwnna wrote:
             | As someone that have spent a lot of time in both Ruby and
             | Python (and specifically a lot of hours in the Python
             | scientific compute stack), I would say that Python is
             | significantly more readable. Python is also significantly
             | easier to teach as opposed to Ruby, especially if the
             | target audience already has a bit of programming experience
             | (from MATLAB, or other courses).
             | 
             | I suspect the main reasons are:
             | 
             | 1. Python's guiding philosophy of "There should be one--
             | and preferably only one --obvious way to do it". With later
             | additions to the language, this is getting less true (3
             | different ways to format strings, asyncio, type hinting,
             | etc). Some libraries also don't conform to this
             | (matplotlib). That said, it's a lot better than the Ruby
             | code I've encountered, which is like the wild west.
             | 
             | 2. Python's syntax is reasonably simple to teach. The
             | object model could be condensed into something very simple
             | if you don't need a lot. With very basic knowledge, you can
             | go a long way. Ruby's a bit more chaotic with things like
             | inheritance, extend, and include; proc, block, lambda;
             | having to use attr_accessor; syntax things like a = b could
             | be a function call or not; if/unless; and many more things
             | that are confusing.
             | 
             | 3. Even basic things like loops in Ruby is not idiomatic as
             | it wants to apply a function/block instead. Beginners,
             | especially those with a bit of background, like their loops
             | better than functional programming.
             | 
             | As I've spent many years working on Ruby code base I still
             | get lost all the time. Python in my experience has been a
             | lot better, although recent Python versions have regressed
             | a bit as it introduced more syntax to do the same things.
        
         | blondin wrote:
         | not sure. there are many factors that contributed to python's
         | success.
         | 
         | i discovered the language in 98 or 99. it came with some
         | obscure linux distribution and the tkinter module stood out for
         | me. it showed pretty scientific graphs and charts. but the
         | language has to reinvent its community many times since then.
         | 
         | my intuition is that it was popular in europe in the scientific
         | community. not sure i can say the same for ruby.
        
       | sega_sai wrote:
       | As many already noticed, the rise of Python is not counter-
       | intuitive at all. (I'm a scientist myself).
       | 
       | Basically modern python offers you a spectrum from easy to
       | understand and quick to write python programs (those will be
       | slow), to purely glue code that connects a lot of high
       | performance c/C++/fortran code. And many scientists will start
       | from pure python code with the help of numpy. In many cases it
       | will be good enough. But if needed you can always interface with
       | other libraries, or write yourself high performance c/c++/fortran
       | code for the most performance critical bit, and use python to
       | glue it together. That flexibility where you can trade speed of
       | writing the code with the speed of execution is very valuable.
        
         | chalst wrote:
         | At this point we can say that against the two criteria of a
         | spectrum from prototyping to heavy lifting and ease of
         | embedding external high-performance libraries, Julia is simply
         | better than Python. Julia does have two drawbacks of being tied
         | to the one, rather heavy metal, implementation and lacking the
         | wealth of libraries outside scientific computing.
        
       | StreamBright wrote:
       | Python is the common scripting language of C, C++, Fortran.
        
       | keskival wrote:
       | I don't think Python displaced Fortran in HPC as much as it
       | displaced Matlab (and Octave) and R in scientific computing.
       | 
       | Displacing Fortran was a side-effect of that trend, as now it
       | wasn't about productionizing Matlab code into Fortran, but Python
       | could do general purpose computing adequately as well.
        
       | d--b wrote:
       | Yeah it's counter-intuitive, and it's because it does not make
       | much sense.
       | 
       | Slowness is one thing, but the tooling is also clearly subpar
       | compared to languages of the same popularity, the dynamic typing
       | makes things difficult to maintain, the 2.7 vs 3 shit show etc.
       | etc.
       | 
       | The very fact that many smart people have been saying for years
       | that Python is a fairly bad tool for data analysis should at
       | least raise some people's eyebrows. But no, the entire field of
       | data science has decided that it knows better...
       | 
       | Good for them.
        
       | smitty1e wrote:
       | > Of course, If the best algorithm is known beforehand or the
       | manpower is not a problem, a lower level-language is probably
       | faster, but this is seldom the case in real life.
       | 
       | One is wary of one-dimensional analysis of anything in a software
       | context.
       | 
       | Who cares if the Fortran library runs like the blue blaze, if it
       | cannot be readily maintained?
        
         | Bostonian wrote:
         | It is possible to write maintainable modern Fortran without
         | gotos with small functions and subroutines. OOP with
         | inheritance and dynamic polymorphism is possible since the
         | Fortran 2003 standard.
        
       | fancyfredbot wrote:
       | I feel like python acts like a kind of bus in scientific
       | computing, connecting various high performance libraries and DSLs
       | together.
       | 
       | That said, this article's story of someone using the wrong
       | algorithm is a bad example in my view. Python hasn't succeeded
       | because people are more likely to use more efficient algorithms
       | due to easier experimentation, it has succeeded because the of
       | the size of the ecosystem and the fact such algorithms are easily
       | available.
        
       | photochemsyn wrote:
       | Python displaced a lot of very expensive proprietary software in
       | the biosciences arena. Ease of use was also a major factor, as
       | many bioscientists have relatively little background in
       | programming, but the ability to escape the world of expensive
       | restrictive software licenses was very attractive to the
       | scientific community, whose historical norms emphasize the open
       | sharing of methods and results:
       | 
       | > "A program that performs a useful task can (and, arguably,
       | should) be distributed to other scientists, who can then
       | integrate it with their own code. Free software licenses
       | facilitate this type of collaboration, and explicitly encourage
       | individuals to enhance and share their programs. This flexibility
       | and ease of collaborating allows scientists to develop software
       | relatively quickly, so they can spend more time integrating and
       | mining, rather than simply processing, their data."
       | 
       | https://journals.plos.org/ploscompbiol/article?id=10.1371/jo...
       | 
       | Now there isn't any area of molecular biology and biochemistry
       | that doesn't have a host of Python libraries available to assist
       | researchers with tasks like designing PCR strategies or searching
       | for nearest matches on up to x-ray crystallography of proteins.
        
       | jackjackk0 wrote:
       | I recommend one of the recent videos by Dave Beazly [1]. He lived
       | through and contributed to the raise of Python in scientific
       | computing first hand in the 90s, and offers some interesting
       | insights. Plus he's always quite an entertainer.
       | 
       | [1] https://youtu.be/4RSht_aV7AU
        
       | amelius wrote:
       | New languages should always provide bindings to call into Python
       | modules, so you get the immediate benefit of the largest
       | ecosystem on the planet.
        
       | elil17 wrote:
       | It's amazing how often the authors point of "agility" arises in
       | real world circumstances. I'm not a programmer, but I use Python
       | a lot in my engineering job. There have been 3 times in the past
       | month where I got an order of magnitude speed up because SciPy
       | implements a very complex but highly efficient algorithm which I
       | would never have had time to deploy.
        
         | JuettnerDistrib wrote:
         | > There have been 3 times in the past month where I got an
         | order of magnitude speed up because SciPy implements a very
         | complex but highly efficient algorithm which I would never have
         | had time to deploy.
         | 
         | Yes. I feel like the author conflates the language with the
         | package ecosystem. Pure Python is pretty horrible for
         | scientific computing (3*[3]=[3,3,3] is about as
         | counterproductive to scientific computations as it gets), but
         | Numpy changes the semantics of those operations.
         | 
         | In other words, Python has an absolutely stellar package
         | ecosystem. There have been attempts to bring a package
         | ecosystem to C, but it never took off. However, I do wonder how
         | C would fare if it had.
        
       | blunte wrote:
       | Python won because people who knew math/science domains only knew
       | Python (or it was the best they knew). And so they made libraries
       | for Python. And it propogated like many other bad ideas based on
       | ignorance.
       | 
       | Python is a miserably bad language for modern times. If you know
       | any of half a dozen other languages, then you understand.
       | 
       | There was a good essay, from Paul Graham?, about the ladder of
       | awareness of programming languages. Unfortunately I can't find it
       | now.
       | 
       | The point is, Python has won and is frankly terrible. It has
       | inconsistent features, but it has an awkward OOP approach (in a
       | time when OOP is finally being recognized as bad itself), as well
       | as seriously lacking basic language features which are only
       | appearing as of 3.9 and 3.10.
       | 
       | Frameworks like Django and Django Rest Framework expand on these
       | bad ideas, creating monstrosities which make the PHP code of yore
       | look arguably decent.
       | 
       | Sadly, I don't think there's any way to kill this. The only
       | option is to vastly outperform the Python people and produce
       | reliable, readable, performant solutions in half the time and
       | beat them to market. Perhaps someday they will die off.
        
         | travisjungroth wrote:
         | > Python won because people who knew math/science domains only
         | knew Python.
         | 
         | This doesn't explain why they knew Python in the first place, a
         | pretty critical step. It reached popularity without a platform
         | mandate (JS, Swift) or corporate backing (Java, Go) so there's
         | something going on.
        
       | dekhn wrote:
       | Counter-intuitive? I picked it because it was the closest
       | scripting language to C (see the select and socket APIs for good
       | examples). And it had numeric array support early-on (making it
       | an attractive replacement for matlab).
        
       ___________________________________________________________________
       (page generated 2022-03-26 23:00 UTC)