https://www.matecdev.com/posts/numpy-julia-fortran.html

[logo_full]
Home Topics

  * Scientific Computing
  * Geospatial Analysis

Languages

  * Python
  * Julia
  * Fortran

About
Julia vs Numpy vs Fortran Assembling a large matrix of complex
exponentials

I'm switching to Julia after running this comparison of Julia vs
Numpy and Fortran, for performance and code simplicity.

Scientific Computing

Julia

Python

Julia: faster than Fortran, cleaner than Numpy

Last updated: 2021-06-15

Julia is a pretty new language, which, among other things, aims to
solve the so-called "two-language problem" in scientific computing.

That is, we usually test ideas in a rapid-prototyping language like
Matlab or Python, but when the testing is done, and its time to do
some serious computation, we need to rely on a different (compiled)
programming language.

Many tools exist to ease the transition, and wrapping Fortran
libraries into Python has been my preference so far. For example,
wrapping up some Fortran with F2PY seems like a very convenient way
to use (and distribute) efficient Fortran code that anybody can run.
I also keep track of various ways of using Fortran in Python in this
post.

Now, Julia aims to solve this issue in a radical way. The idea is to
use a single programming language, which has both an interactive
mode, suitable for rapid prototyping, but that can also be compiled
and executed at C/Fortran performance.

Some of the most basic characteristics of the alternatives under
consideration in this article are:

Language    Vectorization  Code readability   Multithreaded
Numpy       Manual         Regular            No            
Fortran     Automatic      Good               Yes          
Julia       Automatic      Excellent          Yes          

Julia vs Fortran vs Numpy: speed and code clarity summary

Here are my conclusions after running this benchmark.

  * Numpy is restricted to single-threading in many cases, is hard to
    code and read, and can be much slower than the two alternatives.
  * Fortran offers great performance with pretty simple code, but
    some wrapping is required to call Fortran from high-level
    languages like Python.
  * Julia is faster and easier to write than Numpy and is even able
    to beat gfortran's performance.

Honestly, I was shocked by what I found in my performance comparison.
Julia is amazing. I didn't expect Julia would beat gfortran with
compiler optimizations in my test. And it currently does so by a
lofty margin.

Code readability: the impact of manual vectorization vs for loops

Opinions on code readability are, of course, subjective, but I want
to explain why I think Numpy offers the worst quality experience.

So far, interpreted languages have required manual re-writing of
loops as vector operations. With some practice, this becomes and easy
task, but after having done so extensively for many years (first in
Matlab, then in Python), I came to the conclusion that I hate to
manually vectorize code. I prefer to write for loops and have the
compiler vectorize them for me!

Consider the following Numpy code, for example.

n = len(a)
M = np.exp(1j*k*(np.tile(a,(n,1))**2 + np.tile(a.reshape(n,1),(1,n))**2))

Can you tell what it does, at first sight? Can you reason about the
code's performance? Can you even guess the dimension of the output M?

Now, here is a nested for loop which does the same as the above code.

do j=1,n
do i=1,n

    M(i,j) = exp( (0,1)*k * sqrt(a(i)**2 + a(j)**2) )

end do
end do

And here is the mathematical formula it represents:

$$ M_{ij} = e^{i k \sqrt{a_i^2 + a_j^2}} $$

No wonder why they named Fortran "FORmula TRANslator" in the first
place. Interestingly, the Julia syntax is exactly similar in this
regard.

So even though I agree with the idea than Python is a beautiful
language to work with, I don't think the same holds for Numpy. Yes,
Numpy allows us to stay within a Python framework and get things done
with simple one-liners (which is pretty neat anyway) but the
resulting Numpy code becomes hard to read afterward.

My personalized micro-benchmark

I believe there is no such thing as a definitive benchmark. Running a
series of completely unrelated tests of toy problems, and claiming
that a certain language is overall the best because it does well in
most of the tests, is meaningless if we are interested in certain
specific and more realistic cases.

Rather than trying to rely on big, all-purpose benchmarks, I believe
it is preferable to run, from time to time, some micro-benchmarks
suited to our own projects.

If we can predict what the major performance bottlenecks of our code
will be,major performance bottleneck of our code We can then come up
with a minimalistic, but spot-on example, and create a specific
performance test that suits our needs.

In my numerical projects, for example, I do a lot of array operations
and I evaluate a lot of complex exponentials. So that's why this test
is about just that. My benchmark was already described above, it's
just: given a vector a, form the matrix

$$ M_{ij} = e^{i k \sqrt{a_i^2 + a_j^2}} $$

Benchmark details and code

All tests were run in an Intel i7-4700MQ CPU (3.40 GHz, 6MB of Cache)
with 8 GB of DDR3 RAM

The complete source code of the tests can be found here.

For a quick reference, here is the complete Julia code:

function eval_exp(N)
    a = range(0, stop=2*pi, length=N)
    A = Matrix{ComplexF64}(undef, N, N)

    @inbounds [email protected] for j in 1:N
    for i in 1:N
        A[i,j] = exp((100+im)*im*sqrt(a[i]^2 + a[j]^2))
    end
    end

    return A

end

eval_exp(5)
print(string("running loop on ", Threads.nthreads(), " threads \n"))
for N in 1000:1000:10000
    @time begin
    A = eval_exp(N)
    end
end

In case the above figure is hard to read, here are the actual
numbers:

  N   Numpy Fortran ST Julia ST Fortran MT Julia MT
1000  0.114 0.048      0.035    0.035      0.007
2000  0.204 0.191      0.126    0.054      0.031
3000  0.448 0.421      0.289    0.117      0.062
4000  0.778 0.756      0.554    0.205      0.116
5000  1.229 1.173      0.831    0.317      0.215
6000  1.741 1.686      1.108    0.456      0.274
7000  2.671 2.283      1.489    0.614      0.304
8000  4.190 2.994      1.948    0.884      0.402
9000  5.319 3.797      2.460    1.022      0.505
10000 6.847 4.895      3.039    1.353      0.616

note: ST and MT stand for single-threaded and multithreaded,
respectively. Note that in all cases, the fast-math optimization flag
was used, which allows compilers to reorder mathematical expressions
to obtain accuracy, but in a way that could break IEEE compliance.

Multithreading efficiency

Interestingly, from the above table, we can compute the threading
efficiency, that is, the speed-up factor when moving from single to
multithreading.

Threading efficiency

In a quad-core CPU, Julia's Multithreading efficiency is about 4.9,
whereas when relying on gfortran and OpenMP, we get about 3.6. So
Julia is effectively taking advantage of hyperthreading, something
which is rarely seen in OpenMP.

   N    1000  2000  3000  4000  5000  6000  7000  8000  9000  10000
Fortran 1.389 3.520 3.601 3.682 3.701 3.697 3.717 3.386 3.714 3.618
Julia   4.897 3.976 4.660 4.769 3.866 4.036 4.890 4.844 4.866 4.928

Additional performance tweaks in Julia

Surprisingly, the Julia performance story doesn't end up here.

It turns out that Julia has further tools to increase performance.
One of such tools is LoopVectorization.jl

As suggested by Mason in this topic of the JuliaLang forum, at least
the single-threaded version can be sped up quite significantly by
resorting to that library.

I'll continue to dig deeper on how to skeeze the last bit of
performance out of computer hardware by resorting to Julia. Stay
tuned!

Related posts:

  * Why is Fortran still used. I had a fair degree of experience with
    Fortran, and I believe there are solid reasons to use Fortran for
    scientific computing today.
  * Will Julia Replace Fortran in HPC?. However, afeter comparing
    Fortran with Julia on various grounds, I tend to think that, over
    time, Julia seems on track to superseed Fortran, even in the HPC
    niche.

   
                             [face2020]

Welcome!

I'm Martin D. Maas Ph.D, and in this website I share tutorials,
software reviews, comments, and more, about mathematics, technology
and software development.

 

Read more...

Table of contents:

  * Julia vs Fortran vs Numpy: speed and code clarity summary
  * Code readability: the impact of manual vectorization vs for loops
  * My personalized micro-benchmark
      + Benchmark details and code
      + Multithreading efficiency
  * Additional performance tweaks in Julia
  * Related posts:

Copyright (c) Martin D. Maas PhD, 2021
Privacy policy | Cookie policy