[HN Gopher] Arraymancer - Deep learning Nim library
___________________________________________________________________
Arraymancer - Deep learning Nim library
Author : archargelod
Score : 185 points
Date : 2024-03-29 03:26 UTC (19 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| angusturner wrote:
| I would love for a non-python based deep learning framework to
| gain traction.
|
| My initial impression though is that the scope is very broad.
| Trying to be both sci-kit learn and numpy and torch seems like a
| recipe for doing none of these things very well.
|
| Its interesting to contrast this with the visions/aspirations of
| other new-ish deep learning frameworks. Starting with my
| favorite, Jax offers "composable function transformations +
| autodiff". Obviously there is still a tonne of work to do this
| well, support multiple accelerators etc. etc. But notably I think
| they made the right call to leave high level abstractions (like
| fully-fledged NN libraries or optimisation libraries) out of the
| Jax core. It does what it says on the box. And it does it really
| really well.
|
| TinyGrad seems like another interesting case study, in the sense
| that it is aggressively pushing to reduce complexity and LOC
| while still providing the relevant abstractions to do ML on
| multiple accelerators. It is quite young still, and I have my
| doubts about how much traction it will gain. Still a cool project
| though, and I like to see people pushing in this direction.
|
| PyTorch obviously still has a tonne of mind-share (and I love
| it), but it is interesting to see the complexity of that project
| grow beyond what it is arguably necessary. (e.g. having a
| "MultiHeadAttention" implementation in PyTorch is a mistake in my
| opinion).
| ezquerra wrote:
| I am one of the Arraymancer contributors. I believe that what
| mratsim (Arraymancer's creator) has done is pretty amazing but
| I agree that the scope is a quite ambitious. There's been some
| talk about separating the deep learning bits into its own
| library (which I expect would be done in a backwards compatible
| way). Recently we worked on adding FFT support but instead of
| adding it to Arraymancer it was added to "impulse"
| (https://github.com/SciNim/impulse) which is a separate, signal
| processing focused library. There is also Vindaar's datamancer
| (a pandas like dataframe library) and ggplotnim (a plotting
| library inspired by R's ggplot). The combination of all of
| these libraries makes nim a very compelling language for signal
| processing, data science and ML.
|
| Personally I'd like Arraymancer to be a great tensor library
| (basically a very good and ideally faster alternative to numpy
| and base Matlab). Frankly I think that it's nearly there
| already. I've been using Arraymancer to port a 5G physical
| layer simulator from Matlab to nim and it's been a joy. It's
| not perfect by any means but it's already very good. And given
| how fast nim's scientific ecosystem keeps improving it will
| only get much better.
| pjmlp wrote:
| I see Julia and eventually Mojo gaining more adoption than
| anything else, this without taking into account that finally
| JIT efforts have started to be taken more seriously by the
| community after PyPy feeling quixotic for so many years.
|
| There are also the JIT GPU efforts from Intel and NVIDIA into
| their APIs.
|
| Personally I would like to see more Java and .NET love, however
| dynamic languages loved by the research community is where the
| game is at, also the reasoning behind Mojo, after the Swift for
| Tensorflow failure.
|
| Naturally kudos to the Arraymancer effort, the more the better.
| machinekob wrote:
| Mojo is AOT not JIT or is there are some sort JIT in mojo?
| (I'm not up to date with new development there) I feel like
| JIT is just kinda bad for Deep learning in most cases you
| want to "compile and optimise" your graph before run and load
| it fast for running/training.
| ubj wrote:
| Mojo supports both AOT and JIT compilation:
|
| https://docs.modular.com/mojo/faq#is-mojo-interpreted-or-
| com...
| pjmlp wrote:
| You're mixing up my Julia / Mojo as languages alternatives,
| versus my reference to Python JITs.
| gcr wrote:
| Jax is so complex though! Autograd using bytecode reflection to
| inspect CPython's interpreter state to emit Jax's own front end
| ILR (jaxpr), the Jax-specific compiler (XLA) that lowers HLO
| down to at least three different implementation backends (cpu,
| TPU, GPU for CUDA, maybe more...) Then there's the JIT that Jax
| also brings to the table. All of that to make something that
| seems simple on the surface.
|
| You could say that Jax is simultaneously trying to be numpy,
| Theano/sympy, PyPy/numba, and pyCUDA all at the same time.
|
| Both systems are trying to be so much. Perhaps the difference
| is Jax's focus on a narrower developer interface.
| CornCobs wrote:
| What syntax of nim's is the network: ... Used to declaratively
| construct the neural networks? Is it a macro? Looks really neat!
| supakeen wrote:
| Yes, Nim macros can fiddle with the AST: https://nim-
| lang.org/docs/macros.html
|
| You can also see another (I think) neat example in `npeg`:
| https://github.com/zevv/npeg?tab=readme-ov-file#quickstart
| warangal wrote:
| It is a small DSL written using macros at https://github.com/mr
| atsim/Arraymancer/blob/master/src/array....
|
| Nim has pretty great meta-programming capabilities and
| arraymancer employs some cool features like emitting cuda-
| kernels on the fly using standard templates depending on
| backend !
| logicchains wrote:
| Interesting that it "Supports tensors of up to 6 dimensions". Is
| it difficult to support an arbitrary number of dimensions, e.g.
| does Nim lack variadic generics?
| ElegantBeef wrote:
| It does not formally have variadics, but since it has tuples
| you can have pretend variadic generics.
| mratsim wrote:
| Author here,
|
| Nim supports variadic generic, it's an arbitrary limitation so
| that shape and stride small vectors that describe a tebsor can
| be stack-allocated and fit in a cacheline.
|
| Also at the time, Nim default heap allocator was not compatible
| with OpenMP.
|
| Edit: it can be configured via a compile-time flag to 8 or 10
| or anything:
| https://github.com/mratsim/Arraymancer/blob/master/src%2Farr...
| jononor wrote:
| 6 dimensions is sufficient for a dataset of 3d hyperspectral
| video (batch, time, x, y, z, channels). I think it wil cover
| the vast majority of usecases :D
| lucidrains wrote:
| I've never worked with a project with more than 7 dimensions,
| yet
| wodenokoto wrote:
| Having grown up with JavaScript Python and R, I'm kinda looking
| towards learning a compiled language.
|
| I've given a bit of thought to Rust since it's polars native and
| I want to move away from pandas.
|
| Is nim a good place to go?
| FireInsight wrote:
| Nim as a language is a good place to go. The ecosystem is
| another story entirely. I suggest you search for the kinds of
| libraries you'd need and check their maintenance status, maybe
| do some example project to get a feel for the compiler and
| `nimble`.
| arc619 wrote:
| Native Nim libs are definitely nicer, but being able to
| output C/C++/JS/LLVM-IR with nice FFI means you can access
| those ecosystems natively too. It's one reason the language
| has been so great for me, as I can write shared Nim code that
| uses both C and JS libs (even Node) in the same project.
| ezquerra wrote:
| Definitely. Nim is a great language and coming from Python it
| might be the easiest compiled language for you to get into.
| aquova wrote:
| Nim is probably my favorite language for personal projects at
| the moment. I love the syntax and the tools available in the
| STL.
|
| However, the things I'm interested in don't require much use of
| 3rd party packages, but I'm told this is its current weakness.
| Granted, that can only be fixed if more people adopt it.
| miki123211 wrote:
| IMO, no language without a Jupyter kernel can ever be a serious
| contender in the machine learning research space.
|
| I was pretty skeptical of Jupyter until recently (because of
| accessibility concerns), but I just can't imagine my life without
| it any more. Incidentally, this gave me a much deeper
| appreciation and understanding of why people loved Lisp so much.
| An overpowered repl is an useful tool indeed.
|
| Fast compilation times are great and all, but the ability to
| modify a part of your code while keeping variable values intact
| is invaluable. This is particularly true if you have large
| datasets that are somewhat slow to load or models that are
| somewhat slow to train. When you're experimenting, you don't want
| to deal with two different scripts, one for training the model
| and one for loading and experimenting with it, particularly when
| both of them need to do the same dataset processing operations.
| Doing all of this in Jupyter is just so much easier.
|
| With that said, this might be a great framework for deep learning
| on the edge. I can imagine this thing, coupled with a nice
| desktop GUI framework, being used in desktop apps for using such
| models. Things like LLM Studio, Stable Diffusion, voice changers
| utilizing RVC (as virtual sound cards and/or VST plugins), or
| even internal, proprietary models, to be used by company
| employees. Use cases where the model is already trained, you
| already know the model architecture, but you want a binary that
| can be distributed easily.
| pietroppeter wrote:
| Jupyter notebook is indeed very important. It mainly provides
| data scientists with two things: a literate programming
| environment (mixing text, code and outputs) and a way to hold
| state of data in memory (so that you can perform computation
| interactively).
|
| As a different take to literate programming we have created a
| library and an ecosystem around it:
| https://github.com/pietroppeter/nimib
|
| For holding state a Nim repl (which is on the roadmap as
| secondary priority after completing incremental compilation) is
| definitely an option.
|
| Another option could be to create a library framework for
| caching (or be able to serialize and deserialize quickly) large
| data and objects. One way to see it, could be to build
| something similar to streamlit cache (streamlit indeed provides
| great interactivity)
| jononor wrote:
| I just had a look, and there does seem to be a Jupyter kernel
| at https://github.com/stisa/jupyternim
| pokipoke wrote:
| Elixir has https://livebook.dev/
| freedomben wrote:
| Elixir beating python in the machine learning wars, or at
| least becoming a competitive option, is something I dream of.
|
| Is anybody using Elixir for ML who could comment on the state
| of it? How usable is it now?
|
| Last I heard, for new projects/models/etc it was great, but
| so much existing stuff (that you want to reuse or expand on)
| is dependent on python, making it hard unless you are
| starting from scratch.
| jononor wrote:
| As someone who uses ML on embedded devices, it is great to see
| good alternatives in compiled languages. Nim seems like a very
| useful and pragmatic language in this regard. Certainly a huge
| step up from the C and C++ which is still very entrenched. I
| think that solid libraries for deep learning is something we will
| see in practically all programming languages. In 10 years a
| library covering core usecase (of today) will be as standard as a
| JSON parser and a web sever, for almost any ecosystem.
| elcritch wrote:
| Nim is also great on embedded devices, both Linux, RTOS, and
| barebone! Though ML on embedded would require the "Lazer" pure
| Nim backend the author has been working on. Well unless you can
| compile Blas for the embedded device.
___________________________________________________________________
(page generated 2024-03-29 23:01 UTC)