[HN Gopher] Awkward: Nested, jagged, differentiable, mixed type,...
       ___________________________________________________________________
        
       Awkward: Nested, jagged, differentiable, mixed type, GPU-enabled,
       JIT'd NumPy
        
       Author : pizza
       Score  : 132 points
       Date   : 2021-12-16 08:45 UTC (14 hours ago)
        
 (HTM) web link (awkward-array.org)
 (TXT) w3m dump (awkward-array.org)
        
       | mathgenius wrote:
       | How does this work? Is it a C++ lib with python interface? Does
       | it use llvm for the JIT? ...
        
         | jpivarski wrote:
         | It's a C++ library with a Python interface. How much C++ and
         | how much Python is currently in flux
         | (https://indico.cern.ch/event/855454/contributions/4605044/):
         | we're moving toward having the majority of the library be in
         | Python with only the speed-critical parts in C++ behind a C
         | interface.
         | 
         | The main library has no JIT, so no LLVM, but there is a
         | secondary method of access through Numba, which has JIT and
         | LLVM.
         | 
         | The main method of access is to call array-at-a-time
         | ("vectorized") functions on the arrays, like NumPy. The types
         | of the data in the arrays can be defined at runtime, and are
         | stored in a columnar way, like Apache Arrow. This format allows
         | for operations to be performed on dynamically typed data at
         | compile-time speeds without specialized compilation--it can be
         | done with a suite of precompiled functions. The reason that's
         | possible is because these data structures are never
         | instantiated in the normal "record-oriented" way, they're sets
         | of integer arrays that are only interpreted as data structures,
         | and it's not hard to precompile a suite of functions that only
         | operate on integer arrays.
         | 
         | A 3.5 minute explanation of the columnar format and what a
         | columnar operation looks like can be found here:
         | https://youtu.be/2NxWpU7NArk?t=661
        
       | radarsat1 wrote:
       | This looks incredibly useful for my current project! We have lots
       | of arrays of objects having trajectories of different lengths to
       | deal with, and I often end up having to use numpy with
       | dtype=object, which feels, well, "awkward". It can be very hard
       | to convince numpy arrays to have items that are actually lists,
       | but I like using the vectorized syntax. I've found it sometimes
       | necessary to have Pandas dataframes containing lists too, which I
       | also find awkward.
       | 
       | Just a note, I noticed that there is an unfortunate error in the
       | page describing the bike route calculations. Right in the box
       | where it's supposed to show how much faster things are after JIT,
       | instead a traceback is displayed ending with:
       | 
       | > ImportError: Numba needs NumPy 1.20 or less
       | 
       | I don't think this is intentional. This is in the output box just
       | above the phrase "But it runs 250x faster than the pure Python
       | code:". The box below it shows no timings.
       | 
       | I guess this is a danger of having a "live" backend to render and
       | serve documentation.
        
       | mlajtos wrote:
       | We are getting closer and closer to differentiable APL running on
       | GPU. I like this direction :)
        
         | FridgeSeal wrote:
         | It'd be a lot cooler if so much of the work wasn't so
         | inextricably linked to Python.
        
           | mlajtos wrote:
           | Well, Python was at the right place at the right time. I am
           | actually happy that array-oriented programming (via Numpy)
           | got mainstream. A lot of great ideas came from sheer number
           | of people doing tinkering and great engineering in this
           | domain. If only people caught on to Iverson's 'Notation as a
           | Tool of Thought' [0] and Backus' 'Can programming be
           | liberated from the von Neumann style? A functional style and
           | its algebra of programs'.
           | 
           | [0] https://dl.acm.org/doi/pdf/10.1145/358896.358899 [1]
           | https://dl.acm.org/doi/pdf/10.1145/359576.359579
        
         | cl3misch wrote:
         | Not sure if you mean exactly this, but JAX is pretty close to
         | "differentiable APL running on GPU" I'd say.
        
           | mlajtos wrote:
           | Yes, I agree. Only if we could have a thin syntactic sugar on
           | top and start using algebra to compose stuff.
        
         | N1H1L wrote:
         | But also jax.numpy can do this quite a bit too.
        
         | ByThyGrace wrote:
         | Which GPU? On whose drivers? On which systems?
        
         | Certhas wrote:
         | Julia is there now. It seems absurd to me to build this on top
         | of Python. You are building everything as a DSL that needs to
         | be carefully crafted to never touch the data structures of the
         | host language. Changing from Python to Julia and suddenly being
         | able to use datastructures other than numpy arrays, and writing
         | things that should be for loops as for loops, was such a
         | relieve. But, I guess as Javascript has shown, with an infinite
         | amount of engineering resources dedicated to it you can build
         | anything in any language.
        
           | bertr4nd wrote:
           | I'm extremely sympathetic to this viewpoint. I spent the last
           | couple years building a JIT compiler for PyTorch, and a
           | constant battle was that people want to write "normal"
           | idiomatic Python, but those idioms work to defeat
           | optimizations, unless you go to really crazy lengths. Some of
           | the recent work in this space (torch.fx in particular) has
           | made things easier by allowing the user to "clean up" their
           | model, but it was definitely an eye-opening experience.
        
             | mpweiher wrote:
             | This sounds very interesting! Did you ever write something
             | up about the issues encountered?
        
               | bertr4nd wrote:
               | I have not, but I will certainly consider doing so!
        
         | servytor wrote:
         | I think Futhark[0] is close to that, no? Or what about Co-
         | DFNS[1]?
         | 
         | [0]: https://futhark-lang.org/
         | 
         | [1]: https://github.com/Co-dfns/Co-dfns
        
           | rowanG077 wrote:
           | Futhark doesn't support AD yet. But it's pretty close afaik.
        
       | visarga wrote:
       | How fast is it at loading and saving compared to Python objects
       | with JSON?
        
         | jpivarski wrote:
         | I've been doing some tests with JSON recently, so I have some
         | exact numbers for a particular sample. Suppose you have JSON
         | like the following:                   MULTIPLIER = int(10e6)
         | json_string = b"[" + b", ".join([             b'[{"x": 1.1,
         | "y": [1]}, {"x": 2.2, "y": [1, 2]}, {"x": 3.3, "y": [1, 2,
         | 3]}],' +             b'[],' +             b'[{"x": 4.4, "y":
         | [1, 2, 3, 4]}, {"x": 5.5, "y": [1, 2, 3, 4, 5]}]'         ] *
         | MULTIPLIER) + b"]"
         | 
         | It's a complex structure with many array elements (30 million),
         | but those elements have a common type. It's 1.4 GB of
         | uncompressed JSON. If I convert it to a Parquet file (a natural
         | format for data like these), it would be 1.6 GB of uncompressed
         | Parquet. It can get much smaller with compression, but since
         | the same numbers are repeating in the above, compressing it
         | would not be a fair comparison. (Note that I'm using 3 bytes
         | per float in the JSON; uncompressed Parquet uses 8 bytes per
         | float. I should generate something like the above with random
         | numbers and then compress the Parquet.)
         | 
         | Reading the JSON into Python dicts and lists using the standard
         | library `json` module takes 70 seconds and uses 20 GB of RAM
         | (for the dicts and lists, not counting the original string).
         | 
         | Reading the Parquet file into an Awkward Array takes 3.3
         | seconds and uses 1.34 GB of RAM (for just the array).
         | 
         | Reading the JSON file into an Awkward Array takes 39 seconds
         | and uses the same 1.34 GB of RAM. If you have a JSONSchema for
         | the file, an experimental method (https://github.com/scikit-
         | hep/awkward-1.0/pull/1165#issuecom...) would reduce the reading
         | time to 6.0 seconds, since most of that time is spent
         | discovering the schema of the JSON data dynamically.
         | 
         | The main thing, though, is that you can compute the already-
         | loaded data in faster and more concise ways. For instance, if
         | you wanted to slice and call a NumPy ufunc on the above data,
         | like                   output = np.square(array["y", ..., 1:])
         | 
         | the equivalent Python would be                   output = []
         | for sublist in python_objects:             tmp1 = []
         | for record in sublist:                 tmp2 = []
         | for number in record["y"][1:]:
         | tmp2.append(np.square(number))
         | tmp1.append(tmp2)             output.append(tmp1)
         | 
         | The Python code runs in 140 seconds and the array expression
         | runs in 2.4 seconds (current version; in the unreleased version
         | 2 it's 1.5 seconds).
         | 
         | For both loading data and performing operations on them, it's
         | orders of magnitude faster than pure Python--for the same
         | reasons the same can be said of NumPy. What we're adding here
         | is the ability to do this kind of thing on non-rectangular
         | arrays of numbers.
        
       | rich_sasha wrote:
       | It would be nice to see some comparison to JAX.
       | 
       | Is it mostly that arrays in Awkward are, sort of, unstructured?
       | JIT-ed, ADed JAX on JSON-like structures, on GPU?
        
         | ivirshup wrote:
         | It's a ragged array that you can work with in python code, but
         | also Jax and numba code.
        
         | nmca wrote:
         | This isn't really comparable to JAX afaict - JAX offers only
         | masked dense computation, this is first-class ragged support
         | with no mention of autodiff or jit (outside of numba)
        
           | hyperbovine wrote:
           | There is definitely a comparison to be made, because JAX gets
           | a big speed gain by embracing that limitation.
        
       | ivirshup wrote:
       | Don't be too scared by all the TODO's in the docs, most of that
       | stuff actually works - the docs just haven't been written yet.
       | Admittedly those docs have had TODOs on them for over a year.
       | 
       | To see how it works, you've got to look at the repo or any of the
       | talks. The developers are also quite active on the tracker, so do
       | just ask if something is unclear.
        
       | karjudev wrote:
       | Can I define a deep neural network as an Awkward array of
       | matrices of different sizes? It looks very promising to compute a
       | backpropagation step without ping-ponging with the Python
       | interpreter.
        
         | jpivarski wrote:
         | I hadn't considered an Awkward array being used _as_ a neural
         | network, and I would expect specialized neural network
         | frameworks to optimize it better: don 't they fuse the matrices
         | representing each layer into a single outside-of-Python
         | computation? If they don't, I wonder why not.
         | 
         | Nevertheless, my ears perked up at "matrices of different
         | sizes," since that was something I implemented a while back
         | thinking it would find a use-case if it were well advertised.
         | 
         | You can matrix-multiply two arrays of matrices, in which all
         | the matrices have different shapes, as long as matrix "i" in
         | the first array is compatible with matrix "i" in the second
         | array. Like this:                   import awkward1 as ak
         | lefts = ak.Array([             [[1, 2],              [3, 4],
         | [5, 6]],                  [[1, 2, 3, 4],              [5, 6, 7,
         | 8]],                  [[1],              [2],              [3],
         | [4]],         ])              rights = ak.Array([
         | [[7, 8, 9],              [10, 11, 12]],                  [[8,
         | 10],              [11, 12],              [13, 14],
         | [15, 16]],                  [[5, 6, 7]],         ])
         | 
         | Matrix-multiplying them results in this:                   >>>
         | (lefts @ rights).tolist()         [             [[ 27,  30,
         | 33],              [ 61,  68,  75],              [ 95, 106,
         | 117]],                  [[129, 140],              [317, 348]],
         | [[ 5,  6,  7],              [10, 12, 14],              [15, 18,
         | 21],              [20, 24, 28]]         ]
         | 
         | The left[0]'s 3x2 times right[0]'s 2x3 results in the output's
         | 3x3, etc for i=1 and i=2. This is the kind of operation you
         | wouldn't even be able to talk about if you didn't have jagged
         | arrays.
        
           | jpivarski wrote:
           | I meant to add that autodiff is a feature we'll be working on
           | this upcoming spring, by using JAX arrays as the buffers.
           | Until we have autodiff, backpropagation won't be possible.
        
       | dheera wrote:
       | Is there a comparison to Numba?
        
         | stevesimmons wrote:
         | Awkward is a complement to Numba, not an alternative.
         | 
         | Awkward is the container for data. Like Numpy, it supports
         | storing data as low-level arrays rather than everything being a
         | "boxed" Python object. While Numpy is for regular arrays,
         | Awkward adds irregular arrays of JSON-like objects.
         | 
         | Numba is the computation layer. Its JIT compiler builds faster
         | code by specialising for specific concrete data types, rather
         | than Python allowing everything to be dynamically changed.
         | 
         | If Numba code is fed arrays of Numpy/Awkward data, your
         | computations get much faster.
         | 
         | So they are complementary, not alternatives.
        
       | moelf wrote:
       | glad it's well funded:
       | https://nsf.gov/awardsearch/showAward?AWD_ID=2103945
        
       | jpivarski wrote:
       | Hi! I'm the original author of Awkward Array (Jim Pivarski),
       | though there are now many contributors with about five regulars.
       | Two of my colleagues just pointed me here--I'm glad you're
       | interested! I can answer any questions you have about it.
       | 
       | First, sorry about all the TODOs in the documentation: I laid out
       | a table of contents structure as a reminder to myself of what
       | ought to be written, but haven't had a chance to fill in all of
       | the topics. From the front page (https://awkward-array.org/), if
       | you click through to the Python API reference (https://awkward-
       | array.readthedocs.io/), _that_ site is 100% filled in. Like
       | NumPy, the library consists of one basic data type, `ak.Array`,
       | and a suite of functions that act on it, `ak.this` and `ak.that`.
       | All of those functions are individually documented, and many have
       | examples.
       | 
       | The basic idea starts with a data structure like Apache Arrow
       | (https://arrow.apache.org/)--a tree of general, variable-length
       | types, organized in memory as a collection of columnar arrays--
       | but performs operations on the data without ever taking it out of
       | its columnar form. (3.5 minute explanation here:
       | https://youtu.be/2NxWpU7NArk?t=661) Those columnar operations are
       | compiled (in C++); there's a core of structure-manipulation
       | functions suggestively named "cpu-kernels" that will also be
       | implemented in CUDA (some already have, but that's in an
       | experimental stage).
       | 
       | A key aspect of this is that structure can be manipulated just by
       | changing values in some internal arrays and rearranging the
       | single tree organizing those arrays. If, for instance, you want
       | to replace a bunch of objects in variable-length lists with
       | another structure, it never needs to instantiate those objects or
       | lists as explicit types (e.g. `struct` or `std::vector`), and so
       | the functions don't need to be compiled for specific data types.
       | You can define any new data types at runtime and the same
       | compiled functions apply. Therefore, JIT compilation is not
       | necessary.
       | 
       | We do have Numba extensions so that you can iterate over runtime-
       | defined data types in JIT-compiled Numba, but that's a _second_
       | way to manipulate the same data. By analogy with NumPy, you can
       | compute many things using NumPy 's precompiled functions, as long
       | as you express your workflow in NumPy's vectorized way. Numba
       | _additionally_ allows you to express your workflow in imperative
       | loops without losing performance. It 's the same way with Awkward
       | Array: unpacking a million record structures or slicing a million
       | variable-length lists in a single function call makes use of some
       | precompiled functions (no JIT), but iterating over them at scale
       | with imperative for loops requires JIT-compilation in Numba.
       | 
       | Just as we work with Numba to provide both of these programming
       | styles--array-oriented and imperative--we'll also be working with
       | JAX to add autodifferentiation (Anish Biswas will be starting on
       | this in January; he's actually continuing work from last spring,
       | but in a different direction). We're also working with Martin
       | Durant and Doug Davis to replace our homegrown lazy arrays with
       | industry-standard Dask, as a new collection type
       | (https://github.com/ContinuumIO/dask-awkward/). A lot of my time,
       | with Ianna Osborne and Ioana Ifrim at my university, is being
       | spent refactoring the internals to make these kinds of
       | integrations easier
       | (https://indico.cern.ch/event/855454/contributions/4605044/). We
       | found that we had implemented too much in C++ and need more, but
       | not all, of the code to be in Python to be able to interact with
       | third-party libraries.
       | 
       | If you have any other questions, I'd be happy to answer them!
        
         | pizza wrote:
         | Do you have some examples of slicing, masking, un-padding, and
         | (I suppose) "Haskell-like" ops, eg fmap, but also eg treemap,
         | vmap, pmap that are in Jax? Also grouping, cutting, and
         | interweaving, and also.. this is kind of a weird ask but
         | suppose I had an operation in pure assembly that is extremely
         | fast w two int64 parameters outputting one int64, what's the
         | easiest path for me to get awkward to apply that to two Arrays
         | and give me one Array back as output?
        
           | jpivarski wrote:
           | Okay, that's a lot of questions.
           | 
           | There are slice examples here, for all the different ways
           | these arrays can be sliced: https://awkward-
           | array.readthedocs.io/en/latest/_auto/ak.Arra...
           | 
           | That includes what I think you mean by "masking." (You mean
           | keeping only the array elements that are `true` in a boolean
           | array? There's another function we call ak.mask that keeps
           | all array elements, but replaces the ones that line up with
           | `false` with a missing value: https://awkward-
           | array.readthedocs.io/en/latest/_auto/ak.mask...)
           | 
           | If you have irregular-length lists and you want to make them
           | all the same length, that's padding, ak.pad_none:
           | https://awkward-
           | array.readthedocs.io/en/latest/_auto/ak.pad_... What's "un-
           | padding"?
           | 
           | Mapping is implicit, as it is in NumPy. If you use an Awkward
           | Array in any NumPy ufunc, including binary operators like
           | `+`, `-`, `*`, `==`, `<`, `&`, etc., then all the arrays will
           | be broadcasted and computed element-by-element. This is true
           | whether the data structure is flat or a deep tree. ("Array-
           | oriented" is a different style from "functional.")
           | 
           | There hasn't been much call for grouping yet--Awkward Array
           | is more like a NumPy extension than a Pandas extension--but
           | there is a way to do it by combining a few functions, which
           | is described in the ak.run_lengths documentation:
           | https://awkward-
           | array.readthedocs.io/en/latest/_auto/ak.run_...
           | 
           | For wrapping a function with an ABI interface, I think the
           | easiest way to do that would be to use Numba and ctypes.
           | import ctypes         import awkward as ak         import
           | numba as nb              libm =
           | ctypes.cdll.LoadLibrary("/lib/x86_64-linux-gnu/libm.so.6")
           | libm_exp = libm.exp         libm_exp.argtypes =
           | (ctypes.c_double,)         libm_exp.restype = ctypes.c_double
           | libm_exp(0)    # 1.0         libm_exp(1)    #
           | 2.718281828459045         libm_exp(10)   # 22026.465794806718
           | @nb.vectorize([nb.float64(nb.float64)])         def
           | ufunc_exp(x):             return libm_exp(x)
           | array = ak.Array([[0.0, 1.1, 2.2], [], [3.3, 4.4], [5.5],
           | [6.6, 7.7, 8.8, 9.9]])              ufunc_exp(array)   #
           | calls libm_exp on every value in array, returns an array of
           | the same structure
           | 
           | Numba's @vectorize decorator (https://numba.pydata.org/numba-
           | doc/latest/user/vectorize.htm...) makes a ufunc, and Awkward
           | Array knows how to implicitly map ufuncs. (It is necessary to
           | specify the signature in the @vectorize argument; otherwise,
           | it won't be a true ufunc and Awkward won't recognize it.)
           | 
           | When Numba's JIT encounters a ctypes function, it goes to the
           | ABI source and inserts a function pointer in the LLVM IR that
           | it's generating. Unfortunately, that means that there is
           | function-pointer indirection on each call, and whether that
           | matters depends on how long-running the function is. If you
           | mean that your assembly function is 0.1 ns per call or
           | something, then yes, that function-pointer indirection is
           | going to be the bottleneck. If you mean that your assembly
           | function is 1 ms per call and that's fast, given what it
           | does, then I think it would be alright.
           | 
           | If you need to remove the function-pointer indirection and
           | still run on Awkward Arrays, there are other things we can
           | do, but they're more involved. Ping me in a GitHub Issue or
           | Discussion on https://github.com/scikit-hep/awkward-1.0
        
         | riskneutral wrote:
         | Was there no way to do this in Apache Arrow, or with some
         | modifications to Arrow?
        
           | jpivarski wrote:
           | Naturally, we considered this! :) We needed a larger set of
           | tree node types than Apache Arrow in order to perform some of
           | these operations without descending all the way down a
           | subtree. For example, the implementation of the slice
           | described in the video I linked above requires list offsets
           | to be described as two separate arrays, which we call
           | `starts` and `stops`, rather than a single set of `offsets`.
           | So we have a ListArray (most general, uses `starts` and
           | `stops`) and a ListOffsetArray (for the special case with
           | `offsets`) and some operations produce one, other operations
           | produce the other (as an internal detail, hidden from high-
           | level users). Arrow's ListType is equivalent to the
           | ListOffsetArray. If we had been forced to only use
           | ListOffsetArrays, then the slice described in the video would
           | have to propagate down to all of a tree node's children, and
           | we want to avoid that because a wide record can have a lot of
           | children.
           | 
           | So Awkward Array has a superset of Arrow's node types.
           | However, one of the great things about columnar data is that
           | transformation between formats can share memory and be
           | performed in constant time. In the ak.to_arrow/ak.from_arrow
           | functions (https://awkward-
           | array.readthedocs.io/en/latest/_auto/ak.to_a...),
           | ListOffsetArrays are replaced with Arrow's list node type
           | with shared memory (i.e. we give it to Arrow as a pointer).
           | Our ListArrays are rewritten as ListOffsetArrays, propagating
           | down the tree, before giving it to Arrow. If you're doing
           | some array slicing and your goal is to end up with Arrow
           | arrays, what we've effectively done is delayed the evaluation
           | of the changes that have to happen in the list node's
           | children until they're needed to fit Arrow's format. You
           | might do several slices in your workflow, but the expensive
           | propagation into the list node's children happens just once
           | when the array is finally being sent to Arrow.
           | 
           | As far as what matters for users, Awkward Arrays are 100%
           | compatible with Arrow through the ak.to_arrow/ak.from_arrow
           | functions, and usually shares memory with O(1) cost (where
           | "n" is the length of the array). When it isn't shared memory
           | with O(1) conversion time, it's because it's doing
           | evaluations that you were saved from having to do earlier.
        
             | riskneutral wrote:
             | That sounds great! I'm not working in Python though, but I
             | am interested in the Awkward Array data structure. I'm
             | trying to find example code that just uses libawkward from
             | C++.
        
               | jpivarski wrote:
               | Then I need to warn you that libawkward.so is going away
               | --it didn't serve the purposes that we had for it. The
               | decision to downsize the C++ part is described in detail
               | here: https://indico.cern.ch/event/855454/contributions/4
               | 605044/
               | 
               | I think it would be great to develop this concept in
               | other languages, and I even tried to do some
               | standardization before implementation at the start of the
               | project, but when it came down to it, it had to be
               | implemented in some language. Even then, as users needed
               | specific functions, we added them directly in Python
               | whenever possible, since the development time of doing it
               | in C++ and only exposing it in Python was prohibitive.
               | Clement Helsens found this out when he tried to use the
               | C++ layer without Python (https://github.com/clementhelse
               | ns/FCCAnalyses/blob/awkward/a... a lot of what he needed
               | wasn't there.
               | 
               | But the ideas are simple, and if you have a language in
               | mind, building up a similar implementation could be
               | pretty quick for a specific application--it's generality
               | that takes a long time. This page: https://awkward-
               | array.readthedocs.io/en/latest/ak.layout.Con... has links
               | to all the node types that we've found useful and each
               | link has a minimal, demo-style implementation--not the
               | real implementation with all its baggage. They show what
               | it looks like to _start_ implementing such a thing.
               | 
               | I should also point out that Joosep Pata implemented an
               | alternative Awkward Array:
               | https://arxiv.org/abs/1906.06242 and
               | https://github.com/hepaccelerate/hepaccelerate , not with
               | a different interface language, but using CUDA for a real
               | application faster than my group could get to it.
               | 
               | If the language you're talking about is Julia, I'm doubly
               | interested. In high-energy physics, there's a group of us
               | who think that Julia would be a great alternative to the
               | mix of Python and C++ we currently have. Despite the
               | comments above about Julia having this already, Awkward
               | Array is a rather different thing from JIT-compiled data
               | structures, and it would be as valuable in Julia as it is
               | in Python, least of all because GPUs have to be addressed
               | in a vectorized way.
        
               | ChrisRackauckas wrote:
               | > Despite the comments above about Julia having this
               | already, Awkward Array is a rather different thing from
               | JIT-compiled data structures, and it would be as valuable
               | in Julia as it is in Python, least of all because GPUs
               | have to be addressed in a vectorized way.
               | 
               | This is just done through multiple dispatch with type
               | parameters in Julia. Vector{Vector{Float64}} is an "array
               | of lists". Vector{Dict{Symbol,Float64}} is an "array of
               | records". Vector{Union{Missing,Vector{Float64}} is an
               | array of lists that can have missings, where the memory
               | and generated code is optimized for this union.
               | Vector{String} is a vector of strings. LazyArrays.jl
               | provides lazy array implementations, which satisfy the
               | AbstractArray interface so they can be used pretty much
               | anywhere a standard Array can be. Primitives like
               | flattening and grouping are just the built in
               | `reduce(hcat,x)` etc. which have dispatches to optimize
               | the operation to only allocate once. Etc.
               | 
               | As standard arrays, you can do all of the `sum`, `prod`,
               | broadacsting, linear algebra, `\\`, `transpose`, etc. on
               | these objects because of dispatches on `::AbstractArray`.
               | You can put CuArrays into these kinds of objects, which
               | is one of the building blocks that Flux.jl uses to
               | represent the neural network weights BTW. The entire
               | DataFrames.jl is built on such generically typed arrays,
               | so data filtering operations like Query.jl support these
               | types along with their unions with missings. And these
               | are the objects used across the whole language, so the
               | JIT of course applies to these objects.
               | 
               | Then if you want heterogeneous arrays that don't mix
               | implementation with representation, my favorite is
               | https://github.com/jonniedie/ComponentArrays.jl which are
               | arrays of arrays that are stored contiguously which
               | thusly work well with linear algebra. https://jonniedie.g
               | ithub.io/ComponentArrays.jl/dev/examples/... that's an
               | example of quickly implementing some neural networks and
               | using it with AD, and it all works with broadcast etc. to
               | build optimized GPU kernels.
               | 
               | I don't get what's missing? Seems like that's the list in
               | the documentation.
               | 
               | I think the bigger piece is the reversal of structure. In
               | Julia, additions to arrays are allowed by the user and
               | not hardcoded into the Base library. "Allowing GPUs" or
               | "complex numbers", or "dictionaries" as elements is not
               | special cased. If you create a type for numbers with
               | error bounds
               | (https://github.com/JuliaPhysics/Measurements.jl),
               | everything here still applies.
               | Vector{Dict{Symbol,Measurements}} generates the efficient
               | code. So by design it's impossible to enumerate all of
               | the ways a Julia array can be used, and that's why it's
               | built with multiple dispatch on parametric types. So
               | while I listed out the examples going down the
               | AckwardArray docs, note that by composability you also
               | get all of the combinations of the above, along with
               | combinations of any object a user builds in the language.
               | 
               | I commend the effort of hardcoding a lot of cases people
               | might use in Python, but man this really makes clear how
               | a general solution is usually the right idea after the
               | 3rd specialization.
        
       ___________________________________________________________________
       (page generated 2021-12-16 23:01 UTC)