[HN Gopher] Pandas 2.0 and the Arrow revolution
       ___________________________________________________________________
        
       Pandas 2.0 and the Arrow revolution
        
       Author : ZeroCool2u
       Score  : 256 points
       Date   : 2023-02-28 12:59 UTC (10 hours ago)
        
 (HTM) web link (datapythonista.me)
 (TXT) w3m dump (datapythonista.me)
        
       | hopfenspergerj wrote:
       | Obviously this improves interoperability and the handling of
       | nulls and strings. My naive understanding is that polars columns
       | are immutable because that makes multiprocessing faster/easier.
       | I'm assuming pandas will not change their api to make columns
       | immutable, so they won't be targeting multiprocessing like
       | polars?
        
         | ZeroCool2u wrote:
         | I think if anything pandas may get additional vectorized
         | operations, but from what I understand Polars is almost
         | completely Rust code under the hood, which makes
         | multiprocessing much easier compared to dealing with all the
         | extensions and marshaling of data back and forth between Python
         | and C/C++ that pandas does.
        
         | kylebarron wrote:
         | pandas added a copy-on-write API in 2.0, so under the hood the
         | Arrow columns are still effectively immutable.
         | 
         | https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#copy...
        
       | [deleted]
        
       | est wrote:
       | Now can mysql/pg provide binary Arrow protocols directly?
        
         | lidavidm wrote:
         | For PostgreSQL: https://github.com/apache/arrow-flight-sql-
         | postgresql
         | 
         | Context:
         | https://lists.apache.org/thread/sdxr8b0lj82zd0ql7zhk9472opq3...
         | 
         | Also see ADBC, which aims to provide a unified API on top of
         | various Arrow-native and non-Arrow-native database APIs:
         | https://arrow.apache.org/blog/2023/01/05/introducing-arrow-a...
         | 
         | (I am an Arrow contributor.)
        
       | Gepsens wrote:
       | Pandas is still way too slow, if only there was an integration
       | with datafusion or arrow2
        
       | markers wrote:
       | I highly recommend checking out polars if you're tired of pandas'
       | confusing API and poor ergonomics. It takes some time to make the
       | mental switch, but polars is so much more performant and has a
       | much more consistent API (especially if you're used to SQL). I
       | find the code is much more readable and it takes less lines of
       | code to do things.
       | 
       | Just beware that polars is not as mature, so take this into
       | consideration if choosing it for your next project. It also
       | currently lacks some of the more advanced data operations, but
       | you can always convert back and forth to pandas for anything
       | special (of course paying a price for the conversion).
        
         | y1zhou wrote:
         | On a related note, if you come from the R community and is
         | already familiar with the `tidyverse` suite, `tidypolars` would
         | be a great addition.
         | 
         | https://github.com/markfairbanks/tidypolars
        
         | falcor84 wrote:
         | >... of course paying a price for the conversion
         | 
         | As the blog mentions, once Pandas 2.0 is released, if you use
         | Arrows types, converting between Pandas and Polars will
         | thankfully be almost immediate (requiring only metadata
         | adjustments).
        
       | vegabook wrote:
       | I do a lot of...                 some_pandas_object.values
       | 
       | to get to the raw numpy, because often dealing with raw np buffer
       | is more efficient or ergonomic. Hopefully losing numpy
       | foundations will not affect (the efficiency of) code which does
       | this.
        
         | hannibalhorn wrote:
         | I do the same. Sounds like arrow is being implemented as an
         | alternative, and you'll have to explicitly opt in via a dtype
         | or a global setting, so numpy isn't going away.
        
         | bsdz wrote:
         | Looks like they're recommending using ".to_numpy()" instead of
         | ".values" in dev (2.*) docs:
         | 
         | https://pandas.pydata.org/docs/dev/reference/api/pandas.Data...
        
           | nerdponx wrote:
           | .values has been deprecated for a long time now, with
           | .to_numpy() as its recommended replacement for just as long.
        
             | eigenvalue wrote:
             | Wow, had no idea about this. Hopefully they can just
             | automatically interpret .values as .to_numpy or else a ton
             | of old code is going to break.
        
               | nerdponx wrote:
               | That's what they've been doing for a long time as well!
        
         | __mharrison__ wrote:
         | I often use Numba for complicated looping code for Pandas
         | manipulation. Will be interesting to see how Arrow works with
         | Numba.
        
         | PaulHoule wrote:
         | ... and what about interoperability with scikit-learn, PyTorch,
         | etc?
        
           | nerdponx wrote:
           | None of that has changed.
        
       | jimmyechan wrote:
       | Yes! Swapping out NumPy with Arrow underneath! Excited to have
       | the performance of Arrow the API of Pandas. Huge win for the data
       | community!
        
       | v8xi wrote:
       | Finally! I use pandas all the time particularly for handling
       | strings (dna/aa sequences), and tuples (often nested). Some of
       | the most annoying bugs I encounter in my code are a result of
       | random dtype changes in pandas. Things like it auto-converting
       | str -> np.string (which is NOT a string) during pivot operations.
       | 
       | There's also all types of annoying workarounds you have to do
       | while tuples as indexes resulting from it converting to a
       | MultiIndex. For example
       | 
       | srs = pd.Series({('a'):1,('b','c'):2})
       | 
       | is a len(2) Series. srs.loc[('b','c')] throws an error while
       | srs.loc[('a')] and srs.loc[[('b','c')]] do not. Not to vent my
       | frustrations, but this maybe gives an idea of why this change is
       | important and I very much look forward to improvements in the
       | area!
        
         | boringg wrote:
         | Oh yeah dealing with random dtype changes is a total PITA.
        
       | agolio wrote:
       | > As mentioned earlier, one of our top priorities is not breaking
       | existing code or APIs
       | 
       | This is doomed then. Pandas API is already extremely bloated
        
         | __mharrison__ wrote:
         | How so? Pandas is one of the most popular tools among folks
         | doing data.
         | 
         | I admit that the API has issues (if/else? being the most
         | glaring to me), notwithstanding Pandas has mass adoption
         | because the benefits outweigh the warts.
         | 
         | (I happen to wish that 2.0 deprecated some of the API, but
         | Python 3 burned a deep scar that many don't wish to relive.)
        
           | kelipso wrote:
           | The only reason is because it's the de facto default for
           | python, not because of whatever cost-benefit analysis. You
           | see the same thing with matplotlib and numpy.
        
           | agolio wrote:
           | There's too many ways of doing the same thing (which I assume
           | is already itself a relic of maintaining back-compatibility),
           | there's inconsistencies within the API, there's "deprecated"
           | stuff which isn't really deprecated, et cetera
           | dataframe.column
           | 
           | vs                 dataframe['column']
           | 
           | as one example comes to mind but there is surely much more
           | 
           | I am of the philosophy of 'The Zen of Python'
           | There should be one-- and preferably only one --obvious way
           | to do it.
           | 
           | Pandas is a powerful library, but when I have to use it in a
           | workplace it usually gives me a feeling of dread, knowing I
           | am soon to face various hacks and dataframes full of NaNs
           | without them being handled properly, etc.
        
             | 0cf8612b2e1e wrote:
             | Which column format would you prefer? You need the latter
             | to address random loaded data which may contain illegal
             | identifiers. You need the former to stifle the rumblings
             | about Pandas verbosity.
             | 
             | I would get rid of the .column accessor, but you will see a
             | lot of pushback. Notably from the R camp.
        
               | agolio wrote:
               | I would also get rid of the .column accessor, it has the
               | potential to collide with function members of the
               | DataFrame so shouldn't have been added in the first place
               | IMO
        
               | naijaboiler wrote:
               | Um nah. I can't imagine R follow being against .column
               | disappearing forever. There's no equivalent in R
        
         | Wonnk13 wrote:
         | This is one of my largest pet peeves with Pandas. There's like
         | (or was) three APIs. Half the stuff on Stack Overflow or blogs
         | is from 2013-2015 and deprecated. I feel like I have relearn
         | Pandas every four years since Wes launched it almost a decade
         | ago.
        
       | sireat wrote:
       | The changes to handling strings and Python data types are
       | welcome.
       | 
       | However I am curious on how Arrow beats NumPy on regular ints and
       | floats.
       | 
       | For the last 10 years I've been under impression that int and
       | float columns in Pandas are basically NumPy ndarrays with extra
       | methods.
       | 
       | Then NumPy ndarrays are basically C arrays with well defined
       | vector operations which are often trivially parallelizable.
       | 
       | So how does Arrow beat Numpy when calculating
       | mean (int64) 2.03 ms 1.11 ms 1.8x             mean (float64) 3.56
       | ms 1.73 ms 2.1x
       | 
       | What is the trick?
        
         | kylebarron wrote:
         | Maybe due to SIMD?
        
         | fred123 wrote:
         | SIMD, multithreading, better handling of NaN, maybe other
         | factors.
        
       | college_physics wrote:
       | My feeling is that the pandas community should be bold and
       | consider also overhauling the API besides the internals. Maybe
       | keep the existing API for backward compatibility but rethink what
       | would be desirable for the next decade of pandas so to speak.
       | Borrowing liberally from what works in other ecosystem API's
       | would be the idea. E.g. R, while far from beautiful can be more
       | concise etc.
        
         | stdbrouw wrote:
         | Yeah, especially for a toolkit that is aimed at one-off
         | analyses, I don't really see the point in API conservatism. R
         | went through a period where many of the tools in the
         | hadleyverse/tidyverse drastically changed from one year to the
         | next, but the upshot is that it now has by far the nicest
         | interface to do things like pivoting data into longer or wider
         | formats (reshape, reshape2, tidyr), or doing operations on
         | grouped data (plyr, dplyr, dplyr2) or data frames in general
         | (built-in, data.table, tibble).
        
           | _Wintermute wrote:
           | Anecdotally, the API instability of those R packages is what
           | stopped me using them. Try running any tidyverse code after
           | more than 6 months, it's almost guaranteed to be broken.
           | 
           | If I can't rely on them in my packages or long-running
           | projects, then I don't see the point in learning them to use
           | in interactive work.
        
             | civilized wrote:
             | > Try running any tidyverse code after more than 6 months,
             | it's almost guaranteed to be broken.
             | 
             | You are entitled to your opinion (which I see in every
             | thread that discusses the tidyverse), but in my opinion
             | this is a considerable and outdated exaggeration which will
             | mislead the less experienced. Let's balance it out with
             | some different perspective.
             | 
             | 1. The core of the tidyverse, the data manipulation package
             | dplyr, reached version 1.0 in May 2020 and made a promise
             | to keep the API stable from that point on. To the best of
             | my knowledge, they have done so, and anyone who wants to
             | verify this can look at the changelogs. That's nearly 3
             | years of stability.
             | 
             | 2. For several years, functions in the tidyverse have had a
             | lifecycle indicator that appears in the documentation. It
             | tells you if the function has reached a mature state or is
             | still experimental. To the best of my knowledge, they have
             | kept the promises from the lifecycle indicator.
             | 
             | 3. I have been a full-time R and tidyverse user since dplyr
             | was first released in 2014, and my personal experience is
             | consistent with the two observations above. I agree with
             | the parent commenter that the tidyverse API used to be
             | unstable, but this was mainly in 2019 or earlier, before
             | dplyr went to 1.0. And even back then, they were always
             | honest about when the API might change. So now that the
             | tidyverse maintainers are saying dplyr and other tidyverse
             | packages are stable, I see no rational basis to doubt them.
             | 
             | 4. Finally, even during the early unstable API period of
             | the tidyverse, I personally did not find it such a great
             | burden to upgrade my code as the tidyverse improved. It was
             | actually quite thrilling to watch Hadley's vision develop
             | and incrementally learn and use the concepts as he built
             | them out. To use the tidyverse is to be part of something
             | greater, part of the future, part of a new way of thinking
             | that makes you a better data analyst.
             | 
             | IMHO, the functionality and ergonomics of the tidyverse are
             | light-years ahead of any other* data frame package, and
             | anyone who doesn't try it because of some negative
             | anecdotes is missing out.
             | 
             | *No argument from me if you prefer data.table. It's got
             | some performance advantages on large data and a different
             | philosophy that may appeal more to some. Financial time
             | series folks often prefer it. YMMV.
        
               | kgwgk wrote:
               | > To use the tidyverse is to be part of something
               | greater, part of the future, part of a new way of
               | thinking that makes you a better data analyst.
               | 
               | Wow.
        
               | _Wintermute wrote:
               | Yeah, I'm not even going to response to their points. I
               | see little chance of a balanced discussion if they view
               | an R package as a quasi-religion.
        
         | MrMan wrote:
         | the api is miserable
        
         | 323 wrote:
         | > consider also overhauling the API besides the internals
         | 
         | This never happens. Angular does not become React, React gets
         | created instead. CoffeScript does not become TypeScript, etc.
         | 
         | There is too much baggage and backward compatibility that
         | prevents such radical transformations.
         | 
         | Instead a new thing is created.
        
           | aflag wrote:
           | Which I think is a great way of doing things. Having a
           | project that keeps breaking compatibility is very annoying. I
           | actually think pandas changes way too much on each version.
           | When you have millions of lines of code depending on a
           | library, you want new features, faster speed, but you
           | definitely don't want backwards incompatible changes.
        
       | yamrzou wrote:
       | The submission link points to the blog instead of the specific
       | post. It should be: https://datapythonista.me/blog/pandas-20-and-
       | the-arrow-revol...
        
         | ZeroCool2u wrote:
         | You're right, something weird happened when I pasted the link
         | in to submit it. Maybe a bug in Chrome on Android, because I
         | just tested it again and found it only copied the link to the
         | blog and not the specific post.
         | 
         | If Dang or any other mods see this, please correct the link.
        
           | yamrzou wrote:
           | I think it's a HN bug. I tried submitting another post from
           | the same blog, HN software seems to rewrite the url to the
           | homepage instead.
        
             | eloisius wrote:
             | I'm guessing it's because the page has this tag:
             | <link href="/blog/" rel="canonical" />
             | 
             | They're probably also screwing up their SEO this way.
        
           | bobbylarrybobby wrote:
           | I think it's their own site. When I shared the link from my
           | phone to my computer the same thing happened -- the link got
           | changed to the main blog page. As someone else mentioned I
           | think it's the <link rel="canonical"> in the header.
        
       | Wonnk13 wrote:
       | So where does this place Polars? My perhaps incorrect
       | understanding was this (Arrow integration) was a key
       | differentiator of Polars vs Pandas.
        
         | ritchie46 wrote:
         | Polars author here. Polars adheres to arrow's memory format,
         | but is a complete vectorized query engine written in rust.
         | 
         | Some other key differentiatiors:
         | 
         | - multi-threaded: almost all operations are multi-threaded and
         | share a single threadpool that has low contention (not
         | multiprocessing!). Polars often is able to completely saturate
         | all your CPU cores with useful work.
         | 
         | - out-of-core: polars can process datasets much larger than
         | RAM.
         | 
         | - lazy: polars optimizes your queries and materializes much
         | less data.
         | 
         | - completely written in rust: polars controls every performance
         | critical operation and doesn't have to defer to third parties,
         | this allows it to have tight control over performance and
         | memory.
         | 
         | - zero-required dependencies. This greatly reduces latency. A
         | pandas import takes >500ms, a polars import ~70/80ms.
         | 
         | - declarative and strict API: polars doesn't adhere to the
         | pandas API because we think it is suboptimal for a performant
         | OLAP library.
         | 
         | Polars will remain a much faster and more memory efficient
         | alternative.
        
           | aflag wrote:
           | Was GPU acceleration considered? I know there's cudf which
           | tries to offer dataframes for GPUs already. But, in my naive
           | mind, it feels like dataframes would be a great fit for GPUs,
           | I'm curious why there seems to be little interest in that.
        
       | MrPowers wrote:
       | The Arrow revolution is particularly important for pandas users.
       | 
       | pandas DataFrames are persisted in memory. The rule of thumb was
       | for RAM capacity / dataset size in memory to be 5-10x as of 2017.
       | Let's assume that pandas has made improvements and it's more like
       | 2x now.
       | 
       | That means you can process datasets that take up 8GB of RAM in
       | memory on a 16GB machine. But 8GB of RAM in memory is a lot
       | different than what you'd expect with pandas.
       | 
       | pandas historically persisted string columns as objects, which
       | was wildly inefficient. The new string[pyarrow] column type is
       | around 3.5 times more efficient from what I've seen.
       | 
       | Let's say a pandas user can only process a string dataset that
       | has 2GB of data on disk (8GB in memory) on their 16GB machine for
       | a particular analysis. If their dataset grows to 3GB, then the
       | analysis errors out with an out of memory exception.
       | 
       | Perhaps this user can now start processing string datasets up to
       | 7GB (3.5 times bigger) with this more efficient string column
       | type. This is a big deal for a lot of pandas users.
        
         | tootie wrote:
         | Isn't this also addressed by using Dask?
        
           | 0cf8612b2e1e wrote:
           | I am always tripping over my feet when I use Dask. Definitely
           | due to inexperience (I only reach for it when my usual Pandas
           | tricks fail me), but I never have a good time there. Pandas
           | idiom X will still blow up memory, you have to know Dask
           | idiom Y, etc.
           | 
           | I am glad Dask exists, but it is not a seamless solution
           | where you can just swap in a Dask dataframe.
        
           | TrapLord_Rhodo wrote:
           | Dask can utilize pandas dataframes and make them run across
           | different cores. Also is great for tasks that take up too
           | much memory as discussed by the OP. This may replace the need
           | for you to use DASK in alot of projects since you can work
           | with 3.5x the RAM now.
        
           | MrPowers wrote:
           | Dask sidesteps the problem by splitting up the data into
           | smaller pandas DataFrames and processing them individually,
           | rather than all at once. But the fundamental issue of pandas
           | storing strings as objects in DataFrames (which is really
           | inefficient) is also problematic with Dask (because a Dask
           | DataFrame is composed of a lot of smaller pandas DataFrames).
           | The technologies that really address this problem are
           | Polars/PySpark - technologies that are a lot more memory
           | efficient.
        
         | taeric wrote:
         | The memory expectations for so many programmers going into
         | pandas baffle me. In particular, noone was batting an eye that
         | a 700 meg CSV file would take gigs to hold in memory. Just
         | convincing them to specify the dtypes and to use categorical
         | where appropriate has had over 70% reductions on much of our
         | memory requirements. Not shockingly, they go faster, too.
         | 
         | If there are efforts to help this be even better, I heartily
         | welcome them.
        
           | __mharrison__ wrote:
           | When I'm teaching Pandas, the first thing we do after loading
           | the data is inspect the types. Especially if the data is
           | coming from a CSV. A few tricks can save 90+% of the memory
           | usage for categorical data.
           | 
           | This should be a step in the right direction, but it will
           | probably still require manually specifying types for CSVs.
        
             | taeric wrote:
             | Yeah, I expect most efforts to just help make the pain not
             | as painful. And specifying the data types is not some
             | impossible task and can also help with other things.
        
         | ingenieroariel wrote:
         | To your point: If you have your data as parquet files and load
         | it into memory using DuckDB before passing the arrow object to
         | Pandas you can run in arbitrary large datasets as long as you
         | aggregate them or filter them somehow.
         | 
         | In my case I run analysis on an Apple M1 with 16GB of RAM and
         | my files on disk are thousands of parquet files that add up to
         | hundreds of gigs.
         | 
         | Apart from that: Being able to go from duckdb to pandas very
         | quickly to make operations that make more sense on the other
         | end and come back while not having to change the format is
         | super powerful. The author talks about this with Polars as an
         | example.
        
           | IanCal wrote:
           | > Being able to go from duckdb to pandas very quickly to make
           | operations that make more sense on the other end and come
           | back while not having to change the format is super powerful
           | 
           | I can't stress enough how much I think this is truly
           | transformative. It's generally nice as a working pattern, but
           | much more importantly it lets the _scale_ of a problem that a
           | tool needs to solve shrink dramatically. Pandas doesn 't need
           | to do everything, nor does DuckDB, nor does some niche tool
           | designed to perform very specific forecasting - any can be
           | slotted into an in memory set of processes with no overhead.
           | This lowers the barrier to entry for new tools, so they
           | should be quicker and easier to write for people with
           | detailed knowledge just in their area.
           | 
           | It extends beyond this too, as you can then also get free
           | data serialisation. I can read data from a file with duckdb,
           | make a remote gRPC call to a flight endpoint written in a few
           | lines of python that performs whatever on arrow data, returns
           | arrow data that gets fed into something else... in a very
           | easy fashion. I'm sure there's bumps and leaky abstractions
           | if you do deep work here but I've absolutely got remote
           | querying of dataframes & files working with a few lines of
           | code, calling DuckDB on my local machine through ngrok from a
           | colab instance.
        
           | MrPowers wrote:
           | Yes, I agree with your assessment that technologies that can
           | deal with larger than memory datasets (e.g. Polars) can be
           | used to filter data so there are less rows for technologies
           | that can only handle datasets a fraction of the data (e.g.
           | pandas).
           | 
           | Another way to solve this problem is using a Lakehouse
           | storage format like Delta Lake so you only read in a fraction
           | of the data to the pandas DataFrame (disclosure: I am on the
           | Delta Lake team). I've blogged about this and think predicate
           | pushdown filtering / Z ORDERING data is more straightforward
           | that adding an entire new tech like Polars / DuckDB to the
           | stack.
           | 
           | If you're using Polars of course, it's probably best to just
           | keep on using it rather than switching to pandas. I suppose
           | there are some instances when you need to switch to pandas
           | (perhaps to access a library), but think it'll be better to
           | just stick with the more efficient tech normally.
        
           | mr337 wrote:
           | This sounds very neat, time to look into DuckDB.
        
       | loveparade wrote:
       | There is also Polars [0], which is backed by arrow and a great
       | alternative to pandas.
       | 
       | [0] https://www.pola.rs/
        
         | nerdponx wrote:
         | Polars is less ergonomic than Pandas in my experience for basic
         | data analysis, and has less functionality for time series work.
         | But it's better than Pandas for working with larger datasets
         | and it's now my preferred choice for batch data processing in
         | Python eg in ETL jobs. They both have their own places in the
         | world.
        
           | pletnes wrote:
           | Yes, and with both supporting arrow, you can zero-copy share
           | data from polars into pandas once you've completed your
           | initial data load+filtering.
        
             | nerdponx wrote:
             | Does zero-copy work currently, or is that still
             | theoretical/todo?
             | 
             | I would also be curious about Numpy, since I know you can
             | transparently map data to a Numpy array, but that's just
             | "raw" fixed-width binary data and not something more
             | structured like Arrow.
        
               | ec109685 wrote:
               | Once this is fixed: https://github.com/pola-
               | rs/polars/pull/7084
        
               | nerdponx wrote:
               | I meant to ask about the other way, Polars to Pandas, but
               | this is also useful to know.
        
               | rihegher wrote:
               | It looks like it has already been merged, or did I miss
               | something?
        
           | 323 wrote:
           | Polars is more verbose, but simpler. It's core operations
           | "fit the brain". I've used pandas for years, but I still have
           | to regularly google basic indexing stuff.
        
         | berkle4455 wrote:
         | There's also SQL, when paired with optimized column-store
         | databases that can run embedded or local (duckdb or
         | clickhouse), you can avoid the extra hop into python libraries
         | entirely.
        
         | eternalban wrote:
         | Do you have an opinion on N5? My second hand (somewhat dated)
         | info is that Pandas are not a good fit for unstructured (CV)
         | data or ML.
        
         | ZeroCool2u wrote:
         | Yes, Marc talks extensively about the new interop options
         | between Pandas and Polars enabled by the pyarrow backend.
        
       | [deleted]
        
       | pama wrote:
       | Polars can do a lot of useful processing while streaming a very
       | large dataset without ever having to load in memory much more
       | than one row at a time. Are there any simple ways to achieve such
       | map/reduce tasks with pandas on datasets that may vastly exceed
       | the available RAM?
        
         | davesque wrote:
         | Not currently. But I imagine that, if Pandas does adopt Arrow
         | in its next version, it should be able to do something like
         | that through proper use of the Arrow API. Arrow is built with
         | this kind of processing in mind and is continually adding more
         | compute kernels that work this way when possible. The Dataset
         | abstraction in Arrow allows for defining complex column
         | "projections" that can execute in a single pass like this.
         | Polars may be leveraging this functionality in Arrow.
        
         | nerdponx wrote:
         | Only by writing your own routine to load and process one chunk
         | at a time.
        
       | nathan_compton wrote:
       | I recently switched to Polars because Pandas is so absurdly
       | weird. Polars is much, much better. I'll be interested in seeing
       | how Pandas 2 is.
        
       | Kalanos wrote:
       | if you were facing memory issues, then why not use numpy memmap?
       | it's effing incredible
       | https://stackoverflow.com/a/72240526/5739514
       | 
       | pandas is just for 2D columnar stuff; it's sugar on numpy
        
       ___________________________________________________________________
       (page generated 2023-02-28 23:02 UTC)