[HN Gopher] A 100x speedup with unsafe Python
___________________________________________________________________
A 100x speedup with unsafe Python
Author : ingve
Score : 289 points
Date : 2024-05-05 08:08 UTC (2 days ago)
(HTM) web link (yosefk.com)
(TXT) w3m dump (yosefk.com)
| idkdotcom wrote:
| Safety is one of Python's greatest advantages over C or C++. Why
| would anyone use unsafe Python when P0ython doesn't have other
| features such as type safety and all the debugging tools that
| have been built for C and C++ over time is beyond me.
|
| Python is a great language, but just as the generation of kids
| who got out of computer science programs in the 2000s were
| clueless about anything that wasn't Java, it seems this
| generation is clueless about anything that isn't Python.
|
| There was life before Python and there will be life after Python!
| jandrese wrote:
| Seems to me if you could do most of the work in Python and then
| just make the critical loop unsafe and 100x faster then that
| would certainly have some appeal.
| gibolt wrote:
| Plenty of people would gladly not have to learn another
| language (especially C).
|
| You could also benefit from testing blocks of code with
| safety enabled to have more confidence when safety is
| removed.
| KeplerBoy wrote:
| It was explained pretty well in the blog: Installing the
| OpenCV python package is easy, fast and painless as long as
| you're happy with the default binary. Building OpenCV from
| source to get it into a C/C++ program can quickly turn into
| a multi-hour adventure, if you're not familiar with
| building sizeable C++ projects.
| TylerE wrote:
| Or if you're going to learn another language might as well
| learn nim. Keep most of the python syntax, ditch the
| performance and the packaging insanity.
| eru wrote:
| > Why would anyone use unsafe Python when Python doesn't have
| other features such as type safety and all the debugging tools
| that have been built for C and C++ over time is beyond me.
|
| C and C++ 'type safety' is barely there (compared to more sane
| languages like OCaml or Rust or even Java etc). As to why would
| anyone do that? The question is 'why not?' It's fun.
| gibolt wrote:
| "Why would anyone use" -> "when" usage generally means lots of
| use cases are being ignored / swept under the rug.
|
| >1 billion people exist. Each has a unique opinion / viewpoint.
| bongodongobob wrote:
| I don't use Python because of type safety, I use it because
| it's easy to write. I couldn't give a single fuck about some
| missed pointers that weren't cleaned up in the 2.5 seconds my
| program ran. I'm not deploying public code.
|
| I tend to prototype in Python and then just rewrite the whole
| thing in C with classes if I need the 10000x speedup.
| SunlitCat wrote:
| That easy to write part is something, I am not so sure about.
| It took me ages to understand why my first try at writing a
| blender plugin didn't work.
|
| It was because of white spaces not lining up. I was like,
| really? Using white spaces as a way to denote the body of a
| function? Really?
| nottorp wrote:
| I'm curious... what rock have you been living under in the
| past 33 years since python was launched?
| SunlitCat wrote:
| Let's see, I was living under the:
|
| - Amiga E - Assembler (on Amiga) - perl - tcl/tk -
| javascript - vbscript - java - vba - c - c++ - python
|
| rock. Granted at some of those rocks, I just took a quick
| peek (more like a glance), but a rock with whitespaces
| being important was, at that moment, new to me!
| nottorp wrote:
| Oh cmon, I can still write some z80 assembly from memory
| and remember the zx spectrum memory layout somewhat, but
| I check new programming languages now and then :)
| SunlitCat wrote:
| :)
|
| I just meant that for someone coming from a language(s)
| where different kinds of delimiters are used to denote
| function bodies, a language that puts such a huge
| emphasis on whitespace was kinda throwing me off to get
| an error message slapped at me even though I was
| replicating the example letter by letter (although,
| obviously, I missed the importance of whitespace).
| nottorp wrote:
| It's all fuzzy, but I'm sure I knew about the importance
| of whitespace in python long before I actually tried
| python.
|
| Even Raymond's article which was one of the first whined
| about it [1]. I definitely remember reading that one and
| thinking I have to check out that language later.
|
| [1] https://www.linuxjournal.com/article/3882
| SunlitCat wrote:
| Thank you for the article. I have to put it on my "to
| read" pile! :)
| omoikane wrote:
| The "unsafe" in the title appears to be used in the sense of
| "exposing memory layout details", but not in the sense of
| "direct unbounded access to memory". It's probably not the
| memory safety issue you are thinking of.
| yosefk wrote:
| You can absolutely get direct unbounded access to memory with
| ctypes, with all the bugs that come from this. I just
| think/hope the code I show in TFA happens to have no such
| bugs.
| SunlitCat wrote:
| Thing about academia (like computer science and the like) is,
| you need a programming language you can get across easily,
| quickly and can get people to produce satisfying results in no
| time.
|
| Back then it might have been java, then python, till the next
| language comes around.
|
| The java thing was so apparent, that when you looked at c++
| code, you were able to spot the former java user pretty easily.
| :)
|
| About python, sometime ago, I was looking into arudino
| programming in c and found someone offering a library (I think)
| in c (or c++, can't remember). What i remember is, that this
| person stopped working on it, because they got many requests
| and questions regarding to be able to use their library in
| python.
| pjmlp wrote:
| > The java thing was so apparent, that when you looked at c++
| code, you were able to spot the former java user pretty
| easily. :)
|
| Well, Java was designed based on C++ GUI development patterns
| typical in the 10 years of C++ GUI frameworks that predated
| Java.
|
| Yet, people keep using these kind of remarks.
| kragen wrote:
| while python is eminently criticable, yosef kreinin has
| designed his own cpu instruction set and gotten it fabbed in a
| shipping hardware product, and is the author of the c++fqa,
| which is by far the best-informed criticism of c++; criticizing
| him with 'this generation is clueless about anything that isn't
| Python' seems off the mark
| intelVISA wrote:
| Isn't all Python, by design, unsafe?
| eru wrote:
| Is this supposed to be a joke?
|
| Have a look at the linked article to see what meaning of
| 'unsafe' the author has in mind.
| intelVISA wrote:
| Safety is catching (more) errors ahead of time ...for which
| Python is grossly unsuitable imo.
|
| Fun lang though, just not one that ever comes to mind when I
| hear 'safe'.
| eru wrote:
| That's one definition of safety. But it's not the one the
| author uses in this case.
|
| The generic snide about Python that has nothing to do with
| the article ain't all that informative.
|
| You could make the same remark you just made about all
| Python being unsafe about Rust: 'Isn't all Rust, by design,
| unsafe?'. And compared to eg Agda or Idris that would be
| true, but it wouldn't be a very useful nor interesting
| comment when talking about specifically 'unsafe' Rust vs
| normal Rust.
| intelVISA wrote:
| I would agree, Rust is not safe. We need to encourage
| more formal rigor in our craft and avoid misconceptions
| like 'safe Rust' or 'safe Python'. Thus my original
| comment :P
| eru wrote:
| Eh, even Agda is only safe in the sense of 'does what
| you've proven it to do'. That doesn't mean that if you eg
| write a machine learning library in Agda, your AI won't
| come and turn us all into paperclips.
|
| So it doesn't really make sense to pretend there's some
| single meaning of 'safe' vs 'unsafe' that's appropriate
| in all contexts.
| dahart wrote:
| It's a term of art in this case. Or someone proposed the
| term improper noun
| https://news.ycombinator.com/item?id=32673100
|
| Fine and good to advocate rigor, whether or not it's
| specifically relevant to this post, but maybe be careful
| with the interpretation and commentary lest people decide
| the misconception is on your part?
| recursive wrote:
| Memory corruption vs type safety.
|
| "Safety" is an overloaded term, but that happens a lot in
| software. You'll probably get best results if you try to
| understand what people are talking about, rather than just
| assuming everyone else is an idiot.
| westurner wrote:
| ctypes, c extensions, and CFFI are unsafe.
|
| Python doesn't have an unsafe keyword like Rustlang.
|
| In Python, you can dereference a null pointer and cause a
| Segmentation fault given a ctypes import.
|
| lancedb/lance and/or pandas' dtype_backend="pyarrow" might work
| with pygame
| Retr0id wrote:
| You don't even need a ctypes import for it, you'll get a
| segfault from this:
| eval((lambda:0).__code__.replace(co_consts=()))
|
| cpython is not memory safe.
| westurner wrote:
| Static analysis of Python code should include review of
| "unsafe" things like exec(), eval(), ctypes, c strings,
| memcpy (*),.
|
| Containers are considered nearly sufficient to sandbox
| Python, which cannot be effectively sandboxed using Python
| itself. Isn't that actually true for all languages though?
|
| There's a RustPython, but it does support CFFI and
| __builtins__.eval, so
| maple3142 wrote:
| The example given by parent does not need eval to trigger
| though. Just create a function and replace its code
| object then call it, it will easily segfault.
| jwilk wrote:
| Complete example without eval: def f():
| pass f.__code__ = f.__code__.replace(co_consts=())
| f()
| Retr0id wrote:
| yup, eval was just there for golfing purposes
| emmanueloga_ wrote:
| As others commented "safety" is a heavily overloaded term, for
| instance there's "type safety" [1] and "memory safety" [2], and
| then there's "static vs dynamic typing" [3], and "weak vs
| strong typing" [4], so talking about types and safety offered
| by a language can be very nuanced.
|
| I suspect when you say "unsafe by design" you may be referring
| to the dynamic type checking aspect of Python, although Python
| has supported for a while type annotations, that can be
| statically checked by linter-like tools like PyRight.
|
| --
|
| 1: https://en.wikipedia.org/wiki/Type_safety
|
| 2: https://en.wikipedia.org/wiki/Memory_safety
|
| 3: https://en.wikipedia.org/wiki/Type_system#Type_checking
|
| 4: https://en.wikipedia.org/wiki/Strong_and_weak_typing
| d0mine wrote:
| Python both type and memory safe e.g., ""+1 in Python is
| TypeError and [][0] is IndexError
|
| But you can perform unsafe operations e.g.,
| import ctypes ctypes.string_at(0) # access 0 address
| in memory -> segfault
| comex wrote:
| Do you actually need ctypes here, or can you just use np.reshape?
| That would cut out the unsafety.
| Retr0id wrote:
| You can exploit memory unsafety in cpython using only built-in
| methods, no imports needed at all:
| https://github.com/DavidBuchanan314/unsafe-python/
| yosefk wrote:
| You... probably shouldn't, but, very interesting stuff in
| that link! I didn't know CPython had no bounds checks in the
| load_const bytecode op
| im3w1l wrote:
| I think what you want is a combination of swapaxes and flip
| (reshape does not turn rows into columns, it turns m input rows
| into n output rows), but yeah.
|
| Actually let me include a little figure
| original reshape swapaxes abc ab ad
| def cd be ef cf
| yosefk wrote:
| I think you'll have a problem with BGR data absent ctypes,
| since you have a numpy array with the base pointing at the
| first red channel value, and you want an array with the base
| pointing 2 bytes _before_ the first red channel value. This is
| almost definitely "unsafe"?.. or does numpy have functions
| knowing that due to the negative z stride, the red value before
| the blue value is within the bounds of the original array? I
| somehow doubt it though it would be very impressive. And I
| think the last A value is hopelessly out of reach absent ctypes
| since a safe API has no information that this last value is
| within the array bounds; more strictly speaking all the alpha
| values are out of bounds according to the shape and strides of
| the BGRA array.
| qaq wrote:
| Well with mojo it looks like you will have both safety and
| performance
| grandma_tea wrote:
| Closed source and not relevant.
| ianbutler wrote:
| Mojo was open sourced months ago iirc
|
| https://github.com/modularml/mojo
| viraptor wrote:
| Those are examples and docs. Mojo is still closed.
| ianbutler wrote:
| Their standard lib is in there, seems more open to me
|
| Theres the announcement where they open sourced the
| modules in their standard lib:
| https://www.modular.com/blog/the-next-big-step-in-mojo-
| open-...
|
| There's the standard lib: https://github.com/modularml/mo
| jo/blob/nightly/stdlib/README...
| viraptor wrote:
| Yeah, that's some minimal progress, but really not that
| interesting. (As in, it's cool that it exists, but given
| we've got Python stdlib and numpy already open, it's not
| really new/exciting) And doesn't allow you to port to
| platforms they don't care enough about.
| qaq wrote:
| It might be not interesting to you. Having a lot of Rust
| features with much smarter compiler and Python syntax is
| pretty interesting to me.
| viraptor wrote:
| No, mojo itself is interesting. The stdlib itself that's
| posted isn't. I mean, is not very interesting code beyond
| "what are the implementation details". Any decent
| programmer could redo that based on python's stdlib.
| Animats wrote:
| Oh, array striding.
|
| This is a classic bikeshedding issue. When Go and Rust were first
| being designed, I brought up support for multidimensional arrays.
| For both cases, that became lost in discussions over what
| features arrays should have. Subarrays, like slices but
| multidimensional? That's the main use case for striding.
| Flattening in more than one dimension? And some people want
| sparse arrays. Stuff like that. So the problem gets pushed off
| onto collection classes, there's no standard, and everybody using
| such arrays spends time doing format conversion. This is why
| FORTRAN and Matlab still have strong positions in number-
| crunching.
| lifthrasiir wrote:
| But format conversion is inevitable because C and FORTRAN has a
| different axis ordering anyway, isn't it?
| blt wrote:
| Not really, you can always change the indexing to account for
| it. For example, the GEMM matrix multiplication subroutines
| from BLAS can transpose their arguments [1]. So if you have A
| (m x n) and B (n x p) stored row-major, but you want to use a
| column-major BLAS to compute A*B, you can instead tell BLAS
| that A is n x m, B is p x n, and you want to compute A' * B'.
|
| As the article mentions, NumPy can handle both and do all the
| bookkeeping. So can Eigen in C++.
|
| [1] https://www.math.utah.edu/software/lapack/lapack-
| blas/dgemm....
| lifthrasiir wrote:
| That's conceptually still a format conversion, though the
| actual conversion might not happen. Users have to track
| which format is being used for which matrix and I believe
| that was what the GP was originally complaining for.
| yosefk wrote:
| Agreed, realistically you're going to either convert the
| data or give up on using the function rather than
| reimplementing it to support another data layout
| efficiently; while converting the code instead of the
| data, so to speak, is more efficient from the machine
| resource use POV, it can be very rough on human
| resources.
|
| (The article is basically about not having to give up _in
| some cases_ where you can tweak the input parameters and
| make things work without rewriting the function; but it
| 's not always possible)
| semi-extrinsic wrote:
| Talking about rough on human resources, I take it you've
| never written iterative Krylov methods where matrices are
| in compressed sparse row (CSR) format?
|
| Because that's the kind of mental model capacity
| scientific programmers needed to have back in the days.
| People still do similarly brain-twisting things today. In
| this context, converting your logic between C and Fortran
| to avoid shuffling data is trivial.
| sa-code wrote:
| What kind of support would you have hoped for?
| infogulch wrote:
| A way to reinterpret a slice of size N as a multidimensional
| array with strides that are a factorization of N, including
| optional reverse order strides. Basically, do the stride
| bookkeeping internally so I can write an algorithm only
| considering the logic and optimize the striding order
| independently.
| Animats wrote:
| That's where you end up after heavy bikeshedding. Lots of
| features, terrible performance, as the OP points out.
| infogulch wrote:
| I agree with you on sparse arrays and multidimensional
| slices, but this is basically the same as what you'd do
| manually. Saying that "track strides for me" is "lots of
| features" is a bit uncharitable.
| yosefk wrote:
| From everything I'm seeing, it follows that Matlab & Fortran
| have been decimated by Python and C/C++ around 1990s and 2010s,
| respectively. Of course I could be wrong; any evidence of their
| still strong position will be greatly appreciated, doubly so
| evidence that this position is due to stride issues.
|
| (Of course Python provides wrappers to Fortran code, eg
| FITPACK, but this code specifically was mostly written in the
| 80s, with small updates in this century, and is probably used
| more thru the numpy wrappers, stride conversion issues and all,
| than thru its native Fortran interface)
| MobiusHorizons wrote:
| I think pretty much all linear algebra libraries are still
| Fortran and are unlikely to ever be C because Fortran is
| legitimately faster than C for this stuff. I don't know if it
| has anything to do with strong opinions about how values are
| represented, I think it has more to do with lower overhead of
| function calling, but that is just repeating what someone
| told me about Fortran vs C in general, not necessarily
| applicable to BLAS libraries.
|
| Fortran at least used to be very common in heavy scientific
| computing, but I would bet that relies on GPUs or other
| accelerators these days.
| yosefk wrote:
| The question is how much new numeric code is written in
| Fortran vs C/C++. My guess is way below 10%, certainly if
| we measure by usage as opposed to LOC and I would guess by
| LOC as well.
|
| Is Fortran legitimately faster than C with the restrict
| keyword? Regardless of the function call cost diffs between
| the two - meaning, even if we assume Fortran is somehow
| faster here - there's no way numeric production code does
| enough function calls for this to matter. If Fortran was
| faster than C at any point in time I can only imagine
| pointer aliasing issues to have been the culprit and I
| can't imagine it still being relevant, but I am of course
| ready to stand corrected.
| eyegor wrote:
| I don't think fortran is faster by language virtue, but
| it's certainly easier to scrap together high performance
| numeric fortran code. And ifort/aocc are amazing at
| making code that runs well on clusters, which is not a
| priority for any other domain. Fortran is absolutely
| still on top for modern numerics research code that
| involves clusters, talk to anyone who works in
| simulations at a national lab. If your code is mostly
| matrix math, modern fortran is very nice to work with.
|
| Emphasis on modern because maintaining 70s-80s numeric
| mathematician cowboy code is a ninth circle of hell. It
| has a bad rep for that reason.
| semi-extrinsic wrote:
| When you speak about code from the 1970s in particular,
| you need to appreciate the extremely limited language
| features that were available.
|
| They did not even have while-loops in the language until
| Fortran 77. While-loop functionality used to be
| implemented using DO-loops mixed with GOTOs. You can't
| fault the people of that era for using the only tools
| that were available to them.
| ant6n wrote:
| At the core, Fortran uses multidimensional numerical
| arrays, with the shapes being defined being defined in
| the code. So the compiler knows much more about the data
| that's being operated on, which in theory allows better
| optimization.
|
| I thought blas/lapack is still written in Fortran, so
| most numerical code would still be built on top of
| Fortran.
| josefx wrote:
| Restrict probably helps. I can't say much about fortran
| but C still has warts that can significantly impact its
| math library. For example almost every math function may
| set errno, that is a sideeffect that isn't easy to
| eliminate by the compiler and might bloat the code
| significantly. For example with gcc a single instruction
| sqrt turns into a sqrt instruction, followed by several
| checks to see if it succeeded, followed by a function
| call to the sqrt library function just to set errno
| correctly. I just started to disable math-errno
| completely once I realized that C allows several
| alternative ways to handle math errors, which basically
| makes any code relying on it non portable.
| Galanwe wrote:
| > I think pretty much all linear algebra libraries are
| still Fortran and are unlikely to ever be C
|
| That is more of an urban legend than reality in 2024. Fact
| is, although the original BLAS implementation was fortran,
| it has been at least a decade since every Linux
| distribution ships either OpenBLAS or ATLAS or the MKL,
| which are both written in a mix of C and assembly. All
| modern processor support is only available in these
| implementations.
|
| LAPACK itself is still often built from the Fortran code
| base, but it's just glue over BLAS, and it gets performance
| solely from its underlying BLAS. Fortran doesn't bring any
| value to LAPACK, it's just that rewriting millions of
| algebraic tricks and heuristics for no performance gain is
| not enticing.
| chillfox wrote:
| If everyone is just using the Fortran libraries instead of
| reimplementing it in a modern language, then that's evidence
| that it's still being used for that purpose.
| aragilar wrote:
| Matlab has the issue of not being open source (similarly
| IDL), but I still see it popping up (though naturally, if
| I've got any choice it the matter, it's going to be ported to
| Python). I've also seen new Fortran codebases as well, mainly
| where the (conceptual, not computational) overhead of using
| helper libraries isn't worth it.
|
| I'd suggest Python (via the various memoryview related PEPs,
| plus numpy) does provide most of the required information
| (vs. e.g. Go or Rust or Ruby), so I'm not sure that proving
| much other than if the value of using a library to handle
| multidimensional arrays is enough such that the combined
| value of a more general language plus library beats Matlab or
| Fortran (likely due to other ecosystem or licensing effects),
| people will switch.
| RugnirViking wrote:
| we still get new Matlab codebases at my work, also R. In my
| case it comes from academics in the power systems and
| renewables fields who move into industry and keep using the
| tools they used there. We try to gently ease them into
| python and version control and sensible collaboration but
| we get new hires all the time
| samatman wrote:
| Julia has gained a lot of mindshare in array-heavy coding,
| with a lot of hard work put into making it possible to get
| comparable speeds to C++ and Fortran. The manual[0] offers a
| good introduction to what's possible: this includes sparse
| and strided arrays, as well as the sort of zero-copy
| reinterpretation you accomplish with pointer wizardry in the
| Fine Article.
|
| [0]: https://docs.julialang.org/en/v1/manual/interfaces/#man-
| inte...
| StableAlkyne wrote:
| > any evidence of their still strong position will be greatly
| appreciated
|
| Fortran still dominates certain areas of scientific computing
| / HPC, primarily computational chemistry and CFD. -
| https://fortran-lang.org/packages/scientific - you don't hear
| about most of them because they're generally run on HPC
| centers by scientists in niche fields. But you do get their
| benefit if you buy things that have the chemical sector in
| their supply chain.
|
| The common thread is generally historical codes with a lot of
| matrix math. Fortran has some pretty great syntax for arrays
| and their operations. And the for-loop parallelization syntax
| in parallel compilers (like openmp) is also easy to use. The
| language can even enforce function purity for you, which
| removes some of the foot guns from parallel code that you get
| in other languages.
|
| The kinds of problems those packages solve tend to bottleneck
| at matrix math, so it's not surprising a language that is
| very ergonomic for vector math found use there.
|
| Same for Matlab, it's mostly used by niche fields and
| engineers who work on physical objects (chemical, mechanical,
| etc). Their marketing strategy is to give discounts to
| universities to encourage classes that use them. Like
| Fortran, it has good syntax for matrix operations. Plus it
| has a legitimately strong standard library. Great for
| students who aren't programmers and who don't want to be
| programmers. They then only know this language and ask their
| future employer for a license. If you don't interact with a
| lot of engineers at many companies, you aren't going to see
| Matlab.
| embwbam wrote:
| I work for the National Solar Observatory, creating Level 2
| data for the DKIST Telescope's observations of the sun.
| (For example, an image of the Temperature that lines up
| with the observation)
|
| Just the other day, the solar physicist I work with said
| "yeah, that code that runs on the supercomputer needs to be
| rewritten in Fortran" (from C, I think.
|
| He's nearing retirement, but it's not that he's behind the
| times. He knows his stuff, and has a lot of software
| experience in addition to physics
| moregrist wrote:
| While it's not as ubiquitous as it used to be, Matlab is
| still very heavily used within shops that do a lot of
| traditional engineering (ME, EE, Aero, etc.).
|
| This surprises people who just write software, but
| consider:
|
| - The documentation and support is superb. This alone can
| justify the cost for many orgs.
|
| - Did I mention the support? MathWorks has teams of
| application support engineers to help you use the tool. The
| ability to pay to get someone on the phone can also justify
| the price.
|
| - The toolkits tend to do what specific fields want, and
| they tend to have a decent api. In contrast, you might end
| up hacking together something out of scipy and 3-4 weirdly
| incompatible data science packages. That's fine for me, but
| your average ME/EE would rather just have a tool that does
| what they need.
|
| - It also has SIMULINK, which can help engineers get from
| spec to simulation very quickly, and seems deeply embedded
| in certain areas (eg: it's pretty common for a job ad with
| control systems work to want SIMULINK experience).
|
| Is Python gradually displacing it? Probably.
|
| (Honestly, I wish it would happen faster. I've written one
| large program in Matlab, and I have absolutely no desire to
| write another.)
| xdavidliu wrote:
| > Matlab & Fortran have been decimated by Python and C/C++
| around 1990s and 2010s, respectively.
|
| Nit: this failed to be respective; the years are the other
| way around
| kbelder wrote:
| Interesting... maybe the respective comparison was Matlab
| to Python and Fortran to C/C++? This sentence actually had
| three parallel clauses.
|
| But that was a great nit to find.
| gugagore wrote:
| Technically, image resizing does in general care about the color
| channel ordering, because color spaces are in general not linear.
| https://www.alanzucconi.com/2016/01/06/colour-interpolation/
| quietbritishjim wrote:
| That article fails to touch on the fundamental issue with its
| title "The Secrets of Colour Interpolation": RGB values are a
| nonlinear function of the light emitted (because we are better
| at distinguishing dark colours, so it's better to allow
| representing more of those), so to interpolate properly you
| need to invert that function, interpolate, then reapply. The
| difference that makes to colour gradients is really striking.
| topherclay wrote:
| Do you mean converting the RGB value to LAB values in the
| CIELAB color space and doing the interpellation there?
|
| Is there a better way to do it?
| CarVac wrote:
| No, you just need to linearize the brightness.
| drjasonharrison wrote:
| Typically this is done by using a look-up table to
| convert the 8-bit gamma encoded intensities to 10-bit (or
| more) linear intensities. You can use the same look-up
| table for R, G, and B. Alpha should be linear.
| yosefk wrote:
| While this is a valid point, cv2.resize doesn't actually
| implement color conversion in a way addressing this issue; at
| least in my testing I get identical results whether I interpret
| the data as RGB or BGR. So if you want to use cv2.resize, AFAIK
| you can count on it not caring which channel is which. And if
| you need fast resizing, you're quite likely to settle for the
| straightforward interpolation implemented by cv2.resize.
| planede wrote:
| Even if the RGB components correspond to sRGB, to linearize you
| apply the same non-linear function to each component value,
| independently. So even if you do the interpolation in a linear
| colorspace, the order of the sRGB components does not matter.
| ggm wrote:
| _Why_ does numpy do column order data?
|
| Is it because in much of the maths domain it turns out you
| manipulate columns more often than rows?
|
| (not a mathematician but I could believe if the columns represent
| "dimensions" or "qualities" of some dataset, and you want to
| apply a function to a given dimension, having data in column-
| natural order makes that faster.)
|
| Obviously you want to believe naievely there is no difference to
| the X and Y in the X,Y plane, but machines are not naieve and
| sometimes there IS a difference to doing things to the set of
| verticals, and the set of horizontals.
| sdeer wrote:
| Probably because Fortran stores matrices and other
| multidimensional arrays in column order. Traditionally most
| numerical computation software was written in Fortran and numpy
| calls those under the hood. Storing in row order would have
| meant copying the data to column major order and back for any
| call to Fortran.
| Thrymr wrote:
| NumPy can _store_ arrays in either row-major (C-style) or
| column-major (Fortran-style) order. Row-major is the default.
| Index order is always row, column.
| thayne wrote:
| Numpy was originally designed for mathematicians and
| scientists, and in those domains convention often lines up
| better with column major order. For example, vectors are a
| single column, and the column index comes before the row index
| in many notations. So using column major order meant familar
| formulas and algorithms (including translating from fortran
| code which is column major, or matlab for that matter) are easy
| to translate to numpy code.
|
| Also, numpy is built on some fortran libraries, like BLAS and
| LAPACK that assume a column major order. If it took row major
| input, it would need to transpose matrices in some cases change
| the order, or rewrite those libraries to use row major form.
| kolbusa wrote:
| Copying is usually not necessary. Often times you can swap
| data and/or shape arguments and get a transposed result out.
| While it is true that Fortran BLAS only supports col-major,
| CBLAS supports both row- and col-major. Internally, all the
| libraries I worked on use col-major, but that is just a
| convention.
| bee_rider wrote:
| It is possible that my brain had been damaged by convention,
| but it looks much tidier to have your vector multiplicand on
| the right of the matrix when doing a matrix vector
| multiplication y=Ax, where A is the matrix and x is the vector,
| y is the result.
|
| So, x and y must be columns. So, it is nice if our language has
| column-order as the default, that way a vector is also just a
| good old 1d array.
|
| If we did y=xA, x and y would be rows, the actual math would be
| the same but... I dunno, isn't it just hideous? Ax is like a
| sentence, it should start with an upper case!
|
| It also fits well with how we tend to learn math. First we
| learn about functions, f(x). Then we learn that a function can
| be an operator. And a matrix can represent a linear operator.
| So, we tend to have a long history of stuff sitting to the left
| of x getting ready to transform it.
| SailorJerry wrote:
| I think the reason I prefer columns is I do the mental
| expansion into large bracketed expressions. If x is a row and
| kept inline, then the expansion gets really wide. To keep it
| compact and have the symbols oriented the same as their
| expansion, you'd have to put the x above A and that's just
| silly.
| Joker_vD wrote:
| Well, there used to be a tradition in universal algebra (and
| category theory) of putting functions/operators at the right
| side of the arguments but it seems to have ultimately died
| out. And "y = xA" is the standard notation in the linear
| coding theory, even to this day: message vectors are bit
| strings, not bit columns.
| bee_rider wrote:
| I think there's also some other pretty big field that tends
| to put the vector on the left... statistics? Economics? For
| some reason I couldn't find any links, so maybe I just got
| this from, like, one random statistician. If only they'd
| told me about sample sizes.
| planede wrote:
| Your reasoning starts out nicely with y=Ax, but you get the
| wrong conclusion. The layout of x and y are not affected at
| all by row or column order, they are just consecutive numbers
| in either case. So you have to look at the layout of A.
|
| For the row-major layout the multiplication Ax ends up being
| more cache-efficient than for the column major layout, as for
| calculating each component of y you need to scan x and a row
| of A.
| bee_rider wrote:
| I'm pretty sure BLAS will tile out the matrix
| multiplication internally anyway, so it doesn't matter. At
| least for matmat would, is matvec special?
| planede wrote:
| However matmat is done, row-major vs column-major for
| both matrices shouldn't make a difference (for square
| matrices at least).
|
| I don't know if tiling is done for matvec. I don't think
| it makes sense there, but I didn't think about it too
| hard.
| montebicyclelo wrote:
| y = x @ W + b
|
| Is how you'd write it in NumPy and most (/all?) deep learning
| libraries, with x.shape=[batch, ndim]
|
| I'd personally prefer that to be the convention, for
| consistency.
| rdtsc wrote:
| I think in math columns as vectors is a more common
| representation. Especially if we talk about matrix
| multiplication and linear systems.
| Pinus wrote:
| Hang on, doesn't numpy use C (row major) array ordering by
| default? _checks docs_ Seems like it does. However, numpy array
| indexing also follows the maths convention where 2D arrays are
| indexed by row, column (as does C, by the way), so to access
| pixel x, y you need to say im[y, x]. And the image libraries
| where I have toyed with with numpy integration (only Pillow, to
| be honest) seem to work just fine like this -- a row of pixels
| is stored contiguously in memory. So I don't quite see why the
| author claims that numpy stores a _column_ of pixels
| contiguously, but I have only glanced at the article, so quite
| probably I have missed something.
| KeplerBoy wrote:
| You're right. Numpy stores arrays in row-major order by
| default. One can always just have a look at the flags
| (ndarray.flags returns some information about the order and
| underlying buffer).
| ruined wrote:
| oh THIS is why image byte order and dimensions are so confusing
| every time i fuck with opencv and pygame
|
| well, half of why. for some reason i keep doing everything
| directly on /dev/fb0
| TeMPOraL wrote:
| Image byte order, axis directions, coordinate system handedness
| when in 3D... after enough trying, you eventually figure out
| the order of looping at any given stage of your program, and
| then it Just Works, and then you _never touch it again_.
| Too wrote:
| Clickbait title. The speedup only applies to one particular
| interaction between SDL and numpy.
| C4stor wrote:
| All of this seems unnecessary, and easily replaced in the
| provided benchmark by :
|
| i2 = np.ascontiguousarray(pg.surfarray.pixels3d(isurf))
|
| Which does the 100x speedup too and is a "safe" way to adjust
| memory access to numpy strides.
|
| Whether the output is correct or not is left as an exercise to
| the author, since the provided benchmark only use np.zeros() it's
| kind of hard to verify
| yosefk wrote:
| I just measured this with the np.ascontiguousarray call (inside
| the loop of course, if you do it outside the loop when i2 is
| assigned to, then ofc you don't measure the overhead of
| np.ascontiguousarray) and the runtime is about the same as
| without this call, which I would expect given everything TFA
| explains. So you still have the 100x slowdown.
|
| TFA links to a GitHub repo with code resizing non-zero images,
| which you could use both to easily benchmark your suggested
| change, and to check whether the output is still correct.
| londons_explore wrote:
| > What can we do about this? We can't change the layout of pygame
| Surface data. And we seriously don't want to copy the C++ code of
| cv2.resize, with its various platform-specific optimizations,
|
| Or... you could have sent a ~25 line pull request to opencv to
| fix this performance problem not just for you, but for thousands
| of other developers and millions of users.
|
| I think your fix would go here:
|
| https://github.com/opencv/opencv/blob/ba65d2eb0d83e6c9567a25...
|
| And you could have tracked down that slow code easily by running
| your slow code in gdb, hitting Ctrl+C while it's doing the slow
| thing, and then "bt" to get a stack trace of what it's doing and
| you'd see it constructing this new image copy because the format
| isn't correct.
| yosefk wrote:
| 25 line pull request doing what? Supporting this format
| efficiently is probably more LOC and unlikely to get merged as
| few need this and it complicates the code. Doing the thing I
| did in Python inside OpenCV C++ code instead (reinterpreting
| the data) is not really possible since you have less knowledge
| about the input data at this point (like, you're going to
| assume it's an RGBA image and access past the area defined as
| the allocated array data range for the last A value? The user
| gives a 3D array and you blithely work on the 4th dimension?)
|
| And what about the other examples I measured, including pure
| numpy code? More patches to submit?
| londons_explore wrote:
| inside the python opencv bindings?
|
| Or just special case resize, since there the channel order
| doesn't matter. If you interpret R as A you'll still get the
| right answer.
| yosefk wrote:
| The pygame API gives you a 3D RGB array and a separate 2D
| alpha array, which happen to reference the same RGBA/BGRA
| memory. Are you suggesting that OpenCV Python bindings (in
| most of their functions, and likewise other libraries
| including numpy operators themselves) should have code
| assuming that a WxHx3 RGB array really references the
| memory of a WxHx4 RGBA array?..
|
| If you really want a solution which is a patch rather than
| a function in your own code, I guess the way to go would be
| to persuade the maintainers of pygame and separately, the
| maintainers of its fork, pygame-ce, to add a
| pixels4d_maybe_rgba_and_maybe_bgra() function returning a
| WxHx4 array. Whether they would want to add something like
| that I don't know. But in any case, I think that since
| closed source projects, open source projects, and code by
| sister teams exist that will not accept patches, or will
| not accept them in a timely manner, it's interesting what
| you can do without changing someone else's code.
| antirez wrote:
| MicroPython allows to do even more than that, and is fantastic
| and I wish it was part of standard Python as well. Using a
| decorator you can write functions in Viper, a subset of Python
| that has pointers to bytearrays() and only fast integer types
| that have a behavior similar to C (fixed size, they wrap around
| and so forth). It's a very nice way to speedup things 10 or 100x
| without resorting to C extensions.
| yosefk wrote:
| I dunno if it's "more than that" since they're not directly
| comparable? What you describe sounds more like numba or Cython
| than what TFA describes which is a different use case?..
| antirez wrote:
| Indeed, I don't mean they are directly comparable, but it's a
| great incarnation of the "Unsafe Python" idea, because it
| does not go too far with the subset of Python that you can
| use, like introducing a complete different syntax or alike.
| There are strict rules, but if you follow them you still
| kinda write Python that goes super fast.
| Joker_vD wrote:
| Yep, welcome to the world of RGBA and ARGB storage and memory
| representation formats, with little- and big-endianness thrown in
| for the mix, it's all very bloody annoying for very little gain.
| sylware wrote:
| I wonder how it would perform with the latest openstreetmap tile
| renderer which was released not long ago.
| jofer wrote:
| I'm confused on the need for cytpes/etc here. You can directly
| modify somearr.strides and somearr.shape. And if you need to set
| them both together, then there's
| numpy.lib.stride_tricks.as_strided. Unless I'm missing something,
| you can do the same assignment that ctypes is used for here
| directly in numpy. But I might also be missing a lot - I'm still
| pre-coffee.
|
| On a different note, I'm surprised no one has mentioned Cython in
| this context. This is distinct from, but broadly related to
| things like @cython.boundscheck(False). Cython is _really_ handy,
| and it's a shame that it's kinda fallen out of favor in some ways
| lately.
| yosefk wrote:
| You need to modify base pointer in this example, specifically
| to point 2 bytes before the current base pointer (moving back
| from the first red pixel value to the first blue value.) I
| don't think you can do it without ctypes, maybe I'm wrong.
|
| What does Cython do better than numbs except static
| compilation? Honest q, I know little about both
| jofer wrote:
| You actually can do the "offset by 2 bytes back" with a
| reshape + indexing + reshape back. But I suspect I'm still
| missing something. I need to read things in more depth and
| try it out locally.
|
| By "numbs" here, I'm assuming you mean numba. If that's not
| correct, I apologize in advance!
|
| There are several things where Cython is a better fit than
| numba (and to be fair, several cases where the opposite is
| true). There are two that stick out in my mind:
|
| First off, the optimizations for cython are explicit and a
| matter of how you implement the actual code. It's a separate
| language (technically a superset of python). There's no
| "magic". That's often a significant advantage in and of
| itself. Personally, I often find larger speedups with Cython,
| but then again, I'm more familiar with it, and understand
| what settings to turn on/off in different situations. Numba
| is much more of a black box given than it's a JIT compiler.
| With that said, it can also do things that Cython can't
| _because_ it's a JIT compiler.
|
| If you want to run python code as-is, then yeah, numba is the
| better choice. You won't see a speedup at all with Cython
| unless you change the way you've written things.
|
| The second key thing is one that's likely the most
| overlooked. Cython is arguably the best way to write a python
| wrapper around C code where you want to expose things as
| numpy arrays. It takes a _huge_ amount of the boilerplate
| out. That alone makes it worth learning, i.m.o.
| jofer wrote:
| Ah, right, you mean _outside_ the memory block of the
| array! Sorry, my mind was just foggy this morning.
|
| That's not strictly possible, but "circular" references are
| with "stride tricks". Those can accomplish similar things
| in some circumstances. But with that said, I don't think
| that would work in this case.
___________________________________________________________________
(page generated 2024-05-07 23:02 UTC)