[HN Gopher] Rewrite the VP9 codec library in Rust
       ___________________________________________________________________
        
       Rewrite the VP9 codec library in Rust
        
       Author : marcodiego
       Score  : 173 points
       Date   : 2024-02-28 13:39 UTC (9 hours ago)
        
 (HTM) web link (lore.kernel.org)
 (TXT) w3m dump (lore.kernel.org)
        
       | dureuill wrote:
       | > Much has been spoken at various occasions about drivers and I
       | feel that the consensus is to wait for now.
       | 
       | Interesting, I did not follow that development. I thought the
       | plan was to use Rust for some out-of-tree/optional drivers. What
       | changed?
        
         | charcircuit wrote:
         | It's already used for an intree driver, but a lot of the
         | infrastructure for making Rust drivers was not be upstreamed.
        
         | Macha wrote:
         | There was opposition to building interfaces for toy drivers and
         | the last thread had suggestions at a rust filesystem interface
         | rebuffed by saying they should try rewrite the ext2 driver in
         | rust to prove that it was usable for real filesystems rather
         | than toy ones. I'd guess similar thought processes fuelled this
         | decision.
        
       | Havoc wrote:
       | Seems like a reasonable way forward.
        
         | kramerger wrote:
         | Android started this way too with media libraries being updated
         | first.
        
       | cyber_kinetist wrote:
       | > These algorithms use the data received from userspace in order
       | to index into a lot of arrays and thus benefit from Rust's memory
       | safety.
       | 
       | If everything is done with arrays and indices (apparently from
       | looking at the code:
       | (https://gitlab.collabora.com/dwlsalmeida/for-upstream/-/blob...)
       | it seems like Rust's borrow checker doesn't really help at all,
       | and the only thing Rust really does for you is just bounds checks
       | on arrays (with additional runtime overhead)... So I'm not sure
       | how this can really improve the state of things compared to C
       | with the equivalent bounds checks.
        
         | phkahler wrote:
         | >> So I'm not sure how this can really improve the state of
         | things compared to C _with the equivalent bounds checks_.
         | [emphasis added]
         | 
         | Like all things Rust, you _can_ do the same thing in C but that
         | requires extra effort and more source code. If Rust will add
         | bounds checking for you automatically, that 's a step up from
         | C.
        
           | bayindirh wrote:
           | If you don't need to do bounds checking, and doing it anyway,
           | then that's a step down from C.
        
             | mcfedr wrote:
             | But isn't the point that the number of times a c programmer
             | thought they didn't need bounds checks and reality are very
             | different. Also off by one errors and such. Rust won't let
             | you make these mistakes
        
               | bayindirh wrote:
               | I personally don't think that a programming language
               | should save me from myself unless I explicitly ask for
               | it. For every case I first prove to myself that I can get
               | away without bounds checking, otherwise if I'm in doubt,
               | I put it in and profile.
               | 
               | I think Rust is a nice language, but when it's presented
               | as an "antidote" to "evils of C/C++", and is "indeed made
               | to kill those" I lose my interest.
               | 
               | Also, while I think learning Rust is worthwhile, I think
               | promoting Rust's barriers as saviors are funny. We say
               | that Apple's against general purpose computing with all
               | its walled gardens. Then Rust is against "General purpose
               | programming". I want to be able to write programs which
               | crash and burn like meteors entering the atmosphere,
               | because this allows me to understand the hardware under
               | me. Promoting a walled programming language as "the one
               | and only" is wrong.
               | 
               | Why should I embrace a programming language as one and
               | only if that doesn't allow me to do what I want with my
               | computer?
               | 
               | Edit: No, unsafe doesn't count, because it doesn't remove
               | _all_ checks.
        
               | chatmasta wrote:
               | > Edit: No, unsafe doesn't count
               | 
               | Why not? You can basically write C with FFI and unsafe.
               | In fact that's how many "port to Rust" projects start.
               | 
               | It seems directly analogous to the fact that you can
               | disable SIP on macOS to escape their walled garden, into
               | an "I know what I'm doing" mode.
        
               | bayindirh wrote:
               | Because the borrow checker is always watching. Even
               | you're in an unsafe block. It's that paranoid.
        
               | bombela wrote:
               | Unsafe unlocks dereferencing raw pointers, which aren't
               | subject to the borrow checker.
        
               | pjc50 wrote:
               | Memory safety has become important enough to be a matter
               | of national security:
               | https://www.whitehouse.gov/oncd/briefing-
               | room/2024/02/26/pre...
               | 
               | > I want to be able to write programs which crash and
               | burn like meteors entering the atmosphere
               | 
               | Sure, fine, whatever, just as long as it's not used by
               | anyone else. Production code needs to be held to
               | standards.
        
               | bayindirh wrote:
               | I'm a big proponent of choosing the right tool for the
               | job at hand. On the other hand, I'm a big opponent of
               | throwing stones because of emotions.
               | 
               | C/C++ can be made memory safe. You need to use a couple
               | of data structures and need to be a little bit more
               | vigilant and make periodic tests. If the developers can't
               | bother to learn them, that's fine.
               | 
               | However, saying something is impossible and being adamant
               | about that without research is harmful as it is. I
               | sometimes say things about Rust, people correct me, and I
               | learn. Some people see the evidence contrary to their
               | beliefs and get triggered because they were wrong in the
               | first place. This is what I'm bothered about.
               | 
               | I don't use C or C++ only. I'm not against Rust either.
               | I'm against positioning of Rust and C/C++ from Rust
               | community's perspective. That's all.
               | 
               | Lastly, I'm not a proponent of reckless coding either.
               | Totally contrary. I take pride in writing robust code.
               | However, I want to be able to write code which breaks on
               | purpose to see what hardware does, to understand the
               | failure modes or gotchas of the architecture I'm running
               | on.
        
               | Too wrote:
               | > C/C++ can be made memory safe.
               | 
               | In theory, in small toy projects or with a massive NASA
               | budget - Yes. In any other project, with normal (aka too
               | short) time constraints and average skilled developers,
               | it's not.
               | 
               | 35 years of trying, has proven us humans that, several
               | times over.
        
               | bayindirh wrote:
               | I think it's much simpler than that, because I did it
               | myself, on a HPC scale high performance materials
               | simulation code. Is it simple? Yes. It's easy? No,
               | because you need to design for that and be mindful during
               | implementation (const correctness, guarantees by design,
               | valgrind tests, unit sealing, etc.). I think it can be
               | made much simpler with smart pointers, etc. if speed is
               | not that important.
               | 
               | We push humans too much to develop things fast. C++ is
               | not very conductive to that, yet it's the only tool which
               | works in some cases.
               | 
               | I won't retype my views about Rust because it's all over
               | this thread. Just I'll tell that I'm against vilifying
               | C/C++ as evil because they can be held wrong. I believe
               | things can and shall be able to held wrong. Knowing
               | failure modes and shortcomings is a plus. Because you can
               | then hold dangerous things right, and appreciate things
               | which promote holding things right.
        
               | shakow wrote:
               | > C/C++ can be made memory safe
               | 
               | Famous last words.
        
               | pjc50 wrote:
               | > C/C++ can be made memory safe
               | 
               | .. but it's much harder to _prove_ your work is memory
               | safe. sel4 is memory safe C, for example. The safety is
               | achieved by a large external theorem prover and a synced
               | copy written in Haskell. https://github.com/seL4/l4v
               | 
               | Typechecks are form of proof. It's easier to write
               | provably safe Rust than provably safe C because the
               | proofs and checker are integrated.
        
               | bayindirh wrote:
               | I have never claimed that it's easy. I said doing it is
               | simple, and prone to errors, and needs to be verified
               | either by design or tests, ideally both. I'm aware of
               | seL4. They're doing an amazing job of verifying what they
               | have written.
               | 
               | However, why I'm is so adamant is because I have done
               | something similar myself, albeit in a weaker form, but at
               | least I verified that every part of my code is not doing
               | funny things by vigorously testing it in valgrind in
               | different scenarios both in units and end to end.
               | 
               | Again, I'm not against Rust. I'm against vilifying
               | languages.
        
               | dmos62 wrote:
               | Which do you think is more efficient in terms of CPU
               | resources, user-time, developer-time, money? Both
               | development and the final program taken into account. On
               | average.
        
               | bayindirh wrote:
               | Depends on what you're building. If the code you're
               | running is not resource intensive and relatively short-
               | running, developer time is more expensive.
               | 
               | However, if your program is long running and requires
               | high performance (number crunching, simulations, HPC in
               | general), %1 difference in tight loops affect your total
               | runtime by hours, if not days. Then, user-time is much
               | more expensive, hence you need more speed. Also, in this
               | case you're maxing out ~100 servers in terms of power and
               | TDP, so a shorter runtime has a bigger impact on your
               | energy bill, and global warming.
               | 
               | If I can run more users' code for the same power and time
               | budget, and conclude more research, developer time to be
               | damned. They can spend as much as time they like.
               | 
               | Tech people tend to say developers are expensive and
               | hardware is cheap. No it's not, if you're using it at its
               | max capacity.
        
               | shakow wrote:
               | > %1 difference in tight loops
               | 
               | I'd love to see how you can reach this number.
               | 
               | I hear a lot of people complaining about supposedly
               | degraded performances due to bound checking; but IME,
               | even on number crunching HPC code, I have never been able
               | to get a signal greater than noise regarding bound
               | checks, which can be explained by: (i) the prediction
               | pipeline doing its job, (ii) iterators eliding bound-
               | checks at compile time, (iii) bound checking being
               | dwarfed by the actual computations within the tight loop.
               | 
               | Remember to measure what you optimize for first before
               | going on an intuition.
        
               | bayindirh wrote:
               | Disclaimer: I'm an HPC admin and both develop code on
               | these things and manage them.
               | 
               | The code I have written was doing ~1.7M iterations per
               | core, per second when I implemented it w/o bounds
               | checking and locks. It was designed to be fast from the
               | start, so I never tried bounds checking.
               | 
               | I'm restarting the work on the code soon-ish, so I'll be
               | writing a benchmark module for the thing. If you can
               | provide me an e-mail address, I'll implement both, do the
               | tests, and provide you the results, and we can discuss on
               | it, too.
               | 
               | Also, I'll see whether GCC-14 (or whatever comes next) is
               | intelligent enough to eliminate bounds checks in these
               | cases.
               | 
               | The following part of the code [0], was running with much
               | higher iteration numbers inside the "tight loop", but I
               | never benchmarked it, because its iteration count is both
               | inconsistent (due to adaptive nature), and was
               | meaningless in the bigger picture (where 1.7M/sec/core
               | number comes in).
               | 
               | That code was never optimized before measurement, and the
               | biggest bottleneck was memory controller at the end. I
               | needed to reorder matrices to pass that hurdle, yet the
               | Ph.D. was complete, and speed was adequate, so we didn't
               | bother, TBH.
               | 
               | [0]:
               | https://journals.tubitak.gov.tr/elektrik/vol29/iss2/45/
        
               | sunshowers wrote:
               | I do think, when it comes to professional-grade software,
               | programming languages should save programmers from
               | themselves -- even the ones who don't want to be saved.
        
               | jeroenhd wrote:
               | Very little Rust code actually does all the safety checks
               | that you would expect a debug build of that same program
               | to do, especially in the kernel.
               | 
               | You can write safe rust (check the Option<T> returned by
               | vec.get(i)) but code like `p[0] =
               | update_prob(d[0].into(), p[0].into()) as u8;`
               | (https://gitlab.collabora.com/dwlsalmeida/for-
               | upstream/-/comm...) can panic at three different places.
               | Such a panic would become a kernel oops, which wouldn't
               | be the end of the world but it would probably kill
               | whatever program was trying to decode video. With
               | additional optimisation options, the bounds checking may
               | even be omitted entirely.
               | 
               | Rust does generate more accurate bounds checking warnings
               | thanks to all the metadata it has, but that should not be
               | solely relied upon. Rust will let you make those
               | mistakes, but only sometimes, not usually like in old C
               | or C++.
               | 
               | I think it's important to know the difference, because
               | feeling invulnerable to these bugs may lead you to write
               | buggy code because you stopped thinking about common C
               | bugs entirely.
               | 
               | Also worthy of note is that because of a compiler bug,
               | it's possible to leak memory and cause other weird memory
               | bugs in perfectly safe Rust at the moment. It involves
               | messing with lifetimes and semi-unsafe code so I doubt
               | that bug would just sneak in, but the language doesn't
               | make your code completely bullet proof.
        
               | KerrAvon wrote:
               | > can panic at three different places
               | 
               | Remember that some of the point here is that it _will_
               | reliably kill the program if that happens; in C, you
               | might be silently reading or writing to the wrong address
               | 3 times.
        
               | Mateon1 wrote:
               | The `single_ref` field is a fixed-size array in both of
               | the objects referenced in this line, so this line can't
               | panic, and no bounds checks are involved (since the
               | compiler sees the index < length at compile time and
               | doesn't even need to emit one -- although I think it
               | still does, and it's LLVM that gets rid of it actually)
               | 
               | Causing memory leaks is possible in safe Rust even
               | without any arcane invocations, you can construct a cycle
               | of Rc<T> counted objects. There's even a perfectly safe
               | Box::leak in the standard library that gives you a
               | &'static reference to any object by leaking it.
               | Preventing leaks is outside of the scope of Rust's safety
               | system.
        
             | estebank wrote:
             | If you can _assure_ that bounds checks are not necessary
             | (either by construction, because it 's a statically sized
             | array, or by runtime check because you do a length check
             | once at runtime), then doing those same things will tell
             | rustc enough to know that bounds checks aren't
             | needed[1][2]. If you _think_ you don 't need bounds checks,
             | but can't communicate that in code, such as with an
             | assertion (or if rustc had a bug that misses those checks,
             | unlikely but could happen), then yes, you'll end up with
             | bounds checks unless you use get_unchecked in an unsafe
             | block.
             | 
             | I'm failing to see how this is an onerous difference.
             | 
             | 1: https://nnethercote.github.io/perf-book/bounds-
             | checks.html
             | 
             | 2: https://github.com/Shnatsel/bounds-check-cookbook/
        
             | Aurornis wrote:
             | > If you don't need to do bounds checking, and doing it
             | anyway, then that's a step down from C.
             | 
             | The Rust compiler tries to optimize away unnecessary bounds
             | checks.
             | 
             | In practice, it works well. The real-world cost of Rust
             | bounds checking isn't very significant in most benchmarks,
             | aside from some synthetic micro-benchmarks designed to
             | emphasize the issue.
             | 
             | If you come across a hot loop in Rust where bounds checking
             | is an actual overhead, you can manually optimize it out if
             | you so desire. It's important to really check first,
             | though, because it's often surprising that it makes such
             | little difference or has been optimized out already.
        
             | jcranmer wrote:
             | Who would you rather trust to do bounds checking, computers
             | who are zealously good at doing what they are told to do to
             | the point of absurdity or humans who are notoriously bad at
             | following rigorous procedures? I mean, we've seen from
             | several other fields that the only way to get the safety
             | standards of human procedures up is to introduce checklists
             | and get people to rigorously follow them [1].
             | 
             | If you want to elide bounds checks for performance reasons,
             | which is easier: manually verifying for yourself that every
             | single array access is guarded by a bounds check somewhere
             | and ensuring that no subsequent code changes break this
             | verification, or getting the compiler to prove for you that
             | every bounds check can be safely elided?
             | 
             | [1] And of course we still have several issues in fields
             | like medicine where practitioners refuse to adopt this
             | methodology because they find checklists to be an insult to
             | their intelligence.
        
               | bayindirh wrote:
               | > getting the compiler to prove for you that every bounds
               | check can be safely elided?
               | 
               | I'd prefer to delegate that where compiler if it can do
               | that for that piece of code at hand. I've written about a
               | case where it'd be very hard for a compiler to eliminate
               | a bounds check because the guarantees are made elsewhere
               | in the code.
               | 
               | On the other hand, I'd rather add my bounds check
               | voluntarily (it's very simple in C++ vectors for example.
               | use ".at()" instead of "[]", that's all), because I
               | generally design my code in a way which doesn't need
               | bounds checks by failing hard and early at places where I
               | build/fill the arrays/vectors and prone to malformation.
               | So, you need to be well-formed to pass these checks, and
               | these data structures are not modified _ever_ down the
               | pipe. If they are modified, they 'll be bound checked of
               | course.
               | 
               | What I'm saying is, I'm not naive enough to believe that
               | I'm perfect, but I'm not naive enough to believe that
               | compiler is perfect, either. So, I do my part, and leave
               | the parts I can't be sure to the compiler.
               | 
               | I'm not an hard-liner. I just want finer control on my
               | code, and take full responsibility if it crashes and
               | burns in a way it shouldn't, so plan and implement
               | accordingly.
        
         | bayindirh wrote:
         | I also don't think that we should be writing everything in Rust
         | blindly. If you can guarantee that you won't be accessing
         | outside of an array before entering a critical section, not
         | having bounds checking is actually a plus.
         | 
         | I have a similar code where I can guarantee that I won't be
         | ever accessing outside the boundaries of arrays and vectors,
         | and that gives me great performance boost.
        
           | __s wrote:
           | You can skip bounds checks in Rust using `unsafe`
        
           | ParetoOptimal wrote:
           | > If you can guarantee that you won't be accessing outside of
           | an array before entering a critical section
           | 
           | That is a huge IF though.
           | 
           | People getting those things wrong either initially or a
           | refactor invalidating this invariant is a huge source of
           | bugs.
        
             | bayindirh wrote:
             | Yes, I'm aware. However in most cases I know the size of
             | the array in the beginning and it's not modified by any
             | means (which is guarded by const correctness throughout the
             | code).
             | 
             | If I can't guarantee that, I use vectors and ".at()", which
             | does bound checking at runtime.
             | 
             | Generally I'm developing solo, so people mucking what I do
             | is very rare, however I don't blindly believe myself
             | either.
        
               | mlsu wrote:
               | It makes sense that you don't see the value of Rust if
               | you spend most of your time developing solo.
               | 
               | The value of these checks is not just to reduce bugs in
               | production code. It's nice that that happens, but that's
               | not even the primary value of the borrow checker. The
               | primary value is that having these checks (lifetimes,
               | bounds checks, -- everything that makes Rust annoying to
               | write) makes refactoring on a shared codebase
               | significantly easier. This means that you are not
               | introducing bugs in a refactor, so you can refactor
               | faster, which means you can ditch bad architectures
               | sooner, which compounds and saves enormous amounts of
               | developer time and $$. And it's a knock-on effect that
               | _increases_ in value as the team grows larger.
               | 
               | Keeping track of lifetimes and bounds for a solo dev is
               | quite easy as you say. Keeping track of lifetimes and
               | bounds for the other dozen devs on my team? In all
               | external dependencies? Extremely difficult. Impossible in
               | a large codebase, actually, given the number of mem
               | safety bugs that appear even in mature C codebases. It's
               | collaboration that is the source of these bugs, that
               | these checks work to mitigate.
               | 
               | I sort of wondered why C did not have a package
               | management system, until I started working on a large C
               | codebase. There is a reason Rust has cargo and C does
               | not; it has nothing to do with whether or not someone
               | decided to write cargo and everything to do with Rust's
               | language features.
        
               | bayindirh wrote:
               | That's a different and refreshing perspective to look
               | from, thanks.
               | 
               | I'm aware that my view is somewhat biased because I'm a
               | solo dev which works on small to large projects by
               | myself, and things get exponentially harder as more
               | people mangle the same code base. That's very true.
               | 
               | What I was trying to highlight is basically neither C or
               | C++ are "free for all without recourse". Esp. C++ has
               | many features, but they're opt-in, where in Rust they're
               | opt-out.
               | 
               | Also many newer developers don't understand that
               | compilation used to take way longer on olden times even
               | with simpler languages and compilers. Hence, the thing
               | Rust doing today was "impossible" in the older days.
               | 
               | For the last time, I think Rust is a nice language, and I
               | won't be annoyed by its limitations. What bothers me with
               | no end is vilifying other languages and pushing rust as a
               | silver bullet and savior. Other than that, Rust is just
               | another tool which works for some things very well, and
               | not very well for others.
        
         | tinco wrote:
         | The bounds checks in Rust are implicit (i.e. part of the std
         | implementation), and get removed by the compiler if they're
         | unnecessary. I think that's a pretty great improvement over the
         | state of things in C.
         | 
         | And if you are convinced you don't need a bounds check and the
         | compiler does not remove it you can explicitly remove the
         | bounds check, provided you mark the access as unsafe. So Rust
         | is a strict improvement over C in this regard.
        
           | bayindirh wrote:
           | Asking because of interest: How compiler decides that it
           | needs BC or not?
        
             | tinco wrote:
             | Through value tracking. It's actually LLVM that does this,
             | GCC probably does it as well, so in theory explicit bounds
             | checks in regular C code would also be removed by the
             | compiler.
             | 
             | How it works exactly I don't know, and apparently it's so
             | complex that it requires over 9000 lines of C++ to express:
             | 
             | https://github.com/llvm/llvm-
             | project/blob/main/llvm/lib/Anal...
        
               | jcranmer wrote:
               | > How it works exactly I don't know
               | 
               | The idea is pretty simple. You can build a list of known
               | facts based on control flow, explicit __builtin_assumes,
               | and undefined behavior relations. For example, if you've
               | got this code:                 if (x < N) {         // In
               | this block, we know that x < N       } else {         //
               | ... and in this block we know that x >= N!       }
               | 
               | And on top of that, we can do some basic algebra. If we
               | know that x < N and N < 5, then we can infer that x < 5.
               | So if we see a comparison x < 5, we can then rewrite that
               | to true.
               | 
               | > and apparently it's so complex that it requires over
               | 9000 lines of C++ to express
               | 
               | The two main reasons for that is that a) there is a lot
               | of rules covering cases like "we know the result of
               | count_leading_zeroes can be no more than the number of
               | bits in an integer" and so forth, and b) this is doing a
               | lot more logic than just tracking integer comparisons:
               | there's tracking known-bits of integers, maximum possible
               | value, floating-point comparisons, pointer object
               | references.
        
               | jakubadamw wrote:
               | > And on top of that, we can do some basic algebra. If we
               | know that x < N and N < 5, then we can infer that x < 5.
               | So if we see a comparison x < 5, we can then rewrite that
               | to true.
               | 
               | Even better: if `x` and `N` are integers, then we can
               | infer that `x` < 4. :)
        
               | KMag wrote:
               | I think you mean x <= 4, right?
        
               | karamanolev wrote:
               | No, x < 4, within integers. If N < 5, then N <= 4. If x <
               | N, then x < 4.
        
             | orlp wrote:
             | An example of when it's not necessary:
             | for i in 0..v.len() {             v[i] += 1;         }
             | 
             | Because the compiler can prove that i < v.len() due to the
             | loop condition, the bounds check gets eliminated.
        
               | bayindirh wrote:
               | Thanks!
               | 
               | My case is a bit outside that, so I don't think the
               | compiler can deduce that. I have a file format which
               | tells me the expected number of fields about a category,
               | and I throw an error & abort if the number is not exactly
               | that.
               | 
               | Also, these data structure fields are always sent in as
               | const variables, so they are never modified (making them
               | "sealed" in a sense), hence I don't need to bounds check
               | on arrays and vectors storing them.
        
               | ihattendorf wrote:
               | That sounds trivial enough that the compiler would remove
               | the bounds checks, assuming I'm understanding correctly
               | that you have a condition that validates the number of
               | fields at some point before an invalid access would
               | occur.
               | 
               | But if it's possible for someone to muck with the file
               | contents and lie about the number of fields which would
               | cause a bounds error, that's exactly what bounds checking
               | is supposed to avoid. So either bounds checks will be
               | removed, or they're necessary.
        
               | bayindirh wrote:
               | I think it won't be able to because the creation of these
               | data structures and consuming them is 3 files apart.
               | 
               | > But if it's possible for someone to muck with the file
               | contents and lie about the number of fields.
               | 
               | You can't. You can say you'll have 7, but provide 8. But
               | as soon as I encounter the 8th one during parsing,
               | everything aborts. Same for saying 7 and providing 6. If
               | the file ends after parsing 6th one, I say there's an
               | error in your file and abort. Everything has to checkout
               | and have to be sane to be able to start. Otherwise you'll
               | get file format errors all day.
               | 
               | The rest of the pipeline is unattended completely. It's
               | bona fide number crunching (material simulation to be
               | exact), so speed is of the essence. Talking about >1.5
               | million iterations per second per core.
        
               | aw1621107 wrote:
               | > I think it won't be able to because the creation of
               | these data structures and consuming them is 3 files
               | apart.
               | 
               | Strictly speaking I don't think the distance between
               | creation and consumption matters. It all comes down to
               | what the compiler is able to prove at the site where the
               | bounds check may go.
               | 
               | For example, if you're iterating over a Vec using `for i
               | in 0..vec.len() { ... }` then the amount of code between
               | the creation and consumption of that Vec doesn't matter,
               | as the compiler has all the information it needs to
               | eliminate the bounds check right there.
        
               | bayindirh wrote:
               | If that's a vector which you basically iterate, yes.
               | However, thinking what I developed, I have offset or
               | formula determined indexes I hit constantly, and not
               | strictly in a loop. They might prove harder. I need to
               | implement these and see what the compiler(s) do in these
               | cases.
               | 
               | The code I have written is a 3D materials software which
               | works in >(3000x3000) matrices, and I do a lot of tricks
               | with these to what I get from them. However, since
               | everything creating them are validated during their
               | creation, nothing breaks and nothing requires checks.
               | Because most of the data is read-only (and forced by
               | const correctness throughout the code).
        
               | aw1621107 wrote:
               | > However, thinking what I developed, I have offset or
               | formula determined indexes I hit constantly, and not
               | strictly in a loop. They might prove harder.
               | 
               | I think at that point it'll come down to the compiler's
               | value range analysis as well as how other parts of the
               | program affect inlining/etc. Hard to say exactly what
               | will happen.
        
           | faitswulff wrote:
           | shnatsel wrote a post on bounds checking performance
           | implications in Rust. Money quote:
           | 
           | > The real-world performance impact of bounds checks is
           | surprisingly low.
           | 
           | > The greatest impact I've ever seen on real-world code from
           | removing bounds checks alone was *15%,* but the typical gains
           | are in *1% to 3% range,* and even that only happens in code
           | that does a lot of number crunching.
           | 
           | > You can occasionally see greater impact (as we'll see
           | soon!) if removing bounds checks allows the compiler to
           | perform other optimizations.
           | 
           | > Still, performance of code that's not doing large amounts
           | of number crunching will probably [not be impacted by bounds
           | checks](https://blog.readyset.io/bounds-checks/) at all.
           | 
           | It is, of course, not universally applicable, so read the
           | post for full details: https://shnatsel.medium.com/how-to-
           | avoid-bounds-checks-in-ru...
        
         | woodruffw wrote:
         | > So I'm not sure how this can really improve the state of
         | things compared to C with the equivalent bounds checks.
         | 
         | The simplest answer here is that a compiler-introduced bounds
         | check is almost always better than a human one. Humans make
         | bounds errors, compilers generally don't.
         | 
         | The longer answer is that the current state of the code does
         | not guarantee its future state. The fact that bounds checks
         | constitute the current majority of safety guardrails does not
         | mean that future refactors won't benefit from Rust's temporal
         | memory safety guarantees. Or more abstractly: it's easier to
         | perform _safe_ refactors when your safety properties compose
         | natively, rather than having to bolt another layer of checks
         | onto pre-existing language that doesn 't support them natively.
         | 
         | Edit: Forgot to mention: another benefit of bounds checking in
         | the language itself is optimization: when humans bounds-check,
         | the compiler needs to recognize human patterns to safety remove
         | or merge redundant bounds checks. When the language specifies
         | its own bounds checks, the compiler knows exactly what they'll
         | look like and can optimize accordingly. Modern optimizing
         | compilers are _very_ good at detecting human-written bounds,
         | but a fully compiler-controlled optimization is going to beat a
         | human-augmented optimization  >95% of the time.
        
         | pton_xd wrote:
         | > If everything is done with arrays and indices (apparently
         | from looking at the code:
         | (https://gitlab.collabora.com/dwlsalmeida/for-
         | upstream/-/blob...) it seems like Rust's borrow checker doesn't
         | really help at all
         | 
         | That's the Rust "secret" in many high-performance computing
         | applications, like games. You write everything using arenas and
         | handles, which effectively side-steps the borrow checker.
         | Everyone sane has been doing that for decades in C and C++.
         | 
         | Obviously Rust has far better guarantees in general, but the
         | pervasive usage of this borrow checker anti-pattern suggests
         | that perhaps we need a more comprehensive way to guarantee
         | memory safety.
        
           | giovannibonetti wrote:
           | I remember Zig is able to convert arrays of struct to struct
           | to arrays in compile time [1], which effectively sidesteps
           | all the need for the user to worry about array indices and
           | having them in the right range.
           | 
           | https://zig.news/kristoff/struct-of-arrays-soa-in-zig-
           | easy-i...
        
             | aw1621107 wrote:
             | > which effectively sidesteps all the need for the user to
             | worry about array indices and having them in the right
             | range.
             | 
             | How does converting AoS to SoA eliminate the need to worry
             | about array indices? If you have an array of structs with N
             | entities and convert that to a struct of arrays each array
             | would also have N entities, so out-of-bounds accesses in
             | one would be equally out-of-bounds in the other.
        
         | HackerThemAll wrote:
         | > So I'm not sure
         | 
         | Then sit back and watch others try it out. Let's see what comes
         | out of it, the Rust code can always be discarded if it turns
         | out to be inferior to the current implementation. It's not like
         | breaking a glass, we can reverse it.
        
           | pjc50 wrote:
           | Oh, they're not afraid of this failing, they're afraid of it
           | succeeding.
        
         | gattr wrote:
         | As already mentioned, bounds checks won't necessarily cause
         | that much overhead. When I rewrote my small image processing
         | library from C to Rust ([1]), I only had to use unchecked array
         | access in one hot loop to get overall performance equivalent to
         | C code.
         | 
         | [1] https://github.com/GreatAttractor/libskry_r
        
         | pornel wrote:
         | Vulnerabilities are where programmers thought the bounds checks
         | were redundant, and they weren't. This overprotectiveness by
         | default turns out to be useful:
         | 
         | https://github.com/rust-fuzz/trophy-case
         | 
         | Look how many of the crashes are panics and unwraps that could
         | have been buffer overflows or wild pointer derferences
         | otherwise. And there are plenty of arithmetic overflows that
         | are much less dangerous when they can't cause out of bounds
         | access.
         | 
         | The code in this particular codec seems to be a direct
         | translation of C code. Idiomatic Rust code would use iterators
         | more, which work better for optimizing out redundant checks.
         | It's easily fixable.
        
       | xmichael909 wrote:
       | if only there were a standalone player for linux, osx or
       | windows...
        
       | trimbo wrote:
       | > This patch ports the VP9 library written by Andrzej into Rust
       | as a proof-of-concept ... > this library will not need any
       | further updates for the same reason we have never touched its C
       | counterpart
       | 
       | What was being proof-of-concepted? What's the metric of success
       | for introducing Rust in a case where no one was doing any sort of
       | active work anyway?
        
         | uo21tp5hoyg wrote:
         | From what I understand the need for a "proof-of-concept" comes
         | from the fact these codecs/drivers often using memory unsafe
         | "tricks" to increase performance and therefore need to be
         | properly tested on a myriad of hardware to make sure the
         | conversion to memory safe code isn't a significant performance
         | impact.
        
       | vladimirralev wrote:
       | The VP9 codec I imagine would heavily benefit from SIMD? Just
       | SIMD should massively outperform any advantage gained by putting
       | this into the kernel. Why does the kernel need an unoptimized
       | codec?
        
         | ZeroCool2u wrote:
         | SIMD can be pretty architecture specific. For example, does the
         | CPU support AVX-512 or SSE3? So, you have to have a few code
         | paths if you're going to support a wide variety of hardware.
         | 
         | I don't have an answer to your question it just occurs to me
         | that maybe the Linux kernel doesn't allow a lot of SIMD for
         | this reason or maybe they require a fall back/slow code path to
         | be available if you're submitting a patch that includes SIMD
         | operations?
         | 
         | Curious if anyone knows the answer to this.
        
           | __s wrote:
           | Kernel avoids SIMD because it doesn't have a kernel keeping
           | its registers coherent https://stackoverflow.com/a/46677815
        
             | ZeroCool2u wrote:
             | Thank you for the answer! Very interesting.
        
           | brundolf wrote:
           | Fwiw, Rust's standard library has a cross-platform Simd
           | abstraction: https://doc.rust-lang.org/std/simd/index.html
           | 
           | It's nightly-only for now, but I've used it and it's lovely.
           | I've added Simd to projects that I never otherwise would
           | have, just because this made it so easy and accessible
        
             | ZeroCool2u wrote:
             | That is metal af ;)
        
         | pavon wrote:
         | I don't think that the driver implements VP9 in software. It
         | uses hardware acceleration to perform the actual encode/decode,
         | but is managing preparing of coefficients and DMA of data
         | buffers in the driver.
        
       | briantakita wrote:
       | I have a kernel: BUG: kernel NULL pointer dereference, address:
       | 0000000000000027 error on the latest kernel v6.7.6
       | 
       | I wonder if this conversation to Rust is bringing up these sort
       | of issues...having to deal with a C/Rust hybrid. Or is this
       | another reason why Rust should be used for the kernel.
       | 
       | https://bbs.archlinux.org/viewtopic.php?id=293042
        
       | renewiltord wrote:
       | Interesting. Use case is so that there is some code exercising
       | this path.
       | 
       | While the algorithm is cool, of course, the code itself is quite
       | straightforward as it is. An interesting thing I didn't know is
       | that the kernel code avoids recursion (this one has a depth
       | parameter to prevent recursion past some point).
       | 
       | I can see why this was picked as a candidate. Straightforward
       | implementation. Good test suite. Self-contained and not a moving
       | target.
       | 
       | Lots of the code is using the coefficients array against the
       | framecontext but I don't know how to enforce the bounds invariant
       | that the two are the same. For the fixed size arrays I could see
       | how it's done, but otherwise it seems like the bounds checker
       | will trigger. But I'm reading on my phone so maybe that's just a
       | misread.
       | 
       | Not a big performance hit even if it does. But perhaps it
       | doesn't, and I'd be curious why.
        
       | revskill wrote:
       | I think the inferior of Rust vs C could be "over checking" ?
        
       ___________________________________________________________________
       (page generated 2024-02-28 23:01 UTC)