[HN Gopher] Rewrite the VP9 codec library in Rust
___________________________________________________________________
Rewrite the VP9 codec library in Rust
Author : marcodiego
Score : 173 points
Date : 2024-02-28 13:39 UTC (9 hours ago)
(HTM) web link (lore.kernel.org)
(TXT) w3m dump (lore.kernel.org)
| dureuill wrote:
| > Much has been spoken at various occasions about drivers and I
| feel that the consensus is to wait for now.
|
| Interesting, I did not follow that development. I thought the
| plan was to use Rust for some out-of-tree/optional drivers. What
| changed?
| charcircuit wrote:
| It's already used for an intree driver, but a lot of the
| infrastructure for making Rust drivers was not be upstreamed.
| Macha wrote:
| There was opposition to building interfaces for toy drivers and
| the last thread had suggestions at a rust filesystem interface
| rebuffed by saying they should try rewrite the ext2 driver in
| rust to prove that it was usable for real filesystems rather
| than toy ones. I'd guess similar thought processes fuelled this
| decision.
| Havoc wrote:
| Seems like a reasonable way forward.
| kramerger wrote:
| Android started this way too with media libraries being updated
| first.
| cyber_kinetist wrote:
| > These algorithms use the data received from userspace in order
| to index into a lot of arrays and thus benefit from Rust's memory
| safety.
|
| If everything is done with arrays and indices (apparently from
| looking at the code:
| (https://gitlab.collabora.com/dwlsalmeida/for-upstream/-/blob...)
| it seems like Rust's borrow checker doesn't really help at all,
| and the only thing Rust really does for you is just bounds checks
| on arrays (with additional runtime overhead)... So I'm not sure
| how this can really improve the state of things compared to C
| with the equivalent bounds checks.
| phkahler wrote:
| >> So I'm not sure how this can really improve the state of
| things compared to C _with the equivalent bounds checks_.
| [emphasis added]
|
| Like all things Rust, you _can_ do the same thing in C but that
| requires extra effort and more source code. If Rust will add
| bounds checking for you automatically, that 's a step up from
| C.
| bayindirh wrote:
| If you don't need to do bounds checking, and doing it anyway,
| then that's a step down from C.
| mcfedr wrote:
| But isn't the point that the number of times a c programmer
| thought they didn't need bounds checks and reality are very
| different. Also off by one errors and such. Rust won't let
| you make these mistakes
| bayindirh wrote:
| I personally don't think that a programming language
| should save me from myself unless I explicitly ask for
| it. For every case I first prove to myself that I can get
| away without bounds checking, otherwise if I'm in doubt,
| I put it in and profile.
|
| I think Rust is a nice language, but when it's presented
| as an "antidote" to "evils of C/C++", and is "indeed made
| to kill those" I lose my interest.
|
| Also, while I think learning Rust is worthwhile, I think
| promoting Rust's barriers as saviors are funny. We say
| that Apple's against general purpose computing with all
| its walled gardens. Then Rust is against "General purpose
| programming". I want to be able to write programs which
| crash and burn like meteors entering the atmosphere,
| because this allows me to understand the hardware under
| me. Promoting a walled programming language as "the one
| and only" is wrong.
|
| Why should I embrace a programming language as one and
| only if that doesn't allow me to do what I want with my
| computer?
|
| Edit: No, unsafe doesn't count, because it doesn't remove
| _all_ checks.
| chatmasta wrote:
| > Edit: No, unsafe doesn't count
|
| Why not? You can basically write C with FFI and unsafe.
| In fact that's how many "port to Rust" projects start.
|
| It seems directly analogous to the fact that you can
| disable SIP on macOS to escape their walled garden, into
| an "I know what I'm doing" mode.
| bayindirh wrote:
| Because the borrow checker is always watching. Even
| you're in an unsafe block. It's that paranoid.
| bombela wrote:
| Unsafe unlocks dereferencing raw pointers, which aren't
| subject to the borrow checker.
| pjc50 wrote:
| Memory safety has become important enough to be a matter
| of national security:
| https://www.whitehouse.gov/oncd/briefing-
| room/2024/02/26/pre...
|
| > I want to be able to write programs which crash and
| burn like meteors entering the atmosphere
|
| Sure, fine, whatever, just as long as it's not used by
| anyone else. Production code needs to be held to
| standards.
| bayindirh wrote:
| I'm a big proponent of choosing the right tool for the
| job at hand. On the other hand, I'm a big opponent of
| throwing stones because of emotions.
|
| C/C++ can be made memory safe. You need to use a couple
| of data structures and need to be a little bit more
| vigilant and make periodic tests. If the developers can't
| bother to learn them, that's fine.
|
| However, saying something is impossible and being adamant
| about that without research is harmful as it is. I
| sometimes say things about Rust, people correct me, and I
| learn. Some people see the evidence contrary to their
| beliefs and get triggered because they were wrong in the
| first place. This is what I'm bothered about.
|
| I don't use C or C++ only. I'm not against Rust either.
| I'm against positioning of Rust and C/C++ from Rust
| community's perspective. That's all.
|
| Lastly, I'm not a proponent of reckless coding either.
| Totally contrary. I take pride in writing robust code.
| However, I want to be able to write code which breaks on
| purpose to see what hardware does, to understand the
| failure modes or gotchas of the architecture I'm running
| on.
| Too wrote:
| > C/C++ can be made memory safe.
|
| In theory, in small toy projects or with a massive NASA
| budget - Yes. In any other project, with normal (aka too
| short) time constraints and average skilled developers,
| it's not.
|
| 35 years of trying, has proven us humans that, several
| times over.
| bayindirh wrote:
| I think it's much simpler than that, because I did it
| myself, on a HPC scale high performance materials
| simulation code. Is it simple? Yes. It's easy? No,
| because you need to design for that and be mindful during
| implementation (const correctness, guarantees by design,
| valgrind tests, unit sealing, etc.). I think it can be
| made much simpler with smart pointers, etc. if speed is
| not that important.
|
| We push humans too much to develop things fast. C++ is
| not very conductive to that, yet it's the only tool which
| works in some cases.
|
| I won't retype my views about Rust because it's all over
| this thread. Just I'll tell that I'm against vilifying
| C/C++ as evil because they can be held wrong. I believe
| things can and shall be able to held wrong. Knowing
| failure modes and shortcomings is a plus. Because you can
| then hold dangerous things right, and appreciate things
| which promote holding things right.
| shakow wrote:
| > C/C++ can be made memory safe
|
| Famous last words.
| pjc50 wrote:
| > C/C++ can be made memory safe
|
| .. but it's much harder to _prove_ your work is memory
| safe. sel4 is memory safe C, for example. The safety is
| achieved by a large external theorem prover and a synced
| copy written in Haskell. https://github.com/seL4/l4v
|
| Typechecks are form of proof. It's easier to write
| provably safe Rust than provably safe C because the
| proofs and checker are integrated.
| bayindirh wrote:
| I have never claimed that it's easy. I said doing it is
| simple, and prone to errors, and needs to be verified
| either by design or tests, ideally both. I'm aware of
| seL4. They're doing an amazing job of verifying what they
| have written.
|
| However, why I'm is so adamant is because I have done
| something similar myself, albeit in a weaker form, but at
| least I verified that every part of my code is not doing
| funny things by vigorously testing it in valgrind in
| different scenarios both in units and end to end.
|
| Again, I'm not against Rust. I'm against vilifying
| languages.
| dmos62 wrote:
| Which do you think is more efficient in terms of CPU
| resources, user-time, developer-time, money? Both
| development and the final program taken into account. On
| average.
| bayindirh wrote:
| Depends on what you're building. If the code you're
| running is not resource intensive and relatively short-
| running, developer time is more expensive.
|
| However, if your program is long running and requires
| high performance (number crunching, simulations, HPC in
| general), %1 difference in tight loops affect your total
| runtime by hours, if not days. Then, user-time is much
| more expensive, hence you need more speed. Also, in this
| case you're maxing out ~100 servers in terms of power and
| TDP, so a shorter runtime has a bigger impact on your
| energy bill, and global warming.
|
| If I can run more users' code for the same power and time
| budget, and conclude more research, developer time to be
| damned. They can spend as much as time they like.
|
| Tech people tend to say developers are expensive and
| hardware is cheap. No it's not, if you're using it at its
| max capacity.
| shakow wrote:
| > %1 difference in tight loops
|
| I'd love to see how you can reach this number.
|
| I hear a lot of people complaining about supposedly
| degraded performances due to bound checking; but IME,
| even on number crunching HPC code, I have never been able
| to get a signal greater than noise regarding bound
| checks, which can be explained by: (i) the prediction
| pipeline doing its job, (ii) iterators eliding bound-
| checks at compile time, (iii) bound checking being
| dwarfed by the actual computations within the tight loop.
|
| Remember to measure what you optimize for first before
| going on an intuition.
| bayindirh wrote:
| Disclaimer: I'm an HPC admin and both develop code on
| these things and manage them.
|
| The code I have written was doing ~1.7M iterations per
| core, per second when I implemented it w/o bounds
| checking and locks. It was designed to be fast from the
| start, so I never tried bounds checking.
|
| I'm restarting the work on the code soon-ish, so I'll be
| writing a benchmark module for the thing. If you can
| provide me an e-mail address, I'll implement both, do the
| tests, and provide you the results, and we can discuss on
| it, too.
|
| Also, I'll see whether GCC-14 (or whatever comes next) is
| intelligent enough to eliminate bounds checks in these
| cases.
|
| The following part of the code [0], was running with much
| higher iteration numbers inside the "tight loop", but I
| never benchmarked it, because its iteration count is both
| inconsistent (due to adaptive nature), and was
| meaningless in the bigger picture (where 1.7M/sec/core
| number comes in).
|
| That code was never optimized before measurement, and the
| biggest bottleneck was memory controller at the end. I
| needed to reorder matrices to pass that hurdle, yet the
| Ph.D. was complete, and speed was adequate, so we didn't
| bother, TBH.
|
| [0]:
| https://journals.tubitak.gov.tr/elektrik/vol29/iss2/45/
| sunshowers wrote:
| I do think, when it comes to professional-grade software,
| programming languages should save programmers from
| themselves -- even the ones who don't want to be saved.
| jeroenhd wrote:
| Very little Rust code actually does all the safety checks
| that you would expect a debug build of that same program
| to do, especially in the kernel.
|
| You can write safe rust (check the Option<T> returned by
| vec.get(i)) but code like `p[0] =
| update_prob(d[0].into(), p[0].into()) as u8;`
| (https://gitlab.collabora.com/dwlsalmeida/for-
| upstream/-/comm...) can panic at three different places.
| Such a panic would become a kernel oops, which wouldn't
| be the end of the world but it would probably kill
| whatever program was trying to decode video. With
| additional optimisation options, the bounds checking may
| even be omitted entirely.
|
| Rust does generate more accurate bounds checking warnings
| thanks to all the metadata it has, but that should not be
| solely relied upon. Rust will let you make those
| mistakes, but only sometimes, not usually like in old C
| or C++.
|
| I think it's important to know the difference, because
| feeling invulnerable to these bugs may lead you to write
| buggy code because you stopped thinking about common C
| bugs entirely.
|
| Also worthy of note is that because of a compiler bug,
| it's possible to leak memory and cause other weird memory
| bugs in perfectly safe Rust at the moment. It involves
| messing with lifetimes and semi-unsafe code so I doubt
| that bug would just sneak in, but the language doesn't
| make your code completely bullet proof.
| KerrAvon wrote:
| > can panic at three different places
|
| Remember that some of the point here is that it _will_
| reliably kill the program if that happens; in C, you
| might be silently reading or writing to the wrong address
| 3 times.
| Mateon1 wrote:
| The `single_ref` field is a fixed-size array in both of
| the objects referenced in this line, so this line can't
| panic, and no bounds checks are involved (since the
| compiler sees the index < length at compile time and
| doesn't even need to emit one -- although I think it
| still does, and it's LLVM that gets rid of it actually)
|
| Causing memory leaks is possible in safe Rust even
| without any arcane invocations, you can construct a cycle
| of Rc<T> counted objects. There's even a perfectly safe
| Box::leak in the standard library that gives you a
| &'static reference to any object by leaking it.
| Preventing leaks is outside of the scope of Rust's safety
| system.
| estebank wrote:
| If you can _assure_ that bounds checks are not necessary
| (either by construction, because it 's a statically sized
| array, or by runtime check because you do a length check
| once at runtime), then doing those same things will tell
| rustc enough to know that bounds checks aren't
| needed[1][2]. If you _think_ you don 't need bounds checks,
| but can't communicate that in code, such as with an
| assertion (or if rustc had a bug that misses those checks,
| unlikely but could happen), then yes, you'll end up with
| bounds checks unless you use get_unchecked in an unsafe
| block.
|
| I'm failing to see how this is an onerous difference.
|
| 1: https://nnethercote.github.io/perf-book/bounds-
| checks.html
|
| 2: https://github.com/Shnatsel/bounds-check-cookbook/
| Aurornis wrote:
| > If you don't need to do bounds checking, and doing it
| anyway, then that's a step down from C.
|
| The Rust compiler tries to optimize away unnecessary bounds
| checks.
|
| In practice, it works well. The real-world cost of Rust
| bounds checking isn't very significant in most benchmarks,
| aside from some synthetic micro-benchmarks designed to
| emphasize the issue.
|
| If you come across a hot loop in Rust where bounds checking
| is an actual overhead, you can manually optimize it out if
| you so desire. It's important to really check first,
| though, because it's often surprising that it makes such
| little difference or has been optimized out already.
| jcranmer wrote:
| Who would you rather trust to do bounds checking, computers
| who are zealously good at doing what they are told to do to
| the point of absurdity or humans who are notoriously bad at
| following rigorous procedures? I mean, we've seen from
| several other fields that the only way to get the safety
| standards of human procedures up is to introduce checklists
| and get people to rigorously follow them [1].
|
| If you want to elide bounds checks for performance reasons,
| which is easier: manually verifying for yourself that every
| single array access is guarded by a bounds check somewhere
| and ensuring that no subsequent code changes break this
| verification, or getting the compiler to prove for you that
| every bounds check can be safely elided?
|
| [1] And of course we still have several issues in fields
| like medicine where practitioners refuse to adopt this
| methodology because they find checklists to be an insult to
| their intelligence.
| bayindirh wrote:
| > getting the compiler to prove for you that every bounds
| check can be safely elided?
|
| I'd prefer to delegate that where compiler if it can do
| that for that piece of code at hand. I've written about a
| case where it'd be very hard for a compiler to eliminate
| a bounds check because the guarantees are made elsewhere
| in the code.
|
| On the other hand, I'd rather add my bounds check
| voluntarily (it's very simple in C++ vectors for example.
| use ".at()" instead of "[]", that's all), because I
| generally design my code in a way which doesn't need
| bounds checks by failing hard and early at places where I
| build/fill the arrays/vectors and prone to malformation.
| So, you need to be well-formed to pass these checks, and
| these data structures are not modified _ever_ down the
| pipe. If they are modified, they 'll be bound checked of
| course.
|
| What I'm saying is, I'm not naive enough to believe that
| I'm perfect, but I'm not naive enough to believe that
| compiler is perfect, either. So, I do my part, and leave
| the parts I can't be sure to the compiler.
|
| I'm not an hard-liner. I just want finer control on my
| code, and take full responsibility if it crashes and
| burns in a way it shouldn't, so plan and implement
| accordingly.
| bayindirh wrote:
| I also don't think that we should be writing everything in Rust
| blindly. If you can guarantee that you won't be accessing
| outside of an array before entering a critical section, not
| having bounds checking is actually a plus.
|
| I have a similar code where I can guarantee that I won't be
| ever accessing outside the boundaries of arrays and vectors,
| and that gives me great performance boost.
| __s wrote:
| You can skip bounds checks in Rust using `unsafe`
| ParetoOptimal wrote:
| > If you can guarantee that you won't be accessing outside of
| an array before entering a critical section
|
| That is a huge IF though.
|
| People getting those things wrong either initially or a
| refactor invalidating this invariant is a huge source of
| bugs.
| bayindirh wrote:
| Yes, I'm aware. However in most cases I know the size of
| the array in the beginning and it's not modified by any
| means (which is guarded by const correctness throughout the
| code).
|
| If I can't guarantee that, I use vectors and ".at()", which
| does bound checking at runtime.
|
| Generally I'm developing solo, so people mucking what I do
| is very rare, however I don't blindly believe myself
| either.
| mlsu wrote:
| It makes sense that you don't see the value of Rust if
| you spend most of your time developing solo.
|
| The value of these checks is not just to reduce bugs in
| production code. It's nice that that happens, but that's
| not even the primary value of the borrow checker. The
| primary value is that having these checks (lifetimes,
| bounds checks, -- everything that makes Rust annoying to
| write) makes refactoring on a shared codebase
| significantly easier. This means that you are not
| introducing bugs in a refactor, so you can refactor
| faster, which means you can ditch bad architectures
| sooner, which compounds and saves enormous amounts of
| developer time and $$. And it's a knock-on effect that
| _increases_ in value as the team grows larger.
|
| Keeping track of lifetimes and bounds for a solo dev is
| quite easy as you say. Keeping track of lifetimes and
| bounds for the other dozen devs on my team? In all
| external dependencies? Extremely difficult. Impossible in
| a large codebase, actually, given the number of mem
| safety bugs that appear even in mature C codebases. It's
| collaboration that is the source of these bugs, that
| these checks work to mitigate.
|
| I sort of wondered why C did not have a package
| management system, until I started working on a large C
| codebase. There is a reason Rust has cargo and C does
| not; it has nothing to do with whether or not someone
| decided to write cargo and everything to do with Rust's
| language features.
| bayindirh wrote:
| That's a different and refreshing perspective to look
| from, thanks.
|
| I'm aware that my view is somewhat biased because I'm a
| solo dev which works on small to large projects by
| myself, and things get exponentially harder as more
| people mangle the same code base. That's very true.
|
| What I was trying to highlight is basically neither C or
| C++ are "free for all without recourse". Esp. C++ has
| many features, but they're opt-in, where in Rust they're
| opt-out.
|
| Also many newer developers don't understand that
| compilation used to take way longer on olden times even
| with simpler languages and compilers. Hence, the thing
| Rust doing today was "impossible" in the older days.
|
| For the last time, I think Rust is a nice language, and I
| won't be annoyed by its limitations. What bothers me with
| no end is vilifying other languages and pushing rust as a
| silver bullet and savior. Other than that, Rust is just
| another tool which works for some things very well, and
| not very well for others.
| tinco wrote:
| The bounds checks in Rust are implicit (i.e. part of the std
| implementation), and get removed by the compiler if they're
| unnecessary. I think that's a pretty great improvement over the
| state of things in C.
|
| And if you are convinced you don't need a bounds check and the
| compiler does not remove it you can explicitly remove the
| bounds check, provided you mark the access as unsafe. So Rust
| is a strict improvement over C in this regard.
| bayindirh wrote:
| Asking because of interest: How compiler decides that it
| needs BC or not?
| tinco wrote:
| Through value tracking. It's actually LLVM that does this,
| GCC probably does it as well, so in theory explicit bounds
| checks in regular C code would also be removed by the
| compiler.
|
| How it works exactly I don't know, and apparently it's so
| complex that it requires over 9000 lines of C++ to express:
|
| https://github.com/llvm/llvm-
| project/blob/main/llvm/lib/Anal...
| jcranmer wrote:
| > How it works exactly I don't know
|
| The idea is pretty simple. You can build a list of known
| facts based on control flow, explicit __builtin_assumes,
| and undefined behavior relations. For example, if you've
| got this code: if (x < N) { // In
| this block, we know that x < N } else { //
| ... and in this block we know that x >= N! }
|
| And on top of that, we can do some basic algebra. If we
| know that x < N and N < 5, then we can infer that x < 5.
| So if we see a comparison x < 5, we can then rewrite that
| to true.
|
| > and apparently it's so complex that it requires over
| 9000 lines of C++ to express
|
| The two main reasons for that is that a) there is a lot
| of rules covering cases like "we know the result of
| count_leading_zeroes can be no more than the number of
| bits in an integer" and so forth, and b) this is doing a
| lot more logic than just tracking integer comparisons:
| there's tracking known-bits of integers, maximum possible
| value, floating-point comparisons, pointer object
| references.
| jakubadamw wrote:
| > And on top of that, we can do some basic algebra. If we
| know that x < N and N < 5, then we can infer that x < 5.
| So if we see a comparison x < 5, we can then rewrite that
| to true.
|
| Even better: if `x` and `N` are integers, then we can
| infer that `x` < 4. :)
| KMag wrote:
| I think you mean x <= 4, right?
| karamanolev wrote:
| No, x < 4, within integers. If N < 5, then N <= 4. If x <
| N, then x < 4.
| orlp wrote:
| An example of when it's not necessary:
| for i in 0..v.len() { v[i] += 1; }
|
| Because the compiler can prove that i < v.len() due to the
| loop condition, the bounds check gets eliminated.
| bayindirh wrote:
| Thanks!
|
| My case is a bit outside that, so I don't think the
| compiler can deduce that. I have a file format which
| tells me the expected number of fields about a category,
| and I throw an error & abort if the number is not exactly
| that.
|
| Also, these data structure fields are always sent in as
| const variables, so they are never modified (making them
| "sealed" in a sense), hence I don't need to bounds check
| on arrays and vectors storing them.
| ihattendorf wrote:
| That sounds trivial enough that the compiler would remove
| the bounds checks, assuming I'm understanding correctly
| that you have a condition that validates the number of
| fields at some point before an invalid access would
| occur.
|
| But if it's possible for someone to muck with the file
| contents and lie about the number of fields which would
| cause a bounds error, that's exactly what bounds checking
| is supposed to avoid. So either bounds checks will be
| removed, or they're necessary.
| bayindirh wrote:
| I think it won't be able to because the creation of these
| data structures and consuming them is 3 files apart.
|
| > But if it's possible for someone to muck with the file
| contents and lie about the number of fields.
|
| You can't. You can say you'll have 7, but provide 8. But
| as soon as I encounter the 8th one during parsing,
| everything aborts. Same for saying 7 and providing 6. If
| the file ends after parsing 6th one, I say there's an
| error in your file and abort. Everything has to checkout
| and have to be sane to be able to start. Otherwise you'll
| get file format errors all day.
|
| The rest of the pipeline is unattended completely. It's
| bona fide number crunching (material simulation to be
| exact), so speed is of the essence. Talking about >1.5
| million iterations per second per core.
| aw1621107 wrote:
| > I think it won't be able to because the creation of
| these data structures and consuming them is 3 files
| apart.
|
| Strictly speaking I don't think the distance between
| creation and consumption matters. It all comes down to
| what the compiler is able to prove at the site where the
| bounds check may go.
|
| For example, if you're iterating over a Vec using `for i
| in 0..vec.len() { ... }` then the amount of code between
| the creation and consumption of that Vec doesn't matter,
| as the compiler has all the information it needs to
| eliminate the bounds check right there.
| bayindirh wrote:
| If that's a vector which you basically iterate, yes.
| However, thinking what I developed, I have offset or
| formula determined indexes I hit constantly, and not
| strictly in a loop. They might prove harder. I need to
| implement these and see what the compiler(s) do in these
| cases.
|
| The code I have written is a 3D materials software which
| works in >(3000x3000) matrices, and I do a lot of tricks
| with these to what I get from them. However, since
| everything creating them are validated during their
| creation, nothing breaks and nothing requires checks.
| Because most of the data is read-only (and forced by
| const correctness throughout the code).
| aw1621107 wrote:
| > However, thinking what I developed, I have offset or
| formula determined indexes I hit constantly, and not
| strictly in a loop. They might prove harder.
|
| I think at that point it'll come down to the compiler's
| value range analysis as well as how other parts of the
| program affect inlining/etc. Hard to say exactly what
| will happen.
| faitswulff wrote:
| shnatsel wrote a post on bounds checking performance
| implications in Rust. Money quote:
|
| > The real-world performance impact of bounds checks is
| surprisingly low.
|
| > The greatest impact I've ever seen on real-world code from
| removing bounds checks alone was *15%,* but the typical gains
| are in *1% to 3% range,* and even that only happens in code
| that does a lot of number crunching.
|
| > You can occasionally see greater impact (as we'll see
| soon!) if removing bounds checks allows the compiler to
| perform other optimizations.
|
| > Still, performance of code that's not doing large amounts
| of number crunching will probably [not be impacted by bounds
| checks](https://blog.readyset.io/bounds-checks/) at all.
|
| It is, of course, not universally applicable, so read the
| post for full details: https://shnatsel.medium.com/how-to-
| avoid-bounds-checks-in-ru...
| woodruffw wrote:
| > So I'm not sure how this can really improve the state of
| things compared to C with the equivalent bounds checks.
|
| The simplest answer here is that a compiler-introduced bounds
| check is almost always better than a human one. Humans make
| bounds errors, compilers generally don't.
|
| The longer answer is that the current state of the code does
| not guarantee its future state. The fact that bounds checks
| constitute the current majority of safety guardrails does not
| mean that future refactors won't benefit from Rust's temporal
| memory safety guarantees. Or more abstractly: it's easier to
| perform _safe_ refactors when your safety properties compose
| natively, rather than having to bolt another layer of checks
| onto pre-existing language that doesn 't support them natively.
|
| Edit: Forgot to mention: another benefit of bounds checking in
| the language itself is optimization: when humans bounds-check,
| the compiler needs to recognize human patterns to safety remove
| or merge redundant bounds checks. When the language specifies
| its own bounds checks, the compiler knows exactly what they'll
| look like and can optimize accordingly. Modern optimizing
| compilers are _very_ good at detecting human-written bounds,
| but a fully compiler-controlled optimization is going to beat a
| human-augmented optimization >95% of the time.
| pton_xd wrote:
| > If everything is done with arrays and indices (apparently
| from looking at the code:
| (https://gitlab.collabora.com/dwlsalmeida/for-
| upstream/-/blob...) it seems like Rust's borrow checker doesn't
| really help at all
|
| That's the Rust "secret" in many high-performance computing
| applications, like games. You write everything using arenas and
| handles, which effectively side-steps the borrow checker.
| Everyone sane has been doing that for decades in C and C++.
|
| Obviously Rust has far better guarantees in general, but the
| pervasive usage of this borrow checker anti-pattern suggests
| that perhaps we need a more comprehensive way to guarantee
| memory safety.
| giovannibonetti wrote:
| I remember Zig is able to convert arrays of struct to struct
| to arrays in compile time [1], which effectively sidesteps
| all the need for the user to worry about array indices and
| having them in the right range.
|
| https://zig.news/kristoff/struct-of-arrays-soa-in-zig-
| easy-i...
| aw1621107 wrote:
| > which effectively sidesteps all the need for the user to
| worry about array indices and having them in the right
| range.
|
| How does converting AoS to SoA eliminate the need to worry
| about array indices? If you have an array of structs with N
| entities and convert that to a struct of arrays each array
| would also have N entities, so out-of-bounds accesses in
| one would be equally out-of-bounds in the other.
| HackerThemAll wrote:
| > So I'm not sure
|
| Then sit back and watch others try it out. Let's see what comes
| out of it, the Rust code can always be discarded if it turns
| out to be inferior to the current implementation. It's not like
| breaking a glass, we can reverse it.
| pjc50 wrote:
| Oh, they're not afraid of this failing, they're afraid of it
| succeeding.
| gattr wrote:
| As already mentioned, bounds checks won't necessarily cause
| that much overhead. When I rewrote my small image processing
| library from C to Rust ([1]), I only had to use unchecked array
| access in one hot loop to get overall performance equivalent to
| C code.
|
| [1] https://github.com/GreatAttractor/libskry_r
| pornel wrote:
| Vulnerabilities are where programmers thought the bounds checks
| were redundant, and they weren't. This overprotectiveness by
| default turns out to be useful:
|
| https://github.com/rust-fuzz/trophy-case
|
| Look how many of the crashes are panics and unwraps that could
| have been buffer overflows or wild pointer derferences
| otherwise. And there are plenty of arithmetic overflows that
| are much less dangerous when they can't cause out of bounds
| access.
|
| The code in this particular codec seems to be a direct
| translation of C code. Idiomatic Rust code would use iterators
| more, which work better for optimizing out redundant checks.
| It's easily fixable.
| xmichael909 wrote:
| if only there were a standalone player for linux, osx or
| windows...
| trimbo wrote:
| > This patch ports the VP9 library written by Andrzej into Rust
| as a proof-of-concept ... > this library will not need any
| further updates for the same reason we have never touched its C
| counterpart
|
| What was being proof-of-concepted? What's the metric of success
| for introducing Rust in a case where no one was doing any sort of
| active work anyway?
| uo21tp5hoyg wrote:
| From what I understand the need for a "proof-of-concept" comes
| from the fact these codecs/drivers often using memory unsafe
| "tricks" to increase performance and therefore need to be
| properly tested on a myriad of hardware to make sure the
| conversion to memory safe code isn't a significant performance
| impact.
| vladimirralev wrote:
| The VP9 codec I imagine would heavily benefit from SIMD? Just
| SIMD should massively outperform any advantage gained by putting
| this into the kernel. Why does the kernel need an unoptimized
| codec?
| ZeroCool2u wrote:
| SIMD can be pretty architecture specific. For example, does the
| CPU support AVX-512 or SSE3? So, you have to have a few code
| paths if you're going to support a wide variety of hardware.
|
| I don't have an answer to your question it just occurs to me
| that maybe the Linux kernel doesn't allow a lot of SIMD for
| this reason or maybe they require a fall back/slow code path to
| be available if you're submitting a patch that includes SIMD
| operations?
|
| Curious if anyone knows the answer to this.
| __s wrote:
| Kernel avoids SIMD because it doesn't have a kernel keeping
| its registers coherent https://stackoverflow.com/a/46677815
| ZeroCool2u wrote:
| Thank you for the answer! Very interesting.
| brundolf wrote:
| Fwiw, Rust's standard library has a cross-platform Simd
| abstraction: https://doc.rust-lang.org/std/simd/index.html
|
| It's nightly-only for now, but I've used it and it's lovely.
| I've added Simd to projects that I never otherwise would
| have, just because this made it so easy and accessible
| ZeroCool2u wrote:
| That is metal af ;)
| pavon wrote:
| I don't think that the driver implements VP9 in software. It
| uses hardware acceleration to perform the actual encode/decode,
| but is managing preparing of coefficients and DMA of data
| buffers in the driver.
| briantakita wrote:
| I have a kernel: BUG: kernel NULL pointer dereference, address:
| 0000000000000027 error on the latest kernel v6.7.6
|
| I wonder if this conversation to Rust is bringing up these sort
| of issues...having to deal with a C/Rust hybrid. Or is this
| another reason why Rust should be used for the kernel.
|
| https://bbs.archlinux.org/viewtopic.php?id=293042
| renewiltord wrote:
| Interesting. Use case is so that there is some code exercising
| this path.
|
| While the algorithm is cool, of course, the code itself is quite
| straightforward as it is. An interesting thing I didn't know is
| that the kernel code avoids recursion (this one has a depth
| parameter to prevent recursion past some point).
|
| I can see why this was picked as a candidate. Straightforward
| implementation. Good test suite. Self-contained and not a moving
| target.
|
| Lots of the code is using the coefficients array against the
| framecontext but I don't know how to enforce the bounds invariant
| that the two are the same. For the fixed size arrays I could see
| how it's done, but otherwise it seems like the bounds checker
| will trigger. But I'm reading on my phone so maybe that's just a
| misread.
|
| Not a big performance hit even if it does. But perhaps it
| doesn't, and I'd be curious why.
| revskill wrote:
| I think the inferior of Rust vs C could be "over checking" ?
___________________________________________________________________
(page generated 2024-02-28 23:01 UTC)