[HN Gopher] OpenBSD: Malloc leak detection available in -current
       ___________________________________________________________________
        
       OpenBSD: Malloc leak detection available in -current
        
       Author : peter_hansteen
       Score  : 139 points
       Date   : 2023-04-17 07:49 UTC (15 hours ago)
        
 (HTM) web link (www.undeadly.org)
 (TXT) w3m dump (www.undeadly.org)
        
       | anthk wrote:
       | OpenBSD had malloc.conf settings to set more strict settings. For
       | instance, programs like detox might crash on complex input. I
       | wonder how could OpenBSD work with both enabled.
        
       | bruce343434 wrote:
       | Why duplicate the efforts of valgrind and address sanitizer?
        
         | avar wrote:
         | Both of those are very useful, but _slow_ , especially
         | valgrind.
         | 
         | I don't know OpenBSD, but presumably this is analogous to
         | glibc's relatively faster memory sanity checking.
         | 
         | For GCC and clang you'd also want LSAN for this, not ASAN. ASAN
         | is more accurate for edge cases, but much slower.
         | 
         | "Slow" here means that even for development the runtime can be
         | prohibitively expensive.
         | 
         | E.g. I regularly run git's test suite,
         | optimized/LSAN/ASAN/valgrind runtime is on the order of
         | 3m/15m/30m/24 hours. The basic glibc sanity checking only adds
         | a minute or two to the optimized run.
         | 
         | An advantage of any malloc based detection is also that you can
         | run it on any existing binary. Whereas the likes of LSAN and
         | ASAN require a custom debugging build (or to have that tracing
         | overhead present in your production build).
        
         | irdc wrote:
         | This is available by default and integrates with existing tools
         | (notably ktrace), making it easier to detect memory leaks on
         | all platforms OpenBSD supports.
        
         | Someone wrote:
         | In addition to what others said, Valgrind is GPL-licensed. That
         | conflicts with the OpenBSD copyright policy
         | (https://www.openbsd.org/policy.html), which says:
         | 
         |  _"The GNU Public License and licenses modeled on it impose the
         | restriction that source code must be distributed or made
         | available for all works that are derivatives of the GNU
         | copyrighted code.
         | 
         | While this may superficially look like a noble strategy, it is
         | a condition that is typically unacceptable for commercial use
         | of software. So in practice, it usually ends up hindering free
         | sharing and reuse of code and ideas rather than encouraging it.
         | As a consequence, no additional software bound by the GPL terms
         | will be considered for inclusion into the OpenBSD base
         | system."_
         | 
         | As to clang's Address Sanitizer, that's under the Apache
         | License v2.0 with LLVM Exceptions
         | (https://github.com/google/sanitizers/blob/master/LICENSE.TXT),
         | of which the same page says:
         | 
         |  _"The original Apache license was similar to the Berkeley
         | license, but source code published under version 2 of the
         | Apache license is subject to additional restrictions and cannot
         | be included into OpenBSD. In particular, if you use code under
         | the Apache 2 license, some of your rights will terminate if you
         | claim in court that the code violates a patent."_
        
           | alberth wrote:
           | I had no idea that OpenBSD didn't accept Apache 2 code
           | 
           | https://www.openbsd.org/policy.html
        
             | asveikau wrote:
             | They maintained apache 1.x in base for a long time. For a
             | small number of releases they switched to nginx and then i
             | think they wrote their own.
        
               | maximilianburke wrote:
               | Parent is referring to the Apache 2 license, not version
               | 2 of the Apache HTTPD server.
        
               | asveikau wrote:
               | It's not coincidental. It was early in the Apache 2.0
               | lifecycle that the Apache License 2.0 came about (google
               | says 2.0.49). OpenBSD's continued maintenance of a 1.x
               | fork from 2004-2011 or so was mainly about licensing.
        
               | nequo wrote:
               | I think that's what your parent is saying too. The last
               | non-Apache-2-licensed version of the Apache HTTP server
               | was version 1.3.x, and because the OpenBSD project did
               | not accept the Apache 2 license, they forked the Apache
               | HTTP server's 1.3.x code base.
        
               | whydoyoucare wrote:
               | Yes, httpd. https://man.openbsd.org/httpd.8
        
           | josephcsible wrote:
           | > _it is a condition that is typically unacceptable for
           | commercial use of software_
           | 
           | This is entirely untrue, though. Linux is GPL and it gets way
           | more use than any of the pushover-licensed BSDs do.
        
             | sbuk wrote:
             | They're describing libraries. By 'Linux', I'm assuming that
             | you're referring to the OS distributions and not the kernel
             | on it's own. And please drop the pejoratives.
        
               | ninjin wrote:
               | There is seemingly a movement or line of thought that
               | blames the success of "big tech" on permissive licenses.
               | You do not see it as much on Hacker News, but it
               | certainly has spread across IRC and many other forums.
               | What is even odder is that a subset of it uses alt-right
               | terminology to refer to both permissive licenses and
               | their proponents. It is all _very_ weird to experience as
               | someone that entered the FLOSS community in the early
               | 00s. Not to mention having had a personal conversation
               | with rms bemoaning the license schisms and him explicitly
               | expressing gratitude for the BSDs contributing to the
               | larger FLOSS movement.
        
               | sbuk wrote:
               | I'm beginning to see and hear it more from younger folks
               | --mainly free-speach absolutists. It is indeed a worrying
               | development.
        
             | stonogo wrote:
             | Minix uses a BSD-alike license and is built into every
             | Intel processor sold these days.
        
             | Someone wrote:
             | IMO, even if they hadn't included 'typically' to weaken
             | their claim, one counterexample doesn't make that
             | _entirely_ untrue.
             | 
             | Also, why use 'pushover'? OpenBSD has strong principles
             | that they're willing to give up things for, so implying
             | they're weak is derogatory and unfair.
        
               | josephcsible wrote:
               | I feel like Linux is so incredibly popular that it does
               | make it untrue, similar to the punchline of
               | <https://what-if.xkcd.com/49/>.
               | 
               | I use "pushover" because that's what the FSF uses:
               | <https://www.gnu.org/licenses/license-
               | compatibility.en.html>
               | 
               | > we call them "pushover licenses" because they can't say
               | "no" when one user tries to deny freedom to others.
        
               | sbuk wrote:
               | The _point_ that Theo de Raadt is making is that this
               | line of thinking is wrong*. Most of the FSF /GPL
               | advocates never seem to look at the _why 's_ around
               | people disliking their approach. See
               | https://lkml.org/lkml/2007/9/1/102 as an example. OpenBSD
               | helped the Linux project dual license the ath5k driver.
               | They were thanked by GPL advocates _illegally_ trying to
               | strip the BSD license after the fact. Leaves a bad taste,
               | and one that many of us won 't forget. You can have your
               | forced freedom.
               | 
               | * " _GPL fans said the great problem we would face is
               | that companies would take our BSD code, modify it, and
               | not give back. Nope--the great problem we face is that
               | people would wrap the GPL around our code, and lock us
               | out in the same way that these supposed companies would
               | lock us out. Just like the Linux community, we have many
               | companies giving us code back, all the time.
               | 
               | But once the code is GPL'd, we cannot get it back._"
               | 
               | EDIT: The pejoratives are why the FSF message gets lost
               | on an _awful_ lot of people. I admire what they are
               | trying to achieve, the manner in which they go about it
               | makes me not want to engage.
        
               | josephcsible wrote:
               | Theo de Raadt is the one who's wrong. When code goes
               | closed source, you're locked out and can't have it at all
               | anymore. When it goes GPL, you can still have it just by
               | going GPL yourself.
        
               | sbuk wrote:
               | > "When it goes GPL, you can still have it just by going
               | GPL yourself."
               | 
               | Which in some people's opinion, is no better. Just a
               | different set of handcuffs.
        
             | asguy wrote:
             | Linux makes intentional exceptions in the application of
             | the GPLv2 to accomplish this e.g. vDSOs. There's a reason
             | Linux refused to move to GPLv3.
             | 
             | Now apply these exceptions to user space, and you've
             | basically reinvented the LGPL.
        
             | NexRebular wrote:
             | Way more visible use, maybe. BSD is otherwise everywhere
             | (e.g. Sony).
        
           | jmclnx wrote:
           | True, but I think having this check in the kernel _may_ work
           | better than valgrind on Linux.
           | 
           | With that said, valgrind works great and I like its output,
           | time will tell if I will like the output from this OpenBSD
           | change.
        
           | notaplumber1 wrote:
           | OpenBSD begrudgingly made an exception for LLVM/Clang, after
           | vocal opposition to the re-licencing. It currently uses
           | LLVM/Clang 13 and has been making progress towards 15.
           | Licensing is not the problem here. Most of the sanitizers are
           | simply not enabled in the version shipped in base, and
           | require runtime libraries that have not been ported to
           | OpenBSD.
           | 
           | Valgrind exists in ports, but it is ancient and broken. It
           | does not play well with various security mitigations.
        
             | Someone wrote:
             | > OpenBSD begrudgingly made an exception for LLVM/Clang
             | [...] Licensing is not the problem here
             | 
             | If it isn't a problem, why do you say "begrudgingly"?
             | 
             | I think they are pragmatic but also do find it a problem.
             | Why else would they say _"source code published under
             | version 2 of the Apache license is subject to additional
             | restrictions and cannot be included into OpenBSD"_?
        
       | j16sdiz wrote:
       | To quota GNU libc manual:
       | 
       | > There is no point in freeing blocks at the end of a program,
       | because all of the program's space is given back to the system
       | when the process terminates.
       | 
       | https://www.gnu.org/software/libc/manual/html_node/Freeing-a...
       | 
       | I think many GNU tools just never free any memory.
       | 
       | For example, GCC :
       | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66339
       | 
       | -- edit: added GCC as example.
        
         | matvore wrote:
         | The last time I used a leak finder, a memory allocation which
         | still has a ptr to it is not a leak.
         | 
         | You are conflating unfreed memory with unreferencable memory.
        
           | TazeTSchnitzel wrote:
           | Reachability is one strategy for detecting leaks but it's not
           | the only one. Checking for unfreed memory is easier to
           | implement.
        
         | systems_glitch wrote:
         | "I run gmake and gcc, And I ain't never called malloc without
         | calling free."
        
         | colonwqbang wrote:
         | > There is no point in freeing blocks at the end of a program
         | 
         | ...unless you are trying to find memory leaks in your program.
         | In that case it would be very helpful if the program, and the
         | libraries it uses, were written to actively free all allocated
         | memory.
         | 
         | > I think many GNU tools just never free any memory
         | 
         | It is absolutely true that some GNU software (including
         | libraries like glib) allocate memory that they never intend to
         | free. This causes leak analysis of programs that link with
         | these libraries to be more painful than necessary.
        
           | miohtama wrote:
           | To work around leaks, just don't run your GNU programs for
           | very long (:
        
             | Maken wrote:
             | Maybe this is the intended use case of timeout(1).
        
             | DominoTree wrote:
             | Many GNU programs will helpfully terminate periodically
             | with SIGSEGV to help prevent memory leaks from becoming a
             | problem
        
         | anthk wrote:
         | On GCC, the memory usage on compiling C++ vs Clang can be night
         | and day.
         | 
         | You can compile large C++ projects in a GB of RAM based netbook
         | with Clang and ZRAM.
        
         | bjackman wrote:
         | Well yeah if your program does something and then terminates
         | then memory leaks are not an issue.
         | 
         | If your process is a server or a daemon or something then
         | restarting each instance every N hours is a nice backstop but
         | memory leaks are still a frightening spanner in the works!
        
         | baggy_trough wrote:
         | One point would be to be able to detect memory leaks with
         | tools.
        
       | saagarjha wrote:
       | Neat! Although, I'm curious why the tool doesn't just run
       | addr2line for you?
        
         | notaplumber1 wrote:
         | Are you asking why doesn't it execv(2) addr2line deep within
         | the libc malloc implementation? Because calling execv(2) within
         | libraries is frowned upon.. ;-)
         | 
         | The leak report is being generated internally by malloc. It is
         | then logged via utrace(2) when a process is traced through
         | ktrace(1).
         | 
         | The kdump utility simply dumps the report, strvis(3) escaping
         | any potentially unsafe characters. As this is untrusted user
         | data, passing it as the input/args to another command is
         | unwise. Also kdump(1) uses pledge(2) and cannot execute
         | commands.
        
           | charcircuit wrote:
           | Just link to the code instead of executing it as a separate
           | binary
        
             | notaplumber1 wrote:
             | I'm pretty sure parsing ELF binaries is out of scope for
             | kdump(1), sorry, but I don't think that's going to happen.
             | 
             | It's not that difficult to run addr2line yourself with the
             | information provided, and that's really for the best.
        
               | charcircuit wrote:
               | You are arguing for a worse UX because of an arbitrary
               | reason. Just link against the code in addr2line.
               | Providing a good UX should always be in scope for a
               | project.
        
               | sigjuice wrote:
               | How does one "just link against the code in addr2line" ?
        
               | charcircuit wrote:
               | You rename them main symbol to something else then you
               | just call it like any other library.
        
             | tedunangst wrote:
             | Why would openbsd want to taint kdump with GPL code?
        
               | charcircuit wrote:
               | Then find or write an alternative. Asking people to run a
               | command to see the file and line number is a joke.
        
               | yakubin wrote:
               | LLVM provides llvm-addr2line and llvm-symbolizer. Of
               | course I understand not wanting to link with LLVM code,
               | even when the licence is ok. :)
        
       | jmclnx wrote:
       | This is great news to me, it is the one think I was hoping for.
       | 
       | I use OpenBSD to test objects I create and testing there
       | discovered issues that Linux and AIX happily ignored. But I used
       | valgrind on Linux to look for leaks. With this I can now test for
       | all "my issues" on OpenBSD :)
        
         | pjmlp wrote:
         | Did you used xlC tooling on Aix?
         | 
         | Just curious how they have changed since RS/6000 days.
        
           | jmclnx wrote:
           | Yes, but I doubt it changed that much. It did work better for
           | me than gcc on AIX. But I expect the admins messed up gcc
           | when the installed it.
        
       | Karellen wrote:
       | > The null "f" values (call sites) are due to the sampling nature
       | of small allocations. Recording all call sites of all potential
       | leaks introduces too much overhead.
       | 
       | Hmmm.... "too much" feels like a trade-off or value judgement
       | that won't apply to all cases, and some people would probably
       | like to be able to take the performance hit in exchange for a
       | complete trace. Seems a bit odd that that's not even available as
       | an option, as well as the current behaviour.
        
       ___________________________________________________________________
       (page generated 2023-04-17 23:01 UTC)