hngopher.com

       [HN Gopher] Understanding thread stack sizes and how Alpine is d...
       ___________________________________________________________________
        
       Understanding thread stack sizes and how Alpine is different
        
       Author : notacoward
       Score  : 93 points
       Date   : 2021-06-26 11:49 UTC (11 hours ago)
        
 (HTM) web link (ariadne.space)
 (TXT) w3m dump (ariadne.space)
        
       | ohazi wrote:
       | The distinction that's being characterized as "GNU/Linux" vs.
       | just "Alpine" is confusing.
       | 
       | Is Alpine using some new kernel? No, it's a Linux distribution
       | that uses the Linux kernel, albeit with some unusual defaults.
       | 
       | Does Alpine not have any of the GNU userspace tools? Also no,
       | there are plenty in the Alpine package repository.
       | 
       | Look, I get that GNU/Linux and "GnU pLuS LiNuX" is a loaded term
       | and has a lot of baggage, and that everyone would like to just be
       | rid of the whole mess, but the characterization used here had me
       | thinking that there was some other "Alpine kernel" experimental
       | OS project that I had missed that had nothing to do with Alpine
       | Linux.
       | 
       | The word "Linux" never once follows the word "Alpine" in this
       | article, and it discusses overcommit mode as if it's a uniquely
       | "GNU/Linux" thing. WTF does kernel overcommit have to do with
       | GNU?
       | 
       | Please just call it what it is.
        
         | lonjil wrote:
         | So in your view, having even a single GNU tool installed, or
         | even available for installation, means you're using
         | "GNU/Linux"? Alpine uses musl and busybox rather than the more
         | common GNU equivalents.
         | 
         | Kernel overcommit has nothing to do with GNU, but default stack
         | size has a lot to do with which libc you use. Musl has a
         | different default than glibc. Overcommit is mentioned because
         | it is the justification for glibc having a large stack size by
         | default. Musl has defaults that make fewer assumptions about
         | how the system is configured.
        
           | ohazi wrote:
           | Okay, that's fair. I was confused and responded in
           | frustration. Maybe glibc vs. musl would have been more clear
           | than "GNU/Linux" vs. "Alpine".
           | 
           | I still think only referring to it as "Alpine" and not _once_
           | calling it  "Alpine Linux" is weird.
        
       | stefan_ wrote:
       | Stop being cringe and adopt the same stack size as everyone else.
       | Oh my god. What on earth are you saving?
        
         | swiley wrote:
         | IMO: it's nice to have weird platforms. I've caught lurking UDB
         | and memory corruption in my programs by trying to run them on
         | weird OSes.
        
           | stefan_ wrote:
           | There is no UDB or memory corruption exceeding the stack
           | size. It means your platform is too small to run the program.
           | Of course, Alpine doesn't run on platforms that are too
           | small, they just make nonsensical changes like these that
           | cause incompatibilities but have zero benefits.
        
             | ariadneconill wrote:
             | Alpine runs on all sorts of small platforms. It is possible
             | to run it on OpenWRT type devices.
        
             | lmz wrote:
             | https://github.com/yaegashi/muslstack seems to indicate the
             | limit is coming from musl (the libc) rather than Alpine,
             | which may legitimately have a hope of running on small
             | platforms.
        
       | eqvinox wrote:
       | > Thread-local variables are referenced with the thread_local
       | keyword. You must include threads.h in order to use it:
       | #include <threads.h>              void some_function(void) {
       | thread_local char scratchpad[500000];
       | memset(scratchpad, 'A', sizeof scratchpad);        }
       | 
       | As an important note, thread-local storage through this keyword
       | _still_ isn 't supported on OpenBSD. It's a serious PITA.
       | 
       | [------ also, copied from a reply I posted below: ------]
       | 
       | The autofree macro is wrong since __attribute__((cleanup))
       | expects a function that takes an additional level of pointer. In
       | this case, it'll call "free(&scratchpad);". Which doesn't get you
       | a compiler warning in C because passing a char ** as a void * is
       | perfectly fine. But your heap is f*cked after this.
       | 
       | Correct way to do it is                 void free2(char **p)
       | {         free(*p);       }       #define autofree
       | __attribute__((cleanup(free2)))
        
       | mjw1007 wrote:
       | Another complication is that in glibc's implementation TLS
       | variables come out of the same space as the per-thread stack.
       | 
       | As far as I can tell this is something they think in principle
       | should be changed.
       | 
       | https://sourceware.org/bugzilla/show_bug.cgi?id=11787
        
       | kstenerud wrote:
       | Yeah, it sucks that programs crash on your system, but this is
       | the way of things: The popular systems get targeted and tested
       | in-depth, the less popular systems not so much. This is _NOT_ the
       | developer 's fault; this is _pragmatism_.
       | 
       | And so the mountain must come to Mohamed. Increase Alpine's
       | default stack size to something more in-line with the big boys.
        
         | api wrote:
         | I would at least make it equal to MacOS, another very popular
         | target where things are tested a lot. That's 512KiB. 128 is
         | teeny.
        
         | arghwhat wrote:
         | No, they should keep their low stack size to the benefit of
         | everyone.
         | 
         | Diversity helps discover what is fundamentally a broken and
         | fragile assumption that a dynamic property will always have
         | some value. An assumption that _can_ fail anywhere, including
         | on the OS that was initially targeted, and _will_ fail the
         | moment another OS is targeted.
         | 
         | The developer _should_ fix their broken assumption, but is
         | entirely free to do so by taking control of the value at link
         | time.
        
           | choeger wrote:
           | I absolutely agree with your point about diversity. But us
           | advocates need to understand this "pragmatic" view and
           | counter it properly. Normally, the argument that works is:
           | "Do you monitor the thread stack size on $POPULAR_PLATFORM so
           | if _that_ changes, you won 't be bitte ?
        
       | torh wrote:
       | I'm annoyed by the fact that I have to visit "new" reddit to be
       | able to accept cookies, and then go back to the old design. This
       | is on desktop btw.
        
         | a1369209993 wrote:
         | Wrong thread. You probably want
         | https://news.ycombinator.com/item?id=27641366.
        
       | nemetroid wrote:
       | > As most threads only need a small amount of stack memory, other
       | platforms use smaller limits, such as OpenBSD using only 64 KiB
       | and Alpine using at most 128 KiB by default.
       | 
       | By your own table, OpenBSD uses 512 KiB unless you're on an
       | ancient version. Among the listed, Alpine is the lone outlier.
        
       | mjw1007 wrote:
       | I remember reading about arguments over whether Algol should
       | permit recursive procedures, where one side of the argument was
       | apparently claiming that they wouldn't be possible to implement.
       | 
       | That seems pretty strange to modern ears, but maybe the
       | underlying point was that it isn't possible to statically know
       | how much stack size would be required.
       | 
       | I suppose it wouldn't have been obvious then that if you fudge
       | the issue for the first thirty years or so, everyone will just
       | accept that this is the way the world is.
       | 
       | Still, it's a bit of a shame that there are still widely-used
       | systems where if you exceed the available stack space you're
       | likely to face a "weird crash" rather than a clean error message
       | at runtime.
        
         | mananaysiempre wrote:
         | Contemporary books (incl. TAoCP Vol. 1 IIRC) don't place return
         | addresses on a stack or in a register at all: you're generally
         | told to modify the operand of the jump instruction at the end
         | of the procedure you want to call then jump to its entry point.
         | It took a while before reentrancy was even recognized as a
         | possibility, let alone used (by humans or compilers) by
         | default.
         | 
         | I remember reading that the call stack was _invented_ as an
         | implementation device for recursion as introduced in Algol, but
         | I can't recall how that claim was sourced.
        
         | jerf wrote:
         | "but maybe the underlying point was that it isn't possible to
         | statically know how much stack size would be required."
         | 
         | Given the time frame you're talking about, remember to be
         | thinking in terms of kilobytes, not gigabytes. And potentially
         | low-single-digit numbers of kilobytes. It could be in the range
         | of hundreds of bytes dedicated to stacks at the time. Even if
         | you could compute your maximum size it's easy to imagine people
         | balking a the results of such a computation and think it's not
         | worth it to even consider the possibility because you'd blow
         | your stack so quickly it's not like there'd be any benefit to
         | it.
        
         | a1369209993 wrote:
         | > it isn't possible to statically know how much stack size
         | would be required.
         | 
         | It's not just that - if it were merely that you couldn't know
         | the size statically, you could use dynamic memory allocation.
         | The problem with _that_ though, is that now every (not provably
         | nonrecursive) function call can now fail with a memory
         | allocation error, and if your language doesn 't surface that to
         | the caller, and (correctly) doesn't allow spurious errors to
         | appear out of nowhere (cough every modern programming language
         | cough cough), there's no way to handle that error.
        
       | totorovirus wrote:
       | I think allocating something very large on stack should have been
       | reviewed by peers to either fix it to use heap.
        
       | sys_64738 wrote:
       | In other words, use heap space and not stack space. This is
       | pretty elementary in C programming.
        
         | kstenerud wrote:
         | That might have been true in the old days when memory wasn't
         | the bottleneck, but in today's world where a cache miss is
         | catastrophic, it makes MUCH more sense to use stack space where
         | you can. This also has the side effect of facilitating
         | idempotent functions and function purity in general.
         | 
         | No sense in clinging to old world ideals when they no longer
         | make sense.
        
           | jlokier wrote:
           | I agree that using the stack instead of heap makes sense for
           | cache reasons.
           | 
           | But in the contemporary world, the trend is increasingly to
           | transform functions to "async" forms where much of the
           | functions' local state _including return address_ is stored
           | in heap-allocated space instead.
        
           | ariadneconill wrote:
           | That is certainly a take.
           | 
           | Storing small data, like function-local ints, pointers, etc,
           | on the stack is beneficial due to L1$ prefetching semantics,
           | but storing a 512KB scratchpad on the stack (which is what
           | the article is about) will totally trash your L1$ and you'll
           | have MORE cache misses than you would if that scratchpad was
           | not on the stack.
        
         | dnautics wrote:
         | this is too reductive. Usually stack space is faster because
         | you will have fewer cache misses.
         | 
         | future languages will be able to suss this out at compile-time:
         | https://github.com/ziglang/zig/blob/2ac769eab9b7dba4cd38e5de...
        
           | sys_64738 wrote:
           | If speed issues then it sounds like this is critical code
           | path that is sensitive to time. You'd be looking for
           | alternative data structures at that point so this point would
           | be moot.
        
         | IshKebab wrote:
         | Sure, but 128 kB is really small even if you do that properly.
         | 
         | Seems like it would be more sensible if the stack space could
         | just grow when required. Surely not that difficult?
        
           | pjmlp wrote:
           | Enough for a full COM binary plus one overlay section. :)
        
           | aidenn0 wrote:
           | A typical stack frame is around a couple dozen words. Let's
           | round that up to 32 words (256 bytes). 128K is enough for a
           | 500 deep stack at that size. 128K is huge.
        
           | CodesInChaos wrote:
           | Since standard OS stacks are contiguous and unmovable, you
           | can't grow them once they run out of space. However while the
           | address space for the maximum stack size gets reserved, each
           | page only requires backing memory once it's first used. So as
           | far as physical memory consumption is concerned, typical
           | stacks act as growable with a fixed maximum size. Since
           | address space is huge on 64-bit systems, choosing a large
           | stack size is cheap on such systems (at least of they allow
           | over-commit).
           | 
           | For the main thread, a system can also try to keep other
           | allocations far from the stack without committing to any
           | particular size (heap grows upwards from the bottom of the
           | address space, stack downwards from the top). But this
           | doesn't scale to multiple threads and leads to an
           | unpredictable maximum stack size, so I prefer the fixed
           | reserved space approach.
        
           | viraptor wrote:
           | The main thread already does that. The thread stack does not.
           | It probably made sense on 32b with relatively limited address
           | space... But I'm curious why we're not applying that to all
           | threads on 64b. Reserving a few 10MB of address space per
           | thread shouldn't be a big issue, right? (Without actually
           | mapping those pages)
        
             | megous wrote:
             | Mapping a lot of pages has some overhead, too. So if you
             | have large stacks per thread you'll take a hit.
             | 
             | If you use huge pages to alleviate that, you'll waste a lot
             | of physical memory.
        
           | megous wrote:
           | Wait until you realize OS kernel thread stack size is 8 KiB
           | or in that range, if you think 128 KiB is small. :))
        
           | tyingq wrote:
           | It is easy for a single threaded program, but since each
           | thread has it's own part of the stack, it would be non-
           | trivial for a multithreaded program.
        
         | viraptor wrote:
         | I'd say "be aware of stack sizes". If you can get away with
         | just the appropriately sized stack in a thread, that's a nice
         | performance gain over dealing with multithreading heap
         | allocators.
        
         | ynik wrote:
         | It's a bit ridiculous to complicate recursive algorithms just
         | because the stack sizes haven't been increased in the past 3
         | decades.
         | 
         | Nowadays we have at least 48bit virtual address space
         | available; what's the harm in giving each thread a full GB of
         | stack?
        
           | lonjil wrote:
           | If you need a bigger stack, you can get one. This stuff is
           | merely the default.
        
           | sys_64738 wrote:
           | Generally recursion is not something you want in production
           | code. It's cute for academia when studying algorithms but
           | where there's an iterative alternative that should be used.
        
       | moring wrote:
       | > In general, it is my opinion that if your program is crashing
       | on Alpine, it is because your program is dependent on behavior
       | that is not guaranteed to actually exist, which means your
       | program is not actually portable. When it comes to this kind of
       | dependency, the typical issue has to deal with the thread stack
       | size limit.
       | 
       | The wording sounds as if it is trying to assign blame for the
       | problem. What, then, _is_ the guaranteed thread stack size? A
       | developer would obviously need to know this (and other things
       | such as the amount of stack size required by variables,
       | parameters and frames) to not fall into this trap of writing non-
       | portable programs.
        
         | lonjil wrote:
         | > What, then, is the guaranteed thread stack size?
         | 
         | If you _need_ a big stack, you can just ask pthread to give you
         | one.
        
         | nsajko wrote:
         | > The wording sounds as if it is trying to assign blame for the
         | problem.
         | 
         | Yes. Sadly, people often write incorrect programs.
         | 
         | > What, then, is the guaranteed thread stack size?
         | 
         | I can't be bothered to look up the POSIX guarantees, the gist
         | of it anyway is that it depends on the application developer
         | and system administrator.
         | 
         | > A developer would obviously need to know this (and other
         | things such as the amount of stack size required by variables,
         | parameters and frames) to not fall into this trap of writing
         | non-portable programs.
         | 
         | If you're not going to calculate the exact requirements
         | (probably unnecessary), guess/measure a number of bytes and
         | allocate a stack that's 50 or 100 times greater than that.
         | That's better than ignoring the existence of the stack, anyway.
        
         | tyingq wrote:
         | Guessing PTHREAD_STACK_MIN
        
           | formerly_proven wrote:
           | > Minimum Acceptable Value: 0
           | 
           | Thanks, posix.
        
       | ClumsyPilot wrote:
       | It woupd be nice if the article explained why was the current
       | size set and what is the benefit of doing so?
        
       | kosinus wrote:
       | This is one of the reasons why I'm no longer using Alpine as a
       | base in Docker images. I ran into this limit specifically with
       | node-sass.
       | 
       | But in general, the difference in image size is negligible
       | because of shared layers, and I just don't think enough testing
       | happens on Alpine / musl in any given stack. Even if your app
       | runtime is tested this way, how many dependencies are?
       | 
       | Come to think of it, I'm not even sure why there was a push for
       | Alpine-based Docker images at some point. Maybe it was just hype.
        
         | aecay wrote:
         | At $WORK, there's a process for automatically scanning docker
         | images for packages that have CVEs against them. Any docker
         | image that includes glibc instantly shoots to the top of the
         | charts, mostly because of a boatload of high or critical
         | severity CVEs relating to bugs in asm-implemented functions on
         | platforms like ARM, POWER9, etc. Everything in our company runs
         | on x86, but the CVE scanning tool is dumb, so a switch to
         | alpine was heavily encouraged.
         | 
         | This broke teams that rely on python and on node, but the
         | docker image guidelines come from a team whose ideal language
         | is now go (and most of whose legacy code is in java), so they
         | are not really sensitive to those concerns. Ironically we tried
         | to move to distroless as implemented by google[1], but that's
         | based on debian which includes glibc, so the un-nuanced CVE
         | checker freaks out again. That effort was quietly dropped.
         | 
         | (I'm not actually disputing the proposition that alpine is
         | better for security under certain circumstances, but I think a
         | lot of "the push" comes from what might uncharitably be
         | described as cargo culting, or with more insight as
         | interpretations that make sense in one context [everything is a
         | static binary, little to no reliance on traditional userland
         | tools] being unquestioningly extended to other contexts.)
         | 
         | [1] https://github.com/GoogleContainerTools/distroless
        
         | arghwhat wrote:
         | > Come to think of it, I'm not even sure why there was a push
         | for Alpine-based Docker images at some point.
         | 
         | The continuing push is due to the smaller footprint and better
         | security properties. And no amount of sharing makes up for the
         | difference between a single-MB image and a GB image.
         | 
         | Any application can just dictate its own thread stack size.
         | What is discussed here is a default.
        
         | bradleyjg wrote:
         | A slimmer image is better from an attack surface point of view.
         | "Distroless" with its tree shaking takes this to its logical
         | conclusion but when images on alpine started getting popular
         | that wasn't available (at least to the general public).
        
         | ldoughty wrote:
         | It took some time before Ubuntu and Debian offered official
         | slim containers... Before that, many programs in containers
         | defaulted to Ubuntu or Debian, and it was 600MB to run
         | something like Nginx or Apache, while alpine was 40MB
        
       | rburhum wrote:
       | Why would you overcomplicate your life and use something like the
       | autofree example, that is not even portable, if you can use the
       | heap which is simple to understand and do? I understand that if
       | it is a hot function you may run into memory
       | fragmentation/performance issues, but there are some many ways to
       | deal with that with custom allocators _if it truly is a problem_.
       | This is one of those perfect examples where simple is better IMHO
        
         | eqvinox wrote:
         | The autofree example uses the heap. It just makes calling
         | free() automatic when the function returns, regardless of where
         | it does so. It's leak protection.
         | 
         | It's also wrong since __attribute__((cleanup)) expects a
         | function that takes an additional level of pointer. In this
         | case, it'll call "free(&scratchpad);". Which doesn't get you a
         | compiler warning in C because passing a char ** as a void * is
         | perfectly fine. But your heap is f*cked after this.
         | 
         | Correct way to do it is                 void free2(char **p)
         | {         free(*p);       }       #define autofree
         | __attribute__((cleanup(free2)))
        
       | chrisseaton wrote:
       | How can you write a program that runs without at least some
       | guaranteed stack size? Are you at fault if you program doesn't
       | run in a 1kb stack? And how do you work out what stack size your
       | program takes from looking at the source code?
       | 
       | I guess make sure your required stack size is not a function of
       | input, and test against a minimum stack size.
        
         | simias wrote:
         | If you have to work in an environment where stack size is very
         | limited (typically a few KiB) you have to pay attention to
         | certain things that you can brush away in more generous
         | environment. In particular you need to be very careful with
         | recursive functions and you probably want to use the heap or
         | static storage for any object bigger than a couple dozens
         | bytes.
         | 
         | But in my experience you don't really compute a "guaranteed"
         | stack size, you use your experience and knowledge of the
         | program to make an educated guess, and then you apply a
         | reasonable multiplier to give you some security margin.
         | 
         | If you don't use (or severely limit) recursive calls you can
         | usually just check that your deepest call stack fits within the
         | bounds. Although finding the deepest call stack in the first
         | place can be tricky given that compilers can aggressively
         | inline function calls.
        
         | viraptor wrote:
         | Unless you do alloca() or dynamically sized local arrays, you
         | can measure your stack usage in the deepest call stack. Add
         | some space in each frame for potential instrumentation and you
         | have your minimum.
         | 
         | Keep in mind that this is just for thread stacks - you can set
         | the size for them yourself, so ideally you'd always do it. Then
         | a guaranteed minimum size becomes irrelevant.
        
           | chrisseaton wrote:
           | > Unless you do alloca() or dynamically sized local arrays,
           | you can measure your stack usage in the deepest call stack.
           | 
           | How does a normal working programmer calculate the size of
           | each of their stack frames? I'm a compiler researcher and I'd
           | struggle to do that. How are application developers going to
           | do it?
           | 
           | And how do you design a program to have a deterministic
           | maximum call stack depth?
           | 
           | I don't think these things are as easy as you're making out.
        
             | viraptor wrote:
             | You have 2 options: either your functions are recursive and
             | you can hope and pray, or they're not and you can figure
             | out which of your functions are the bottom of the call
             | graph.
             | 
             | In those leaf functions you can check &local_var and
             | compare it to pthread_attr_getstack(pthread_getattr_np()).
             | (Of course that's not precise for many reasons.)
             | 
             | > And how do you design a program to have a deterministic
             | maximum call stack depth?
             | 
             | If you're running only your code - don't use recursion, or
             | alloca. If you use external libraries, you have to research
             | what they do and add some extra in case of updates.
             | 
             | Bounded stack size is also a common issue if you're
             | targeting small microprocessors.
             | 
             | For non-critical apps it should be pretty easy to figure
             | out the needed stack size. For cases when you want to
             | guarantee it... that gets more tricky.
             | 
             | Edit: just learned that clang has the option -fstack-usage
             | which should help a lot.
        
               | CodesInChaos wrote:
               | Dynamic dispatch (e.g. function pointers/delegates,
               | virtual methods) is another case that makes figuring out
               | the call graph and thus the maximum stack size difficult
               | (via whole-program data-flow analysis) to impossible
               | (function pointers come from outside your codebase or are
               | constructed in ways analysis can't handle).
        
               | creata wrote:
               | > You have 2 options: either your functions are recursive
               | and you can hope and pray
               | 
               | Or you can try to figure out the maximum number of times
               | it'll recurse: for example, the height of a red-black
               | tree with less than 2^64 nodes is less than 128, iirc.
        
               | lanstin wrote:
               | One rather quickly runs into halting-problem type issues,
               | especially in the function dispatch method. Imagine a DSL
               | that does stuff, and is implemented by function pointers
               | in the parser/interpreter, and then the question becomes
               | one of program inputs. In any case, having such a small
               | limit is crazy, and defending it with references to
               | correctness smells of Ulrich Drepper and the memmove
               | issue. The whole "sucks less" movement is a little too
               | focused on purity for my taste. The only time I'm sad
               | when I look at my memory usage on my personal laptop is
               | when I have unused memory. Please, pre-fetch some
               | news.ycombinator, cache some more inodes for my next ncdu
               | or find command; I have already paid for the memory, not
               | using it is silly. Sure, we software engineers get lazy,
               | but, except in AWS, use all your memory, all your
               | processors upto throttling, all the time. Why not?
               | 
               | I remember one day in the 90s counting out like max
               | address len and max zip code len and so and trying to
               | figure out how long to make my target stack allocated
               | buffer, and i was like fuck it, I have more important
               | things to do, all my stack buffers are hence forth 65536
               | bytes long.
        
             | MaxBarraclough wrote:
             | If we forbid things like recursion (including mutual
             | recursion), function pointers, dynamic dispatch, and
             | unbounded use of _alloca_ , doesn't it then follow from the
             | call graph and the per-function worst-case stack-usage
             | numbers (which the compiler presumably knows)? Is that
             | mistaken, or is the difficulty in generalising this
             | approach to where those restrictions are lifted?
             | 
             | I tried googling for how SPARK Ada provides assurances
             | against exceeding stack-size limits, but I couldn't find a
             | decent answer. I presume it does so, though.
             | 
             |  _edit: forgot about alloca_
             | 
             |  _edit 2: Turns out the AdaCore folks have a tool
             | specifically for static analysis of stack-space
             | requirements of Ada /C/C++ code:_
             | https://www.adacore.com/gnatpro/toolsuite/gnatstack
        
             | mjw1007 wrote:
             | Not easy at all.
             | 
             | I know that in the small-embedded world, people do work on
             | such things.
             | 
             | Eg https://github.com/japaric/cargo-call-stack
        
             | megous wrote:
             | You ask a compiler, since it knows the max stack
             | requirements of every function it compiled, if it's fixed.
             | If it's not fixed it may give you at least the minimum.
             | 
             | For total depth, keeping you program simple and predictable
             | helps. People certainly manage to do it even for large
             | programs like Linux itself, where stack size is like 16KiB
             | or so. https://elixir.bootlin.com/linux/v5.2/source/arch/x8
             | 6/includ... and less on other archs. 8 KiB on arm https://e
             | lixir.bootlin.com/linux/v5.13-rc7/source/arch/arm/i...
        
               | chrisseaton wrote:
               | But the compiler may not compile simple 'functions' as
               | the user understands them - it may compile loop bodies,
               | functions with other functions compiled in them,
               | individual branches of functions, multiple versions of
               | functions based on where they're called from...
               | 
               | If I tell you as a compiler writer that this loop body
               | from this function, but with this branch and this branch
               | outlined, but only when called from this context, takes n
               | bytes... I don't get what most working programmers are
               | going to usefully do with that information.
        
               | megous wrote:
               | Where's the problem? Compiler will tell you what amount
               | of stack a function will use, if it's inlined it may not
               | tell you for that function, but it will tell you for the
               | function the function was inlined to, which is what
               | matters.
               | 
               | If the language is complicated and has generics or
               | whatever, the programmer will have to do more work to
               | understand it.
               | 
               | It's not a huge issue in C.
        
               | chrisseaton wrote:
               | > Compiler will tell you what amount of stack a function
               | will use
               | 
               | If you ask a compiler how much stack a function will use
               | the answer for a non-trivial compiler for a complicated
               | language is always going to be 'it depends...'
        
               | MaxBarraclough wrote:
               | As I mentioned in my other comment, AdaCore's _GNATstack_
               | tool appears to be capable of reporting this information
               | conservatively but with enough accuracy to be useful.
               | 
               | https://www.adacore.com/gnatpro/toolsuite/gnatstack
        
           | bsdetector wrote:
           | > Add some space in each frame for potential instrumentation
           | and you have your minimum.
           | 
           | On exit just scan from the maximum stack to minimum looking
           | for non-zero.
           | 
           | If you have tests it should be easy to get within a few bytes
           | of max stack used, which is probably just as good as
           | instrumenting everything.
        
             | viraptor wrote:
             | It's possible, but you need to watch out for some cases.
             | For example let's say your furthest function declares char
             | foo[4096], but uses only a few bytes of it in your testing.
             | Your measurement will be 4k short.
        
         | saagarjha wrote:
         | In general, you just can't. This means that any function call
         | in C can bust the stack, unfortunately. You can try to use
         | heuristics to try to avoid using up large amounts of space
         | (avoid alloca and large stack arrays, be careful about
         | recursion) but other than that there isn't much you can do.
        
         | sesuximo wrote:
         | GCC has "-fstack-usage"
        
       ___________________________________________________________________
       (page generated 2021-06-26 23:03 UTC)