https://devblogs.microsoft.com/oldnewthing/20240401-00/?p=109599 Skip to main content [RE1Mu3b] Microsoft The Old New Thing The Old New Thing The Old New Thing * Home * DevBlogs * Developer + Visual Studio + Visual Studio Code + Visual Studio for Mac + DevOps + Windows Developer + Developer support + ISE Developer + Engineering@Microsoft + Azure SDK + Command Line + Perf and Diagnostics + Dr. International + Math in Office + React Native * Technology + DirectX + Semantic Kernel + SurfaceDuo + Windows AI Platform * Languages + C++ + C# + F# + TypeScript + PowerShell Community + PowerShell Team + Python + JavaScript + Java + Java Blog in Chinese + Go * .NET + All .NET posts + .NET MAUI + ASP.NET Core + Blazor + Entity Framework + AI + Machine Learning + Servicing + Xamarin + .NET Blog in Chinese * Platform Development + #ifdef Windows + Azure Government + Azure VM Runtime Team + Bing Dev Center + Microsoft Edge Dev + Microsoft Azure + Microsoft 365 Developer + Microsoft Entra Identity Developer Blog + Old New Thing + Power Platform + Windows MIDI and Music dev * Data Development + Azure Cosmos DB + Azure Data Studio + Azure SQL Database + OData + Revolutions R + SQL Server Data Tools * More [ ] Search Search * No results Cancel Subroutine calls in the ancient world, before computers had stacks or heaps [png] Raymond Chen April 1st, 202412 5 We take stacks and heaps for granted nowadays, but back in the very old days of computing, computers operated without a stack or a heap. Tell a recent college graduate this, and you may as well tell them that there was a time when you didn't have instant access to millions of cat videos. It's not too hard to imagine computing without dynamic memory allocation. You just have to use fixed-size memory buffers for everything. If you have to operate on variable-sized data, you reserved a fixed-size buffer of some capacity that is large enough to accommodate any data you would reasonably be expected to process, and if somebody asked for more, you just exited the program with a fatal error. If you were really nice, you would provide a compile-time configuration so your clients could adjust the maximum capacity to suit their datasets. And if you were really fancy, you wrote a custom allocator that operated on that fixed-size buffer so people could "allocate" and "free" memory from the buffer. But operating without a stack? How did you call a function if you didn't have a stack for the return address or local variables? Here's how it worked. First, the compiler defined a secret global variable for each inbound function parameter, plus another secret global variable for each function to hold the return address. It also defined a secret global variable for each of the function's local variables. To generate a function call, the compiler assigned the parameter values to the corresponding secret global variables, assigned the return address to the function's secret "return address variable", and then jumped to the start of the function. The function read its parameters from its secret global variables, and used the pre-defined secret global variables that corresponded to its logically local variables. When the function was finished, it jumped to the address held in the function's secret "return address variable." For example, suppose you had code like this, written in a C-like language: int add_two_values(int a, int b) { int c = a + b; return c; } void sample() { int x = add_two_values(31415, 2718); } This would be transformed by the compiler into something like this: int a2v_a; int a2v_b; int a2v_c; void* a2v_retaddr; int add_two_values() { a2v_c = a2v_a + a2v_b; return_value_register = a2v_c; goto a2v_retaddr; } int sample_x; void sample() { a2v_a = 31415; a2v_b = 2718; a2v_retaddr = &resume; goto add_two_values; resume: sample_x = return_value_register; } Check it out: We did a function call and return without a stack! Now, you can optimize the ABI by, say, passing some of these values in registers rather than globals. For example, most processors had a special "link" register and a special instruction "branch with link" that automatically set the link register equal to the address of the instruction after the "branch with link" instruction, And maybe you optimize the calling convention to pass the first two parameters in registers, resulting in this: int a2v_a; int a2v_b; int a2v_c; void* a2v_retaddr; int add_two_values() { a2v_a = argument_register_1; a2v_b = argument_register_2; a2v_retaddr = link_register; a2v_c = a2v_a + a2v_b; return_value_register = a2v_c; goto a2v_retaddr; } int sample_x; void sample() { argument_register_1 = 31415; argument_register_2 = 2718; branch_with_link add_two_values; sample_x = return_value_register; } There was just one catch: You can't do recursion. Recursion doesn't work because a recursive call would overwrite the return-address variable with the return address of the recursive call, and when the outer call completed, it would jump to the wrong place. The programming languages of the day solved this problem by simply declaring it illegal: They didn't support recursion.1 Bonus chatter: Some compilers were even sneakier and used self-modifying code: The special return-address variable was really the address field of the jump instruction at the end of the function! This was occasionally not so much a sneaky trick as a practical necessity: The processor might not support indirect jumps either! After the practical value of subroutines was recognized, quite a few processors added a subroutine call instruction that worked by storing the return address at the first word of the subroutine, and beginning execution at the second word of the subroutine. To return from a subroutine, you execute an indirect jump through the subroutine start label. (As I recall, some processors stored the return address at the word before the first instruction of the subroutine.) Here's what it looked like using a made-up assembly language: add_two_values: nop ; return address goes here add r1 = r1, r2 ; actual subroutine begins here jmp @add_two_values ; indirect jump to return address sample: mov r1 = 31415 ; first parameter mov r2 = 2718 ; second parameter bsr add_two_values ; call subroutine st sample_x = r1 ; save return value When the CPU executed the bsr branch-to-subroutine instruction, it stored the return address into the first word of add_two_values (overwriting the sacrificial nop) and began execution at the following instruction, the add r1 = r1, r2. 1 FORTRAN initially didn't even support subroutines! Those were added in 1958. And support in FORTRAN for recursion didn't become standard until 1991, and even then, you had to explicitly declare your subroutine as RECURSIVE. [png] Raymond Chen Follow Tagged History Read next What is a hard error, and what makes it harder than an easy error? A throwback to the early days of 16-bit Windows. [png]Raymond Chen January 16, 2024 4 comments That time the Word team sent presents to the children of WordPerfect's executive vice president No, it wasn't creepy. [png]Raymond Chen December 25, 2023 3 comments 12 comments Leave a commentCancel reply Log in to join the discussion or edit/delete existing comments. * [png] Thomas Harte April 1, 2024 2:18 pm 1 collapse this comment copy link to this comment The world's most minor gripe: I repeatedly read the title of this post as subject-verb-object, i.e. the subject "The ancient world before computers", the verb "had" and the object "stacks or heaps". So I kept expecting the article to tack back to an unexpected pre-computing ancestor of heaps and stacks. Albeit that for a stack the antecedent would just be a stack of something. I don't know what you'd pick for a pre-computing heap. Maybe office space allocation in a commercial building? That's not a strong suggestion but it's all I have. Log in to Vote or Reply + [png] Yuri Khan April 2, 2024 12:29 am 0 collapse this comment copy link to this comment Hey, I'm pretty sure cafes existed before computers and they had both heaps (of dishes before washing) and stacks (after drying). Log in to Vote or Reply + [png] Raymond ChenMicrosoft employee April 3, 2024 12:33 pm 0 collapse this comment copy link to this comment I tweaked the title, should be less ambiguous now. Log in to Vote or Reply * [png] Peter Raynham April 1, 2024 3:31 pm 0 collapse this comment copy link to this comment All too often, problems come along for which there are no solutions, and it's sad (a war flares up, a sentimental coffee mug shatters). Almost as often, solutions come along for which there are no problems, and its amusing (my social media is full of ads for such gizmos). But once in a rare while, a problem and its solution come along just at the same time, and it's a beautiful sight to see. One of these rare events was in the late 1950s when block-structured languages like ALGOL were invented and the call-return stack was devised. A new generation of compilers needed recursion to compile nested code blocks, and the call-return stack happened to be just the solution for that problem. Log in to Vote or Reply * [png] Ian Boyd April 1, 2024 7:58 pm 3 collapse this comment copy link to this comment It's still amazing to me that even as recently as Windows 95, which had to run in 4 MB of RAM, that we had to scrimp and save every byte, and every cycle. Where every page fault was a nightmare. Whereas now the CPU is executing two or three *hundred* instructions ahead, across multple loop iterations, across multiple function calls, and returns. Where a 2 GB address limit now feels confining, and our PCs regularly have more RAM than we could ever use, so Windows uses the other half of it as a cache (SuperFetch). Where we don't have to have a mini-COM, or declare memory as discardable or not-discardable, or squeeze every byte out of an image with palettes. And you can have a full 3D recreation of the planet, with every tree and building, running on Javascript inside a browser. What a wonderous age we live in. Log in to Vote or Reply * [png] Dave Gzorple April 1, 2024 8:35 pm 1 collapse this comment copy link to this comment You didn't really need to make up some assembly language, you could have used the most famous instruction set that did this, and one which is still used today, the IBM 360 with balr et al. Log in to Vote or Reply + [png] Mark out West April 2, 2024 8:56 am 0 collapse this comment copy link to this comment Yup. For reentrancy/recursion/multithreading IBM utilized a "dynamic storage area" protocol in lieu of a hardware stack. A main memory save area was dynamically allocated and chained together in essentially a doubly linked list that lengthened as the calls deepened. Each DSA saved the entire register set and the set of passed parameters. Because the save area was heap allocated, the executable code was considered read-only and thus thread-safe. Simpler routines simply dumped the register set into a statically allocated save area within the code and restored its contents on exit. Log in to Vote or Reply * [png] Neil Rashbrook April 2, 2024 5:24 am 1 collapse this comment copy link to this comment Ooh, FORTRAN eventually got recursion? I remember it didn't back in the day when I had a course on it. (I think my year was the last year where FORTRAN was a mandatory course for first-year students. It was also the first year that they actually taught C to first year students.) Log in to Vote or Reply * [png] Hong Lou Tou April 2, 2024 8:20 am 0 collapse this comment copy link to this comment Stackless coroutines in C++ support inline allocation, if the compiler can prove lifetime shenanigans, can see the definition of the callee coroutine (so it knows how many bytes its coroutine frame consumes) and, of course, if the call isn't recursive. This process can then continue on for the transitive callees of the callee, and the end result is a big, fixed size contiguous buffer that's just big enough to provide memory for the execution of the entire coroutine call hierarchy. It's the compiler taking code that you wrote assuming the omnipresence of the stack, analyzing it, and emitting code in a way as if you coded like the pre-stack programmers did. Note how the compiler doing this optimization has nothing to do with coroutines; because most functions used in programming do not recurse directly or indirectly, the compiler could, in theory, compile such functions into machine code that operated within a static blob of scratch memory, using a special "stackless" calling convention that prohibited accessing the stack pointer in any way; and if you launched a thread with one of those stackless functions as its code, then the thread wouldn't need to have a stack (well it needs, because there are devils such as interrupts, APCs, POSIX signals, etc, but in embedded programming there's a good chance you'll one day find yourself creating multiple threads running such stackless functions, and in a world where stackless functions were real, it would save a huge amount of memory to be able to create threads without having to give each thread hundreds of bytes of stack, and while interrupts exist, you now need a stack per processor for handling them, instead of one per thread). Log in to Vote or Reply * [png] aidtopia April 3, 2024 9:22 am 0 collapse this comment copy link to this comment It's not just the ancient world! There are still domains today where dynamic memory allocation is not allowed and you have to guarantee an upper bound on the stack size (which often implies no recursion). If your embedded system is a critical safety component, you probably have to conform to guidelines, like MISRA, that prohibit such perilous luxuries. "[J]ust [exit] the program with a fatal error." Ha! not unless you can guarantee the program will be immediately restarted in a known good state. Log in to Vote or Reply * [png] Mark Yagnatinsky April 3, 2024 12:40 pm 0 collapse this comment copy link to this comment Aside from not supporting recursion, this sounds like it would not cope well with function pointers, right? Log in to Vote or Reply + [png] Raymond ChenMicrosoft employee April 3, 2024 1:37 pm 0 collapse this comment copy link to this comment Programming languages of the day generally didn't have the concept of "function pointers". But if they did, you can make function pointers be, say, a pointer to a structure that has the code address (read-only), the return address (read-write), and the parameters (read-write). Log in to Vote or Reply Archive * April 2024 * March 2024 * February 2024 * January 2024 * December 2023 * November 2023 * October 2023 * September 2023 * August 2023 * July 2023 * June 2023 * May 2023 * April 2023 * March 2023 * February 2023 * January 2023 * December 2022 * November 2022 * October 2022 * September 2022 * August 2022 * July 2022 * June 2022 * May 2022 * April 2022 * March 2022 * February 2022 * January 2022 * December 2021 * November 2021 * October 2021 * September 2021 * August 2021 * July 2021 * June 2021 * May 2021 * April 2021 * March 2021 * February 2021 * January 2021 * December 2020 * November 2020 * October 2020 * September 2020 * August 2020 * July 2020 * June 2020 * May 2020 * April 2020 * March 2020 * February 2020 * January 2020 * December 2019 * November 2019 * October 2019 * September 2019 * August 2019 * July 2019 * June 2019 * May 2019 * April 2019 * March 2019 * February 2019 * January 2019 * December 2018 * November 2018 * October 2018 * September 2018 * August 2018 * July 2018 * June 2018 * May 2018 * April 2018 * March 2018 * February 2018 * January 2018 * December 2017 * November 2017 * October 2017 * September 2017 * August 2017 * July 2017 * June 2017 * May 2017 * April 2017 * March 2017 * February 2017 * January 2017 * December 2016 * November 2016 * October 2016 * September 2016 * August 2016 * July 2016 * June 2016 * May 2016 * April 2016 * March 2016 * February 2016 * January 2016 * December 2015 * November 2015 * October 2015 * September 2015 * August 2015 * July 2015 * June 2015 * May 2015 * April 2015 * March 2015 * February 2015 * January 2015 * December 2014 * November 2014 * October 2014 * September 2014 * August 2014 * July 2014 * June 2014 * May 2014 * April 2014 * March 2014 * February 2014 * January 2014 * December 2013 * November 2013 * October 2013 * September 2013 * August 2013 * July 2013 * June 2013 * May 2013 * April 2013 * March 2013 * February 2013 * January 2013 * December 2012 * November 2012 * October 2012 * September 2012 * August 2012 * July 2012 * June 2012 * May 2012 * April 2012 * March 2012 * February 2012 * January 2012 * December 2011 * November 2011 * October 2011 * September 2011 * August 2011 * July 2011 * June 2011 * May 2011 * April 2011 * March 2011 * February 2011 * January 2011 * December 2010 * November 2010 * October 2010 * September 2010 * August 2010 * July 2010 * June 2010 * May 2010 * April 2010 * March 2010 * February 2010 * January 2010 * December 2009 * November 2009 * October 2009 * September 2009 * August 2009 * July 2009 * June 2009 * May 2009 * April 2009 * March 2009 * February 2009 * January 2009 * December 2008 * November 2008 * October 2008 * September 2008 * August 2008 * July 2008 * June 2008 * May 2008 * April 2008 * March 2008 * February 2008 * January 2008 * December 2007 * November 2007 * October 2007 * September 2007 * August 2007 * July 2007 * June 2007 * May 2007 * April 2007 * March 2007 * February 2007 * January 2007 * December 2006 * November 2006 * October 2006 * September 2006 * August 2006 * July 2006 * June 2006 * May 2006 * April 2006 * March 2006 * February 2006 * January 2006 * December 2005 * November 2005 * October 2005 * September 2005 * August 2005 * July 2005 * June 2005 * May 2005 * April 2005 * March 2005 * February 2005 * January 2005 * December 2004 * November 2004 * October 2004 * September 2004 * August 2004 * July 2004 * June 2004 * May 2004 * April 2004 * March 2004 * February 2004 * January 2004 * December 2003 * November 2003 * October 2003 * September 2003 * August 2003 * July 2003 Relevant Links I wrote a book Ground rules Disclaimers and such My necktie's Twitter Categories Code History Tips/Support Other Non-Computer Stay informed [ ] [Subscribe] By subscribing you agree to our Terms of Use and Privacy Policy Share on Social media * * * Login Theme * light-theme-iconLight * dark-theme-iconDark Insert/edit link Close Enter the destination URL URL [ ] Link Text [ ] [ ] Open link in a new tab Or link to existing content Search [ ] No search term specified. Showing recent items. Search or use up and down arrow keys to select an item. Cancel [Add Link] Code Block x Paste your code snippet [ ] Cancel Ok Feedback usabilla icon What's new * Surface Laptop Studio 2 * Surface Laptop Go 3 * Surface Pro 9 * Surface Laptop 5 * Microsoft Copilot * Copilot in Windows * Microsoft 365 * Windows 11 apps Microsoft Store * Account profile * Download Center * Microsoft Store support * Returns * Order tracking * Certified Refurbished * Microsoft Store Promise * Flexible Payments Education * Microsoft in education * Devices for education * Microsoft Teams for Education * Microsoft 365 Education * How to buy for your school * Educator training and development * Deals for students and parents * Azure for students Business * Microsoft Cloud * Microsoft Security * Dynamics 365 * Microsoft 365 * Microsoft Power Platform * Microsoft Teams * Copilot for Microsoft 365 * Small Business Developer & IT * Azure * Developer Center * Documentation * Microsoft Learn * Microsoft Tech Community * Azure Marketplace * AppSource * Visual Studio Company * Careers * About Microsoft * Company news * Privacy at Microsoft * Investors * Diversity and inclusion * Accessibility * Sustainability Your Privacy Choices Your Privacy Choices Consumer Health Privacy * Sitemap * Contact Microsoft * Privacy * Manage cookies * Terms of use * Trademarks * Safety & eco * Recycling * About our ads * (c) Microsoft 2024