https://lemire.me/blog/2021/07/29/measuring-memory-usage-virtual-versus-real-memory/ Skip to content Daniel Lemire's blog Daniel Lemire is a computer science professor at the University of Quebec (TELUQ) in Montreal. His research is focused on software performance and data engineering. He is a techno-optimist. Menu and widgets * My home page * My papers * My software Subscribe Email Address [ ] [ ] [Subscribe by email] Where to find me? I am on Twitter and GitHub: Follow @lemire You can also find Daniel Lemire on * on Google Scholar with 4k citations and over 75 peer-reviewed publications, * on Facebook, * and on LinkedIn. Before the pandemic of 2020, you could meet Daniel in person, as he was organizing regular talks open to the public in Montreal: tribalab and technolab . Search for: [ ] [Search] Support my work! I do not accept any advertisement. However, you can support the blog with donations through paypal. Please consider getting in touch if you are a supporter so that I can thank you. Recent Posts * Science and Technology links (July 31st 2021) * Measuring memory usage: virtual versus real memory * Faster sorted array unions by reducing branches * Science and Technology links (July 10th 2021) * Compressing JSON: gzip vs zstd Recent Comments * Vincent Bernat on Science and Technology links (July 31st 2021) * Damian Vuleta on Fast random shuffling * Sergey Nikolaev on Decoding over 4 billion integers per second in C * George Spelvin on Faster sorted array unions by reducing branches * Daniel Lemire on Fast unary decoding Pages * A short history of technology * About me * Book recommendations * Cognitive biases * Interviews and talks * My bets * My favorite articles * My favorite quotes * My readers * My sayings * Predictions * Recommended video games * Terms of use * Write good papers Archives Archives [Select Month ] Boring stuff * Log in * Entries feed * Comments feed * WordPress.org Measuring memory usage: virtual versus real memory Software developers are often concerned with the memory usage of their applications, and rightly so. Software that uses too much memory can fail, or be slow. Memory allocation will not work the same way under all systems. However, at a high level, most modern operating systems have virtual memory and physical memory (RAM). When you write software, you read and write memory at addresses. On modern systems, these addresses are 64-bit integers. For all practical purposes, you have an infinite number of these addresses: each running program could access hundreds of terabytes. However, this memory is virtual. It is easy to forget what virtual mean. It means that we simulate something that is not really there. So if you are programming in C or C++ and you allocate 100 MB, you may not use 100 MB of real memory at all. The following line of code may not cost any real memory at all: constexpr size_t N = 100000000; char *buffer = new char[N]; // allocate 100MB Of course, if you write or read memory at these 'virtual' memory addresses, some real memory will come into play. You may think that if you allocate an object that spans 32 bytes, your application might receive 32 bytes of real memory. But operating systems do not work with such fine granularity. Rather they allocate memory in units of "pages". How big is a page depends on your operating system and on the configuration of your running process. On PCs, a page might often be as small as 4 kB, but it is often larger on ARM systems. Operating systems allow you to request large pages (e.g., one gigabyte). Your application receives "real" memory in units of pages. You can never just get "32 bytes" of memory from the operating system. It means that there is no sense micro-optimizing the memory usage of your application: you should think in terms of pages. Furthermore, receiving pages of memory is a relative expensive process. So you probably do not want to constantly grab and release memory if efficiency is important to you. Once you have allocated virtual memory, can we predict the actual (real) memory usage within the following loop? for (size_t i = 0; i < N; i++) { buffer[i] = 1; } The result will depend on your system. But a simple model is as follows: count the number of consecutive pages you have accessed, assuming that your pointer begins at the start of a page. The memory used by the pages is a lower-bound on the memory usage of your process, assuming that the system does not use other tricks (like memory compression or other heuristics). I wrote a little C++ program under Linux which prints out the memory usage at regular intervals within the loop. I use about 100 samples. As you can see in the following figure, my model (indicated by the green line) is an excellent predictor of the actual memory usage of the process. [plot] Thus a reasonable way to think about your memory usage is to count the pages that you access. The larger the pages, the higher will be the cost in this model. It may thus seem that if you want to be frugal with memory usage, you would use smaller pages. Yet a mobile operating system like Apple's iOS has relatively larger pages (16 kB) than most PCs (4 kb). Given a choice, I would almost always opt for bigger pages because they make memory allocation and access cheaper. Furthermore, you should probably not worry too much about virtual memory. Do not blindly count the address ranges that your application has requested. It might have little to no relation with your actual memory usage. Modern systems have a lot of memory and very clever memory allocation techniques. It is wise to be concerned with the overall memory usage of your application, but you are more likely to fix your memory issues at the software architecture level than by micro-optimizing the problem. Published by [4b7361] Daniel Lemire A computer science professor at the University of Quebec (TELUQ). View all posts by Daniel Lemire Posted on July 29, 2021July 29, 2021Author Daniel LemireCategories Leave a Reply Cancel reply Your email address will not be published. Required fields are marked * To create code blocks or other preformatted text, indent by four spaces: This will be displayed in a monospaced font. The first four spaces will be stripped off, but all other whitespace will be preserved. Markdown is turned off in code blocks: [This is not a link](http://example.com) To create not a block, but an inline code span, use backticks: Here is some inline `code`. For more help see http://daringfireball.net/projects/markdown/syntax [ ] [ ] [ ] [ ] [ ] [ ] [ ] Comment [ ] Name * [ ] Email * [ ] Website [ ] [ ] Save my name, email, and website in this browser for the next time I comment. Receive Email Notifications? [no, do not subscribe ] [instantly ] Or, you can subscribe without commenting. [Post Comment] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] Post navigation Previous Previous post: Faster sorted array unions by reducing branches Next Next post: Science and Technology links (July 31st 2021) Proudly powered by WordPress