https://www.sandordargo.com/blog/2025/04/16/raw-loops-for-performance avatar Sandor Dargo's Blog On C++, software development and books * HOME * TAGS * ARCHIVES * BOOKS * SPEAKING * DAILY C++ * HI... * [become_a_p] Blog 2025 04 16 Raw loops for performance? Post [ ] Cancel Raw loops for performance? Sandor Dargo Apr 16 2025-04-16T00:00:00+02:00 8 min To my greatest satisfaction, I've recently joined a new project. I started to read through the codebase before joining and at that stage, whenever I saw a possibility for a minor improvement, I raised a tiny pull request. One of my pet peeves is rooted in Sean Parent's 2013 talk at GoingNative, Seasoning C++ where he advocated for no raw loops. When I saw this loop, I started to think about how to replace it: 1 #include 2 #include 3 #include 4 #include 5 6 struct FromData { 7 // ... 8 std::string title; 9 int amount; 10 }; 11 12 struct Widget { 13 // ... 14 std::list data; 15 }; 16 17 struct ToData { 18 // ... 19 std::string title; 20 int amount; 21 }; 22 23 struct Response { 24 // ... 25 std::vector data; 26 }; 27 28 Response foo(Widget widget) { 29 std::vector transformed_data; 30 for (const auto& element : widget.data) { 31 transformed_data.push_back( 32 {.title = element.title, .amount = element.amount * 42}); 33 } 34 Response response; 35 // ... 36 response.data = transformed_data; 37 38 return response; 39 } 40 41 int main() { 42 Widget widget{.data = { 43 {"a", 1}, 44 {"b", 2}, 45 {"c", 1}, 46 }}; 47 48 auto r = foo(widget); 49 50 for (const auto& element : r.data) { 51 std::cout << "title: " << element.title << ", amount " << element.amount 52 << '\n'; 53 } 54 } 55 56 /* 57 title: a, amount 42 58 title: b, amount 84 59 title: c, amount 42 60 */ Please note that the example is simplified and slightly changed so that it compiles on its own. Let's focus on foo, the rest is there just to make the example compilable. It seems that we could use std::transform. But heck, we use C++20 we have ranges at our hands so let's go with std::ranges::transform! 1 #include 2 #include 3 #include 4 #include 5 #include 6 7 struct FromData { 8 // ... 9 std::string title; 10 int amount; 11 }; 12 13 struct Widget { 14 // ... 15 std::list data; 16 }; 17 18 struct ToData { 19 // ... 20 std::string title; 21 int amount; 22 }; 23 24 struct Response { 25 // ... 26 std::vector data; 27 }; 28 29 Response foo(Widget widget) { 30 const auto transformed_data = widget.data 31 | std::views::transform([](const auto& element) { 32 return ToData{ 33 .title = element.title, 34 .amount = element.amount * 42 35 }; 36 }); 37 Response response; 38 // ... 39 response.data = {transformed_data.begin(), transformed_data.end()}; 40 41 return response; 42 } 43 44 int main() { 45 Widget widget{.data = { 46 {"a", 1}, 47 {"b", 2}, 48 {"c", 1}, 49 }}; 50 51 auto r = foo(widget); 52 53 for (const auto& element : r.data) { 54 std::cout << "title: " << element.title << ", amount " << element.amount 55 << '\n'; 56 } 57 } 58 /* 59 title: a, amount 42 60 title: b, amount 84 61 title: c, amount 42 62 */ We have no more raw loops, no more initialized then modified vectors, and the result is the same. Is this better? We don't have to modify a vector that's definitely better. But when I proposed such a change, one of my colleagues asked a question. transformed_data became a view. When we populate response.data, do we actually copy all the elements of the view? I couldn't answer the question with confidence, hence this article. I slightly updated both examples,and updated ToData to this: 1 struct ToData { 2 // ... 3 std::string title; 4 int amount; 5 6 ToData() { 7 std::cout << "ToData()\n"; 8 } 9 10 ToData(std::string title, int amount) : title(title), amount(amount) { 11 std::cout << "ToData(std::string title, int amount)\n"; 12 } 13 14 ToData(const ToData& other): title(other.title), amount(other.amount) { 15 std::cout << "ToData(const ToData& other)\n"; 16 } 17 18 ToData& operator=(const ToData& other) { 19 std::cout << "ToData& operator=(const ToData& other)\n"; 20 title = other.title; 21 amount = other.amount; 22 return *this; 23 } 24 25 ToData(ToData&& other) 26 : title(std::exchange(other.title, "")), 27 amount(std::exchange(other.amount, 0)) { 28 std::cout << "ToData(ToData&& other)\n"; 29 } 30 31 ToData& operator=(ToData&& other) { 32 std::cout << "ToData& operator=(ToData&& other)\n"; 33 title = std::exchange(other.title, ""); 34 amount = std::exchange(other.amount, 0); 35 return *this; 36 } 37 }; I also had to remove the usage of designed initializers as ToData is no longer an aggregate. The output for the original version using push_back is not particularly surprising. 1 ToData(std::string title, int amount) 2 ToData(ToData&& other) 3 ToData(std::string title, int amount) 4 ToData(ToData&& other) 5 ToData(const ToData& other) 6 ToData(std::string title, int amount) 7 ToData(ToData&& other) 8 ToData(const ToData& other) 9 ToData(const ToData& other) 10 writing response.data 11 ToData(const ToData& other) 12 ToData(const ToData& other) 13 ToData(const ToData& other) 14 wrote response.data 15 title: a, amount 42 16 title: b, amount 84 17 title: c, amount 42 In the for loop, we construct ToData and move it, and there is also a copy construction. Before actually copying the data. On the other hand, for the version using ranges, the outpout is shorter and different! 1 filling response.data 2 ToData(std::string title, int amount) 3 ToData(ToData&& other) 4 ToData(std::string title, int amount) 5 ToData(ToData&& other) 6 ToData(const ToData& other) 7 ToData(std::string title, int amount) 8 ToData(ToData&& other) 9 ToData(const ToData& other) 10 ToData(const ToData& other) 11 filled1 response.data 12 title: a, amount 42 13 title: b, amount 84 14 title: c, amount 42 Nothing actually happens within the transformation pipeline! Everything is happening lazily when we use the results of the pipeline and actually construct a vector. Then we have fewer calls than we had in the original version. Seemingly, far the ranges version has an advantage! But we all know that the original version is not optional even with a raw loop. Let's use emplace_back! Oh and we also forgot about calling std::vector::reserve to avoid reallocations! Here is the code producing the below output. 1 ToData(std::string title, int amount) 2 ToData(std::string title, int amount) 3 ToData(std::string title, int amount) 4 writing response.data 5 ToData(const ToData& other) 6 ToData(const ToData& other) 7 ToData(const ToData& other) 8 wrote response.data 9 title: a, amount 42 10 title: b, amount 84 11 title: c, amount 42 Now the raw loop version has an advantage! In this version, for each item, we have a constructor and a copy while in the ranges version, we also have an extra move! Note that since C++23, you can also use std::ranges::to > to construct the final vector, but it didn't result in any difference in terms of the number of special member function calls. Is that so bad? Probably not. Move operations are cheap, that's why they were introduced! Probably this is just an acceptable price to pay for more readable code. But our "more readable code" also features a lambda so let's just say that we have assumptions. Let's also run benchmarks. Benchmarks from QuickBench Based on Quick Bench, the enhanced raw loop version is about 20% faster on Clang than the raw loop version. The results are slightly different with GCC, but the raw loop version is still 10% faster. It's also worth noting that the original version with a push_back and without the reserve is 20-30% slower than the other two versions! By adding the reserve but still using push_back, the code is between the ranges and the raw loop with emplace_back version. What does this mean in real life? It depends. You must measure. Don't forget about Amdahl's law which says that "the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used." If this happens to be a bottleneck, use the emplace_back version without hesitation and don't forget about reserving enough space in memory for all the elements. I think you have no reason to use the push_back version and definitely not without calling reserve. Otherwise, if you write code where you also do some network calls or read from the database or from the filesystem, these differences are negligible and you should go with the version that you find the most readable. That's up to you. Conclusion Using ranges or algorithms has several advantages over raw loops, notably readability. On the other hand, as we've just seen, sheer performance is not necessarily among those advantages. Using ranges can be slightly slower than a raw loop version. But that's not necessarily a problem, it really depends on your use case. Most probably it won't make a bit difference. Connect deeper If you liked this article, please * hit on the like button, * subscribe to my newsletter [yH5BAEAAAA] dev cpp cpp20 cpp23 ranges loops This post is licensed under CC BY 4.0 by the author. [yH5BAEAAAA] [yH5BAEAAAA] Share Recent Update * The Battle Hardened Developer by Fiodar Sazanavets * C++26: erroneous behaviour * C++26: Removing language features * The big STL Algorithms tutorial: transform * My DEV birthday gift for you: DEV.to Analytics! Trending Tags cpp books watercooler career tutorial cpp23 stl algorithms self-improvement management Contents Further Reading Apr 13, 2022 2022-04-13T00:00:00+02:00 My first work experience with C++20 I joined a new team recently. We have our own internal microservices as well as libraries. While for microservices we support one main branch, for libraries we do have to support at least three, in... Feb 15, 2023 2023-02-15T00:00:00+01:00 The evolution of enums Constants are great. Types are great. Constants of a specific type are really great. This is why enum classes are just fantastic. Last year, we talked about why we should avoid using boolean funct... Apr 3, 2021 2021-04-03T00:00:00+02:00 C++ 20: Get the details by Rainer Grimm I could say that I picked C++ 20: Get the details up because I wanted to learn about the latest version of C++. I wouldn't lie if I said so, but truth be told I was already an avid reader of Modern... Should you use final? - Comments powered by Disqus. (c) 2025 Sandor Dargo. Some rights reserved. Powered by Jekyll with Chirpy theme. Trending Tags cpp books watercooler career tutorial cpp23 stl algorithms self improvement management