[HN Gopher] Replace std:find_if in 80% of the cases
___________________________________________________________________
Replace std:find_if in 80% of the cases
Author : todsacerdoti
Score : 32 points
Date : 2021-10-01 18:27 UTC (4 hours ago)
(HTM) web link (www.sandordargo.com)
(TXT) w3m dump (www.sandordargo.com)
| [deleted]
| quietbritishjim wrote:
| I disagree with the article's conclusion. The evidence in it
| proves that it's better _not_ to use the specialised functions
| (like any_of).
|
| It's immediately clear to me what each of the find_if snippets
| does, but I haven't heard of the other functions - and the
| evidence from the article is that the vast majority of other
| programmers haven't either (in fact that's out of the subset that
| bother to use find_if, which I would say means even less of
| general coders). Sure, it's fairly obvious what they do from the
| name, but I'd still bother to check their signature in case
| there's a subtlety. In other words, there may be fewer lines of
| code, but it's still more cognitive overhead.
|
| In fact I'd go one step further and say it's better to use raw
| primitives, like for loops (range based where appropriate), where
| they don't need too much extra code.
| leetcrew wrote:
| it seems like your argument boils down to "I prefer a for loop
| because I'm more familiar with it". if that's true of you and
| most of the people you work with, that's certainly a valid
| argument. the opposite happens to be true on my team. we
| understand and prefer the functions from the algorithms header
| in most cases.
|
| one argument in favor of functions like any_of is that, once
| you get familiar with them, they provide additional
| documentation of intent. I'll concede this is a pretty minor
| benefit; most uses of any_of replace a pretty simple loop.
| yellow_lead wrote:
| > So I went through all our usages of find_if and I found that it
| was only used in a proper way in about 20% of all the cases.
|
| > This means that the Pareto principle applies here too. In 80%
| of the cases, std::find_if should not have been used.
|
| I don't think this is the pareto principle. Its a rule of
| probabilities, i.e 1-X.
| tomjakubowski wrote:
| The Pareto principle can be stated thus: When
| the percentage sign's 89 minus nine That's
| Pareto
| kragen wrote:
| Agreed. 80% of the `find_if` cases require 20% of the what,
| exactly? The functions? But he's using four times as many
| functions now.
| toxik wrote:
| Marginally more readable, but is it performing better?
| secondcoming wrote:
| No, the codegen is the same
|
| https://godbolt.org/z/eebE9eYP6
| twoodfin wrote:
| Indeed, outside of corner cases where iterators are expensive
| to copy or compare, I'd be surprised if there were any
| measurable advantage beyond clarity.
| dataflow wrote:
| [edit: whoops, never mind]
| kragen wrote:
| There's no difference in how many times the predicate is
| called or with which elements.
| felixguendling wrote:
| When reading the title I thought that it is about complexity of
| linear search vs. better alternatives like hash sets/maps, or at
| least sorted data structures (binary search).
|
| But if having linear complexity is fine,
| std::find_if/any_of/none_of/all_of/etc. are of course fine and
| you should prefer the version that's most expressive.
| kragen wrote:
| Of course; as Larry Wall said, doing a sequential search over
| the keys of a hash table is like clubbing someone to death with
| a loaded Uzi.
| mywittyname wrote:
| It's safer, more reliable, and easier to clean up after?
| [deleted]
| malkia wrote:
| Wondering if something like this can catch these -
| https://twitter.com/_fel1x/status/1443613096820543492
| sillysaurusx wrote:
| Is "using namespace std" still considered heresy? I rather liked
| it. It'd make a lot of these examples a little more readable.
| eska wrote:
| Definitely don't use it in headers, because it pollutes the
| namespace of files including it. In .cpp files it's more
| subjective. You can also resort to only using it in functions,
| to limit it's scope to code where you make heavy use of std.
| riskneutral wrote:
| I don't like it, clarity is better in C++. Things are
| complicated enough as it is.
| kragen wrote:
| How about `using std::find_if`?
| dataflow wrote:
| Unqualified calls change the behavior. You end up doing ADL
| instead of a normal lookup and could potentially end up
| calling a different function than the one you intended
| entirely. Sometimes that's intentional (as with swap), but
| often it's a bug.
| sillysaurusx wrote:
| I've been a C++ dev for longer than I care to admit. The
| idea that you could call the wrong function after putting
| "using std::string" at the top of your .cpp file doesn't
| reflect my experience.
|
| You might have a point if people are putting that in
| header files. But even in header files, "using
| std::string" is often a standard thing to do in e.g. game
| engines. It's far easier to change out your string
| implementation if you're writing "string" instead of
| "std::string".
| dataflow wrote:
| > using std::string
|
| The question was about find_if, which is a function, not
| string.
| [deleted]
| kragen wrote:
| Reading these I'm struck by the non-orthogonality and "mental
| compilation" on display here. In Python instead of
| return numbers.end() !=
| std::find_if(numbers.begin(), numbers.end(),
| [](int number) { return number % 2 == 1; });
|
| I would write return any(number % 2 == 1 for
| number in numbers)
|
| which, similarly, stops examining numbers once it finds an odd
| one, but works by transforming the sequence of numbers into a
| lazy sequence of booleans, then applying the generic boolean-
| sequence-reducer `any` to it. I think this more clearly says what
| is desired (the C++ version strikes me as something a compiler
| might output given the Python version, but in this case you had
| to run the compiler in your mind), and additionally is a better
| orthogonal decomposition of the problem than computing a first-
| hit iterator from a sequence of numbers and a number-to-boolean
| transformation, then comparing it against the end of the
| sequence.
|
| Sandor's suggested replacement solves the first problem of
| clearly saying what is desired: return
| std::any_of(numbers.begin(), numbers.end(),
| [](int number) { return number % 2 == 1; });
|
| But the penalty is even worse orthogonality!
|
| Because of Python's better orthogonality, you can factor out the
| lazy boolean sequence generator into a function of its own if
| that's a useful thing to do: def
| oddnesses(numbers): return (number % 2 == 1 for
| number in numbers) return any(oddnesses(numbers))
|
| In this particular case the extra degree of composability you get
| from it is not very useful, but then again, neither is the
| original code. You could do things like this with that sequence
| with equal plausibility to the original code:
| all(oddnesses(numbers)) # addressed by all_of in OP
| zip(numbers, oddnesses(numbers)) sum(oddnesses(numbers))
| # punning booleans as ints sum(map(operator.mul, numbers,
| oddnesses(numbers)))
|
| Of course, Python is not really a viable alternative to C++ in
| most of the places where people are using C++, mostly because it
| uses 40 times as much CPU and energy and 10 times as much memory,
| but also because large Python codebases get unmaintainable.
|
| But we could wish for a language with the efficiency and in-the-
| large maintainability of C++ and the clarity and generality of
| Python.
| eska wrote:
| Are you aware of ranges?
| std::ranges::any_of(numbers, [](auto i){ return i % 2 == 0; }))
|
| There's also stuff like this for orthogonality
| https://en.cppreference.com/w/cpp/ranges/transform_view
| kragen wrote:
| Yes, which is why I complained about how nonorthogonal the
| C++ was instead of just how verbose :)
| eska wrote:
| See my edit. You're probably interested in views, with
| which you can do stuff similar to lazy functional languages
| v | std::views::reverse | std::views::drop(2);
|
| Here's an intro
| https://hannes.hauswedell.net/post/2019/11/30/range_intro/
| blt wrote:
| Sadly the C++20 ranges library does not include any
| facility to create views by writing Python-like generator
| loops (coroutine loops with "yield"). Creating new views
| is about as difficult as creating iterators before - i.e.
| too difficult.
| jrimbault wrote:
| Because I love Python, and Rust is applicable is many places
| where C++ is, here's a Rust version : //
| assuming `numbers` is an iterator numbers.any(|i| i % 2
| == 1)
|
| Rust's iterators generally compile down very well, and I think
| this exactly as expressive as Python. And I also wished C#'s
| Enumerable were as performant as Rust's iterators.
| Doctor_Fegg wrote:
| That's basically borrowed off Ruby. (Which isn't a bad
| thing.)
| kragen wrote:
| It's not that simple.
|
| Ruby's Enumerable is a concrete class; Rust's
| std::iter::Iterator is a _trait_ , which has some aspects
| in common with Ruby classes and other aspects in common
| with Python protocols such as the Python iterator protocol.
| std::iter::Iterator is (primarily, though not in this case)
| an _external_ iterator like Python iterators, STL
| iterators, and ranges in C++20 or D, not an _internal_
| iterator like (the original version of) Ruby Enumerable.
| Rust 's type-checking strategy is more like ML than like
| Ruby, Python, C++, or D, while its compilation strategy is
| more like C++'s or D's compilation strategy than those of
| Ruby, Python, or ML.
| mbrubeck wrote:
| And if you write the equivalent to the `find_if` version in
| Rust, then the standard linter Clippy will automatically
| suggest the "good" version: warning: called
| `is_some()` after searching an `Iterator` with `find`
| --> src/lib.rs:2:9 | 2 | numbers.find(|i| i
| % 2 == 1).is_some() |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use `any()` instead:
| `any(|i| i % 2 == 1)` |
| kragen wrote:
| IMHO it still suffers from less orthogonality; it's precisely
| equivalent to the C++ std::ranges::any_of version, though
| pleasantly less verbose.
|
| I think I'm going to have to get serious about Rust. I had a
| dismaying epiphany writing
| https://news.ycombinator.com/item?id=28660097 the other day:
| like Perl, Python _just isn 't saving me very much effort_
| except for quick hacks, maybe a factor of 2-4 over C (if we
| discount CVEs, heh). It's not the order-of-magnitude
| development speed improvement I was hoping for, and now that
| I've been using it for 20 years I know it never will be.
| Python starts to buckle under the strain of large projects,
| they totally fucked up Unicode handling to an epic degree
| never seen in any other programming language, and it
| typically costs you orders of magnitude in performance.
|
| And--this is the worst of all--at best a library written in
| Python is only usable from Python and maaybe the JVM
| languages, and you can probably call it from a shell script.
| (I don't know, maybe the CIL languages too.) By contrast, if
| I write a library in Rust or C or C++ I can use it in Rust,
| C, C++, Python, Perl, Ruby, LuaJIT, Julia, PHP, node.js,
| Scheme, Common Lisp, OCaml, Golang, the JVM languages, the
| CIL, and maybe even Arduino. So the effort is greater
| (though, as you point out in this case, not necessarily
| _much_ greater, though you didn 't mention compile time,
| which sucks pretty bad for both Rust and C++) but the benefit
| is _much_ greater.
|
| So I think from now on I'll try to start writing prototypes
| in Python or OCaml but, if the prototype seems worthwhile,
| writing the Real Thing in Rust. Which is gonna be pretty
| rough since I don't know Rust. :)
| jackewiehose wrote:
| in C++ I would write for (int i: numbers)
| if (i % 2 == 1) return true; return false;
|
| find_if() etc. is fine if you already have the right predicate
| function at hand but it becomes quite ridiculous with lambda
| expressions.
| kragen wrote:
| I probably would too, and composability would suffer further.
| The concern you raise about lambda expressions is basically
| one of syntax rather than semantics. Syntax is important, but
| as an unrepentant user and occasional implementor of Lisps
| and Forths, I am in no position to criticize C++ :)
| secondcoming wrote:
| This interestingly generates shorter code as the compiler
| doesn't attempt any loop unrolling
|
| https://godbolt.org/z/eebE9eYP6
| tomsmeding wrote:
| > But we could wish for a language with the efficiency and in-
| the-large maintainability of C++ and the clarity and generality
| of Python.
|
| What about Haskell? Maintainability yes; efficiency mostly. It
| has a GC that can be a pain for high-memory scenarios if you
| don't spend a lot of effort optimising, but on the whole
| Haskell is surprisingly fast for its expressivity.
|
| Any odd? `any odd numbers`. Or equivalently, `or (map odd
| numbers)`, which is compositional in that `map odd numbers` is
| just a list again -- no distinction between lazy and strict
| lists in Haskell. Your other four examples are `and (map odd
| numbers)`, `zip numbers (map odd numbers)`, `sum (map (fromEnum
| . odd) numbers)`, `sum (zipWith (*) numbers (map (fromEnum .
| odd) numbers))`. Note I used `map odd` instead of defining
| `oddnesses = map odd` and using `oddnesses`, because the
| definition is shorter than the name. :)
|
| This changes somewhat when you use arrays instead of lists:
| Haskell's lists are in terms of behaviour mostly "generators
| that are implicitly stored as a lazy linked list if necessary".
| So if you want to store stuff more compactly, or you want to
| index into the list (as opposed to looping over it), you need
| arrays. But even for arrays the story doesn't change much!
| Using the 'vector' library, actually all of the above examples
| work just fine, with no unnecessary intermediate vectors
| because of some intelligent fusion rules, if you put `V.`
| before the list-related function names.
| kragen wrote:
| Haskell is #1 for generality but #0 for clarity, though I
| have to admit that some of my Python examples aren't exactly
| shining exemplars either. But if I write code in Haskell I
| can only call it from Haskell (which wasn't something I
| mentioned in my original comment, but is important).
| secondcoming wrote:
| > Because of Python's better orthogonality, you can factor out
| the lazy boolean sequence generator into a function of its own
| if that's a useful thing to do:
|
| You can do the same in C++: static bool
| isEven(int x) { return x % 2 == 0; } bool
| foo(const std::vector<int>& v) { return
| v.end() != std::find_if(v.begin(), v.end(),
| isEven); }
| kragen wrote:
| Not the same; no sequence.
| mywittyname wrote:
| Shouldn't this be titled, "use std::find_if where appropriate"?
|
| The whole point of this is to find every element in the list
| which matches the lambda. Which is why it returns an iterator:
| you're supposed to use that value, increment it, then call
| std::find_if again with it as the first parameter.
|
| Granted, the documentation is failing here, since it doesn't
| provide an example of this use case. for(auto i
| = begin(v), e = end(v); find_if(i, e, lambda_func) != e; ++i) {
| cout << "Found a match: " << i }
|
| If you don't need a list of matching elements, but instead want
| to know if there was any match, then use a different function
| designed for that specific use.
| josephcsible wrote:
| tl;dr: if the only thing you do with the result of find_if or
| find_if_not is compare it to .end(), then you should use all_of,
| any_of, or none_of instead.
___________________________________________________________________
(page generated 2021-10-01 23:02 UTC)