[HN Gopher] Programming Idioms
       ___________________________________________________________________
        
       Programming Idioms
        
       Author : mkl95
       Score  : 132 points
       Date   : 2021-08-15 18:47 UTC (4 hours ago)
        
 (HTM) web link (programming-idioms.org)
 (TXT) w3m dump (programming-idioms.org)
        
       | maxk42 wrote:
       | Some of the examples don't do what the idiom says. Some aren't
       | even close to good answers.
       | 
       | Too many to rely upon this.
        
       | blunte wrote:
       | I see a lot of people taking issue with the idioms presented, and
       | rightfully so in many cases.
       | 
       | Add the ability for people to improve or debate the solutions.
       | Ultimately we should have a large curated cookbook (with
       | additional variant selections and associated recipe variants).
       | 
       | The most important human element of programming is knowing what
       | to build (and what pieces to build to make the bigger thing). How
       | often do I have to lookup ways to read a file in Ruby?... most
       | times I need to read a file, I have to refer to the different
       | approaches. I just don't do that often enough to remember
       | everything. What I do know is, "this will be a lot of data, so I
       | need to read it line by line or in chunks". That should be all
       | you need to know, and then you pull up a recipe.
        
       | wmu wrote:
       | There's a similar project
       | http://rosettacode.org/wiki/Rosetta_Code.
        
         | diogenesjunior wrote:
         | I made something similar myself, although it's only printing.
         | 
         | https://github.com/FormerlyChucks/jello
        
           | PrincessJas wrote:
           | Your github username is a nazi, alt-right dogwhistle. Mods,
           | please handle this.
        
       | thom wrote:
       | Surprisingly many failures of reading comprehension in the
       | implementations here:
       | 
       | https://programming-idioms.org/idiom/184/tomorrow
       | 
       | I've always found it interesting to consider the simplest
       | possible spec you could give 100 programmers and receive no bugs
       | in return.
        
         | still_grokking wrote:
         | Hmm. There are more wrong implementations than correct ones as
         | by now.
         | 
         | The core of the problem was noted already: Obviously most
         | people can't read.
         | 
         | That's especially "funny" when thinking about all the fuss that
         | is made about teaching children programming in school. They
         | should start with teaching them reading.
         | 
         | I don't even mean this snarky. The state of affairs is actually
         | depressing and I would welcome it very much if more people
         | around would be able to read, understanding what's written.
         | Would make a lot of things easier for everybody I guess.
        
           | jpxw wrote:
           | Also writing. I don't believe that it's possible for someone
           | to be a clear and concise programmer unless they're able to
           | write their native language clearly and concisely.
           | 
           | This would likely require teaching students at least basic
           | logic. This was entirely absent from my school experience.
           | 
           | It's a hugely undertaught and undervalued skill, IMO. Before
           | we start shoehorning CS into high school curriculums, we
           | should consider laying these foundations first.
        
         | ghoward wrote:
         | > I've always found it interesting to consider the simplest
         | possible spec you could give 100 programmers and receive no
         | bugs in return.
         | 
         | There is only one or two
         | https://pubs.opengroup.org/onlinepubs/9699919799/utilities/t...
         | and maybe
         | https://pubs.opengroup.org/onlinepubs/9699919799/utilities/f...
         | .
         | 
         | Even "Hello, World!" is done wrong more often than not. An
         | example: you should check for errors from `printf()`, as in
         | [1].
         | 
         | [1]: https://stackoverflow.com/questions/12355758/proper-hello-
         | wo...
        
           | remexre wrote:
           | Not even both of those; I thought there famously existed some
           | way to make GNU true return non-zero.
           | 
           | EDIT: Yep, https://github.com/coreutils/coreutils/blob/master
           | /src/true....
        
             | ghoward wrote:
             | Ha! So it is impossible!
             | 
             | Seriously, though, thank you for the link. That is good to
             | know!
        
             | Someone wrote:
             | But that's a complex program, with command-line parsing,
             | locale usage, etc. No wonder it has bugs :-)
             | 
             | https://en.wikipedia.org/wiki/IEFBR14 was a one-byte
             | program, but that had a bug, so it had to be doubled in
             | size.
             | 
             | And yes, it still exists (https://www.ibm.com/docs/en/zos-
             | basic-skills?topic=utilities...). I wouldn't know how large
             | it is nowadays.
        
         | duped wrote:
         | What do you call a spec that any programmer can understand and
         | correctly describes the problem and its desired outcome
         | exactly?
         | 
         | A program
        
           | still_grokking wrote:
           | That description doesn't apply for most programs though.
           | 
           | * Not all programmers can understand all programs
           | 
           | * Most programs don't describe the problem exactly
           | 
           | * Almost no program describes the desired outcome exactly
        
             | jstanley wrote:
             | I think the point is that the implementation of the program
             | exactly describes the behaviour of the program. If you take
             | the program as a description of what the program is
             | supposed to do, then what it is supposed to do is pretty
             | unambiguous!
             | 
             | (Also debatable, for example if correct operation of the
             | program depends on some property of its environment that
             | can't always be relied upon.)
        
               | still_grokking wrote:
               | It's trivial to say that a program describes its behavior
               | exactly (so it's its own spec).
               | 
               | But the whole point of a spec is to describe how a
               | program should behave _before_ writing it.
               | 
               | Also you usually want some verification that the program
               | matches the spec. When the program IS the spec what do
               | you match?
        
             | duped wrote:
             | It's a joke - the point behind it is that most of
             | programming around specs is refining the spec until it is
             | sufficiently well defined as a program.
        
               | still_grokking wrote:
               | It's frankly only a kind of joke.
               | 
               | People take it seriously, and you can hear it here and
               | there as "argument" against proper specs.
        
       | kukx wrote:
       | I found out that a good way to learn idioms for a given language
       | is to do some simple katas on codewars.com and then review the
       | most upvoted solutions.
        
       | Miiko wrote:
       | The random idiom I got was:
       | 
       | > Idiom #120 Read integer from stdin
       | 
       | > Read an integer value from the standard input into variable n
       | int n[15];       fgets(n, 15, stdin);
       | 
       | Really?
        
         | akomtu wrote:
         | Yes. You asked for "an integer" from stdin? Here's your
         | integer. Specify constraints better next time. (That's probably
         | how the upcoming AI-assisted code gen tools will look like).
        
         | blunte wrote:
         | Just wait until this gets rolled into GitHub Copilot...
        
         | WJW wrote:
         | The longer I look at this example, the more weirdness I spot:
         | 
         | - There are no standard integer types that take 15 (decimal)
         | digits to represent.
         | 
         | - The array contains ints instead of chars
         | 
         | - Why would you use fgets() instead of just gets()? (Though I
         | don't touch C very often so perhaps that is considered proper
         | style)
         | 
         | - Obviously no conversion of the digits into else, let alone
         | specifying a base or handling a `0x` prefix for hexadecimal or
         | a minus sign for negative numbers.
        
           | Someone wrote:
           | > There are no standard integer types that take 15 (decimal)
           | digits to represent
           | 
           | Nitpick: that is irrelevant. The code reads in at most 14
           | characters.
           | 
           | > Why would you use fgets() instead of just gets()?
           | 
           | You don't use _gets_ because it doesn't exist anymore. It got
           | removed in C11 (it rightfully was deemed so bad that
           | backwards compatibility was sacrificed). You can use
           | char *gets_s( char *str, rsize_t n )
           | 
           | , though.
           | 
           | (https://en.cppreference.com/w/c/io/gets)
        
           | leetcrew wrote:
           | gets is inherently unsafe unless the input is guaranteed
           | (externally) to never overflow the buffer.
        
           | mpweiher wrote:
           | man gets             ...                SECURITY
           | CONSIDERATIONS          The gets() function cannot be used
           | securely.  Because of its lack of          bounds checking,
           | and the inability for the calling program to reliably
           | determine the length of the next incoming line, the use of
           | this function          enables malicious users to arbitrarily
           | change a running program's func-          tionality through a
           | buffer overflow attack.  It is strongly suggested
           | that the fgets() function be used in all cases.  (See the
           | FSA.)
        
           | jhgb wrote:
           | > Why would you use fgets() instead of just gets()?
           | 
           | I assume it's because gets() ranks as "-10: It's impossible
           | to get right" on Rusty's API Design Manifesto?
           | (http://sweng.the-davies.net/Home/rustys-api-design-
           | manifesto)
        
             | WJW wrote:
             | TIL. I have been leading a sheltered life in languages with
             | garbage collectors that obscured the true horror of gets()
             | from me.
        
         | jstanley wrote:
         | The random idiom I got was:
         | 
         | > Idiom #137 Check if string contains only digits
         | 
         | > Set boolean b to true if string s contains only characters in
         | range '0'..'9', false otherwise.                   char b = 0;
         | for (int i = 0; i < strlen(s); i++) {             if (! (b =
         | (s[i] >= '0' && s[i] <= '9')))   break;         }
         | 
         | I appreciate the funny assignment-and-test-and-early-break in
         | one (although I'd hardly say it's idiomatic), but I could do
         | without the quadratic strlen().
        
           | Miiko wrote:
           | Not to mention that proper idiom for this task would be:
           | int n = strspn(s,"0123456789");       BOOL b = (s[n] == 0);
        
             | Stratoscope wrote:
             | That is nice and simple, but it makes ten comparisons for
             | each character in s, where only two are needed. Of course
             | it would be a good approach if the set of characters you're
             | testing against is not contiguous, unlike 0..9.
        
               | toxik wrote:
               | Good news then, it's also linear time like the marginally
               | faster but enormously grotesque for loop provided
               | previously. I would take that clarity of intent a hundred
               | times over squeezing a couple of comparisons out.
        
               | Stratoscope wrote:
               | > _it's also linear time_
               | 
               | You raise an interesting point. It got me thinking about
               | how big-O notation has failed us in some ways: it teaches
               | us to ignore constant factors.
               | 
               | In big-O, an algorithm that makes 1000 comparisons per
               | element is no different from one that makes a single
               | comparison per element. They are both linear time. But
               | you can't deny that one of these will likely take 1000
               | times as long as the other.
               | 
               | Of course, like you, I favor simple and readable code
               | over grotesque code that is hard to understand and
               | mentally verify.
        
           | foxfluff wrote:
           | It's not unreasonable to assume the compiler will optimize it
           | to a single call. Though I guess people who are capable of
           | making that judgement won't need to look this idiom up on the
           | internet.
        
             | andrepd wrote:
             | But still, it will be two passes through the string when it
             | could be only one.
        
             | Stratoscope wrote:
             | We don't write code only for compilers, but for human
             | readers as well. Why write code that makes a smart human
             | wonder "Is that going to be quadratic? I'd better make sure
             | the compiler optimizes it out!"
        
             | sokoloff wrote:
             | Is there something in the C spec that allows optimizing to
             | a single strlen call?
        
               | foxfluff wrote:
               | Absolutely. The gist of it:
               | 
               | "In the abstract machine, all expressions are evaluated
               | as specified by the semantics. An actual implementation
               | need not evaluate part of an expression if it can deduce
               | that its value is not used and that no needed side
               | effects are produced (including any caused by calling a
               | function or accessing a volatile object)."
        
               | sokoloff wrote:
               | As I understand it, the strlen implementation ("calling a
               | function") is typically going to come from another object
               | file (at link time), so it's not clear that when
               | compiling this file that "calling strlen has no side
               | effects" is information available to the compiler.
        
               | foxfluff wrote:
               | strlen is a standard function (in a hosted environment).
               | So it must do exactly what the standard says it does, and
               | the standard doesn't say it has side-effects. The
               | compiler could very well use a built-in implementation of
               | strlen, or even omit the call entirely if it had another
               | way to deduce its would-be return value.
               | 
               | Object files are an implementation detail not known by
               | the C standard.
        
               | sokoloff wrote:
               | > the standard doesn't say it has side-effects
               | 
               | The relevant question is "does the standard say that it
               | _does not have_ side-effects? " (is a pure function).
               | @skissane's sibling comment to yours provides the
               | explanation of how the compiler can deduce that it's a
               | pure function.
        
               | foxfluff wrote:
               | The compiler can deduce that it is a pure function using
               | the same logic by which it can deduce it is free to
               | replace the call with a builtin: the function is defined
               | in the standard. Everything @skissane said is
               | implementation details and not particularly relevant (the
               | compiler can do the optimization without the attribute
               | they mentioned).
               | 
               | I think the standard would fall apart if you read it
               | under the assumption that anything not explicitly
               | forbidden can happen. Instead, you should read and find
               | out what the side effects are (and same for undefined
               | behavior, unspecified behavior, implementation defined
               | behavior, etcetra).
               | 
               | "Accessing a volatile object, modifying an object,
               | modifying a file, or calling a function that does any of
               | those operations are all side effects, which are changes
               | in the state of the execution environment." (There are
               | more details if you care to dig in)
               | 
               | Of course nothing stops you or me from making extensions
               | to the standard, but analysing things from the
               | perspective that some implementation might extend strlen
               | to have visible side effects goes too far into
               | whataboutism for my taste, unless there are real world
               | examples to make it a relevant point.
        
               | skissane wrote:
               | > so it's not clear that when compiling this file that
               | "calling strlen has no side effects" is information
               | available to the compiler.
               | 
               | It is because strlen is a #define to __builtin_strlen
               | 
               | And __builtin_strlen is declared (internally) as
               | __attribute__((pure)).
               | 
               | The fact that it is a compiler builtin isn't that
               | important; the compiler will perform the exact same loop
               | optimisation on a non-builtin function if you declare it
               | as __attribute__((pure)). So, in practice, it is not the
               | object file which tells the compiler it can do this, it
               | is the function declaration in the header file.
               | 
               | __attribute__((pure)) is of course a GCC extension, not
               | standard C or C++ - but Clang supports it and so do many
               | proprietary compilers, MSVC appears to be the main
               | exception. This isn't part of the standard but is a
               | compatible extension. C++11 standardises attributes (with
               | a different syntax) and C202x is going to do the same.
               | Neither is yet including "pure" among the standardised
               | attributes but a future standard revision could always
               | add it. There is a proposal to add pure to the C++
               | standard [0]. If it successfully makes it into C++, don't
               | be surprised if it makes it into the C standard as well
               | at some point. It is the kind of C++ feature which the C
               | standard is likely to borrow, and is already widely
               | implemented (just under different syntax) anyway.
               | 
               | [0] http://www.open-
               | std.org/jtc1/sc22/wg21/docs/papers/2015/p007...
        
               | Someone wrote:
               | Nitpick: I don't think the compiler needs that
               | ___attribute__((pure))_ annotation to move _strlen_
               | around.
               | 
               | __builtin_strlen is a reserved name, so if the compiler
               | sees it, it may assume it came from an _#include
               | <string.h>_ and that it does what the standard says
               | _strlen_ does.
               | 
               | That is what allows compilers to do much more than move
               | that function call out of the loop. Because they know
               | what it does (or rather: what the programmer promises it
               | will do, once the program get linked) they can inline its
               | code or even, if the length of the string is known up
               | front, replace the function call by a constant.
        
               | foxfluff wrote:
               | It doesn't even need to see __builtin_strlen (try modify
               | your string.h or use musl, which has neither the
               | attribute nor __builtin_strlen). It just needs to see
               | strlen in a hosted environment (with the right header
               | included?). If you want gcc to treat it otherwise, you
               | also need -fno-builtin-strlen.
        
               | ratww wrote:
               | Don't know about the C spec, but this is a very popular
               | optimisation. It's called _Loop-invariant code motion_.
        
           | thethirdone wrote:
           | I just checked godbolt [0]. gcc only calls strlen once even
           | with -O0.
           | 
           | [0]:https://godbolt.org/z/j4o1915vE
        
             | josefx wrote:
             | For me the strlen call appears directly before loops
             | backwards jump when set to -O0, resulting in a call every
             | iteration as far as I can tell. However -O1 already seems
             | to optimize it to a single call at the start of the
             | function.
        
               | still_grokking wrote:
               | I find it every time funny that when using languages that
               | want to give you "total control" over the execution of
               | your code (mostly C/C++) you actually almost never know
               | what code gets executed in the end.
               | 
               | It depends on the compiler, it's version, it's flags, and
               | likely "the position of the moon".
               | 
               | Of course the compiler is only allowed to do
               | transformations that the spec permits. But it's
               | impossible for a human being to anticipate the exact
               | outcome. It's more like: "Compiler, do something that has
               | the same outcome as this code I show you here". The
               | output can be than something that doesn't resemble the
               | input even slightly!
               | 
               | There's obviously nothing wrong when the compiler is so
               | smart that it sees some patterns and transforms your code
               | into something much more efficient. Only that there's not
               | much difference to what happens when you use a high level
               | language. In both cases you in fact don't control the
               | exact code that gets executed, and in both cases you rely
               | on the smartness of your compiler to produce some
               | efficient code, "whatever" you've written.
               | 
               | That's why I think it's mostly a function of the _code-
               | style_ how performant or efficient some language can be
               | (to some extend of course). When you write low-level
               | _style_ code (even in a high level language) a smart
               | compiler will (hopefully) create something like what you
               | would get form writing your code in C /C++.
        
               | bombela wrote:
               | I think what matters is the intent. Here the intent is to
               | compare against strlen at every turn of the loop.
               | 
               | When the compiler is optimizing it might very well
               | realize that the parameter to strlen doesn't change and
               | the output can be saved first.
               | 
               | If the intent is to compute the length only once, then we
               | can save the length in a variable.
               | 
               | Of course; as others have pointed out; there is no need
               | to call strlen at all in this example.
        
           | Stratoscope wrote:
           | That is one of the most confusing pieces of C code I have
           | seen lately. And it fails on an empty string: I would expect
           | it to set b to 1 for an empty string, but it sets it to 0. Of
           | course that could easily be fixed by setting b to 1 at the
           | top.
           | 
           | Also, code like this should always be put inside a function
           | that returns a value, not just written inline. Making it a
           | function allows simpler and more understandable code too.
           | 
           | The funniest part is that it is not necessary to call
           | strlen() at all! The whole thing can be written in a single
           | pass over the string. Here is how I would code it in C:
           | int OnlyDigits( char str[] ) {           for( int i = 0;
           | str[i] != '\0';  ++i ) {               if( str[i] < '0' ||
           | str[i] > '9' ) return 0;           }           return 1;
           | }
           | 
           | Try it here:
           | 
           | https://replit.com/@geary/OnlyDigitsC
        
             | pvtmert wrote:
             | You do realize it returns 1 for an empty string right? I
             | mean it doesn't have any digits in it...
             | 
             | What about adding a check of str[0] == 0 -> return 0 Also,
             | giving char str[] will make it char* str. Which can be
             | null. This may cause reading a random memory location
             | (possibly segfault or use-after-free)
             | 
             | edit: I get the comments but empty string still contains no
             | digits. Given the regex would be ^[0-9]+ (+ instead of *)
             | What I want to say is string has a numerical value or not.
             | Empty string is NaN.
        
               | Someone wrote:
               | "contains only digits" isn't synonymous with "not
               | (doesn't have any digits)"
               | 
               | It is synonymous with "none of the characters is a non-
               | digit", which is true for the empty string.
        
               | Stratoscope wrote:
               | > _You do realize it returns 1 for an empty string right?
               | I mean it doesn 't have any digits in it..._
               | 
               | Yes, and that was deliberate on my part, as it meets my
               | expectation of what such a function should do in this
               | edge case.
               | 
               | The problem statement was "Set boolean b to true if
               | string s contains only characters in range '0'..'9',
               | false otherwise."
               | 
               | To my mind, the question "is every character in the
               | string a digit" should be equivalent to "are there any
               | non-digits in the string" (with the answer inverted, of
               | course).
               | 
               | Returning 0 (false) for the empty string makes those
               | questions not equivalent. It makes the empty string a
               | special case.
               | 
               | Of course the real problem is that the problem is under-
               | specified. It should call out specifically what should
               | happen for an empty string, because as illustrated here,
               | this is something where reasonable people may disagree.
               | 
               | > _Also giving char str[] will make it char* str. Which
               | can be null. This may cause reading a random memory
               | location (possibly segfault or use-after-free)_
               | 
               | Well yes, of course. The point of my comment wasn't to
               | write bullet-proof library-ready code, it was only to
               | illustrate two things: code like this should always go in
               | a function, and the entire task can be accomplished in a
               | single pass through the string.
               | 
               | Thanks for keeping me on my toes!
        
               | zeven7 wrote:
               | Empty string contains no non-digits. Therefore it is only
               | digits.
        
         | besnn00 wrote:
         | as far as I saw they were pretty useless and the way they were
         | presented wasn't that great (same language variants could be
         | grouped in one commented section)
        
       | worik wrote:
       | No Swift?
       | 
       | I like these projects. Fun
        
         | nearmuse wrote:
         | People make these for autodidactic purposes, you can't expect
         | them to always be interesting or even meet any expectations.
        
       | reaperducer wrote:
       | Is it problematic that the same link was submitted by the same
       | user 49 days ago?
        
         | dang wrote:
         | This is in the FAQ:
         | https://news.ycombinator.com/newsfaq.html#reposts.
        
         | mdp2021 wrote:
         | Official practice (see FAQ): please also consider page:
         | 
         | https://news.ycombinator.com/invited
         | 
         | as reported by the unofficial
         | 
         | https://github.com/minimaxir/hacker-news-undocumented
        
         | besnn00 wrote:
         | reposting is permitted while unnoticed if not done too often
         | (once in two weeks would be ok imo)
        
         | mkl95 wrote:
         | HN invited me to repost it. I thought it was a good idea.
        
           | Shared404 wrote:
           | I concur, I didn't see it last time and am glad I did this
           | time.
        
       ___________________________________________________________________
       (page generated 2021-08-15 23:00 UTC)