[HN Gopher] Programming Idioms
___________________________________________________________________
Programming Idioms
Author : mkl95
Score : 132 points
Date : 2021-08-15 18:47 UTC (4 hours ago)
(HTM) web link (programming-idioms.org)
(TXT) w3m dump (programming-idioms.org)
| maxk42 wrote:
| Some of the examples don't do what the idiom says. Some aren't
| even close to good answers.
|
| Too many to rely upon this.
| blunte wrote:
| I see a lot of people taking issue with the idioms presented, and
| rightfully so in many cases.
|
| Add the ability for people to improve or debate the solutions.
| Ultimately we should have a large curated cookbook (with
| additional variant selections and associated recipe variants).
|
| The most important human element of programming is knowing what
| to build (and what pieces to build to make the bigger thing). How
| often do I have to lookup ways to read a file in Ruby?... most
| times I need to read a file, I have to refer to the different
| approaches. I just don't do that often enough to remember
| everything. What I do know is, "this will be a lot of data, so I
| need to read it line by line or in chunks". That should be all
| you need to know, and then you pull up a recipe.
| wmu wrote:
| There's a similar project
| http://rosettacode.org/wiki/Rosetta_Code.
| diogenesjunior wrote:
| I made something similar myself, although it's only printing.
|
| https://github.com/FormerlyChucks/jello
| PrincessJas wrote:
| Your github username is a nazi, alt-right dogwhistle. Mods,
| please handle this.
| thom wrote:
| Surprisingly many failures of reading comprehension in the
| implementations here:
|
| https://programming-idioms.org/idiom/184/tomorrow
|
| I've always found it interesting to consider the simplest
| possible spec you could give 100 programmers and receive no bugs
| in return.
| still_grokking wrote:
| Hmm. There are more wrong implementations than correct ones as
| by now.
|
| The core of the problem was noted already: Obviously most
| people can't read.
|
| That's especially "funny" when thinking about all the fuss that
| is made about teaching children programming in school. They
| should start with teaching them reading.
|
| I don't even mean this snarky. The state of affairs is actually
| depressing and I would welcome it very much if more people
| around would be able to read, understanding what's written.
| Would make a lot of things easier for everybody I guess.
| jpxw wrote:
| Also writing. I don't believe that it's possible for someone
| to be a clear and concise programmer unless they're able to
| write their native language clearly and concisely.
|
| This would likely require teaching students at least basic
| logic. This was entirely absent from my school experience.
|
| It's a hugely undertaught and undervalued skill, IMO. Before
| we start shoehorning CS into high school curriculums, we
| should consider laying these foundations first.
| ghoward wrote:
| > I've always found it interesting to consider the simplest
| possible spec you could give 100 programmers and receive no
| bugs in return.
|
| There is only one or two
| https://pubs.opengroup.org/onlinepubs/9699919799/utilities/t...
| and maybe
| https://pubs.opengroup.org/onlinepubs/9699919799/utilities/f...
| .
|
| Even "Hello, World!" is done wrong more often than not. An
| example: you should check for errors from `printf()`, as in
| [1].
|
| [1]: https://stackoverflow.com/questions/12355758/proper-hello-
| wo...
| remexre wrote:
| Not even both of those; I thought there famously existed some
| way to make GNU true return non-zero.
|
| EDIT: Yep, https://github.com/coreutils/coreutils/blob/master
| /src/true....
| ghoward wrote:
| Ha! So it is impossible!
|
| Seriously, though, thank you for the link. That is good to
| know!
| Someone wrote:
| But that's a complex program, with command-line parsing,
| locale usage, etc. No wonder it has bugs :-)
|
| https://en.wikipedia.org/wiki/IEFBR14 was a one-byte
| program, but that had a bug, so it had to be doubled in
| size.
|
| And yes, it still exists (https://www.ibm.com/docs/en/zos-
| basic-skills?topic=utilities...). I wouldn't know how large
| it is nowadays.
| duped wrote:
| What do you call a spec that any programmer can understand and
| correctly describes the problem and its desired outcome
| exactly?
|
| A program
| still_grokking wrote:
| That description doesn't apply for most programs though.
|
| * Not all programmers can understand all programs
|
| * Most programs don't describe the problem exactly
|
| * Almost no program describes the desired outcome exactly
| jstanley wrote:
| I think the point is that the implementation of the program
| exactly describes the behaviour of the program. If you take
| the program as a description of what the program is
| supposed to do, then what it is supposed to do is pretty
| unambiguous!
|
| (Also debatable, for example if correct operation of the
| program depends on some property of its environment that
| can't always be relied upon.)
| still_grokking wrote:
| It's trivial to say that a program describes its behavior
| exactly (so it's its own spec).
|
| But the whole point of a spec is to describe how a
| program should behave _before_ writing it.
|
| Also you usually want some verification that the program
| matches the spec. When the program IS the spec what do
| you match?
| duped wrote:
| It's a joke - the point behind it is that most of
| programming around specs is refining the spec until it is
| sufficiently well defined as a program.
| still_grokking wrote:
| It's frankly only a kind of joke.
|
| People take it seriously, and you can hear it here and
| there as "argument" against proper specs.
| kukx wrote:
| I found out that a good way to learn idioms for a given language
| is to do some simple katas on codewars.com and then review the
| most upvoted solutions.
| Miiko wrote:
| The random idiom I got was:
|
| > Idiom #120 Read integer from stdin
|
| > Read an integer value from the standard input into variable n
| int n[15]; fgets(n, 15, stdin);
|
| Really?
| akomtu wrote:
| Yes. You asked for "an integer" from stdin? Here's your
| integer. Specify constraints better next time. (That's probably
| how the upcoming AI-assisted code gen tools will look like).
| blunte wrote:
| Just wait until this gets rolled into GitHub Copilot...
| WJW wrote:
| The longer I look at this example, the more weirdness I spot:
|
| - There are no standard integer types that take 15 (decimal)
| digits to represent.
|
| - The array contains ints instead of chars
|
| - Why would you use fgets() instead of just gets()? (Though I
| don't touch C very often so perhaps that is considered proper
| style)
|
| - Obviously no conversion of the digits into else, let alone
| specifying a base or handling a `0x` prefix for hexadecimal or
| a minus sign for negative numbers.
| Someone wrote:
| > There are no standard integer types that take 15 (decimal)
| digits to represent
|
| Nitpick: that is irrelevant. The code reads in at most 14
| characters.
|
| > Why would you use fgets() instead of just gets()?
|
| You don't use _gets_ because it doesn't exist anymore. It got
| removed in C11 (it rightfully was deemed so bad that
| backwards compatibility was sacrificed). You can use
| char *gets_s( char *str, rsize_t n )
|
| , though.
|
| (https://en.cppreference.com/w/c/io/gets)
| leetcrew wrote:
| gets is inherently unsafe unless the input is guaranteed
| (externally) to never overflow the buffer.
| mpweiher wrote:
| man gets ... SECURITY
| CONSIDERATIONS The gets() function cannot be used
| securely. Because of its lack of bounds checking,
| and the inability for the calling program to reliably
| determine the length of the next incoming line, the use of
| this function enables malicious users to arbitrarily
| change a running program's func- tionality through a
| buffer overflow attack. It is strongly suggested
| that the fgets() function be used in all cases. (See the
| FSA.)
| jhgb wrote:
| > Why would you use fgets() instead of just gets()?
|
| I assume it's because gets() ranks as "-10: It's impossible
| to get right" on Rusty's API Design Manifesto?
| (http://sweng.the-davies.net/Home/rustys-api-design-
| manifesto)
| WJW wrote:
| TIL. I have been leading a sheltered life in languages with
| garbage collectors that obscured the true horror of gets()
| from me.
| jstanley wrote:
| The random idiom I got was:
|
| > Idiom #137 Check if string contains only digits
|
| > Set boolean b to true if string s contains only characters in
| range '0'..'9', false otherwise. char b = 0;
| for (int i = 0; i < strlen(s); i++) { if (! (b =
| (s[i] >= '0' && s[i] <= '9'))) break; }
|
| I appreciate the funny assignment-and-test-and-early-break in
| one (although I'd hardly say it's idiomatic), but I could do
| without the quadratic strlen().
| Miiko wrote:
| Not to mention that proper idiom for this task would be:
| int n = strspn(s,"0123456789"); BOOL b = (s[n] == 0);
| Stratoscope wrote:
| That is nice and simple, but it makes ten comparisons for
| each character in s, where only two are needed. Of course
| it would be a good approach if the set of characters you're
| testing against is not contiguous, unlike 0..9.
| toxik wrote:
| Good news then, it's also linear time like the marginally
| faster but enormously grotesque for loop provided
| previously. I would take that clarity of intent a hundred
| times over squeezing a couple of comparisons out.
| Stratoscope wrote:
| > _it's also linear time_
|
| You raise an interesting point. It got me thinking about
| how big-O notation has failed us in some ways: it teaches
| us to ignore constant factors.
|
| In big-O, an algorithm that makes 1000 comparisons per
| element is no different from one that makes a single
| comparison per element. They are both linear time. But
| you can't deny that one of these will likely take 1000
| times as long as the other.
|
| Of course, like you, I favor simple and readable code
| over grotesque code that is hard to understand and
| mentally verify.
| foxfluff wrote:
| It's not unreasonable to assume the compiler will optimize it
| to a single call. Though I guess people who are capable of
| making that judgement won't need to look this idiom up on the
| internet.
| andrepd wrote:
| But still, it will be two passes through the string when it
| could be only one.
| Stratoscope wrote:
| We don't write code only for compilers, but for human
| readers as well. Why write code that makes a smart human
| wonder "Is that going to be quadratic? I'd better make sure
| the compiler optimizes it out!"
| sokoloff wrote:
| Is there something in the C spec that allows optimizing to
| a single strlen call?
| foxfluff wrote:
| Absolutely. The gist of it:
|
| "In the abstract machine, all expressions are evaluated
| as specified by the semantics. An actual implementation
| need not evaluate part of an expression if it can deduce
| that its value is not used and that no needed side
| effects are produced (including any caused by calling a
| function or accessing a volatile object)."
| sokoloff wrote:
| As I understand it, the strlen implementation ("calling a
| function") is typically going to come from another object
| file (at link time), so it's not clear that when
| compiling this file that "calling strlen has no side
| effects" is information available to the compiler.
| foxfluff wrote:
| strlen is a standard function (in a hosted environment).
| So it must do exactly what the standard says it does, and
| the standard doesn't say it has side-effects. The
| compiler could very well use a built-in implementation of
| strlen, or even omit the call entirely if it had another
| way to deduce its would-be return value.
|
| Object files are an implementation detail not known by
| the C standard.
| sokoloff wrote:
| > the standard doesn't say it has side-effects
|
| The relevant question is "does the standard say that it
| _does not have_ side-effects? " (is a pure function).
| @skissane's sibling comment to yours provides the
| explanation of how the compiler can deduce that it's a
| pure function.
| foxfluff wrote:
| The compiler can deduce that it is a pure function using
| the same logic by which it can deduce it is free to
| replace the call with a builtin: the function is defined
| in the standard. Everything @skissane said is
| implementation details and not particularly relevant (the
| compiler can do the optimization without the attribute
| they mentioned).
|
| I think the standard would fall apart if you read it
| under the assumption that anything not explicitly
| forbidden can happen. Instead, you should read and find
| out what the side effects are (and same for undefined
| behavior, unspecified behavior, implementation defined
| behavior, etcetra).
|
| "Accessing a volatile object, modifying an object,
| modifying a file, or calling a function that does any of
| those operations are all side effects, which are changes
| in the state of the execution environment." (There are
| more details if you care to dig in)
|
| Of course nothing stops you or me from making extensions
| to the standard, but analysing things from the
| perspective that some implementation might extend strlen
| to have visible side effects goes too far into
| whataboutism for my taste, unless there are real world
| examples to make it a relevant point.
| skissane wrote:
| > so it's not clear that when compiling this file that
| "calling strlen has no side effects" is information
| available to the compiler.
|
| It is because strlen is a #define to __builtin_strlen
|
| And __builtin_strlen is declared (internally) as
| __attribute__((pure)).
|
| The fact that it is a compiler builtin isn't that
| important; the compiler will perform the exact same loop
| optimisation on a non-builtin function if you declare it
| as __attribute__((pure)). So, in practice, it is not the
| object file which tells the compiler it can do this, it
| is the function declaration in the header file.
|
| __attribute__((pure)) is of course a GCC extension, not
| standard C or C++ - but Clang supports it and so do many
| proprietary compilers, MSVC appears to be the main
| exception. This isn't part of the standard but is a
| compatible extension. C++11 standardises attributes (with
| a different syntax) and C202x is going to do the same.
| Neither is yet including "pure" among the standardised
| attributes but a future standard revision could always
| add it. There is a proposal to add pure to the C++
| standard [0]. If it successfully makes it into C++, don't
| be surprised if it makes it into the C standard as well
| at some point. It is the kind of C++ feature which the C
| standard is likely to borrow, and is already widely
| implemented (just under different syntax) anyway.
|
| [0] http://www.open-
| std.org/jtc1/sc22/wg21/docs/papers/2015/p007...
| Someone wrote:
| Nitpick: I don't think the compiler needs that
| ___attribute__((pure))_ annotation to move _strlen_
| around.
|
| __builtin_strlen is a reserved name, so if the compiler
| sees it, it may assume it came from an _#include
| <string.h>_ and that it does what the standard says
| _strlen_ does.
|
| That is what allows compilers to do much more than move
| that function call out of the loop. Because they know
| what it does (or rather: what the programmer promises it
| will do, once the program get linked) they can inline its
| code or even, if the length of the string is known up
| front, replace the function call by a constant.
| foxfluff wrote:
| It doesn't even need to see __builtin_strlen (try modify
| your string.h or use musl, which has neither the
| attribute nor __builtin_strlen). It just needs to see
| strlen in a hosted environment (with the right header
| included?). If you want gcc to treat it otherwise, you
| also need -fno-builtin-strlen.
| ratww wrote:
| Don't know about the C spec, but this is a very popular
| optimisation. It's called _Loop-invariant code motion_.
| thethirdone wrote:
| I just checked godbolt [0]. gcc only calls strlen once even
| with -O0.
|
| [0]:https://godbolt.org/z/j4o1915vE
| josefx wrote:
| For me the strlen call appears directly before loops
| backwards jump when set to -O0, resulting in a call every
| iteration as far as I can tell. However -O1 already seems
| to optimize it to a single call at the start of the
| function.
| still_grokking wrote:
| I find it every time funny that when using languages that
| want to give you "total control" over the execution of
| your code (mostly C/C++) you actually almost never know
| what code gets executed in the end.
|
| It depends on the compiler, it's version, it's flags, and
| likely "the position of the moon".
|
| Of course the compiler is only allowed to do
| transformations that the spec permits. But it's
| impossible for a human being to anticipate the exact
| outcome. It's more like: "Compiler, do something that has
| the same outcome as this code I show you here". The
| output can be than something that doesn't resemble the
| input even slightly!
|
| There's obviously nothing wrong when the compiler is so
| smart that it sees some patterns and transforms your code
| into something much more efficient. Only that there's not
| much difference to what happens when you use a high level
| language. In both cases you in fact don't control the
| exact code that gets executed, and in both cases you rely
| on the smartness of your compiler to produce some
| efficient code, "whatever" you've written.
|
| That's why I think it's mostly a function of the _code-
| style_ how performant or efficient some language can be
| (to some extend of course). When you write low-level
| _style_ code (even in a high level language) a smart
| compiler will (hopefully) create something like what you
| would get form writing your code in C /C++.
| bombela wrote:
| I think what matters is the intent. Here the intent is to
| compare against strlen at every turn of the loop.
|
| When the compiler is optimizing it might very well
| realize that the parameter to strlen doesn't change and
| the output can be saved first.
|
| If the intent is to compute the length only once, then we
| can save the length in a variable.
|
| Of course; as others have pointed out; there is no need
| to call strlen at all in this example.
| Stratoscope wrote:
| That is one of the most confusing pieces of C code I have
| seen lately. And it fails on an empty string: I would expect
| it to set b to 1 for an empty string, but it sets it to 0. Of
| course that could easily be fixed by setting b to 1 at the
| top.
|
| Also, code like this should always be put inside a function
| that returns a value, not just written inline. Making it a
| function allows simpler and more understandable code too.
|
| The funniest part is that it is not necessary to call
| strlen() at all! The whole thing can be written in a single
| pass over the string. Here is how I would code it in C:
| int OnlyDigits( char str[] ) { for( int i = 0;
| str[i] != '\0'; ++i ) { if( str[i] < '0' ||
| str[i] > '9' ) return 0; } return 1;
| }
|
| Try it here:
|
| https://replit.com/@geary/OnlyDigitsC
| pvtmert wrote:
| You do realize it returns 1 for an empty string right? I
| mean it doesn't have any digits in it...
|
| What about adding a check of str[0] == 0 -> return 0 Also,
| giving char str[] will make it char* str. Which can be
| null. This may cause reading a random memory location
| (possibly segfault or use-after-free)
|
| edit: I get the comments but empty string still contains no
| digits. Given the regex would be ^[0-9]+ (+ instead of *)
| What I want to say is string has a numerical value or not.
| Empty string is NaN.
| Someone wrote:
| "contains only digits" isn't synonymous with "not
| (doesn't have any digits)"
|
| It is synonymous with "none of the characters is a non-
| digit", which is true for the empty string.
| Stratoscope wrote:
| > _You do realize it returns 1 for an empty string right?
| I mean it doesn 't have any digits in it..._
|
| Yes, and that was deliberate on my part, as it meets my
| expectation of what such a function should do in this
| edge case.
|
| The problem statement was "Set boolean b to true if
| string s contains only characters in range '0'..'9',
| false otherwise."
|
| To my mind, the question "is every character in the
| string a digit" should be equivalent to "are there any
| non-digits in the string" (with the answer inverted, of
| course).
|
| Returning 0 (false) for the empty string makes those
| questions not equivalent. It makes the empty string a
| special case.
|
| Of course the real problem is that the problem is under-
| specified. It should call out specifically what should
| happen for an empty string, because as illustrated here,
| this is something where reasonable people may disagree.
|
| > _Also giving char str[] will make it char* str. Which
| can be null. This may cause reading a random memory
| location (possibly segfault or use-after-free)_
|
| Well yes, of course. The point of my comment wasn't to
| write bullet-proof library-ready code, it was only to
| illustrate two things: code like this should always go in
| a function, and the entire task can be accomplished in a
| single pass through the string.
|
| Thanks for keeping me on my toes!
| zeven7 wrote:
| Empty string contains no non-digits. Therefore it is only
| digits.
| besnn00 wrote:
| as far as I saw they were pretty useless and the way they were
| presented wasn't that great (same language variants could be
| grouped in one commented section)
| worik wrote:
| No Swift?
|
| I like these projects. Fun
| nearmuse wrote:
| People make these for autodidactic purposes, you can't expect
| them to always be interesting or even meet any expectations.
| reaperducer wrote:
| Is it problematic that the same link was submitted by the same
| user 49 days ago?
| dang wrote:
| This is in the FAQ:
| https://news.ycombinator.com/newsfaq.html#reposts.
| mdp2021 wrote:
| Official practice (see FAQ): please also consider page:
|
| https://news.ycombinator.com/invited
|
| as reported by the unofficial
|
| https://github.com/minimaxir/hacker-news-undocumented
| besnn00 wrote:
| reposting is permitted while unnoticed if not done too often
| (once in two weeks would be ok imo)
| mkl95 wrote:
| HN invited me to repost it. I thought it was a good idea.
| Shared404 wrote:
| I concur, I didn't see it last time and am glad I did this
| time.
___________________________________________________________________
(page generated 2021-08-15 23:00 UTC)