[HN Gopher] The hardest program I've ever written (2015)
___________________________________________________________________
The hardest program I've ever written (2015)
Author : graderjs
Score : 230 points
Date : 2022-03-05 10:06 UTC (12 hours ago)
(HTM) web link (journal.stuffwithstuff.com)
(TXT) w3m dump (journal.stuffwithstuff.com)
| 734129837261 wrote:
| The day job interviews for programmers ask "write me a language
| formatter, you have 3 hours" I'll probably end up in jail. Those
| things are way beyond my skillset and I'm glad smarter people
| than me exist. If you're one of those people: thank you. I love
| you.
| viginti_tres wrote:
| Why jail? Cause you'll lose your mind and do harmful things?
| Asking for a friend
| andai wrote:
| The result will be so bad, it will be considered a violation
| of the Geneva Convention.
| dang wrote:
| Related:
|
| _The Hardest Program I 've Ever Written - How a code formatter
| works (2015)_ - https://news.ycombinator.com/item?id=22706242 -
| March 2020 (125 comments)
|
| _The Hardest Program I 've Ever Written (2015)_ -
| https://news.ycombinator.com/item?id=17271963 - June 2018 (76
| comments)
|
| _The Hardest Program I 've Ever Written (2015)_ -
| https://news.ycombinator.com/item?id=15063193 - Aug 2017 (48
| comments)
|
| _The Hardest Program I 've Ever Written_ -
| https://news.ycombinator.com/item?id=10195091 - Sept 2015 (76
| comments)
| paxys wrote:
| If I ever have to write a code formatter, it will strictly
| enforce one line per statement and disallow artificial line
| breaks. Devs who end up writing 5000-character function chains
| better have a wide monitor.
| LudwigNagasena wrote:
| Long statements is one of the reason I dislike "fluent
| interfaces". To me long statements feel like a problem of bad
| language design. And a super smart formatter feels like a crutch
| when what you really want is a leg.
| cosmiccatnap wrote:
| [deleted]
| amelius wrote:
| Does it find a true optimum, or just some approximation?
| algon33 wrote:
| Approximation
| munificent wrote:
| As long as it doesn't hit a built-in hard limit for search
| space exploration (which is in practice only encountered on
| pathological generated code), it will find the optimally scored
| set of line breaks.
| coliveira wrote:
| I don't understand why people have this fetich for automatic
| formatters. If you really want this, you should be using old
| style FORTRAN or something similar. The good thing about modern
| languages is that you don't depend on the location of code in the
| page for it to work. If you start worrying too much about exact
| formatting, you throw away this big advantage. I really prefer
| code in the location where I put it, not where are machine thinks
| it is best.
|
| And if you think that formatting is a problem to understand the
| code, let's get real: this is the smallest of the problems. There
| are tons of other things that make code complicated to read, like
| variable and function names, the particular style of your code,
| how you split it into classes and files, the algorithm you're
| using, and so many other, more important things. I can guarantee
| you that if a piece of code is well written, you can understand
| it independent of where you put braces or the number of spaces
| you're using.
| dahart wrote:
| In addition to what @derefr said, in order to not want
| automatic formatting, first you have to get to the place where
| zero people in your team/company care about formatting &
| whitespace at all. Disagreements over whitespace consume people
| time, and those disagreements go away when automated formatting
| is used. This is the strongest reason in my experience to use
| automatic formatting: to eliminate time spent talking about
| formatting.
|
| Auto-formatting tools in editors exist, and they're very
| common, and they're not always configured properly, so people
| change formatting on accident. Sometimes formatting changes can
| cause code reviews to take more time than necessary. Having
| tabs in code can cause actual problems, for example, since tabs
| aren't the same size everywhere.
|
| This is not just a code understanding problem, and shouldn't be
| written off as trivial, IMO.
| [deleted]
| avl999 wrote:
| Auto Format is not for the machine, it is for other humans who
| work with you.
| paxys wrote:
| If you truly think that you have never worked on a codebase
| with a team size > 3.
| umanwizard wrote:
| I like automatic formatters (if they're a deterministic
| function of AST to text) because I think of what I'm writing as
| a syntax tree, and the fact that it's stored as text as a
| historical accident.
|
| I just want to write the tokens without ever thinking about
| where they go on the page, periodically save and let my
| formatter deal with it.
| derefr wrote:
| I take it that you don't work on a large corporate code-base /
| don't have to code-review other people's code?
|
| Auto-formatting (esp. when used as a pre-commit hook) means
| that _changes_ people make to the style are ignored /reverted
| (and/or, that places where people introduce a different style
| in new code, are auto-formatted back into the existing style
| immediately, rather than that needing to be an additional
| commit later on.) Thus, no spurious diff lines from formatting.
| Thus, not having to wade through a bunch of "noise" diff-lines,
| to get to the "signal" of semantic changes at code-review time.
|
| Also, having auto-formatting on both your main branch +
| development branches, makes merge/rebase conflicts less likely
| to happen. (Which basically boils down to "fewer noise diff-
| lines" again.)
|
| In other words: auto-formatting makes code more machine-legible
| to _syntax-blind_ parsers; which in turn allows tooling like
| diff(1) to be more helpful.
|
| (Yes, we _could_ just have language-syntax-aware semantic-level
| diff /merge/etc. tools. Not sure why nobody ever made these. I
| bet this is one of those things where Lisp users have had it
| for ages but using their own parallel world of abstractions
| that doesn't exist in C/POSIX.)
| coliveira wrote:
| This has nothing to do with formatting. When you create a
| change to a code base you should be submitting only the lines
| that are new/changed. If someone is submitting purely
| formatting changes, he/she's just wrong and you should reject
| that during review.
| derefr wrote:
| > If someone is submitting purely formatting changes,
| he/she's just wrong and you should reject that during
| review.
|
| If you add a line between two existing lines, and then
| insert after it a new _blank_ line to serve as a sort of
| "paragraph marker", is that a "pure formatting" change?
|
| If you add a constant in a group of constants, whose name
| is longer than the existing ones, do you pad the spacing of
| the values of those constants so they line up with one-
| another?
|
| For that matter, if you fully-qualify a previously-
| unqualified and potentially-ambiguous identifier, is _that_
| a "pure formatting" change? Some auto-formatter tools do
| this, after all.
|
| These are things that people may or may not do in code-
| bases, that "fly under the radar" of even the most
| stringent of human code-reviewers, because they're so
| irrelevant to _understanding_ the code. They 're "fluff."
| But because of this, how people introduce that fluff is
| essentially random, and so the cause of a lot of diff
| noise. These are the things that auto-formatters can "lock
| down" to only happen a certain way.
|
| But I think you're missing the forest for the trees, as I
| mostly wasn't talking about _pure_ formatting changes. What
| I 'm talking about is more like:
|
| You add a formal parameter to a function. Before, the
| function's clause head was less than 80 characters. Now
| it's more than 80 characters. Do you break the formal
| parameter list onto the next line? If so, how far do you
| indent it? Do you split the formal-parameter list up so
| that _each_ parameter is now on its own line? Etc.
|
| Done by humans with no strict standard, these sort of one-
| off judgements made arbitrarily will add up to "syntax rot"
| -- not something you observe with your eyes, but a sort of
| "potential energy" of un-made formatting changes, that
| means that any given _semantic_ change by a sufficiently-
| motivated human might become the impetus for a manual
| reformatting _during_ that semantic change, such that that
| reformatting will happen at a _random_ time, inflating a
| patch where affecting that additional code _wasn 't_
| strictly necessary. (If you ask the programmer why they did
| it, they'll say they needed to "clean up the code they were
| working on" so that they could understand it well enough to
| apply the fix.) Which is horrible for both code review
| _and_ merge predictability.
|
| On the other hand, an auto-formatting tool will apply that
| transformation exactly _when_ it becomes necessary; and
| will pick some way of formatting the additional lines and
| stick to it. There 's no "potential energy" there. At all
| times, the codebase is "at rest", with no chance of anyone
| introducing "arbitrary" (but actually _left-over_ )
| formatting changes.
|
| Human formatting is like a sequence of DML statements in an
| RDBMS. Auto-formatting is like a sequence of operations
| against a CRDT. Given a bunch of changes run in a _random
| order_ , the output of human formatters will be arbitrary,
| while the output of auto-formatting will be deterministic.
| Which is what you want, if you're doing complex things
| involving e.g. long-maintained stable branches for 1.x that
| cherry-pick changes from 2.x.
| brtmr wrote:
| > The good thing about modern languages is that you don't
| depend on the location of code in the page for it to work. If
| you start worrying too much about exact formatting, you throw
| away this big advantage.
|
| Counterpoint: When using a formatter, I stop worrying about
| formatting. It's a job for a computer, done by a computer.
| Humans are bad at consistency and discipline, computers are
| great at it. I want to concentrate on the things that matter,
| and formatting isn't one of those.
|
| Especially in larger teams, consistent formatting is just nice.
| No conflicting styles in the same file, and more meaningful
| diffs.
| coliveira wrote:
| If you really want to stop worrying about code formatting,
| just stop doing it. It is not really that important. I have
| never spent any time worrying about it, and I don't see why
| people would be upset about formatting.
|
| Moreover, using an automatic formatter will not fix it,
| because, guess what, there is no universal code formatter.
| All of them have different results and a long list of
| parameters. Determining the best way to use one will create
| more work for you as you manage your team, and will
| inevitably add a new step to your already complex building
| process. Just stop worrying and use that time in more
| productive ways.
| RussianCow wrote:
| > Determining the best way to use one will create more work
| for you as you manage your team, and will inevitably add a
| new step to your already complex building process.
|
| I don't know. I write JavaScript at $DAY_JOB and setting up
| Prettier on our repos took all of ~30 minutes, with an
| additional ~15 to determine which options to use. (There
| aren't many because Prettier is fairly opinionated.) I have
| seen far more time wasted quibbling about code styling in
| code reviews.
| jan_Inkepa wrote:
| I've been thinking of working on an automatic formatter for one
| particular programming language in order to easily be able to
| guarantee consistency of the documentation examples for it. (I
| get occasion bug reports about stylistic inconsistency or
| inconsistent spacing in them every so often)
| yashap wrote:
| Strongly disagree:
|
| - Not having to format my code manually at all, just letting
| the formatter do it for me, is a significant productivity win.
| I write code as fast as I can, with the minimum number of key
| strokes, in a way that would normally be super ugly, and it
| comes out the same. I have my editor setup to auto-format the
| current file on save, so it's just type a bit of code with zero
| formatting, cmd+s, then it's instantly perfectly formatted
|
| - For a codebase with 10s or 100s of devs working on it,
| uniform formatting does significantly help readability. Sure I
| can still read it if there's dozens of different formatting
| styles going on, but I can read it faster if the formatting is
| always consistent
|
| - Re: the above, yes you can keep consistent formatting without
| a code formatter, with a style guide that everyone learns, and
| that you enforce in code reviews. But that's a waste of time
| both for on-boarding new devs, and a basically neverending
| waste of time during code reviews. Also a waste of time writing
| and maintaining the style guide itself
|
| The first point helps me write faster, the second helps me read
| faster, and the third keeps code reviews and the like quicker.
|
| Code formatters are such a clear, easy win, especially with
| large teams, that it's hard for me to understand why anyone
| would opt out of them. It's not a MASSIVE win, but IMO it
| clearly makes for a more productive development environment,
| and they're generally dead simple to setup.
| coliveira wrote:
| I have worked on teams that do automatic formatting and
| others that didn't. I have never seen any advantage of
| automatic formatting. In my experience, people who like to
| complain about simple things like where to put braces or
| where to break a line will move the goal posts and start to
| complain about particular parameters of the formatter, or try
| to change the formatter to something "more powerful". People
| who don't care about location of braces will continue working
| without problems, and everything will be the same as before,
| just with the added complexity.
| yashap wrote:
| The point is not that any one code formatting style is
| best, the point is that consistency in formatting across a
| codebase helps you read code faster. Our eyes and brains
| are good at picking up patterns - consistent patterns lets
| our brains parse code faster than if every file is written
| in a different style.
|
| Furthermore, not worrying at all about indentation,
| spacing, brace placement, semicolons or not, etc. lets me
| write code faster, not just read it faster. Type it out
| with zero effort expended on formatting, save, editor auto-
| formats.
|
| It's not that any of this saves crazy amounts of time, but
| it does make all of code writing, code reading and code
| reviews slightly faster. When it's so easy to setup, why
| not do it?
|
| The only argument I can see against auto-formatters is that
| people like to put their own artistic touch on the code
| they write. I get that, but it wastes time, especially when
| everyone starts doing things in their own style.
|
| I've been working professionally as a dev for 9 years, also
| on teams that use auto-formatters, and teams that don't. I
| think they're a small but clear productivity booster.
| oneeyedpigeon wrote:
| I have a particular fondness for well-formatted html that I can
| read via 'view source' - in contrast to the div-overloaded soup
| that I usually encounter. Periodically, I toy with automated
| formatting for html until I remember that it's essentially
| impossible - two html sources can have a single space character
| difference and simultaneously produce the same output and
| different output, depending on an external CSS file. This kind of
| stuff is tricky.
| [deleted]
| ameliaquining wrote:
| I'm confused, don't the browser dev tools do exactly this? That
| seems better than sending pretty-printed HTML on the wire,
| which is a bunch of unnecessary bytes that users have to pay
| for.
| oneeyedpigeon wrote:
| The browser dev tools display something very different to the
| source of the page - it's a live view of the document, for a
| start.
| ameliaquining wrote:
| That's addressed easily enough: disable JavaScript, so that
| the document can't change.
| lesam wrote:
| It's interesting that the article specifically mentions the go
| formatter, but fails to notice that the go formatter sidesteps
| this problem entirely by not setting a line length constraint:
| https://news.ycombinator.com/item?id=16434566
| xphx wrote:
| > _If every statement fit within the column limit of the page,
| yup. It's a piece of cake. (I think that's what gofmt does.)
| But our formatter also keeps your code within the line length
| limit._
| CodesInChaos wrote:
| As a programmer I prefer formatters that don't introduce those
| heuristic line-breaks based on line length.
|
| I'm still hoping that Rust will eventually get such a formatter.
| Unfortunately the people responsible for rustfmt seem to have a
| strong preference for the "ignore line-breaks the user inserted"
| approach.
| boardwaalk wrote:
| This grinds me gears. Especially when you have multiple similar
| lines and one is a character longer and the formatter breaks
| that line.
|
| It's like, I could scan the code easier and understand it
| better without you doing that, thank you!
|
| A similar thing is match statements -- some arms using braces
| vs statements.
|
| There's also a bunch of heuristics in rustfmt that are complex
| to the point that I literally couldn't format the code the way
| it does without building some sort of decision tree annotated
| with uneven column limits (e.g. "70% of the column limit"). If
| I can't and therefore wouldn't format the code the way the
| formatter does, there's an issue.
|
| Formatters are for consistency, I think, and sometimes they
| work against that.
| jynelson wrote:
| It's not the maintainers - rustfmt has an official style guide
| it's not allowed to break without an RFC:
| https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide...
| mbrock wrote:
| Zig's formatter has no notion of line length. If you want an
| argument list to be stacked vertically, you insert a trailing
| comma after the last argument, otherwise the formatter will put
| it all on a single line. I was a bit bothered by this at first
| but I came to really like it.
| umvi wrote:
| Agreed. Formatters are supposed to remove all cognitive burden
| related to formatting. But formatters like Black (for python)
| will do line-length based formatting which reintroduces the
| cognitive burden again ("oh crap, my variable names are too
| long, better shorten them so this line stops getting broken
| up"). I like gofmt better for this reason. It doesn't break up
| your lines based on some arbitrary line length.
| sixstringtheory wrote:
| I usually set the line length limit to somewhere around 80 so
| I can have several columns visible at once without wrapping
| or truncation.
|
| So far my magic number is three columns: three code files, or
| one/two code files with a terminal and/or browser window
| thrown in the mix. Or any of these columns can be split into
| two vertically stacked boxes for a total of six things.
|
| I'm also a (nonultrawide) single screen coder (after many
| forays into the multi screen world) which has undoubtedly
| guided my preference.
| munificent wrote:
| Ultimately code is a visual medium (for most users). It is a
| data format consumed primarily by human eyeballs, so there is
| no escaping the reality that things like identifier length,
| line length, wrapping, etc. matter.
|
| Any other strategy is like trying to design a chair without
| thinking about butts. You may come up with some sort of
| elegant Bauhaus mathematically perfect work of art, but no
| one will want to sit in it.
| jll29 wrote:
| (2015)
| svalorzen wrote:
| I'm not sure I understand why dynamic programming wouldn't work
| (and the author explicitly mentioned Knuth). Tex's main job is
| literally doing line breaks, which is the exact same problem
| being tackled here. I would expect a similar approach
| (progressively build a graph of the most promising breaking
| points) to be effective. Why wouldn't it be the case here?
| mirekrusin wrote:
| Yes, strange, it looks like that's his solution plus some adhoc
| logic. At the same time he's more knowledgeable than I am so
| dunno.
| pantsforbirds wrote:
| There are some nice clarifications to the problems he ran into
| with his DP implementation with the little skulls at the bottom
| of the article.
| svat wrote:
| As someone very familiar with the Knuth-Plass line-breaking
| algorithm (https://tex.stackexchange.com/a/423578/48), an
| important difference I see here is that for paragraphs (the
| domain of TeX), there is no "state" that needs to be preserved
| across lines: if you know that your paragraph is going to
| choose a certain break-point, then you can pretty much typeset
| the "before" and "after" parts independently, each optimally.
| (With one exception: there is a penalty for hyphens being on
| successive lines, so we need to track whether the previous line
| was hyphenated.) This is the "optimal substructures" property
| that makes it so amenable to dynamic programming.
|
| With the code formatter, to format the part after a certain
| character, you need to keep track of the indentation depth of
| all the expressions that have not yet terminated at this point
| -- because you presumably want parallel expressions to be
| formatted with the same indentation depth, for closing
| parentheses to match their corresponding opening parentheses,
| etc.
|
| For example, in this example: experimental =
| document.querySelectorAll('link').any((link) =>
| link.attributes['rel'] == 'import' &&
| link.attributes['href'] == POLYMER_EXPERIMENTAL_HTML);
|
| and, say (I'm making this up): experimental =
| document.querySelectorAll('link').any( (link)
| => link.attributes['rel'] == 'import' &&
| link.attributes['href'] == POLYMER_EXPERIMENTAL_HTML);
|
| -- knowing that there's a break after the `&&` is not enough;
| you also need to know the indentation of the previous
| expressions, to decide how you're going to format the part
| after the `&&`.
|
| This is what the author alludes to in the post:
|
| > _A line break changes the indentation of the remainder of the
| statement, which in turn affects which other line breaks are
| needed. Sorry, Knuth. No dynamic programming this time. [...]
| For most of the time, the formatter_ did _use dynamic
| programming and memoization. [...] It worked fairly well, but
| was a nightmare to debug._
|
| > _It was_ highly _recursive, and ensuring that the keys to the
| memoization table were precise enough to not cause bugs but
| not_ so _precise that the cache lookups always fail was a very
| delicate balancing act. Over time, the amount of data needed to
| uniquely identify the state of a subproblem grew, including
| things like the entire expression nesting stack at a point in
| the line, and the memoization table performed worse and worse._
|
| In TeX, paragraphs have each line of the same width (simple
| case) or can have a \parshape (in general), but these are
| "global" constraints that don't depend on what breaks you
| choose.
| erichocean wrote:
| > _I would expect a similar approach (progressively build a
| graph of the most promising breaking points) to be effective.
| Why wouldn 't it be the case here?_
|
| That's...how he does it.
| raverbashing wrote:
| Whenever I read something like this I wonder that current
| languages (even the higher level ones) are poor at expressing
| higher-level concepts like that in a practical way and capturing
| that complexity (in an easily manageable) form
|
| One of the hardest parts of programming is understanding what's
| happening from reading code. And if you abstract too much "the
| traditional way" then it is just even harder to understand.
| r34 wrote:
| That's something that I've been thinking about yesterday, while
| writing my code in PHPStorm. I thought how much easier those
| modern tools make programmer life, how intuitive they are and how
| hard it must have been to get to the current state of art. Thanks
| for that, creators!
| Shadonototra wrote:
| language formatter/server with semantic analysis are the hardest
| thing ever
| thedatamonger wrote:
| Bravo! Well written and informative! And as someone who's
| obsessed with Dart at the moment timely! Thanks!
| tgv wrote:
| Heuristics? That sounds like a job for machine learning, and I'm
| not being frivolous. I think that it is doable, and when it gets
| it wrong, the consequences are almost nil. It would at least make
| a decent graduation project.
|
| Let's not think of the spin-off: FaaS.
| munificent wrote:
| There is research into using ML for automated formatting.
| Personally, I'm not a fan. The heuristics are relatively simple
| and when hand-authored can be _explained_. Throwing ML at it
| discards explainability and risks really weird formatting
| decisions on edge cases for relatively little upside.
|
| My experience is that people prefer formatting that is:
|
| 1. Unsurprising.
|
| 2. Nice looking.
|
| 3. Simple.
|
| In roughly that order. Using ML might increase 2 but at the
| expense of 1 and certainly 3.
| iudqnolq wrote:
| Part of the complexity is if I format my code and commit it,
| and then you checkout and make a change, we don't want your
| formatter to have different opinions on how to format my code.
| For this you need (partial) stability at least across minor
| versions, which is harder with a less-explainable algorithm.
| dorianmariefr wrote:
| Related for JavaScript, Ruby, HTML and many more
| https://prettier.io
|
| And the creator of prettier/plugin-ruby worked on a pure-Ruby
| implementation https://github.com/ruby-syntax-tree/syntax_tree
| knorker wrote:
| Shouldn't this be marked 2015?
| checker659 wrote:
| This is the second or third article on the difficulty of line
| breaking I've read on HN just this week. Why aren't there any
| good *exhaustive* tome on the art of text editors / line breaking
| / text shaping / text rendering / text on the GPU etc. I'd pay
| good money.
| xarope wrote:
| is this /s, in reference to latex/knuth and the precursor to
| yak shaving?
| bombcar wrote:
| There's not a single reference that I know of, at least not
| covering all aspects. An interesting place to start may be this
| PDF and it's references:
| https://mirror.math.princeton.edu/pub/CTAN/info/memdesign/me...
|
| TeX famously does line breaking in a perhaps decent way - but
| it and text shaping become more of an art than a definite list
| of rules to follow.
| wheelerof4te wrote:
| So many formatters popping up. If the old-timers could do without
| them, I wonder about the usefulness of them.
|
| The best code formatter is _you_.
| sixstringtheory wrote:
| Just because they did without them doesn't mean they wouldn't
| have preferred to have had them, given the opportunity. And I
| really doubt you've surveyed them all to make such an
| authoritative call on this.
| runarberg wrote:
| It's not that formatters are _essential_ but they are extremely
| convenient. They don't just save you from formatting your own
| code, they also:
|
| * Prevent arguments on which formatting convention to adopt,
|
| * Save a peer reviewer from shallow comments if formatting
| conventions are broken,
|
| * Prevent fill-in white space commit from filling in the git
| history, and
|
| * Decrease the risk of senior developers imposing weird styles
| on the code base.
|
| Formatters _might_ help you write better code by freeing you
| from worrying about one aspect of coding, but--much more
| importantly--they help create and maintain a better culture
| around your code.
| Hendrikto wrote:
| Formatting is NOT the purely stylistic choice many code
| formatter authors make it out to be, though.
|
| It can greatly affect readability, lead to/prevent
| unnecessary merge conflicts, and aid in/stand in the way of
| using nice VCS features (blame, revert, bisect, cherry-pick,
| ...).
|
| Many automatic code formatters ignore and do not optimize for
| these metrics.
| sixstringtheory wrote:
| I've never seen a formatter cause merge or blame issues,
| but that could be because I always have them run on a
| precommit hook and even then always squash all commits on a
| PR so there is never a commit in mainline history solely
| for code format. I would not admit a PR solely for
| formatting, either.
|
| Git history trumps code style IMO, I would not e.g. add a
| formatter to a legacy codebase and then reformat
| everything, or do so after changing a rule. Only diffs get
| formatted.
| majkinetor wrote:
| They also:
|
| 1. Make programming more boring
|
| 2. Prevents me organizing related thoughts on single lines
|
| 3. Prevents me doing meaningful and more readable indentation
| in specific contexts (such as align on equal sign etc.)
|
| I don't like them, nor I like any of the style checkers. The
| equivalent is aS if somebody wrote a book, and you give it to
| GPT-3 to make it more readable for entire world. Fascinating
| BS.
| sixstringtheory wrote:
| > _Make programming more boring_
|
| This is a terrible argument for anything to do with
| programming/code. It is pure opinion and preference and
| therefore is not falsifiable.
|
| If you're on a team of one, go wild. But if you're on an
| actual team trying to get things done, please don't bore
| the other people to death with the "interminably soul
| crushing debates over code formatting" as the article aptly
| puts it.
|
| The team wants to see exciting results, not "exciting"
| code. Code is a means, not an end.
|
| And especially if you're on the job, you are not being
| compensated with excitement, but money. Go seek excitement
| in your personal time.
|
| Formatters are not comparable to GPT because the code
| semantics have not changed, only the form. You just don't
| want to retrain yourself to read code that isn't written
| exactly to your liking. That's laziness.
| hhmc wrote:
| Most code formatters have pragmas to allow you to break out
| of the autoformatter, for when you really do need to
| override (your case 3).
| cinntaile wrote:
| I can see why it is preferred for teams, if you don't work
| in teams you can decide for yourself of course.
| hhmc wrote:
| Will you also forgo antibiotics the next time you have a raging
| infection, or do you only appeal to antiquity _sometimes_?
| wheelerof4te wrote:
| Keep your straws, strawman. I don't want them, be they short
| or long.
| junga wrote:
| But you are not me and here we are.
| lordpankake wrote:
| Old-timers did without memory safety too, which is why I'm
| switching our company's stack over to good ol' C
| wheelerof4te wrote:
| Some of the older programming languages had memory safety. C
| is not the only old programming language.
| tomjen3 wrote:
| Do you also wonder about the usefulness of git?
| wheelerof4te wrote:
| Sometimes, yes.
|
| But I never worked in a big professional team before, where
| git's features shine.
|
| And git != code formatter.
| unfocussed_mike wrote:
| Preaching to the choir here I suspect, but if other lone
| developers are reading this:
|
| Git's features shine just as brightly, IMO, if you work for
| yourself.
|
| Granted a good number of them are often not very relevant
| to a lone developer, but if you work for or by yourself on
| any projects over the long term, you should build git (or
| some other distributed source control, but probably git)
| into the way you work.
|
| Not to excess, by any means. I use a very small subset of
| git's features and in practice many of my simpler projects
| don't even use branches, because it's not in the nature of
| the changes I am making to those projects or the time
| management I do.
|
| But for example you should consider git an essential step
| between dev and live -- using it to deploy -- and you
| should look at how you could use it to facilitate staging
| and testing.
|
| Combined with a changelog and relentless use of comments
| and notes, git helps "structured forgetting", which as a
| freelancer is pretty crucial; sometimes you work
| frantically on a thing for a month, get paid and then it
| comes back to you years later.
|
| > And git != code formatter.
|
| No, obviously, but the needs of the former are supported by
| the benefits of the latter.
|
| That said, I use one for golang but not, in general, for
| PHP. I should find a code-formatter I can bend to my will
| for PHP, but after 17 years of increasingly complex lone
| PHP development, I know what I need from my own formatting
| requirements in order to manage projects.
| scq wrote:
| In a team is also where code formatters shine, IMO.
|
| The biggest advantages (to me at least) are that they
| almost entirely eliminate formatting from code review, and
| the consistent style makes it easy to read and edit code
| written by different people.
| ZephyrBlu wrote:
| Git is insanely useful regardless of whether you're in a
| team or not. Having a history of your changes, being able
| to define an atomic change, branches, etc are all very
| useful even when working solo.
| tomjen3 wrote:
| I have never regretted issuing a git init command, but I
| have regretted not doing that.
|
| Granted when I am developing for myself, most/all commits
| are just going to master but even so.
| bombcar wrote:
| Git is how we store all the white space changes as the
| formatting wars rage.
|
| Apparently some actual code changes are in there but we've
| never found them.
| munificent wrote:
| There's two ways to look at how people of the past compare to
| people of today:
|
| 1. They didn't have X, so obviously we don't need X now.
|
| 2. There were the ones who _created_ X, so clearly they felt
| the absence of X was a problem to be solved.
|
| I'm inclined to believe 2 comes into play more often than 1.
| The present was created by those living in the past.
| maerch wrote:
| So many hours have been saved thanks to formatters. Not only
| writing it, but also countless hours in PR-reviews without
| senseless nitpicking.
|
| One of the best trends recently.
| judofyr wrote:
| > Not only writing it, but also countless hours in PR-reviews
| without senseless nitpicking.
|
| I've never understood this point. Programmers will _always_
| find something they can nitpick about. Ultimately you want a
| culture which focuses on the critical parts: Correctness of
| code, good test coverage, big-picture architecture which has
| long-term impact. Bringing an auto-formatter into the picture
| may reduce some senseless nitpicking, but you haven 't
| actually done anything to solve the _real_ culture problem.
| If your team was getting blocked because people were arguing
| about _formatting_ you have bigger problems that won 't be
| magically solved by adding an auto-formatter.
|
| To give some examples:
|
| A while back I worked with one person on a project where we
| had both Prettier and super strict ESLint, and I would still
| get PRs rejected because they wanted the code to be slightly
| refactored in a way which was entirely subjective and had no
| impact of the correctness (e.g. "flip this negation") .
|
| And right now I'm working on a team where we explicitly tag
| some PR comments with "nitpick". This will _not_ block the PR
| from getting merged, but instead it 's a way of saying "I
| prefer it this way, but it's not that important in the bigger
| scheme of things". This is also a signal that it's not
| something that we want to start a bigger discussion around.
|
| (We use auto-formatters and linters as they are very useful.)
| [deleted]
| mjevans wrote:
| The Dart formatter sounds really advanced, and reflects potential
| complexity of the language.
|
| I think I'd still prefer to see a formatter attempt to preserve
| any formatting which is already 'good enough' to pass as an
| output threshold. Code isn't just a recipe for a computer to do
| something, it's a language for explaining to other programmers
| what that thing is and what's important to the structure of
| accomplishing it. The choice of where to place a break can matter
| for cognition and can be almost as important as the printed
| characters for organizing thoughts.
| munificent wrote:
| _> reflects potential complexity of the language._
|
| The language is fairly complex syntactically and that
| definitely adds some cost to formatting.
|
| But I think much of the complexity comes from two things:
|
| 1. A lot of idiomatic Dart uses function literals in a block-
| like way, as in: test(() { expect(1
| + 2, 3); });
|
| 2. But Dart doesn't actually having trailing block argument
| syntax like Smalltalk, Ruby, and Kotlin. So the formatter has
| to look at the closures passed to an argument list and decide
| which ones look better using block formatting, versus regular
| argument list formatting like: someFunction(
| () { expect(1 + 2, 3); });
|
| Also, at the time I first wrote dartfmt, there was a lot of
| very nicely hand-formatted code in the wild that used different
| subtle layout choices to make different argument lists look
| nice. In order to persuade people to adopt the formatter at
| all, it had to be sophisticated enough to figure out many of
| those patterns and apply them automatically.
|
| It's not as good as a human (mainly because it doesn't have
| semantic context) but it had to be pretty close or people
| wouldn't have tried it.
|
| Now that it's well established, I think it would probably be
| possible to simplify how it formats while still making users
| happy. Possibly _happier_ because the results would be a little
| easier to predict.
|
| _> Code isn 't just a recipe for a computer to do something,
| it's a language for explaining to other programmers what that
| thing is and what's important to the structure of accomplishing
| it._
|
| An automated formatter will never be as good as carefully
| crafted artisanal formatting. In particular, automated
| formatters don't know what stuff _means_. A good human might
| choose to line break a function call like so:
| setColor(red: 123, green: 54, blue: 26, alpha: 45);
|
| Because they know that "RGB" is a single coherent concept and
| alpha is less closely related. An automated formatter doesn't
| (and probably shouldn't) have that domain knowledge.
|
| But the value proposition of automated formatting is not just
| "how nice is the resulting code to read". You have to look at
| the total value proposition of completely yielding formatting
| to a tool versus allowing human control over it. When it's
| completely automated:
|
| 1. You can run it on generated code that contains absolutely no
| whitespace and still get nice output.
|
| 2. Humans can do large-scale refactorings, format, and get
| output that is consistent with the existing state of the
| codebase without having to understand any local style
| preferences.
|
| 3. Humans never have to spend time deciding how to format.
| Further, they don't even have to spend time deciding _if_ they
| should format.
|
| 4. When reading a random codebase, it is likely to be formatted
| in a style you are used to even if you have zero communication
| with that team. This is particularly important in open source.
|
| 5. The code looks familiar to you wherever you encounter it:
| IDEs, plain text editors, code review tools, blog posts,
| StackOverflow answers. As opposed to letting everyone pick
| their own style and relying on users to apply their preferred
| style locally, it's just always in a familiar style.
|
| 6. Like any automation, the tool doesn't make mistakes. Even
| very careful humans hand-formatting make more mistakes than
| they realize. (I know because I've looked at their code). Those
| mistakes can be distracting for readers.
|
| 7. It gets people out of the mindset of being nitpicky about
| style. It encourages them to stay focused on the structure and
| naming of their code, which is what really matters.
|
| 8. It eliminates style arguments in code reviews. Those take
| time and, worse, cause disharmony, for next to no benefit.
|
| I think it's very worth excepting some small loss of overall
| formatting quality to get those in return.
| ramraj07 wrote:
| It might be true but as OP says in the article, the moment you
| have potential multiple ways to show the same piece of code,
| you're going to surface engineers with each opinion on the code
| review. Getting an opinionated formatter is the best way to
| bring engineers back to do what they really need to be doing in
| code review, which is review the code not its formatting. I'll
| never go back to python without black and Isort!
| fxtentacle wrote:
| I feel like enforcing a specific text style is the wrong
| approach.
|
| In my opinion, different people have different formatting
| preferences and forcing someone to use the "wrong" one will
| lead to slower reading speed and the risk of overlooking
| errors.
|
| That's why I believe we should treat it like font or color
| choices. The IDE should display source code in the viewer's
| preferred style, so that each person sees what they expect.
| And then the actual source code formatting becomes
| irrelevant.
|
| Go already goes a good step into this direction by making a
| language AST tree part of their core libraries. And opening
| Go source code in JetBrain's GoLand will show you additional
| annotations, spacing, etc. based on the parsed source code
| tree (and not based on the source code's text).
| coldtea wrote:
| > _In my opinion, different people have different
| formatting preferences and forcing someone to use the
| "wrong" one will lead to slower reading speed and the risk
| of overlooking errors._
|
| Not if everybody using the language is strongly forced (as
| with Go), since then those with "different formatting
| preferences" will eventually (and soon) just get used to
| the enforced style.
| duped wrote:
| The AST usually isn't enough, you want a CST (or what a few
| sources call "full" syntax tree, preserving white space,
| comments, etc). I think .NET has the best implementation of
| this out there, unsurprisingly they have incredible tooling
| support.
| eyelidlessness wrote:
| For anyone interested in CST tooling outside of the .NET
| ecosystem, tree-sitter[1] is general purpose, _quite_
| fast, supports a wide range of grammars, and has bindings
| in quite a lot of environments (including WASM, so likely
| can be used anywhere with some effort).
|
| 1: https://tree-sitter.github.io/tree-sitter/
| duped wrote:
| Tree sitter is absurdly complicated to use in a real
| project, a hand written parser might be slower but it's
| way easier to implement and build.
| ramraj07 wrote:
| This seems like the arguments devs stuck on vim and eMacs
| keep making. "Ohhh I'm tired of moving my fingers from one
| key to another!!" [1] It's just code, if you're in a good
| team each PR is just a small diff, it's hard to believe
| that's somehow too complicated for an ostensibly good
| engineer.
|
| [1] Incidentally the only people I know with carpel tunnel
| are folks who are entrenched in command line text editors.
| Maybe there's some benefit to moving your arms around?
| Similarly maybe there's benefit in reading code in a
| different format, you might actually read it slower and
| hence comprehend it better.
| awild wrote:
| From experience I can tell that one can get used to other
| people's styles and just work with that oneself. Unifying
| among an enforced style and sticking to it really reduces
| the amount of unnecessary discussions.
|
| I genuinely feel that focusing a lot on the detriment of
| someone else's or an established code style is a sign of a
| lack of team work.
| davemp wrote:
| Having a strictly (formatter) enforced style is actually
| how you allow people the freedom to use their preferred
| style.
|
| Everyone can just set up a pre-commit hook to automatically
| format the code back into the official style, then everyone
| is free to put the codebase back into their preferred
| style.
|
| Otherwise, if developers reformatted code to their
| preferred style regularly, it would create massive diffs.
| hiccuphippo wrote:
| And this is how git works with the different newline
| styles. It can convert them to a single enforced style on
| commit and converted back when you pull changes.
| delusional wrote:
| I've never had that work correctly. Mostly it just gets
| in the way when the line endings actually matter (because
| my javascript has to load in IE7).
| fxtentacle wrote:
| I feel like I might have explained that badly. In my
| suggestion, the source code would be diffed as a machine-
| readable AST tree. That means all source code
| reformatting actions which do not change the meaning of
| the source code also do not appear in the diff.
|
| In this explanation about Go:
|
| https://golangdocs.com/golang-ast-package
|
| the text following "and we get a nice structure" is what
| the source code management tools would be working on.
| It's an abstract representation of the source code's
| meaning, but not tied to how things are formatted or
| indented.
| ameliaquining wrote:
| I agree it would (maybe) be better if things had been
| designed this way from the beginning, but as it is it
| would be totally incompatible with all existing source
| code management tools, which is a nonstarter. Even if you
| can switch to a new editor or IDE, you still have to
| worry about the GitHub web UI and your error reporting
| tooling and who even knows what else.
| rowanG077 wrote:
| On paper I like this idea. But I can't shake the feeling
| that every non-text based format I have ever encountered
| sucks.
| chrisweekly wrote:
| Agreed! I use Prettier (driven by ESLint) in all my web
| projects (including enterprise clients', where I've helped
| introduce / improve / standardize tooling). IME it's a
| mistake not to use it.
| rowanG077 wrote:
| I'd really hate a formatter to attempt to preserve formatting
| if it's "close enough"(tm). Because what is "close enough"(tm)
| is hugely subjective, like formatting style itself. Formatter
| should just format to a common style. No funny business.
| CodesInChaos wrote:
| > The Dart formatter sounds really advanced, and reflects
| potential complexity of the language.
|
| I would not be surprised if a lisp formatter following the same
| philosophy as the dart formatter would be similarly complex,
| since the complexity comes from optimizing line length, not
| parsing.
| mbrock wrote:
| Incidentally I spent yesterday implementing a Lisp formatter
| with the algorithm described by Jean-Philippe Bernardy in the
| simple and elegant paper "A Pretty But Not Greedy Printer."
|
| https://jyp.github.io/pdf/Prettiest.pdf
|
| It's based on introducing choice between vertical and
| horizontal stacking. It avoids combinatorial explosion by
| pruning away strictly suboptimal choices. With just a few
| extra rules for Lisp forms, the results are quite good.
|
| I don't handle comments though, since I only need it for
| pretty printing values so far.
| virtualwhys wrote:
| > The Dart formatter sounds really advanced, and reflects
| potential complexity of the language.
|
| You got that right -- check out this epic GitHub thread on
| optional semi-colons [1], and the author's own comment on the
| subject in an HN thread [2]
|
| [1] https://github.com/dart-lang/language/issues/69
|
| [2] https://news.ycombinator.com/item?id=22706645
| mushyhammer wrote:
| > I think I'd still prefer to see a formatter attempt to
| preserve any formatting which is already 'good enough' to pass
| as an output threshold.
|
| That feels like Prettier sometimes. I think it leaves some
| objects on multiple lines or one line depending on how you
| leave them. But I'm not too sure.
| rendall wrote:
| There is tension between having a single source of truth for
| source code, including an opinionated formatter; and allowing
| individual programmers expressive facility to format and
| structure code in a way that makes most sense to them.
|
| I've been thinking lately of a formatter that would resolve
| this tension. It would be a local, individual formatter,
| complementary to the global, opinionated, "Prettier" formatter.
|
| When the source code was committed to remote for review, the
| opinionated formatter would do its thing, making sure the code
| was formatted properly according to whatever the team agreed.
| But locally, the code would remain however the developer liked
| it.
| hprotagonist wrote:
| a recipe for merge conflicts.
|
| i think maybe your only hope is to somehow "version the AST",
| and let formatting be a style sheet or something. But i'm 1.
| talking out my butt and 2. sure i'm missing something.
| fragmede wrote:
| It's a decent idea!
|
| One problem though is all of the non-code elements, aka
| comments and how they intentionally align up with the code
| (mixed with spaces and tabs), so the ideal format is the
| one the original programmer wrote it in, and the second
| programmer using a different stylesheet can't edit comments
| and have them end up pretty for the first programmer. (The
| solution, obviously, is to not comment any code >_< )
| Cullinet wrote:
| I agree with the decency of the ide
|
| But isn't this the nearest argument why we should start
| publishing negative results papers?
|
| The most important thing that I have ever learned was a
| announcement made by Google about how their industry
| consortium seeking to trade print magazine advertising
| inventory, failed with reference to the nature of this
| failure. No matter how lacking in detail this notice was,
| no matter that I was sent down a seven years solitary and
| very lonely path of mercifully ultimate discovery, on the
| heels of my startups exit collapse due to my cofounder
| tragically dying but we'd have been sunk by the problems
| indirectly revealed in that Advertising Age news item.
| However painful, and I'm talking about therapy for years
| after I emerged from reclusion, and real health issues
| giving up running and my diet turning to junk energy
| hits...I have learned more from being told "no can do Z
| because of y" and I seriously think that we /seriously
| have to start already getting over the reasons why we
| don't talk about our failures/. I mean holy smoke
| wouldn't we ever run out of good conversation ever again
| if we could have a good old banter and brew over our
| cockups? I'm thinking that this is how women are so much
| more successful in reproduction if they actually want to.
| What could we do if we tried?
| VBprogrammer wrote:
| Was this GPT-2 trying to write a HN comment?
| olvy0 wrote:
| I'm not sure. Their posting history has multiple posts of
| this type with disconnected sentences. With GPT 2 at
| least there is usually some continuity and a semblance of
| a shared context between the sentences.
| freedomben wrote:
| After reading several of the comments, I think OP is not
| a native English speaker, and that makes for some awkward
| grammar/sentence structure.
| andai wrote:
| I am reminded of Terry Davis.
| javbit wrote:
| If the display of the code is removed from its
| representation, I think the same could be done for
| comments. Comments could be kept as part of the AST and
| rendered how you like.
|
| E.g. (+ 1 2) ; Add two numbers.
|
| Would become (comment (+ 1 2) "Add two
| numbers")
|
| With semantics like const.
|
| Another would be (+ ; Adding
| 1 ; one 2 ; and two. )
|
| To ((comment + "add") (comment
| 1 "one") (comment 2 "two"))
|
| You could display comments as popups, marginalia, or even
| in a traditional fashion (since some intent is captured
| by the comment scoping. You could also have different
| types of comments like annotation to have different kinds
| of display types.
| anchpop wrote:
| This is exactly what Unison is trying to do
| rendall wrote:
| > _let formatting be a style sheet_
|
| That's a good expression of the idea. If merge conflicts
| happen, we're doing it all wrong. Version control shouldn't
| even really be aware of this.
| Cullinet wrote:
| the most extreme validation of typographical precision is
| the banknote. Discuss?...
| hprotagonist wrote:
| tree-sitter and git: now kith!
| MeteorMarc wrote:
| Maybe a code formatter should be just brutally simple and
| predictable. Fearing the look of long, complicated statements,
| coders will shorten their statements and just do one thing per
| statement.
| criddell wrote:
| It also encourages shorter and often less descriptive names.
| That's not necessarily a good thing.
| eyelidlessness wrote:
| This really depends a lot on the language syntax, naming of
| built ins, and common idioms. Like Java is notorious for its
| long lines because it's so verbose and long naming is the
| common convention. Lisps are notoriously far off to the other
| extreme. There's quite a lot of room between those extremes,
| occupied by languages like Python and Ruby. And then there's
| languages like JavaScript and especially TypeScript which
| span most of the range depending on preference.
| egberts1 wrote:
| Sheesh. That is hard. And that does NOT pales in comparison to me
| polishing the "tiny" Bash regex for removal of inline comment (as
| denoted by hash, semicolon or double-slash) in the INI-format
| (version 1.4 2009) file ... while ... and while permitting those
| same inline comment characters in quoted string to be allowed in
| (along with its sub-sequential string up to its ending pair of
| matching single/double quote symbol.
|
| I have a working regex (passed by many regex online testers) but
| in bash yet, NO!
|
| To graderjs of HN, the Author of Dart code for matter, you got
| mad respect from me.
| jb3689 wrote:
| I've been the primary maintainer for a vim plug-in for a number
| of years. Dealing with indentation expectations of a modern,
| complex programming language without an AST parser (you need to
| be able to deal with code that doesn't compile) is one of the
| hardest problems I've had to work on. Dealing with string and
| comment detection, dealing with the constant influx of new
| features, keeping it performant and maintainable in a dedicated
| scripting language with bespoke debugging tools. The best
| approach I've come up with is to be ruthless about testing and
| use strictly TDD for everything
|
| Take a look at how complex the code is for something like vim-
| ruby to get a feel for what I'm talking about
| axydlbaaxr wrote:
| So wise to share the failures and pitfalls, along with the
| successes.
___________________________________________________________________
(page generated 2022-03-05 23:00 UTC)