[HN Gopher] PAGNIs: Probably Are Gonna Need Its (2021)
       ___________________________________________________________________
        
       PAGNIs: Probably Are Gonna Need Its (2021)
        
       Author : thunderbong
       Score  : 133 points
       Date   : 2022-10-17 09:18 UTC (13 hours ago)
        
 (HTM) web link (simonwillison.net)
 (TXT) w3m dump (simonwillison.net)
        
       | bena wrote:
       | I like the YAGNI principle.
       | 
       | It doesn't mean "never do it". It means "don't try to anticipate
       | all your needs".
       | 
       | Because you can sit around and come up with scenario after
       | scenario to defend including almost everything.
       | 
       | Get it working. Worry about everything else after that is
       | accomplished. Will you eventually need a kill switch? Maybe. Do
       | you need it to build a proof of concept? You Ain't Gonna Need It.
       | 
       | Do you have enough results for pagination? Eventually. To build
       | the API for your first test? You Ain't Gonna Need It.
       | 
       | Because the rest of the sentence is "until you do". Eventually
       | you will need a lot of this stuff, but sometimes you actually
       | don't need it. Don't waste time fulfilling non-existent needs.
        
       | 0xbadcafebee wrote:
       | PAGNI is basically a form of Shift Left.
       | https://www.dynatrace.com/news/blog/what-is-shift-left-and-w...
       | https://devopedia.org/shift-left
        
       | hdjjhhvvhga wrote:
       | > A kill-switch for your mobile apps
       | 
       | Really? I understand the rationale but this is one of the main
       | reasons I prefer desktop apps where kill switches are relatively
       | rare.
        
         | georgyo wrote:
         | Are they actually that rare?
         | 
         | Both Slack and Discord at the very least have one, but that is
         | a small sample size.
         | 
         | Both Windows and Mac apps are very eager to self update. A
         | feature they don't have on mobile devices the likely hides
         | these kill switches.
        
           | kiawe_fire wrote:
           | I think the reason they might appear rare is because the
           | number of desktop apps that depend on a remote server for
           | functionality are less common.
           | 
           | For example, a photo editing suite, word processor, or code
           | editor all have the expectation that they be usable entirely
           | offline, and any additional cloud-based features are coded to
           | fail gracefully.
           | 
           | As you point out, apps like Slack and Discord don't fall into
           | this camp. Not only is their core functionality expected to
           | be online, they are also Electron apps (basically, embedded
           | Chrome browser) that likely have a thin layer of client
           | logic, and load most resources remotely, just like accessing
           | a web page.
           | 
           | Sure, word processors and code editors exist on mobile, and
           | many of them likely work just like their desktop counter
           | parts when offline [1]. But that's not what most mobile apps
           | are. Most mobile apps would be useless without a remote
           | server, and if that remote server needs to make a breaking
           | API change, then you need a way to communicate that.
           | 
           | [1] I assume. It's been a while since I've used mobile MS
           | Word. Even then, I suspect these apps will prompt the user to
           | update or else lose cloud service at some point, even if they
           | otherwise continue to work fine locally.
        
         | sethhochberg wrote:
         | As much as any sane development team tries to avoid breaking
         | changes, they're virtually inevitable for a long-enough-lived
         | client/server platform.
         | 
         | A nice poison pill built in up front lets the ancient clients
         | that for whatever reason aren't being upgraded at least get a
         | nice informational prompt when those breaking changes happen,
         | instead of a potentially uglier failure mode
         | 
         | Its one of those things that you think you'll never need
         | because of course you wouldn't make breaking changes, but then
         | at some point an unstoppable force hits an immovable object and
         | this becomes the least-bad way out for user experience
        
           | ryandrake wrote:
           | From the user's point of view, I can't think of a _good_
           | reason to force users to update software, besides maybe a
           | security /privacy issue. As a user, I expect software that
           | works today to work tomorrow--forever. Every time I update, I
           | risk getting a new UI shoved down my throat, or my favorite
           | use case removed or broken, or the app has tripled in size
           | for no discernible reason, or it's now too slow for my
           | computer. I've been trained by software companies' repeated
           | bad behavior to not update voluntarily unless I have some
           | kind of assurance that it's purely a security update and no
           | functionality has changed.
           | 
           | Version your API, figure out how to make maintaining the old
           | versions zero cost, and don't throw users under the bus just
           | because they are (understandably) unwilling to update.
        
             | alkonaut wrote:
             | A mobile app is the same as a web page. I view them as much
             | more related to web apps, than to traditional desktop apps.
             | 
             | It's tied to a backend version. I do get annoyed with apps
             | that want to update every single time, but with auto
             | updates it's not that bad.
             | 
             | As many mobile apps are merely wrappers around a web app,
             | It's easy to see how this happens. But I think the end game
             | will be even more web-like: apps that indicate they must be
             | up to date and can't even be manually set to stick to one
             | version.
        
             | simonw wrote:
             | "besides maybe a security/privacy issue"
             | 
             | Right - that's enough of a reason for me to include a kill
             | switch!
        
         | thefreeman wrote:
         | It's not to kill the app entirely, it's to force the user to
         | upgrade to a maintained version rather than just having the app
         | crash if backwards compatibility is required to be broken. I
         | tend to do it with two flags, one to enable an upgrade nag
         | notification and another to hard kill switch the version.
        
           | 1-more wrote:
           | Mad annoying when I cracked open an ancient iPad figuring I
           | could watch Netflix on it and oops nope. No indication of
           | why, but I know the answer because I'm the type of person who
           | comments on HN.
        
       | solatic wrote:
       | Controversial opinion:
       | 
       | Don't write tests first. If you're not writing tests, you don't
       | need CI to run tests, and CD isn't safe.
       | 
       | Write tests only:                 (a) if the algorithm is so
       | complicated, that you need unit testing to ensure that you're
       | writing it correctly in the first place       (b) if, when, and
       | after you earn revenue from the code
       | 
       | Why would you waste your time proving that something you wrote
       | works, if you don't have product-market fit, and you need to
       | throw it all out next week to try to satisfy the market with
       | something different? Just make sure you have the discipline to go
       | back and actually add edge-to-edge / integration testing after
       | you make the sale, and not go straight to the next feature,
       | leaving the lack of testing as technical debt.
        
         | yamtaddle wrote:
         | Some amount of testing can help you move faster.
         | 
         | ... _especially_ with some of the most-popular  "let's shit out
         | a quick MVP" languages/frameworks, since they often lack static
         | typing, or _kinda_ have it but support is uneven.
        
         | avgcorrection wrote:
         | Why does having tests necessitate (in turn) getting a CI? You
         | can just run the tests yourself.
        
         | ParetoOptimal wrote:
         | > Why would you waste your time proving that something you
         | wrote works, if you don't have product-market fit,
         | 
         | What if your poor execution and untested software simply makes
         | it appear you don't have product-market fit?
        
           | solatic wrote:
           | You know, I really wish that a lack of automated tests
           | translated to poor execution and short-term business failure,
           | but having worked for multiple successful startups where that
           | is not the case, sadly, I was proven wrong. In the face of
           | evidence to the contrary, I changed my opinion.
           | 
           | Testing _needs_ to have business justification for it;  "best
           | practice" is not good enough. That means doing it when
           | absolutely required for a feature's development, or tying it
           | to revenue on the books.
        
         | jerf wrote:
         | "Why would you waste your time proving that something you wrote
         | works, if you don't have product-market fit, and you need to
         | throw it all out next week to try to satisfy the market with
         | something different?"
         | 
         | Because somewhere around the 3-6 week mark, having decently-
         | tested code means I move _faster_ , not slower.
         | 
         | And to be honest, I'm too effing old anymore to put up with the
         | endless loop of "oh, this change can't break anything -> ship
         | it -> oh crap, that broke like three things -> fix them -> oh
         | crap, that caused a regression over here -> fix it -> oh crap,
         | that fixed that but now this other thing has regressed -> oh
         | crap, this means my entire architecture is wrong which I would
         | have figured out much earlier if I hadn't been so busy putting
         | out fires". I've wondered how much people complaining about
         | burnout is from them living in that loop. I _far_ prefer my
         | tests to tell me about this than my customers.
         | 
         | Yeah, if you're literally bashing together code in one week, to
         | heck with tests. But I think even in the "startup" space that's
         | a minority. Anything that can be solved in a solid month of
         | coding is already commercially available with half-a-dozen
         | plausible open source alternatives anyhow.
        
           | solatic wrote:
           | > I far prefer my tests to tell me about this than my
           | customers.
           | 
           | Exactly, if your code is used by customers, then it's tied to
           | revenue, and you should write tests for it. See: (b)
        
             | AnimalMuppet wrote:
             | You missed at least part of jerf's point. If your program
             | is non-trivial, you're going to come out ahead from tests
             | _even before you ship_.
        
         | hinkley wrote:
         | I worked on a project where the architect kept talking about
         | the rewrite they were going to do next year. That was cute
         | until it wasn't.
         | 
         | It wasn't cute when he started pushing back on anyone making
         | material architecture improvements to the 'legacy' system. How
         | do you expect people to write better code on the new system
         | instead of the same bad code they're writing now? People don't
         | work that way.
         | 
         | Shortly before my contract was up, the person who approved his
         | rewrite got promoted out of the building. I'd bet anything that
         | his new boss didn't give him his rewrite.
         | 
         | Starting a project with no culture of testing, refactoring,
         | whatever, makes it harder to add it on later. And if you fail
         | to get it to stick, now you're in Sunk Cost territory, because
         | knowing everyone was going to say no early on could have let
         | you bow out before you got invested in a project that was going
         | to torture you.
        
           | solatic wrote:
           | > Starting a project
           | 
           | I have a feeling you're working from a different set of base
           | assumptions. If you're working for a company that is paying
           | multiple people's worth of salaries to deliver a project, a
           | project whose definition of done is "delivered to market" and
           | not "earning $X revenue", then yeah, if people are going to
           | be de-tasked after the project is up and migrated to other
           | projects, I absolutely appreciate that whatever project
           | manager is running that project should ensure that it has a
           | nice testing suite to prove that the project works. Behavior-
           | Driven-Development, ideally.
           | 
           | I just think that's a monumentally expensive way to develop
           | software, one that executives should have relegated to the
           | trash bin twenty years ago, after the Agile Manifesto. If
           | you're a private sector business then you want to develop
           | quality software _at the lowest cost._ Guess what? Automated
           | tests, in and of themselves, don 't tell you whether you have
           | quality software! The only way you know whether your software
           | has any quality at all is whether _people pay money for it!_
           | Not whether software fits an arbitrary spec determined at the
           | beginning of the  "project".
        
             | simonw wrote:
             | It sounds to me like you're arguing from a position where
             | tests are expensive and time consuming to write.
             | 
             | If you have the right habits and testing tools in place,
             | this doesn't need to be the case.
             | 
             | I write tests for even my tiniest projects: I estimate I
             | spend less than 20% of my coding time working on the tests
             | - and they give me more than a 20% boost in productivity
             | VERY quickly.
             | 
             | But... that's because every one of my projects starts with
             | a pytest test suite in place. Adding new tests to an
             | existing test suite is massively less work than adding
             | tests if there's no test suite and you have to spin one up
             | from scratch.
        
               | hinkley wrote:
               | I've seen it go off the rails though. I've seen people
               | exhibit traumatic responses to it.
               | 
               | Pain is information, but there are less helpful ways to
               | interpret it.
               | 
               | There are some architectural decisions I resist
               | energetically because they create a lot of consequences
               | for testing, with not a lot of value added compared to
               | other alternatives. If you introduce something that makes
               | writing code 30% harder you'd better be getting a lot of
               | profit margin out of the deal. Otherwise you're just
               | torturing people for no good reason
        
             | AnimalMuppet wrote:
             | You're using a different definition of "quality software"
             | that is orthogonal to the one that the rest of us are
             | using. Your definition is the one that isn't helped much by
             | automated tests. But for an actual product, you really want
             | both kinds of quality, including the kind that _is_ helped
             | by automated tests.
        
             | hinkley wrote:
             | Detasked is a definite problem, but I'm more concerned
             | about the reverse, which is being overwhelmed by customer
             | input. Once the party starts it's too late to change some
             | things. Everything is a priority queue and louder more
             | charismatic voices will push out you quality of life
             | concerns.
             | 
             | Edit: but in this thread what I'm talking about is skill.
             | The longer you wait to practice a skill that you will need
             | in abundance, the more likelihood that you will fail
             | outright. You need a long runway for operational
             | excellence.
        
         | beltsazar wrote:
         | > Why would you waste your time proving that something you
         | wrote works
         | 
         | How do you know it works if you don't test it?
        
           | solatic wrote:
           | Manually.
           | 
           | Writing automated tests with sufficient coverage takes, as a
           | rule of thumb, three times as many lines of code as the code
           | itself, not including test data. (I don't remember exactly
           | where I first learned this, I think this is out of
           | Microsoft?).
           | 
           | For small-enough codebases, with few enough engineers working
           | on those codebases, and without dedicated QA resources
           | attempting to break your code on a thousand device
           | permutations, simple manual checks, when needed, are _orders
           | of magnitude faster_ compared to writing and maintaining
           | automated test suites for code that _nobody is paying you
           | money for yet_ and therefore is _worse than worthless_ from
           | an accounting perspective, due to ongoing maintenance costs
           | of the code itself.
           | 
           | Once you have paying customers, then you need the tests to
           | protect company revenue and reputation. Once you decide to
           | scale, then pay for QA to try to break your code on a
           | thousand end-user devices. But not before!
        
             | hinkley wrote:
             | So the question is, are you going to test it manually?
             | 
             | Case study 1: The help functionality in the UI was broken
             | for months before anyone noticed, because people on the
             | team don't need the help. Integration tests were written to
             | exercise the help functionality. Problem solved.
             | 
             | Case study 2: 'We' insisted on having extensive unit tests
             | for our login logic. Getting code coverage on that
             | functionality ended up being easier said than done, and a
             | bunch of refactoring was necessary to make the tests not
             | suck balls. Said refactoring introduced a number of
             | regressions in the login logic. Every time the login
             | functionality broke, two engineers came to tell me or my
             | teammate that login was broken before the CI build had even
             | completed.
             | 
             | If the human is willing, faster than the automation, and
             | failures are unambiguous, what's ultimately the value of
             | those tests? Goodhart's Law says it's less than or equal to
             | zero.
        
               | solatic wrote:
               | > broken for months before anyone noticed
               | 
               | Great case in point. If nobody noticed, can you really
               | tie it to revenue? Did sales fail because prospects
               | couldn't use the help? Did customers churn because they
               | didn't want to pay for something broken? No? Then it
               | doesn't matter that it's "broken". Unless it affects the
               | bottom line somehow, even in the most tenuous way, then
               | from a business/financial perspective, arguably nothing
               | is actually "broken".
               | 
               | > Every time the login functionality broke, two engineers
               | came to tell me or my teammate that login was broken
               | before the CI build had even completed.
               | 
               | So, technically, somebody manually tested it, right?
               | Ideally, the executives should have been pressuring you
               | to do a better job and not force other teams to catch
               | your bad work, but it's fine if it's a rare-ish
               | occurrence.
        
               | hinkley wrote:
               | That depends on your delivery pipeline. At the time this
               | product was only shipping between teams on a new
               | initiative, so all of the users were experts. But things
               | like this cost you status once louder more influential
               | people start noticing.
               | 
               | The login one I've seen on two separate projectst, only
               | one was I directly involved in. What was needed were
               | negative tests to make sure authorization or
               | authentication don't succeed when they shouldn't. I won't
               | begrudge those tests at all, and may insist on them
               | personally. But some parts of the code are dogfooded
               | hourly. If you can test it quickly and cheaply, by all
               | means do. But if the test system is slower and less
               | reliable, if chasing a metric makes you break things that
               | already "worked" you need to tap the brakes and think
               | about your strategy. Not this way, or maybe not at all.
        
           | im3w1l wrote:
           | Testing it manually, probably.
        
         | layer8 wrote:
         | Your unit tests shouldn't need CI to run, period. Only
         | integration tests _might_ need to, but also not necessarily
         | (the "continuous" part, that is).
        
         | AYBABTME wrote:
         | The most obvious missing one:                  (c) You start
         | wasting a lot of time fixing the features you previously
         | shipped but broke later.
         | 
         | When you start to waste so much time manually testing stuff,
         | and realizing that something you implemented and worked two
         | weeks ago, but now doesn't because someone forgot to check that
         | their new work didn't break the old one.
        
           | solatic wrote:
           | Why are you keeping around features that aren't tied to
           | revenue? Either write tests for the features, or delete the
           | features. The easiest code to test and maintain is no code at
           | all.
        
             | AYBABTME wrote:
             | Okay? Some products take more than just a few lines of
             | codes and a few days to build. Absolutist opinions aren't
             | useful.
             | 
             | Also, just about everything can be construed as being tied
             | to revenue. Not a useful heuristic.
        
           | hbrn wrote:
           | Honestly, in most cases testing is a band-aid on top of a
           | bigger problem. IME things usually break due to complexity,
           | not due to lack of tests.
           | 
           | But of course, tackling complexity is hard. It's much easier
           | to sprinkle the big ball of mud with tests and set it in
           | stone.
           | 
           | If you feel like you're wasting time manually testing, it
           | probably means that the blast radius of your changes is too
           | big. You should prefer smaller rockets over your head, not
           | bigger bunkers to hide in.
        
             | AYBABTME wrote:
             | Some stuff just takes time to manual test, e.g. if you
             | provision multiple cloud resources and must coordinate
             | them. Or anything involving some sort of multi step
             | process. If you want to ship quickly, you want a test suite
             | to save you the trouble of doing an hour long of manual
             | test for each change you're adding. Delegating the
             | verification to computer, so that my small team can move
             | faster on the other things we need to build, is a no-
             | brainer. At least, once the "cost to manually test" becomes
             | greater than "cost to implement some tests and setup CI".
        
         | postalrat wrote:
         | It's a bit disappointing that the general consensus seems be:
         | more tests is always better.
         | 
         | A lot of people are fooling themselves about the time needed to
         | build and maintain tests vs. the benefits of having those
         | tests.
         | 
         | And anyone questioning their value is seen as some lazy
         | outsider who just doesn't get it.
        
           | hinkley wrote:
           | Nothing is more expensive than unmaintainable tests.
           | 
           | What I tend to want is unit tests that are cheap. Cheap to
           | run, cheap to read, cheap to write. If they're properly
           | simple I can rewrite them. If the requirements changed I can
           | delete them and write new ones. If you get your testing
           | pyramid you also know that _most_ of your tests are cheap.
           | 
           | The more business logic that leaks up into your functional
           | and integration tests, the more important it is that you are
           | confident that the rules you have are the correct rules.
           | Which leads into the problem space of GP.
           | 
           | More information is better, but more tests often give you the
           | same information, or conflicting and therefore less
           | information. Tests are about confidence. Can we ship this,
           | did we break stupid things? Repeating an action for the sole
           | purpose of soothing yourself is a form of OCD. It's not
           | healthy.
        
             | postalrat wrote:
             | There are a lot of things I'd love to see change in the
             | testing world.
             | 
             | One would be a way to write a second implementation that is
             | used to test or generate tests. For example if I want to
             | test a function that adds two numbers ("add") it would be
             | really nice to be able to write: test(add, (a, b) => a + b)
             | 
             | IMO tests kinda boil down to write it twice and hope you
             | got it right once.
        
               | hinkley wrote:
               | You might get a kick out of property based testing.
        
         | john-tells-all wrote:
         | Absolutely -- Tests are a _business_ investment, and a
         | _business_ expense.
         | 
         | A question is: "what is the _business_ consequence if this code
         | doesn 't work as expected?" If the code is frontend stuff, then
         | maybe tests won't pay off.
         | 
         | If the code is "calculate total and taxes for user buying
         | products", then yes, you probably _do_ want to write tests. If
         | you don 't write them first, then they might not get done until
         | there's a "big bang" of "gee we need to write tests for the
         | last 10k lines of code over the weekend". Or worse: code might
         | or might not be correct in the past, but you can't tell how
         | many users were affected.
         | 
         | Tests are a business investment, and it's worth a discussion to
         | find out risk tolerance.
        
       | wgerard wrote:
       | My personal addition: Code isolation
       | 
       | I've "prematurely" split up code across multiple/classes
       | functions so many times before these huge god functions became
       | set-in-stone and unwieldy.
       | 
       | I had a co-worker once who was extremely heavy on YAGNI--and I
       | don't mean that as a pejorative, it was a really helpful check
       | against code solving non-existent problems. They once called me
       | out on a refactor I'd done surreptitiously for a payments flow to
       | split payment states into multiple classes. Serendipity had
       | struck, and I was able to point to a submitted (but not accepted)
       | diff that would've broken the entire payments flow (as opposed to
       | just breaking one state of the flow).
       | 
       | I always think about that every time I question whether I'm
       | isolating code prematurely.
        
         | zzbzq wrote:
         | To be fair, what you're championing for PAGNI is the poster
         | child for YAGNI, so you're basically against YAGNI.
        
         | quietbritishjim wrote:
         | My rule for this is that:
         | 
         | * If breaking up the code makes it easier to understand _with
         | its current functionality_ then absolutely go ahead.
         | 
         | * If breaking up the code makes it slightly less easy to
         | understand right now, but will make it easier to add some
         | feature later that you know you know you're definitely going to
         | need, then hold off.
         | 
         | Most of the time you "know" you're going to add a feature, you
         | turn out to be wrong, and (worse than that) some other slightly
         | different feature is needed instead that's actually harder to
         | add because you broke up the logic along the wrong axis.
         | Whereas, breaking code up purely based on what makes it easier
         | to understand now, ironically, usually does a much better job
         | at making it easy to add a feature later - it often almost
         | feels like an accident (but it's not, really).
         | 
         | I bet your payment flow refactoring made things a bit easier to
         | understand, even if it had never been worked on again.
        
           | jsbg wrote:
           | > If breaking up the code makes it slightly less easy to
           | understand right now, but will make it easier to add some
           | feature later that you know you know you're definitely going
           | to need, then hold off.
           | 
           | This. Your code shouldn't be documenting what it will do in
           | the future: it should be clear what it's doing right now.
           | Otherwise some other engineer maybe from another team will
           | come upon it, misread it, and bugs ensue. Trying to save
           | yourself time in the future ends up wasting time for others,
           | and likely for yourself as well.
        
             | WorldMaker wrote:
             | This relates to "Rule of 3 Before Abstracting": One
             | concrete instance of a proposed abstraction is YAGNI
             | (adding an abstraction blurs the concrete code and makes it
             | harder to understand), two concrete instances are
             | _coincidence_ (it 's maybe a sign that you need an
             | abstraction between the two, but you don't have enough
             | scientific data to consider), at least three concrete
             | instances is (finally) _a pattern_ (abstracting a known
             | pattern is right and good and because it becomes a document
             | of the high level pattern itself and the differences and
             | distinctions between the concrete implementations become
             | more distinct and obvious).
             | 
             | That "Rule of 3" is sometimes a good reminder itself not to
             | build for some possible future of the codebase, but to
             | build for the patterns in your code base as they exist
             | _today_.
        
         | 3pt14159 wrote:
         | Premature class isolation: Usually bad.
         | 
         | Some notable exceptions are things like addresses where many
         | times users have multiple, but usually people splitting up the
         | user table and the "user profile" table are just causing
         | headaches. A wide user table is fine.
         | 
         | Premature method or function isolation: Usually good.
         | 
         | Even if there isn't reuse, the code is usually more readable.
         | def let_in_bar?             return self.is_of_age_in_region?
         | and self.is_sober?
         | 
         | Is a perfectly fine isolation with very little downside.
        
           | quietbritishjim wrote:
           | I know your example is just a toy, but it definitely reminds
           | me of heavy handed over-refactoring into tiny one-line
           | functions.
           | 
           | There are two problems with that approach:
           | 
           | The first is when you're trying to understand what a single
           | function does. If it's 20 lines long but has self-contained
           | logic, that's often easier to understand than referring back
           | to the tiny pieces it's assembled from (even if they're
           | supposedly so self contained you don't need to look at them
           | individually - that's rarely true in practice). On balance,
           | even if a function is a slightly too long, that's usually
           | less bad than one that's in slightly too many little pieces.
           | 
           | The second is when you're looking at the file level. When
           | faced with a file with dozens of tiny functions, it's much
           | harder to get a top level understanding than if it has a
           | smaller number of longer functions.
           | 
           | The second one is the bigger problem, because understanding a
           | project at the file (and higher) level is typically much
           | harder than understanding what a single function does. Even
           | if breaking up a function into smaller pieces does actually
           | make it easier to understand, but makes the overall project
           | harder to parse, then that's usually a loss overall. Of
           | course, it depends very much on the specifics of the
           | situation though, and certainly a long function broken down
           | into logical pieces can actually make the overall project
           | easier to understand.
        
             | jsbg wrote:
             | > over-refactoring into tiny one-line functions
             | 
             | It's relatively rare to refactor a single line of code into
             | its own function, unless it's really hard to read, e.g.
             | some nasty regex.
             | 
             | > even if they're supposedly so self contained you don't
             | need to look at them individually - that's rarely true in
             | practice
             | 
             | If a long function is broken up into smaller functions only
             | for the sake of having smaller functions, then you end up
             | with functions that don't really make sense on their own
             | and, sure, the bigger function is better. But if the broken
             | up functions are named well enough like in the example
             | above, then it shouldn't be necessary to see how it's
             | implemented, unless you're tracking down a bug. After all,
             | it's very rare to look at e.g. how a library function is
             | implemented, and for some libraries/languages it's not even
             | possible.
             | 
             | > When faced with a file with dozens of tiny functions,
             | it's much harder to get a top level understanding than if
             | it has a smaller number of longer functions.
             | 
             | Most languages have facilities to help with that, e.g.
             | function visibility. Your smaller number of large functions
             | can remain the public API for your module while the
             | "dozens" of tiny functions can be private. In either case,
             | long files are harder to take in no matter how many
             | functions they're composed of.
        
       | z3c0 wrote:
       | I'd add that this extends to data processes as well. Lean
       | pipelines with rigid schemas are great and all, but you're really
       | going to regret not having something more malleable once the
       | system has grown and you have an onslaught of requests for new
       | fields to be added to X solution.
       | 
       | Blob storage is cheap. Just dump the whole database and sift
       | through it later.
        
       | swyx wrote:
       | my submission for PAGNI: Preemptive Pluralizations
       | https://www.swyx.io/preemptive-pluralization
       | 
       | Before you write any code -- ask if you could ever possibly want
       | multiple kinds of the thing you are coding. If yes, just do it.
       | It is a LOT easier to scale code from a cardinality of 2 to 3
       | than it is to refactor from a cardinality of 1 to 2.
       | 
       | Example:
       | 
       | - You assume that one team has many users, and correspondingly,
       | one user belongs to one team.
       | 
       | - Eventually, you find that a user may need multiple teams. This
       | is actually fantastic for your business!
       | 
       | - But you are depressed because you now have to spend 2 months
       | refactoring every line of code and database schema that assumed
       | the one-to-one mapping
        
         | vaidhy wrote:
         | The more general rule I follow is that there are only three
         | cardinalities in software - 0, 1, and many. Default should be
         | many. Any other limits you put is likely to be incorrect.
        
           | swyx wrote:
           | also a list of known things that are cardinality 1 could be
           | valuable, since someone coming across this advice people
           | might swing too far over to the always-many side
        
           | kragen wrote:
           | This is appealing, but taking this advice seriously means
           | that all your relationships are many-to-many, until you
           | optimize some of them to be one-to-many or one-to-one.
           | 
           | This clashes extremely strongly with the currently popular
           | approaches to software development (OO, procedural, Lisp/ML-
           | style functional, SQL, etc.), which have an extremely strong
           | default of cardinality 1, inflicting lots of extra work on
           | you for many-to-many relationships. You'd never have a member
           | of a struct or a class start out as an int or a string; they
           | would always be int lists, string lists, etc. Points would
           | have a list of X coordinates rather than a single X
           | coordinate; interpreters would interpret a list of programs;
           | 3-D objects would have a list of transform matrices rather
           | than a single transform matrix; files would have a list of
           | owners.
           | 
           | (And of course in languages like C and Golang when you nest a
           | Foo object inside a Bar object, you are foreclosing the
           | possibility of that Foo having the same relationship with
           | some other Bar object as well; you need pointers.)
           | 
           | Systems that do support that kind of thing in one way or
           | another include Prolog, miniKANREN, the APL family, and Pick.
           | But Prolog and APL only support it to a very limited extent,
           | and I've never used Pick or miniKANREN.
        
             | vaidhy wrote:
             | I am not sure I understood your arguments because of where
             | these get applied. I generally mean to apply these when we
             | talk about business constraints - in my field of supply
             | chain optimizations, these kind of things come up often.
             | How many trucks do you plan between nodes? If the answer is
             | 2, it means many. Being explicit about 0 and 1 means that
             | if that constraint is broken, it means a more through code
             | analysis as it would lead to some unexpected breaks in
             | various systems.
        
               | kragen wrote:
               | I certainly agree that 2 should be "many".
        
           | swyx wrote:
           | what does a cardinality of 0 mean? i dont even think an
           | attribute should be on the schema if the cardinality is 0? or
           | am i experiencing a brain fart and missing something obvious
        
             | Sohcahtoa82 wrote:
             | This comment reads like it's trying to disguise pedantry as
             | confusion, which is an obnoxious trend on HN.
             | 
             | Keeping with the original "user on a team" example,
             | allowing a cardinality of zero means that you support a
             | user not being on a team at all. This may be a
             | bastardization of the definition of "cardinality", but I
             | think you understood that.
        
               | swyx wrote:
               | no i didnt, and it was a sincere question. im the guy who
               | wrote the post and i didnt understand cardinality zero
               | lol.
        
               | vaidhy wrote:
               | It is what swyx said.. cardinality not in the
               | mathematical sense.. but the fact it does not exisit,
               | there is exactly one instance of it or there are many
               | instances of it. Going from one to another means
               | refactoring/through testing as there would be side
               | effects all through the system.
        
       | kotlin2 wrote:
       | I don't think you'll need most of these things. My metric would
       | be whether or not your app would continue to function without
       | these things. Using this metric:
       | 
       | Kill switch: may be required, but unlikely.
       | 
       | Automated deploys: definitely not necessary. Your app can still
       | function if you deploy manually.
       | 
       | CI: definitely not necessary. Your app can still function without
       | automatically run integration tests. It can run without any
       | integration tests at all.
       | 
       | API pagination: depends on the API.
       | 
       | Detailed API logs: not necessary. You can add logging as problems
       | arise. Even when I add detailed logging ahead of time, I often
       | find I missed the one crucial piece of information needed to make
       | using the logs possible during an investigation.
       | 
       | SQL query dashboard: not necessary. You can easily use a script
       | interface or CLI that stores history.
        
         | jaywalk wrote:
         | The article describes precisely why proactively adding these
         | things is (probably) necessary. For example, the kill switch:
         | yeah, you hope you won't need it. But if the time comes that
         | you do, it's too late to add it.
        
           | kotlin2 wrote:
           | You can use that argument when adding anything.
        
         | xapata wrote:
         | > I often find I missed the one crucial piece of information
         | needed to make using the logs possible during an investigation.
         | 
         | This may be a bit of confirmation bias, or ... I'm not sure
         | what the right term is. The times you didn't have the necessary
         | information stick out in your memory, because they were serious
         | problems. The times you did have the right information logged
         | you solved the problem quickly and it never became a fire.
        
       | muglug wrote:
       | > having a mechanism that provides detailed logs--including the
       | POST bodies passed to the API--is invaluable.
       | 
       | He includes a caveat later to avoid logging PII, but this should
       | have a flashing red warning sign attached.
        
         | adamisom wrote:
         | Is a logging service/pipeline that removes* PII the only real
         | high-level solution?
         | 
         | (* or 'does its best' to remove PII; I feel like there's an
         | immovable object / unstoppable force thing if the answer is "0%
         | PII _ever_!" because that conflicts with the need to log as
         | many things as you can)
        
           | bob1029 wrote:
           | We do explicit redaction of logs forwarded to us so that we
           | don't have to see any end-customer information. The redaction
           | process occurs automatically in the customer's secure
           | environment, so theoretically it is clean from our
           | perspective.
           | 
           | This is indeed a very tricky process, but we have it close
           | enough to make regulators happy. Most of our log information
           | is stored in SQLite and XML, so we can do a lot of parsing
           | magic to achieve determinism.
        
           | 0xbadcafebee wrote:
           | You shouldn't log as many things as you can. You should only
           | log useful things, when they're needed, along with context.
           | Otherwise you have too much noise and not enough signal, and
           | it's easier to leak PII.
           | 
           | Applications should have a logging interface that can
           | identify and conceal sensitive information, and data stored
           | in the app/database/etc should have a type that can be
           | denoted as sensitive. You will eventually also want to reduce
           | the amount of logs you generate, so it's useful to have
           | logging levels the same way operating systems do.
        
           | [deleted]
        
         | rfrey wrote:
         | I wonder about this a bit. If I'm already storing all the PII
         | in my database, probably on other people's computers via a
         | managed database or at least a cloud server, why are logs
         | different? Is it because they don't receive the same security
         | scrutiny, e.g. any dev can view them versus having special
         | permissions for the database?
         | 
         | Or is there more there than authorization that I should be
         | thinking about?
         | 
         | (Obviously I'm not talking about logging cleartext passwords,
         | which don't belong in the database either.)
        
           | michaelmior wrote:
           | I believe this also plays into "right to be forgotten
           | legislation" such as in the GDPR. Suppose you log PII and
           | then a user requests all their PII be deleted. I believe you
           | would then also be required to scrub those logs instead of
           | just dropping records from a database.
        
           | throwaway74829 wrote:
           | Permissions is the big one.
           | 
           | Second is potential for DDoS (either intentionally or
           | unintentionally).
           | 
           | Third is possibility of "oopsies" via me intentionally or
           | unintentionally including my passwords/sensitive info/what
           | have you in the POST body. Now you have to add branch to look
           | for and scrub sensitive info in your logger -- otherwise my
           | PII has now been logged (and if I were a massive asshole
           | looking for a quick payday, I could throw up a fuss).
           | 
           | It's fine in dev, but shouldn't be in prod. Too much
           | liability.
        
           | oandhjakk wrote:
           | I believe the main concern here, from dealing with clients
           | that have mandated no PII in logs, is both authorisation as
           | well as control. If it's in your service logs, then it could
           | be in your Splunk logs, it could be in a storage repository
           | and it could be in your requests that you send to the service
           | provider to troubleshoot some issue.
           | 
           | Unless there is a valid use case for logging PII (and I can't
           | think of any which can't be engineered around) then I think
           | it's best to avoid it in principle.
           | 
           | I think of it as the same as logging passwords, keys or
           | tokens.
        
       | jjice wrote:
       | > A bookmarkable interface for executing read-only SQL queries
       | against your database
       | 
       | We use Metabase at my work and I love it. It handles
       | visualizations out of the box and I can write SQL like my heart
       | desires, while still being able to help out our designers,
       | customer success, and sales needs. Their "questions" (what they
       | call a query) can be parameterized and added to dashboards and
       | everything works so well.
       | 
       | Also one of the few large projects I see that's written in
       | Closure.
        
         | jmull wrote:
         | This wouldn't be possible in a lot of the projects I work on.
         | 
         | The problem is privileges -- for example, the data you can see
         | depends on whether you're logged in or not, and your group and
         | role when logged in.
         | 
         | What I've done instead is provided an interface (bookmarkable)
         | to a set of "rowsets", where the rowset has a name and set of
         | columns, where the columns are filterable, sortable, and/or
         | selectable. Directly behind the rowset is SQL, of course
         | (and/or a view or table), so it's not that much less flexible
         | than SQL directly, but it provides a point where additional
         | filter conditions can be reliably added to enforce privileges
         | (there's also a dev option to see the SQL, if that's needed).
        
       | jerf wrote:
       | A few more in my experience:
       | 
       | Database abstractability. Here I don't mean "type of database"
       | like Postgres vs. MySQL vs. Mongo, but making sure in your system
       | you have the ability to target and isolate the entire DB.
       | Integration testing is both vital and slow; as you grow you will
       | be glad you have the ability to run multiple of them at once
       | because they don't interact. With MySQL & Postgres this can be as
       | easy as making sure you code takes the database name as a
       | _parameter_ instead of hard coding it. If your database lacks
       | this feature, take the time to abstract out table names so you
       | can use a certain prefix /suffix for a given test.
       | 
       | If you can't quickly recreate the schema and populate with some
       | known set of test data, take the time to write code that will
       | take an existing DB and truncate it back to start or whatever.
       | You need those integration tests, and you need them to not take
       | two hours to run. Get that going at the beginning. I've not yet
       | gotten to the point I need prepopulated schema which is handed
       | out by a central manager, and then the cleanup is run out of the
       | actual test cycle, but I'd be willing to go that far. In 2022 it
       | is likely not a problem at all to have literally a thousand
       | copies of your schema lying around on some test DB server, just
       | to make the tests run fast. Use some versioning solution to make
       | sure they're the right version.
       | 
       | There is a _LOT_ of variation in databases here; you will need to
       | tune your process to your DB. In some, schema creation is wicked
       | fast and you can just write all your tests to write out new ones
       | without a problem. In others it is slow but very parallelizable.
       | In others it may be just plain slow, in which case the code to
       | take existing schema and reset them is more valuable. In yet
       | others truncation may be difficult and /or slow due to loose
       | serialization requirements. Do whatever it takes to make your
       | integration tests fast, or at least, as fast as possible. Do not
       | just settle for even simple tests taking 10 minutes because you
       | naively do a slow thing. You really need these to go fast.
       | 
       | Authentication and authorization; the earlier you work in _some_
       | solution, the happier you will be. Sometimes this is easy to
       | retrofit, sometimes it really isn 't. Even if you start with
       | authentication and a list of who can do what in the very early
       | phases it'll still prevent you from accidentally depending on
       | having no auth at all.
       | 
       | Write your code to be thread/async safe from the beginning, even
       | if you have no use for it yet. You don't want to be in a position
       | where you need it but you've got a 2-year-old code base now built
       | on the deep assumption of single-threading, and it only gets
       | worse as the code base gets larger. It's a slight marginal
       | expense to keep it in mind, it's a project-destroying nightmare
       | to add it back in later.
       | 
       | In static languages, get the types declared as early as possible.
       | Get away from (int, int, int, string, string, bool, bool)
       | function signatures as quickly as you can. It's not exactly
       | impossible to retrofit better types later, but it's at the very
       | least a pain in the bum disproportional to the cost of having
       | declared them correctly in the first place. Even if your first
       | stab at real types is wrong, it's _still_ easier to fix them this
       | way because you have the type name as a hook for your IDE, or in
       | the worst case, for sheer grep-based refactoring. (Even in
       | dynamic languages I 'd recommend this, but they have more options
       | and the analysis is more complicated. Still, I'd suggest having a
       | "user" that knows how to render itself as a string is better than
       | passing around the user's name or something as a string
       | everywhere and expecting every bit of code that needs to do some
       | user-object-method-y thing to have to look it up themselves.)
        
         | schwartzworld wrote:
         | > Still, I'd suggest having a "user" that knows how to render
         | itself as a string is better than passing around the user's
         | name or something as a string everywhere and expecting every
         | bit of code that needs to do some user-object-method-y thing to
         | have to look it up themselves.
         | 
         | For some reason this seems to be the best kept secret in
         | software development. Type-driven development lives somewhere
         | at the intersection of OOP and FP. Here on HN every time
         | TypeScript comes up, people claim to prefer code comments or
         | some other nonsense that doesn't give you any of the benefits,
         | because they see types as (userName: string, userAge: number)
         | and not User { name: string, age: number }
        
           | kroltan wrote:
           | Yes, type-driven development saves so much hassle, both in
           | ingesting user-provided data (parsing request bodies etc) as
           | well as limiting the API surface of what you can do to
           | things.
           | 
           | Sadly, one very useful feature is missing from most
           | languages, and that is, the ability to make opaque or
           | partially transparent wrapper types. For example in C# before
           | record types, you'd need about 30 lines of boilerplate if you
           | wanted to make a wrapper for an identifier that can only be
           | compared for equality.
           | 
           | Wrapper and rich types allow writing shared behaviours for
           | those specific kinds of information, and greatly reduce the
           | "accidental API surface" of primitive types that plagues us
           | with so many bugs and data leaks.
        
       | AndreasHae wrote:
       | I'm interested in the reasoning why API pagination is hard to add
       | later, or rather what the benefit is of ,,leaving space" in the
       | API response while in reality returning all items at once.
       | 
       | Wouldn't have to adapt the way the frontend consumes the API
       | either way, keeping track of the current list state? Or is that
       | what the author meant, that you should always write your list UIs
       | as if they consumed a paginated API?
        
         | justin_oaks wrote:
         | I agree that the author wasn't quite clear on this point.
         | 
         | If you control both the client and the API, then pagination is
         | something you can add in later. But setting up pagination early
         | may make sense if the client isn't in your control.
         | 
         | The author gave the example of the an old mobile app that users
         | don't update. The client is also out of your control if it's a
         | public API that users call with whatever client code they
         | choose.
         | 
         | That said, I'm not sure how setting up "fake" pagination is
         | helpful if the client isn't already coded to use pagination
         | responses from the server. So I'll have to assume the author
         | means that any client you provide already has pagination logic
         | built-in.
        
         | michaelmior wrote:
         | This is a narrow view, but one example is that if you're using
         | JSON and your API response looks like this:
         | 
         | [{id: 1, ...},...]
         | 
         | Then you can't easily add pagination without changing the
         | structure of the response. Whereas if you have something like
         | this:
         | 
         | {results: [{id: 1, ...},...]}
         | 
         | then you can add in whatever other properties you need and this
         | change can be made without immediately breaking any existing
         | applications. Of course, it's still true that you'll have to
         | rewrite the frontend to actually implement pagination, but it
         | does do something for backward compatibility. You could
         | probably still have old versions of the application at least
         | display the first page of results without changes.
        
           | jmull wrote:
           | That's the best interpretation I could come up with as well.
           | 
           | I think the general idea is to leave room in distributed APIs
           | to add metadata later. It's not really just about potential
           | pagination, but leaving yourself a way to add/change
           | functionality later without updating all the clients in
           | lockstep with the server (always awkward and there's never a
           | way to pull it off 100% cleanly).
        
             | simonw wrote:
             | Yeah, that's exactly what I was getting at here.
        
           | tdeck wrote:
           | In this case, it would often be better to make a breaking
           | change than to have a client that thinks it's fetching all
           | the items but isn't. For a frontend application sure, it
           | might be fine that the user only sees one page, but it might
           | also be a big problem. For backend APIs you absolutely don't
           | want to add a new pagination mechanism that clients don't
           | know about.
        
         | eyelidlessness wrote:
         | There are reasons not to return a bare array (eg they're
         | executable JavaScript), but this one isn't as strong IMO. It's
         | a tiny bit more work, one time, for clients to parse Link
         | headers, but once done they're an excellent way to convey
         | pagination (and a whole lot of other neat metadata about
         | related resources!). See, for example, GitHub's API[1].
         | 
         | 1: https://docs.github.com/en/rest/guides/traversing-with-
         | pagin...
        
       ___________________________________________________________________
       (page generated 2022-10-17 23:01 UTC)