[HN Gopher] PAGNIs: Probably Are Gonna Need Its (2021)
___________________________________________________________________
PAGNIs: Probably Are Gonna Need Its (2021)
Author : thunderbong
Score : 133 points
Date : 2022-10-17 09:18 UTC (13 hours ago)
(HTM) web link (simonwillison.net)
(TXT) w3m dump (simonwillison.net)
| bena wrote:
| I like the YAGNI principle.
|
| It doesn't mean "never do it". It means "don't try to anticipate
| all your needs".
|
| Because you can sit around and come up with scenario after
| scenario to defend including almost everything.
|
| Get it working. Worry about everything else after that is
| accomplished. Will you eventually need a kill switch? Maybe. Do
| you need it to build a proof of concept? You Ain't Gonna Need It.
|
| Do you have enough results for pagination? Eventually. To build
| the API for your first test? You Ain't Gonna Need It.
|
| Because the rest of the sentence is "until you do". Eventually
| you will need a lot of this stuff, but sometimes you actually
| don't need it. Don't waste time fulfilling non-existent needs.
| 0xbadcafebee wrote:
| PAGNI is basically a form of Shift Left.
| https://www.dynatrace.com/news/blog/what-is-shift-left-and-w...
| https://devopedia.org/shift-left
| hdjjhhvvhga wrote:
| > A kill-switch for your mobile apps
|
| Really? I understand the rationale but this is one of the main
| reasons I prefer desktop apps where kill switches are relatively
| rare.
| georgyo wrote:
| Are they actually that rare?
|
| Both Slack and Discord at the very least have one, but that is
| a small sample size.
|
| Both Windows and Mac apps are very eager to self update. A
| feature they don't have on mobile devices the likely hides
| these kill switches.
| kiawe_fire wrote:
| I think the reason they might appear rare is because the
| number of desktop apps that depend on a remote server for
| functionality are less common.
|
| For example, a photo editing suite, word processor, or code
| editor all have the expectation that they be usable entirely
| offline, and any additional cloud-based features are coded to
| fail gracefully.
|
| As you point out, apps like Slack and Discord don't fall into
| this camp. Not only is their core functionality expected to
| be online, they are also Electron apps (basically, embedded
| Chrome browser) that likely have a thin layer of client
| logic, and load most resources remotely, just like accessing
| a web page.
|
| Sure, word processors and code editors exist on mobile, and
| many of them likely work just like their desktop counter
| parts when offline [1]. But that's not what most mobile apps
| are. Most mobile apps would be useless without a remote
| server, and if that remote server needs to make a breaking
| API change, then you need a way to communicate that.
|
| [1] I assume. It's been a while since I've used mobile MS
| Word. Even then, I suspect these apps will prompt the user to
| update or else lose cloud service at some point, even if they
| otherwise continue to work fine locally.
| sethhochberg wrote:
| As much as any sane development team tries to avoid breaking
| changes, they're virtually inevitable for a long-enough-lived
| client/server platform.
|
| A nice poison pill built in up front lets the ancient clients
| that for whatever reason aren't being upgraded at least get a
| nice informational prompt when those breaking changes happen,
| instead of a potentially uglier failure mode
|
| Its one of those things that you think you'll never need
| because of course you wouldn't make breaking changes, but then
| at some point an unstoppable force hits an immovable object and
| this becomes the least-bad way out for user experience
| ryandrake wrote:
| From the user's point of view, I can't think of a _good_
| reason to force users to update software, besides maybe a
| security /privacy issue. As a user, I expect software that
| works today to work tomorrow--forever. Every time I update, I
| risk getting a new UI shoved down my throat, or my favorite
| use case removed or broken, or the app has tripled in size
| for no discernible reason, or it's now too slow for my
| computer. I've been trained by software companies' repeated
| bad behavior to not update voluntarily unless I have some
| kind of assurance that it's purely a security update and no
| functionality has changed.
|
| Version your API, figure out how to make maintaining the old
| versions zero cost, and don't throw users under the bus just
| because they are (understandably) unwilling to update.
| alkonaut wrote:
| A mobile app is the same as a web page. I view them as much
| more related to web apps, than to traditional desktop apps.
|
| It's tied to a backend version. I do get annoyed with apps
| that want to update every single time, but with auto
| updates it's not that bad.
|
| As many mobile apps are merely wrappers around a web app,
| It's easy to see how this happens. But I think the end game
| will be even more web-like: apps that indicate they must be
| up to date and can't even be manually set to stick to one
| version.
| simonw wrote:
| "besides maybe a security/privacy issue"
|
| Right - that's enough of a reason for me to include a kill
| switch!
| thefreeman wrote:
| It's not to kill the app entirely, it's to force the user to
| upgrade to a maintained version rather than just having the app
| crash if backwards compatibility is required to be broken. I
| tend to do it with two flags, one to enable an upgrade nag
| notification and another to hard kill switch the version.
| 1-more wrote:
| Mad annoying when I cracked open an ancient iPad figuring I
| could watch Netflix on it and oops nope. No indication of
| why, but I know the answer because I'm the type of person who
| comments on HN.
| solatic wrote:
| Controversial opinion:
|
| Don't write tests first. If you're not writing tests, you don't
| need CI to run tests, and CD isn't safe.
|
| Write tests only: (a) if the algorithm is so
| complicated, that you need unit testing to ensure that you're
| writing it correctly in the first place (b) if, when, and
| after you earn revenue from the code
|
| Why would you waste your time proving that something you wrote
| works, if you don't have product-market fit, and you need to
| throw it all out next week to try to satisfy the market with
| something different? Just make sure you have the discipline to go
| back and actually add edge-to-edge / integration testing after
| you make the sale, and not go straight to the next feature,
| leaving the lack of testing as technical debt.
| yamtaddle wrote:
| Some amount of testing can help you move faster.
|
| ... _especially_ with some of the most-popular "let's shit out
| a quick MVP" languages/frameworks, since they often lack static
| typing, or _kinda_ have it but support is uneven.
| avgcorrection wrote:
| Why does having tests necessitate (in turn) getting a CI? You
| can just run the tests yourself.
| ParetoOptimal wrote:
| > Why would you waste your time proving that something you
| wrote works, if you don't have product-market fit,
|
| What if your poor execution and untested software simply makes
| it appear you don't have product-market fit?
| solatic wrote:
| You know, I really wish that a lack of automated tests
| translated to poor execution and short-term business failure,
| but having worked for multiple successful startups where that
| is not the case, sadly, I was proven wrong. In the face of
| evidence to the contrary, I changed my opinion.
|
| Testing _needs_ to have business justification for it; "best
| practice" is not good enough. That means doing it when
| absolutely required for a feature's development, or tying it
| to revenue on the books.
| jerf wrote:
| "Why would you waste your time proving that something you wrote
| works, if you don't have product-market fit, and you need to
| throw it all out next week to try to satisfy the market with
| something different?"
|
| Because somewhere around the 3-6 week mark, having decently-
| tested code means I move _faster_ , not slower.
|
| And to be honest, I'm too effing old anymore to put up with the
| endless loop of "oh, this change can't break anything -> ship
| it -> oh crap, that broke like three things -> fix them -> oh
| crap, that caused a regression over here -> fix it -> oh crap,
| that fixed that but now this other thing has regressed -> oh
| crap, this means my entire architecture is wrong which I would
| have figured out much earlier if I hadn't been so busy putting
| out fires". I've wondered how much people complaining about
| burnout is from them living in that loop. I _far_ prefer my
| tests to tell me about this than my customers.
|
| Yeah, if you're literally bashing together code in one week, to
| heck with tests. But I think even in the "startup" space that's
| a minority. Anything that can be solved in a solid month of
| coding is already commercially available with half-a-dozen
| plausible open source alternatives anyhow.
| solatic wrote:
| > I far prefer my tests to tell me about this than my
| customers.
|
| Exactly, if your code is used by customers, then it's tied to
| revenue, and you should write tests for it. See: (b)
| AnimalMuppet wrote:
| You missed at least part of jerf's point. If your program
| is non-trivial, you're going to come out ahead from tests
| _even before you ship_.
| hinkley wrote:
| I worked on a project where the architect kept talking about
| the rewrite they were going to do next year. That was cute
| until it wasn't.
|
| It wasn't cute when he started pushing back on anyone making
| material architecture improvements to the 'legacy' system. How
| do you expect people to write better code on the new system
| instead of the same bad code they're writing now? People don't
| work that way.
|
| Shortly before my contract was up, the person who approved his
| rewrite got promoted out of the building. I'd bet anything that
| his new boss didn't give him his rewrite.
|
| Starting a project with no culture of testing, refactoring,
| whatever, makes it harder to add it on later. And if you fail
| to get it to stick, now you're in Sunk Cost territory, because
| knowing everyone was going to say no early on could have let
| you bow out before you got invested in a project that was going
| to torture you.
| solatic wrote:
| > Starting a project
|
| I have a feeling you're working from a different set of base
| assumptions. If you're working for a company that is paying
| multiple people's worth of salaries to deliver a project, a
| project whose definition of done is "delivered to market" and
| not "earning $X revenue", then yeah, if people are going to
| be de-tasked after the project is up and migrated to other
| projects, I absolutely appreciate that whatever project
| manager is running that project should ensure that it has a
| nice testing suite to prove that the project works. Behavior-
| Driven-Development, ideally.
|
| I just think that's a monumentally expensive way to develop
| software, one that executives should have relegated to the
| trash bin twenty years ago, after the Agile Manifesto. If
| you're a private sector business then you want to develop
| quality software _at the lowest cost._ Guess what? Automated
| tests, in and of themselves, don 't tell you whether you have
| quality software! The only way you know whether your software
| has any quality at all is whether _people pay money for it!_
| Not whether software fits an arbitrary spec determined at the
| beginning of the "project".
| simonw wrote:
| It sounds to me like you're arguing from a position where
| tests are expensive and time consuming to write.
|
| If you have the right habits and testing tools in place,
| this doesn't need to be the case.
|
| I write tests for even my tiniest projects: I estimate I
| spend less than 20% of my coding time working on the tests
| - and they give me more than a 20% boost in productivity
| VERY quickly.
|
| But... that's because every one of my projects starts with
| a pytest test suite in place. Adding new tests to an
| existing test suite is massively less work than adding
| tests if there's no test suite and you have to spin one up
| from scratch.
| hinkley wrote:
| I've seen it go off the rails though. I've seen people
| exhibit traumatic responses to it.
|
| Pain is information, but there are less helpful ways to
| interpret it.
|
| There are some architectural decisions I resist
| energetically because they create a lot of consequences
| for testing, with not a lot of value added compared to
| other alternatives. If you introduce something that makes
| writing code 30% harder you'd better be getting a lot of
| profit margin out of the deal. Otherwise you're just
| torturing people for no good reason
| AnimalMuppet wrote:
| You're using a different definition of "quality software"
| that is orthogonal to the one that the rest of us are
| using. Your definition is the one that isn't helped much by
| automated tests. But for an actual product, you really want
| both kinds of quality, including the kind that _is_ helped
| by automated tests.
| hinkley wrote:
| Detasked is a definite problem, but I'm more concerned
| about the reverse, which is being overwhelmed by customer
| input. Once the party starts it's too late to change some
| things. Everything is a priority queue and louder more
| charismatic voices will push out you quality of life
| concerns.
|
| Edit: but in this thread what I'm talking about is skill.
| The longer you wait to practice a skill that you will need
| in abundance, the more likelihood that you will fail
| outright. You need a long runway for operational
| excellence.
| beltsazar wrote:
| > Why would you waste your time proving that something you
| wrote works
|
| How do you know it works if you don't test it?
| solatic wrote:
| Manually.
|
| Writing automated tests with sufficient coverage takes, as a
| rule of thumb, three times as many lines of code as the code
| itself, not including test data. (I don't remember exactly
| where I first learned this, I think this is out of
| Microsoft?).
|
| For small-enough codebases, with few enough engineers working
| on those codebases, and without dedicated QA resources
| attempting to break your code on a thousand device
| permutations, simple manual checks, when needed, are _orders
| of magnitude faster_ compared to writing and maintaining
| automated test suites for code that _nobody is paying you
| money for yet_ and therefore is _worse than worthless_ from
| an accounting perspective, due to ongoing maintenance costs
| of the code itself.
|
| Once you have paying customers, then you need the tests to
| protect company revenue and reputation. Once you decide to
| scale, then pay for QA to try to break your code on a
| thousand end-user devices. But not before!
| hinkley wrote:
| So the question is, are you going to test it manually?
|
| Case study 1: The help functionality in the UI was broken
| for months before anyone noticed, because people on the
| team don't need the help. Integration tests were written to
| exercise the help functionality. Problem solved.
|
| Case study 2: 'We' insisted on having extensive unit tests
| for our login logic. Getting code coverage on that
| functionality ended up being easier said than done, and a
| bunch of refactoring was necessary to make the tests not
| suck balls. Said refactoring introduced a number of
| regressions in the login logic. Every time the login
| functionality broke, two engineers came to tell me or my
| teammate that login was broken before the CI build had even
| completed.
|
| If the human is willing, faster than the automation, and
| failures are unambiguous, what's ultimately the value of
| those tests? Goodhart's Law says it's less than or equal to
| zero.
| solatic wrote:
| > broken for months before anyone noticed
|
| Great case in point. If nobody noticed, can you really
| tie it to revenue? Did sales fail because prospects
| couldn't use the help? Did customers churn because they
| didn't want to pay for something broken? No? Then it
| doesn't matter that it's "broken". Unless it affects the
| bottom line somehow, even in the most tenuous way, then
| from a business/financial perspective, arguably nothing
| is actually "broken".
|
| > Every time the login functionality broke, two engineers
| came to tell me or my teammate that login was broken
| before the CI build had even completed.
|
| So, technically, somebody manually tested it, right?
| Ideally, the executives should have been pressuring you
| to do a better job and not force other teams to catch
| your bad work, but it's fine if it's a rare-ish
| occurrence.
| hinkley wrote:
| That depends on your delivery pipeline. At the time this
| product was only shipping between teams on a new
| initiative, so all of the users were experts. But things
| like this cost you status once louder more influential
| people start noticing.
|
| The login one I've seen on two separate projectst, only
| one was I directly involved in. What was needed were
| negative tests to make sure authorization or
| authentication don't succeed when they shouldn't. I won't
| begrudge those tests at all, and may insist on them
| personally. But some parts of the code are dogfooded
| hourly. If you can test it quickly and cheaply, by all
| means do. But if the test system is slower and less
| reliable, if chasing a metric makes you break things that
| already "worked" you need to tap the brakes and think
| about your strategy. Not this way, or maybe not at all.
| im3w1l wrote:
| Testing it manually, probably.
| layer8 wrote:
| Your unit tests shouldn't need CI to run, period. Only
| integration tests _might_ need to, but also not necessarily
| (the "continuous" part, that is).
| AYBABTME wrote:
| The most obvious missing one: (c) You start
| wasting a lot of time fixing the features you previously
| shipped but broke later.
|
| When you start to waste so much time manually testing stuff,
| and realizing that something you implemented and worked two
| weeks ago, but now doesn't because someone forgot to check that
| their new work didn't break the old one.
| solatic wrote:
| Why are you keeping around features that aren't tied to
| revenue? Either write tests for the features, or delete the
| features. The easiest code to test and maintain is no code at
| all.
| AYBABTME wrote:
| Okay? Some products take more than just a few lines of
| codes and a few days to build. Absolutist opinions aren't
| useful.
|
| Also, just about everything can be construed as being tied
| to revenue. Not a useful heuristic.
| hbrn wrote:
| Honestly, in most cases testing is a band-aid on top of a
| bigger problem. IME things usually break due to complexity,
| not due to lack of tests.
|
| But of course, tackling complexity is hard. It's much easier
| to sprinkle the big ball of mud with tests and set it in
| stone.
|
| If you feel like you're wasting time manually testing, it
| probably means that the blast radius of your changes is too
| big. You should prefer smaller rockets over your head, not
| bigger bunkers to hide in.
| AYBABTME wrote:
| Some stuff just takes time to manual test, e.g. if you
| provision multiple cloud resources and must coordinate
| them. Or anything involving some sort of multi step
| process. If you want to ship quickly, you want a test suite
| to save you the trouble of doing an hour long of manual
| test for each change you're adding. Delegating the
| verification to computer, so that my small team can move
| faster on the other things we need to build, is a no-
| brainer. At least, once the "cost to manually test" becomes
| greater than "cost to implement some tests and setup CI".
| postalrat wrote:
| It's a bit disappointing that the general consensus seems be:
| more tests is always better.
|
| A lot of people are fooling themselves about the time needed to
| build and maintain tests vs. the benefits of having those
| tests.
|
| And anyone questioning their value is seen as some lazy
| outsider who just doesn't get it.
| hinkley wrote:
| Nothing is more expensive than unmaintainable tests.
|
| What I tend to want is unit tests that are cheap. Cheap to
| run, cheap to read, cheap to write. If they're properly
| simple I can rewrite them. If the requirements changed I can
| delete them and write new ones. If you get your testing
| pyramid you also know that _most_ of your tests are cheap.
|
| The more business logic that leaks up into your functional
| and integration tests, the more important it is that you are
| confident that the rules you have are the correct rules.
| Which leads into the problem space of GP.
|
| More information is better, but more tests often give you the
| same information, or conflicting and therefore less
| information. Tests are about confidence. Can we ship this,
| did we break stupid things? Repeating an action for the sole
| purpose of soothing yourself is a form of OCD. It's not
| healthy.
| postalrat wrote:
| There are a lot of things I'd love to see change in the
| testing world.
|
| One would be a way to write a second implementation that is
| used to test or generate tests. For example if I want to
| test a function that adds two numbers ("add") it would be
| really nice to be able to write: test(add, (a, b) => a + b)
|
| IMO tests kinda boil down to write it twice and hope you
| got it right once.
| hinkley wrote:
| You might get a kick out of property based testing.
| john-tells-all wrote:
| Absolutely -- Tests are a _business_ investment, and a
| _business_ expense.
|
| A question is: "what is the _business_ consequence if this code
| doesn 't work as expected?" If the code is frontend stuff, then
| maybe tests won't pay off.
|
| If the code is "calculate total and taxes for user buying
| products", then yes, you probably _do_ want to write tests. If
| you don 't write them first, then they might not get done until
| there's a "big bang" of "gee we need to write tests for the
| last 10k lines of code over the weekend". Or worse: code might
| or might not be correct in the past, but you can't tell how
| many users were affected.
|
| Tests are a business investment, and it's worth a discussion to
| find out risk tolerance.
| wgerard wrote:
| My personal addition: Code isolation
|
| I've "prematurely" split up code across multiple/classes
| functions so many times before these huge god functions became
| set-in-stone and unwieldy.
|
| I had a co-worker once who was extremely heavy on YAGNI--and I
| don't mean that as a pejorative, it was a really helpful check
| against code solving non-existent problems. They once called me
| out on a refactor I'd done surreptitiously for a payments flow to
| split payment states into multiple classes. Serendipity had
| struck, and I was able to point to a submitted (but not accepted)
| diff that would've broken the entire payments flow (as opposed to
| just breaking one state of the flow).
|
| I always think about that every time I question whether I'm
| isolating code prematurely.
| zzbzq wrote:
| To be fair, what you're championing for PAGNI is the poster
| child for YAGNI, so you're basically against YAGNI.
| quietbritishjim wrote:
| My rule for this is that:
|
| * If breaking up the code makes it easier to understand _with
| its current functionality_ then absolutely go ahead.
|
| * If breaking up the code makes it slightly less easy to
| understand right now, but will make it easier to add some
| feature later that you know you know you're definitely going to
| need, then hold off.
|
| Most of the time you "know" you're going to add a feature, you
| turn out to be wrong, and (worse than that) some other slightly
| different feature is needed instead that's actually harder to
| add because you broke up the logic along the wrong axis.
| Whereas, breaking code up purely based on what makes it easier
| to understand now, ironically, usually does a much better job
| at making it easy to add a feature later - it often almost
| feels like an accident (but it's not, really).
|
| I bet your payment flow refactoring made things a bit easier to
| understand, even if it had never been worked on again.
| jsbg wrote:
| > If breaking up the code makes it slightly less easy to
| understand right now, but will make it easier to add some
| feature later that you know you know you're definitely going
| to need, then hold off.
|
| This. Your code shouldn't be documenting what it will do in
| the future: it should be clear what it's doing right now.
| Otherwise some other engineer maybe from another team will
| come upon it, misread it, and bugs ensue. Trying to save
| yourself time in the future ends up wasting time for others,
| and likely for yourself as well.
| WorldMaker wrote:
| This relates to "Rule of 3 Before Abstracting": One
| concrete instance of a proposed abstraction is YAGNI
| (adding an abstraction blurs the concrete code and makes it
| harder to understand), two concrete instances are
| _coincidence_ (it 's maybe a sign that you need an
| abstraction between the two, but you don't have enough
| scientific data to consider), at least three concrete
| instances is (finally) _a pattern_ (abstracting a known
| pattern is right and good and because it becomes a document
| of the high level pattern itself and the differences and
| distinctions between the concrete implementations become
| more distinct and obvious).
|
| That "Rule of 3" is sometimes a good reminder itself not to
| build for some possible future of the codebase, but to
| build for the patterns in your code base as they exist
| _today_.
| 3pt14159 wrote:
| Premature class isolation: Usually bad.
|
| Some notable exceptions are things like addresses where many
| times users have multiple, but usually people splitting up the
| user table and the "user profile" table are just causing
| headaches. A wide user table is fine.
|
| Premature method or function isolation: Usually good.
|
| Even if there isn't reuse, the code is usually more readable.
| def let_in_bar? return self.is_of_age_in_region?
| and self.is_sober?
|
| Is a perfectly fine isolation with very little downside.
| quietbritishjim wrote:
| I know your example is just a toy, but it definitely reminds
| me of heavy handed over-refactoring into tiny one-line
| functions.
|
| There are two problems with that approach:
|
| The first is when you're trying to understand what a single
| function does. If it's 20 lines long but has self-contained
| logic, that's often easier to understand than referring back
| to the tiny pieces it's assembled from (even if they're
| supposedly so self contained you don't need to look at them
| individually - that's rarely true in practice). On balance,
| even if a function is a slightly too long, that's usually
| less bad than one that's in slightly too many little pieces.
|
| The second is when you're looking at the file level. When
| faced with a file with dozens of tiny functions, it's much
| harder to get a top level understanding than if it has a
| smaller number of longer functions.
|
| The second one is the bigger problem, because understanding a
| project at the file (and higher) level is typically much
| harder than understanding what a single function does. Even
| if breaking up a function into smaller pieces does actually
| make it easier to understand, but makes the overall project
| harder to parse, then that's usually a loss overall. Of
| course, it depends very much on the specifics of the
| situation though, and certainly a long function broken down
| into logical pieces can actually make the overall project
| easier to understand.
| jsbg wrote:
| > over-refactoring into tiny one-line functions
|
| It's relatively rare to refactor a single line of code into
| its own function, unless it's really hard to read, e.g.
| some nasty regex.
|
| > even if they're supposedly so self contained you don't
| need to look at them individually - that's rarely true in
| practice
|
| If a long function is broken up into smaller functions only
| for the sake of having smaller functions, then you end up
| with functions that don't really make sense on their own
| and, sure, the bigger function is better. But if the broken
| up functions are named well enough like in the example
| above, then it shouldn't be necessary to see how it's
| implemented, unless you're tracking down a bug. After all,
| it's very rare to look at e.g. how a library function is
| implemented, and for some libraries/languages it's not even
| possible.
|
| > When faced with a file with dozens of tiny functions,
| it's much harder to get a top level understanding than if
| it has a smaller number of longer functions.
|
| Most languages have facilities to help with that, e.g.
| function visibility. Your smaller number of large functions
| can remain the public API for your module while the
| "dozens" of tiny functions can be private. In either case,
| long files are harder to take in no matter how many
| functions they're composed of.
| z3c0 wrote:
| I'd add that this extends to data processes as well. Lean
| pipelines with rigid schemas are great and all, but you're really
| going to regret not having something more malleable once the
| system has grown and you have an onslaught of requests for new
| fields to be added to X solution.
|
| Blob storage is cheap. Just dump the whole database and sift
| through it later.
| swyx wrote:
| my submission for PAGNI: Preemptive Pluralizations
| https://www.swyx.io/preemptive-pluralization
|
| Before you write any code -- ask if you could ever possibly want
| multiple kinds of the thing you are coding. If yes, just do it.
| It is a LOT easier to scale code from a cardinality of 2 to 3
| than it is to refactor from a cardinality of 1 to 2.
|
| Example:
|
| - You assume that one team has many users, and correspondingly,
| one user belongs to one team.
|
| - Eventually, you find that a user may need multiple teams. This
| is actually fantastic for your business!
|
| - But you are depressed because you now have to spend 2 months
| refactoring every line of code and database schema that assumed
| the one-to-one mapping
| vaidhy wrote:
| The more general rule I follow is that there are only three
| cardinalities in software - 0, 1, and many. Default should be
| many. Any other limits you put is likely to be incorrect.
| swyx wrote:
| also a list of known things that are cardinality 1 could be
| valuable, since someone coming across this advice people
| might swing too far over to the always-many side
| kragen wrote:
| This is appealing, but taking this advice seriously means
| that all your relationships are many-to-many, until you
| optimize some of them to be one-to-many or one-to-one.
|
| This clashes extremely strongly with the currently popular
| approaches to software development (OO, procedural, Lisp/ML-
| style functional, SQL, etc.), which have an extremely strong
| default of cardinality 1, inflicting lots of extra work on
| you for many-to-many relationships. You'd never have a member
| of a struct or a class start out as an int or a string; they
| would always be int lists, string lists, etc. Points would
| have a list of X coordinates rather than a single X
| coordinate; interpreters would interpret a list of programs;
| 3-D objects would have a list of transform matrices rather
| than a single transform matrix; files would have a list of
| owners.
|
| (And of course in languages like C and Golang when you nest a
| Foo object inside a Bar object, you are foreclosing the
| possibility of that Foo having the same relationship with
| some other Bar object as well; you need pointers.)
|
| Systems that do support that kind of thing in one way or
| another include Prolog, miniKANREN, the APL family, and Pick.
| But Prolog and APL only support it to a very limited extent,
| and I've never used Pick or miniKANREN.
| vaidhy wrote:
| I am not sure I understood your arguments because of where
| these get applied. I generally mean to apply these when we
| talk about business constraints - in my field of supply
| chain optimizations, these kind of things come up often.
| How many trucks do you plan between nodes? If the answer is
| 2, it means many. Being explicit about 0 and 1 means that
| if that constraint is broken, it means a more through code
| analysis as it would lead to some unexpected breaks in
| various systems.
| kragen wrote:
| I certainly agree that 2 should be "many".
| swyx wrote:
| what does a cardinality of 0 mean? i dont even think an
| attribute should be on the schema if the cardinality is 0? or
| am i experiencing a brain fart and missing something obvious
| Sohcahtoa82 wrote:
| This comment reads like it's trying to disguise pedantry as
| confusion, which is an obnoxious trend on HN.
|
| Keeping with the original "user on a team" example,
| allowing a cardinality of zero means that you support a
| user not being on a team at all. This may be a
| bastardization of the definition of "cardinality", but I
| think you understood that.
| swyx wrote:
| no i didnt, and it was a sincere question. im the guy who
| wrote the post and i didnt understand cardinality zero
| lol.
| vaidhy wrote:
| It is what swyx said.. cardinality not in the
| mathematical sense.. but the fact it does not exisit,
| there is exactly one instance of it or there are many
| instances of it. Going from one to another means
| refactoring/through testing as there would be side
| effects all through the system.
| kotlin2 wrote:
| I don't think you'll need most of these things. My metric would
| be whether or not your app would continue to function without
| these things. Using this metric:
|
| Kill switch: may be required, but unlikely.
|
| Automated deploys: definitely not necessary. Your app can still
| function if you deploy manually.
|
| CI: definitely not necessary. Your app can still function without
| automatically run integration tests. It can run without any
| integration tests at all.
|
| API pagination: depends on the API.
|
| Detailed API logs: not necessary. You can add logging as problems
| arise. Even when I add detailed logging ahead of time, I often
| find I missed the one crucial piece of information needed to make
| using the logs possible during an investigation.
|
| SQL query dashboard: not necessary. You can easily use a script
| interface or CLI that stores history.
| jaywalk wrote:
| The article describes precisely why proactively adding these
| things is (probably) necessary. For example, the kill switch:
| yeah, you hope you won't need it. But if the time comes that
| you do, it's too late to add it.
| kotlin2 wrote:
| You can use that argument when adding anything.
| xapata wrote:
| > I often find I missed the one crucial piece of information
| needed to make using the logs possible during an investigation.
|
| This may be a bit of confirmation bias, or ... I'm not sure
| what the right term is. The times you didn't have the necessary
| information stick out in your memory, because they were serious
| problems. The times you did have the right information logged
| you solved the problem quickly and it never became a fire.
| muglug wrote:
| > having a mechanism that provides detailed logs--including the
| POST bodies passed to the API--is invaluable.
|
| He includes a caveat later to avoid logging PII, but this should
| have a flashing red warning sign attached.
| adamisom wrote:
| Is a logging service/pipeline that removes* PII the only real
| high-level solution?
|
| (* or 'does its best' to remove PII; I feel like there's an
| immovable object / unstoppable force thing if the answer is "0%
| PII _ever_!" because that conflicts with the need to log as
| many things as you can)
| bob1029 wrote:
| We do explicit redaction of logs forwarded to us so that we
| don't have to see any end-customer information. The redaction
| process occurs automatically in the customer's secure
| environment, so theoretically it is clean from our
| perspective.
|
| This is indeed a very tricky process, but we have it close
| enough to make regulators happy. Most of our log information
| is stored in SQLite and XML, so we can do a lot of parsing
| magic to achieve determinism.
| 0xbadcafebee wrote:
| You shouldn't log as many things as you can. You should only
| log useful things, when they're needed, along with context.
| Otherwise you have too much noise and not enough signal, and
| it's easier to leak PII.
|
| Applications should have a logging interface that can
| identify and conceal sensitive information, and data stored
| in the app/database/etc should have a type that can be
| denoted as sensitive. You will eventually also want to reduce
| the amount of logs you generate, so it's useful to have
| logging levels the same way operating systems do.
| [deleted]
| rfrey wrote:
| I wonder about this a bit. If I'm already storing all the PII
| in my database, probably on other people's computers via a
| managed database or at least a cloud server, why are logs
| different? Is it because they don't receive the same security
| scrutiny, e.g. any dev can view them versus having special
| permissions for the database?
|
| Or is there more there than authorization that I should be
| thinking about?
|
| (Obviously I'm not talking about logging cleartext passwords,
| which don't belong in the database either.)
| michaelmior wrote:
| I believe this also plays into "right to be forgotten
| legislation" such as in the GDPR. Suppose you log PII and
| then a user requests all their PII be deleted. I believe you
| would then also be required to scrub those logs instead of
| just dropping records from a database.
| throwaway74829 wrote:
| Permissions is the big one.
|
| Second is potential for DDoS (either intentionally or
| unintentionally).
|
| Third is possibility of "oopsies" via me intentionally or
| unintentionally including my passwords/sensitive info/what
| have you in the POST body. Now you have to add branch to look
| for and scrub sensitive info in your logger -- otherwise my
| PII has now been logged (and if I were a massive asshole
| looking for a quick payday, I could throw up a fuss).
|
| It's fine in dev, but shouldn't be in prod. Too much
| liability.
| oandhjakk wrote:
| I believe the main concern here, from dealing with clients
| that have mandated no PII in logs, is both authorisation as
| well as control. If it's in your service logs, then it could
| be in your Splunk logs, it could be in a storage repository
| and it could be in your requests that you send to the service
| provider to troubleshoot some issue.
|
| Unless there is a valid use case for logging PII (and I can't
| think of any which can't be engineered around) then I think
| it's best to avoid it in principle.
|
| I think of it as the same as logging passwords, keys or
| tokens.
| jjice wrote:
| > A bookmarkable interface for executing read-only SQL queries
| against your database
|
| We use Metabase at my work and I love it. It handles
| visualizations out of the box and I can write SQL like my heart
| desires, while still being able to help out our designers,
| customer success, and sales needs. Their "questions" (what they
| call a query) can be parameterized and added to dashboards and
| everything works so well.
|
| Also one of the few large projects I see that's written in
| Closure.
| jmull wrote:
| This wouldn't be possible in a lot of the projects I work on.
|
| The problem is privileges -- for example, the data you can see
| depends on whether you're logged in or not, and your group and
| role when logged in.
|
| What I've done instead is provided an interface (bookmarkable)
| to a set of "rowsets", where the rowset has a name and set of
| columns, where the columns are filterable, sortable, and/or
| selectable. Directly behind the rowset is SQL, of course
| (and/or a view or table), so it's not that much less flexible
| than SQL directly, but it provides a point where additional
| filter conditions can be reliably added to enforce privileges
| (there's also a dev option to see the SQL, if that's needed).
| jerf wrote:
| A few more in my experience:
|
| Database abstractability. Here I don't mean "type of database"
| like Postgres vs. MySQL vs. Mongo, but making sure in your system
| you have the ability to target and isolate the entire DB.
| Integration testing is both vital and slow; as you grow you will
| be glad you have the ability to run multiple of them at once
| because they don't interact. With MySQL & Postgres this can be as
| easy as making sure you code takes the database name as a
| _parameter_ instead of hard coding it. If your database lacks
| this feature, take the time to abstract out table names so you
| can use a certain prefix /suffix for a given test.
|
| If you can't quickly recreate the schema and populate with some
| known set of test data, take the time to write code that will
| take an existing DB and truncate it back to start or whatever.
| You need those integration tests, and you need them to not take
| two hours to run. Get that going at the beginning. I've not yet
| gotten to the point I need prepopulated schema which is handed
| out by a central manager, and then the cleanup is run out of the
| actual test cycle, but I'd be willing to go that far. In 2022 it
| is likely not a problem at all to have literally a thousand
| copies of your schema lying around on some test DB server, just
| to make the tests run fast. Use some versioning solution to make
| sure they're the right version.
|
| There is a _LOT_ of variation in databases here; you will need to
| tune your process to your DB. In some, schema creation is wicked
| fast and you can just write all your tests to write out new ones
| without a problem. In others it is slow but very parallelizable.
| In others it may be just plain slow, in which case the code to
| take existing schema and reset them is more valuable. In yet
| others truncation may be difficult and /or slow due to loose
| serialization requirements. Do whatever it takes to make your
| integration tests fast, or at least, as fast as possible. Do not
| just settle for even simple tests taking 10 minutes because you
| naively do a slow thing. You really need these to go fast.
|
| Authentication and authorization; the earlier you work in _some_
| solution, the happier you will be. Sometimes this is easy to
| retrofit, sometimes it really isn 't. Even if you start with
| authentication and a list of who can do what in the very early
| phases it'll still prevent you from accidentally depending on
| having no auth at all.
|
| Write your code to be thread/async safe from the beginning, even
| if you have no use for it yet. You don't want to be in a position
| where you need it but you've got a 2-year-old code base now built
| on the deep assumption of single-threading, and it only gets
| worse as the code base gets larger. It's a slight marginal
| expense to keep it in mind, it's a project-destroying nightmare
| to add it back in later.
|
| In static languages, get the types declared as early as possible.
| Get away from (int, int, int, string, string, bool, bool)
| function signatures as quickly as you can. It's not exactly
| impossible to retrofit better types later, but it's at the very
| least a pain in the bum disproportional to the cost of having
| declared them correctly in the first place. Even if your first
| stab at real types is wrong, it's _still_ easier to fix them this
| way because you have the type name as a hook for your IDE, or in
| the worst case, for sheer grep-based refactoring. (Even in
| dynamic languages I 'd recommend this, but they have more options
| and the analysis is more complicated. Still, I'd suggest having a
| "user" that knows how to render itself as a string is better than
| passing around the user's name or something as a string
| everywhere and expecting every bit of code that needs to do some
| user-object-method-y thing to have to look it up themselves.)
| schwartzworld wrote:
| > Still, I'd suggest having a "user" that knows how to render
| itself as a string is better than passing around the user's
| name or something as a string everywhere and expecting every
| bit of code that needs to do some user-object-method-y thing to
| have to look it up themselves.
|
| For some reason this seems to be the best kept secret in
| software development. Type-driven development lives somewhere
| at the intersection of OOP and FP. Here on HN every time
| TypeScript comes up, people claim to prefer code comments or
| some other nonsense that doesn't give you any of the benefits,
| because they see types as (userName: string, userAge: number)
| and not User { name: string, age: number }
| kroltan wrote:
| Yes, type-driven development saves so much hassle, both in
| ingesting user-provided data (parsing request bodies etc) as
| well as limiting the API surface of what you can do to
| things.
|
| Sadly, one very useful feature is missing from most
| languages, and that is, the ability to make opaque or
| partially transparent wrapper types. For example in C# before
| record types, you'd need about 30 lines of boilerplate if you
| wanted to make a wrapper for an identifier that can only be
| compared for equality.
|
| Wrapper and rich types allow writing shared behaviours for
| those specific kinds of information, and greatly reduce the
| "accidental API surface" of primitive types that plagues us
| with so many bugs and data leaks.
| AndreasHae wrote:
| I'm interested in the reasoning why API pagination is hard to add
| later, or rather what the benefit is of ,,leaving space" in the
| API response while in reality returning all items at once.
|
| Wouldn't have to adapt the way the frontend consumes the API
| either way, keeping track of the current list state? Or is that
| what the author meant, that you should always write your list UIs
| as if they consumed a paginated API?
| justin_oaks wrote:
| I agree that the author wasn't quite clear on this point.
|
| If you control both the client and the API, then pagination is
| something you can add in later. But setting up pagination early
| may make sense if the client isn't in your control.
|
| The author gave the example of the an old mobile app that users
| don't update. The client is also out of your control if it's a
| public API that users call with whatever client code they
| choose.
|
| That said, I'm not sure how setting up "fake" pagination is
| helpful if the client isn't already coded to use pagination
| responses from the server. So I'll have to assume the author
| means that any client you provide already has pagination logic
| built-in.
| michaelmior wrote:
| This is a narrow view, but one example is that if you're using
| JSON and your API response looks like this:
|
| [{id: 1, ...},...]
|
| Then you can't easily add pagination without changing the
| structure of the response. Whereas if you have something like
| this:
|
| {results: [{id: 1, ...},...]}
|
| then you can add in whatever other properties you need and this
| change can be made without immediately breaking any existing
| applications. Of course, it's still true that you'll have to
| rewrite the frontend to actually implement pagination, but it
| does do something for backward compatibility. You could
| probably still have old versions of the application at least
| display the first page of results without changes.
| jmull wrote:
| That's the best interpretation I could come up with as well.
|
| I think the general idea is to leave room in distributed APIs
| to add metadata later. It's not really just about potential
| pagination, but leaving yourself a way to add/change
| functionality later without updating all the clients in
| lockstep with the server (always awkward and there's never a
| way to pull it off 100% cleanly).
| simonw wrote:
| Yeah, that's exactly what I was getting at here.
| tdeck wrote:
| In this case, it would often be better to make a breaking
| change than to have a client that thinks it's fetching all
| the items but isn't. For a frontend application sure, it
| might be fine that the user only sees one page, but it might
| also be a big problem. For backend APIs you absolutely don't
| want to add a new pagination mechanism that clients don't
| know about.
| eyelidlessness wrote:
| There are reasons not to return a bare array (eg they're
| executable JavaScript), but this one isn't as strong IMO. It's
| a tiny bit more work, one time, for clients to parse Link
| headers, but once done they're an excellent way to convey
| pagination (and a whole lot of other neat metadata about
| related resources!). See, for example, GitHub's API[1].
|
| 1: https://docs.github.com/en/rest/guides/traversing-with-
| pagin...
___________________________________________________________________
(page generated 2022-10-17 23:01 UTC)