[HN Gopher] Eradicating N+1s: The Two-Phase Data Load and Render...
___________________________________________________________________
Eradicating N+1s: The Two-Phase Data Load and Render Pattern in Go
Author : gmcabrita
Score : 21 points
Date : 2024-05-28 20:23 UTC (2 hours ago)
(HTM) web link (brandur.org)
(TXT) w3m dump (brandur.org)
| andrscyv wrote:
| So a super complicated work around instead of just doing sql
| queries or using a query builder ???
| lainga wrote:
| Or else, if the post is talking about a "public-facing API
| resource", can someone tell me why the API wouldn't implement
| querying for multiple of the same record type at once? It just
| seems to me that choosing between getting 1 owner, and "get
| _ALL_ owners " (as TFA puts it), is like a law of the excluded
| middle
| metadat wrote:
| It's tricky because in some cases you might be able to batch
| all the queries up front, but in others you will only know
| the IDs you need to fetch after you get one or more
| intermediate results back and apply business logic to arrive
| at a decision.
|
| As of today there's no silver bullet beyond having a solid
| and principles-first understanding of your database and
| related infrastructure.
| Groxx wrote:
| For detecting rather than preventing duplicates, I'm fond of this
| pattern in Go:
|
| At "entry" to your code, add a mutable "X was called" tracker to
| your context. Anywhere you want to track duplicates,
| insert/increment something in that tracker in the context. And
| then when you exit, log the duplicates (at the place where you
| put the tracker in).
|
| It's reasonably implicit, works for both tracking and implicitly
| deduplicating (turn it into a cache rather than a counter and
| voila, lazy deduplication*), and it's the sort of thing that all
| your middle layers of code don't need to know anything about. As
| long as they forward contexts correctly, which you REALLY want to
| do all the time anyway, it Just Works(tm).
|
| *: obviously you can go overboard with this, there are read-
| after-write concerns in many cases, etc which this article's
| "prevent by design" structure generally handles better by making
| the phases of behavior explicit. but when it works, it's quite
| easy.
| hinkley wrote:
| I've always been sort of fond of 1 + 1. It's too often the case
| that there's a popular query that doesn't even need the child
| data to function, and unless you have some elaborate caching
| mechanism it would be a shame to pay the full cost of the join or
| however you want to implement it.
|
| Making one query that returns the base data and a second that
| pulls all of the associated data works often enough.
|
| Then it's only when you need to pull M individual records and the
| associated data that _might_ put you into M + 1 queries, if you
| can 't work out client side grouping for some esoteric reason.
| But you've reduced the exponent of the fanout by 1, which can
| hold you for a long time. Years even.
| gregwebs wrote:
| Jet can automatically load joined objects into embedded Go
| structs: https://github.com/go-jet/jet/wiki/Query-Result-
| Mapping-(QRM...
|
| Depending on what you are doing there might be some duplication
| that you could remove by creating hash lookups as in this post,
| but I would reach for Jet first.
|
| sqlc supports embedding but not embedded slices?
| sigmonsays wrote:
| everything about this screams wtf to me
|
| engineering around deficiencies sometimes yields interesting
| results but this isn't one of them.
|
| I'd say this code is spaghetti.
| dfee wrote:
| This seems to be the dataloader pattern. There are
| implementations in many languages, but the idea is that you have
| a bunch of threads which declare their I/O needs, and then you 1)
| denounce and merge the requests (uniform access) and 2) cache the
| results so that later in the graph of calls you don't need to
| fetch already loaded data.
|
| Here's one impl: https://github.com/graphql/dataloader
___________________________________________________________________
(page generated 2024-05-28 23:00 UTC)