[HN Gopher] How not to blow up the production database
___________________________________________________________________
How not to blow up the production database
Author : pyrolistical
Score : 43 points
Date : 2021-10-15 18:23 UTC (4 hours ago)
(HTM) web link (blog.battlefy.com)
(TXT) w3m dump (blog.battlefy.com)
| tnorgaard wrote:
| Answer: Materalized Views.
|
| On a unrelated note: Still hoping for those automatically
| refreshed materalized views in PostgreSQL, ala what VoltDB has.
| marcosdumay wrote:
| Given how well they work on any non-specialised DBMS, I prefer
| Postgres to take their time and do it right (AKA, differently
| from everybody else).
| aledalgrande wrote:
| Another thing I'm waiting for in Postgres is lifting and
| decoupling from the connection limit...
| sojournerc wrote:
| TimescaleDB (psql extension) has these, specific to time-series
| however.
|
| https://docs.timescale.com/timescaledb/latest/how-to-guides/...
| BikiniPrince wrote:
| Yes, views are nice, but there is also a fair concept of not
| needlessly bogging down a table. Sure, they were making up
| data, but a flat table with stats, profile data and other
| easily external data is just bloat. Once you have an id then
| static fields can be retrieved from other services/data stores.
| tomrod wrote:
| I'm not sure I am following. Aren't materialized views just
| formal, cached results of a query? That wouldn't bog down a
| table.
| lazide wrote:
| I think their point is more 'don't store all that junk in
| your primary database and then do all your work on it there
| too if you can just stuff it somewhere else'. Which has
| pros and cons and depends a lot on various scaling factors.
| manquer wrote:
| Perhaps he means it will bog down on refresh.
| tomrod wrote:
| Maybe? Not sure.
| ericHosick wrote:
| > Still hoping for those automatically refreshed materialized
| views in PostgreSQL, ala what VoltDB has.
|
| Not exactly what you're hoping for and you probably already
| follow this pattern. pg_cron can help (and is now available in
| AWS RDS).
|
| ```sql CREATE EXTENSION IF NOT EXISTS pg_cron;
|
| CREATE MATERIALIZED VIEW IF NOT EXISTS
| activeschema.some_thing_cached AS ...;
|
| SELECT cron.schedule('some_thing_cached', ' _/ 5 _ * * *',
| $CRON$ REFRESH MATERIALIZED VIEW some_thing_cached; $CRON$ );
| ```
| tibiapejagala wrote:
| I think that the problem is when you have a materialized view
| which takes hours to refresh. We are lucky that 99% of our
| traffic is during 7-19 on weekdays, so we can just refresh at
| night, but that won't work for others.
|
| I don't know much about how postgresql works internally, so I
| just probably don't understand the constraints. Anyway as I
| understand, there are two ways to refresh. You either refresh
| a view concurrently or not.
|
| If not, then postgres rebuilds the view from its definition
| on the side and at the end some internal structures are
| switched from the old to the new query result. Seems
| reasonable, but for some reason, which I don't understand due
| to my limited knowledge, an exclusive access lock is held for
| the entire duration of the refresh and all read queries are
| blocked, what doesn't work for us.
|
| If you refresh concurrently, postgres rebuilds the view from
| its definition and compares the old and the new query result
| with a full outer join to compute a diff. The diff is then
| applied to the old data (like regular table
| INSERT/UPDATE/DELETE I assume), so I think you get away with
| just an exclusive lock and read access still works. There are
| two downsides to this, first that it requires a UNIQUE
| constraint for the join, second that the full outer join is a
| lot of additional work.
|
| I never had the time to test Materialize, but it seems to do
| what I want with its continuous refresh.
|
| I also thought about splitting the materialized view into
| two, one for rarely changing data and another one for smaller
| part of the data which changes daily. Then I would only have
| to refresh the smaller view and UNION ALL both materialized
| views in a regular view. Not sure how well will that work
| with postgres query planner.
| nkozyra wrote:
| If it's a one shot data compilation, you could use
| something like postgres' NOTIFY to trigger a listening
| external app.
| kccqzy wrote:
| I've been looking at Materielize for a while
| (https://materialize.com/). It can handle automatically
| refreshed materialized views. Last time I checked, it didn't
| support some Postgres SQL constructs that I use often, but I'm
| really looking forward to it.
___________________________________________________________________
(page generated 2021-10-15 23:00 UTC)