[HN Gopher] Features I'd Like in PostgreSQL
___________________________________________________________________
Features I'd Like in PostgreSQL
Author : todsacerdoti
Score : 73 points
Date : 2023-01-28 18:51 UTC (4 hours ago)
(HTM) web link (gilslotd.com)
(TXT) w3m dump (gilslotd.com)
| Alir3z4 wrote:
| I love PostgreSQL, it's the database I go for even for really
| tiny projects and data needs even when SQLite could be used, but
| one thing that makes me annoyed is the upgrading.
| captrb wrote:
| A built-in timestamp type that stores both the UTC time, the
| originating timezone IANA name, and the timezone offset. Without
| loss.
|
| "it was 2:01am on November 5th, in America/Los_Angeles and the
| offset was -7"
| doctor_eval wrote:
| That's a weird use case but you could just create a UDT to do
| it. You need the timestamptz value, the timestamp and the
| Timezone. Super expensive but -\\_(tsu)_/-
| crazygringo wrote:
| Why?
|
| You can do this yourself with two fields. And it would be
| extremely redundant to be constantly storing timezone together
| with offset (violates database normalization) -- really you'd
| want the second field to be a key pointing to an index of
| timezones with offsets in a separate table.
|
| But most of all it pretty much destroys normal assumptions
| regarding sorting and collation, and introduces an
| idiosyncratic "logic" of its own. Is 2:01am at the same moment
| in America/Los_Angeles and in America/Vancouver the same moment
| in time, or not? (There are several more edge cases easy to
| imagine regarding how equality and sorting might be
| calculated.)
| derefr wrote:
| 1. What would you use this for?
|
| 2. What is the collation (sorting algorithm) for this datatype?
| svieira wrote:
| Without loss requires the version of the timezones data too.
|
| "it was 2:01am on November 5th, in America/Los_Angeles and the
| offset was -7 (retrieved from ICU v97)"
| captrb wrote:
| Interesting point. I don't think I have that particular use
| case, but I can definitely imagine it. I'd think especially
| with dates in the far future and the timezone offsets are
| altered between the time of recording and the time value
| itself?
| giraffe_lady wrote:
| Individual timezones are pretty stable but a handful change
| every year in some way, often switching how they observe
| DST or something similar. If you have a truly global
| userbase and this actually matters, you'll definitely hit
| them.
| valzam wrote:
| I hope postgres adopts easily disabling an index instead of
| deleting it. MySQL has this feature and it's very useful for
| testing whether an index can be deleted on production load.
| Transactional DDL is amazing already, disabling indexes would be
| a great addition for low-risk performance tuning.
| whitepoplar wrote:
| I'd love to see:
|
| 1) _Simple_ , easy to use, high-availability features.
|
| 2) Horizontal scalability without having to resort to an
| extension like Citus.
|
| 3) Built-in connection pooler.
|
| 4) Query pinning.
|
| 5) Good auto-tuning capability.
| seedless-sensat wrote:
| Points 1-3 exactly describe my pains operating Postgres at
| scale
| threeseed wrote:
| It is kind of ridiculous that the first three haven't been
| sorted by now.
|
| And over time it will increasingly relegate PostgreSQL to being
| for development only with production use being handled by wire-
| compatible databases e.g. Aurora.
| ahachete wrote:
| > 2) Horizontal scalability without having to resort to an
| extension like Citus.
|
| Just curious, what would save you having the solution in-core?
| Installation, sure, but that's a one-off possibly in your
| deployment code. "CREATE EXTENSION citus" and add that to
| postgresql.conf? Sure, but not too much work for me. The rest
| (commands to actually create the nodes, do the sharding itself)
| are something I cannot imagine being different or simpler if
| with an in-core solution.
|
| What am I missing?
| threeseed wrote:
| It means that when you upgrade PostgreSQL you don't have to
| worry if the extensions are compatible and have been fully
| tested not just for functionality but security etc as well.
|
| And most importantly it means you don't have to worry if that
| extension will move to a freemium model which (a) often has
| important features out of your price range and (b) is
| generally unacceptable in enterprise environments.
| morrbo wrote:
| Never agreed with something so much more in my life. Points 1-3
| in particular. I understand their logic of not wanting to
| "pollute" the core products with these features.... but it's
| sort of past the point now where you're starting to expect this
| stuff in a modern database product (instead of add-ons, hacks,
| and outdated blog posts)
| LtWorf wrote:
| What would the proprietary vendors sell then?
| dudeinjapan wrote:
| Array foreign keys and indexes on arrays/JSON nested fields. When
| using MongoDB, having this eliminates most need for join tables.
| (*This may have already been implemented, I haven't used SQL
| recently.)
| aidos wrote:
| I think they have indexes on json (or any selectable thing?).
|
| You could probably implement the foreign key using generated
| columns these days. Ah I guess you mean instead of having the
| association table? That would definitely be nice in some cases.
| tezza wrote:
| 1) "NOTIFY" should be eligible for prepared statements to assist
| in the case where untrusted values is part of the "channel" or
| "payload".
|
| I had to write some ugly stored procedures to attempt sanitising
| myself
|
| 2) Query batching where queries are submitted to a database
| together and then results are returned. If this could be done
| without a connection per query that would be awesome. The use
| case is a global truth database needing 1 write to 20 reads... If
| the server is in the UK and the client is in Fiji then batching
| the 20 reads together is a massive latency win.
| ntarora wrote:
| Can't you do number 2 with Postgres pipelining? Here's an
| article from bit.io (serverless Postgres w/ data repos) their
| example is inserts but also works for read ops.
|
| https://innerjoin.bit.io/the-distance-dilemma-measuring-and-...
| https://www.postgresql.org/docs/current/libpq-pipeline-mode....
| petepete wrote:
| I'd love to see native temporal tables at some point. Being able
| to step back in time without relying on extra code or a plugin
| would be great.
|
| https://learn.microsoft.com/en-us/sql/relational-databases/t...
| eliaspro wrote:
| Temporal tables are IMHO a profound gamechanger. Once you
| understood how to use them, you'll have a completely different
| view on your data models/db schemas and your data lifecycle.
|
| It makes so many things far easier, having a temporal relation
| for every record and saves a lot of headaches that would
| otherwise usually be dealt with on the application level.
| captrb wrote:
| Half my job is doing this in reverse by accumulating deltas
| over streams and materializing the current state in various
| caching tiers. I'd love this to be native feature in PG, but
| for any my workloads it would need to support a lower-cost
| archival-grade storage tier ala a blob store.
| cosmotic wrote:
| I'd like to see easier upgrades between major versions, or at
| least published docker images that can do the upgrade. As it
| stands, it's a huge hassle to get both binaries installed on the
| same system. This is a task almost everyone has to do
| periodically, it should be easier.
| doctor_eval wrote:
| Yes this is a huge problem. I wish PG could contain whatever it
| needs to do in place upgrades from supported versions.
| Statically compiled builds of previous versions of pg_upgrade?
| foreigner wrote:
| Allow trailing commas in SELECT and IN. Pretty please?
| valzam wrote:
| Omg yes please. I get that it might be slightly harder to parse
| but any language that doesn't support it inevitably annoys me.
| In pg and json you end up with runtime errors, programming
| languages lead to bigger diffs than necessary (adding an entry
| to a static list is a 2 line change instead of 1 line)
| defanor wrote:
| > programming languages lead to bigger diffs than necessary
| (adding an entry to a static list is a 2 line change instead
| of 1 line)
|
| I think in ML-family languages it's fairly common to format
| like this, which avoids bigger diffs: foo = [
| 1 , 2 ]
|
| In C-style languages something similar goes as well, though
| perhaps used less commonly: int x[] = { 1
| , 2 };
| giraffe_lady wrote:
| This is how I format sql code when I have control over it,
| have for years. People often recoil in horror, sometimes
| they ponder for a minute then switch over themselves.
|
| It works very easily and consistently to center around the
| space and lead with the comma: select id
| , name , address , ts from
| table where condition and etc;
| lgas wrote:
| Doesn't this just move the problem to when you add a
| longer statement like "INNER JOIN"? Then the entire SQL
| statement becomes the diff?
| sergiotapia wrote:
| I wish there were a easy way for Postgres to tell me "hey you've
| been doing a ton of queries with this field that has no index."
| thinkingkong wrote:
| You can use the slow query log which is very helpful. Lots of
| logging tools will injest it properly and you can aggregate
| queries that are ran frequently. Having an index or not is just
| one optimization.
| ahachete wrote:
| Many of those queries may run sub-second (typical lower-end
| value for log_min_duration_statement), so they won't get
| logged. Yet, if called at high frequency, may represent a
| notable % of your CPU and I/O. The slow query log is not
| enough in many cases.
| richbell wrote:
| I would also like to see "friendly" SQL syntaxes like what DuckDB
| offers, but I doubt they'd add it without an update to the
| standards themselves.
|
| https://duckdb.org/2022/05/04/friendlier-sql.html
| aftbit wrote:
| Oh thanks for sharing this! I love all of those, and most seem
| like they'd be easy sugar over the existing syntax. The biggest
| missing feature from those that I would really enjoy in data
| exploration tasks (though not in PROD) would be automatic JOIN
| ON selection based on foreign keys.
|
| Example: SELECT users.id, COUNT(*)
| FROM users JOIN orders ON AUTO WHERE
| orders.created_at > NOW() - '7 day'::interval GROUP BY
| ALL
|
| This would only work if there was an obvious path to do the
| join. In this case, I'm imagining that the `orders` table might
| have a `user_id` column which is a foreign key into the `users`
| table.
| SigmundA wrote:
| >Reduce the memory usage of prepared queries
|
| Yes query plan reuse like every other db, this still blows me
| away PG replans every time unless you explicitly prepare and
| that's still per connection.
|
| Better full-text scoring is one for me that's missing in that
| list, TF/IDF or BM25 please see:
| https://github.com/postgrespro/rum
| doctor_eval wrote:
| As I understand it this is supposed to be done in the client
| libraries rather than in the server. It's not that it doesn't
| reuse query plans, it just doesn't do it in core.
| thdc wrote:
| For running queries that modify data, I will start with BEGIN,
| run the query, maybe run some checks, then COMMIT or ROLLBACK
| depending on if it did what I hoped it would do.
|
| Are there any downsides to that outside of more typing?
|
| Ah I see the `-i-am-a-dummy` also affects queries with large
| result set sizes.
| silverlyra wrote:
| Your comment led me to realize that I could run psql with -v
| AUTOCOMMIT=0 (or add that to ~/.psqlrc) to achieve most of the
| safety net I've been wanting. My fear has been forgetting the
| BEGIN.
| lordgilman wrote:
| (Author here) I think the purpose of the large result limiting
| is to avoid swamping the terminal with output. The whole set of
| --i-am-a-dummy features is to give a saving chance to the
| hackers who are typing away without any forethought into a
| prompt in autocommit mode. Using transactions, copy and pasting
| complete queries, having your coworker double-check your work
| etc give you enough foresight that you'll not have a need for
| it.
| doctor_eval wrote:
| I agree with the GP - rename "commit" to "save" if you have
| to, disable autocommit, and use a paginator. Voila! Dummy
| mode.
|
| I'm definitely not a PG dummy but that's what I do anyway.
| x4m wrote:
| > Reduce the memory usage of prepared queries
|
| FWIW Odyssey supports prepared statements in transaction pooling.
| t-writescode wrote:
| Loose index scan. Its lack is why your "select distinct" is slow.
| bornfreddy wrote:
| SHOW DATABASES; and SHOW TABLES;, please. DESC would be nice too.
| And let it work everywhere, not just in CLI (like \d & co. do).
| leo8 wrote:
| This a thousand times. I have to ALWAYS google that one big
| query for showing me all the databases. (I almost always
| connect with pgAdmin)
| derefr wrote:
| > VALIDATE 1:m
|
| Rather than this, why not do more with existing table REFERENCES
| metadata? For example, why can't I have a covering index across
| data from two tables, joined through a foreign-key column with a
| pre-established REFERENCES foreign-key constraint against the
| other table, where the REFERENCES constraint then keeps
| everything in place to make that work (i.e. ensuring that
| vacuuming/clustering either table also vacuums/clusters the other
| together in a transaction in order to rewrite the LSNs in the
| combined index correctly, etc.)
| lordgilman wrote:
| (Author here) This is also a neat idea but wouldn't help with
| subqueries, views or other computed relations getting joined on
| each other which is a major motivation of the validation
| proposal.
| derefr wrote:
| I could see that being a motivation, but isn't that exactly
| the use-case where it'd also be a bad idea, overhead-wise?
|
| With a static DDL constraint on join shape, an assertion
| about such shape can be pre-validated at insert time, to be
| always-valid at query time, such that queries can then
| pass/fail such assertions during their "compilation" (query
| planning) step.
|
| Without such a static constraint, you have to instead insert
| a merge node in the query plan (same as what a DISTINCT
| clause does) in order to normalize the input row-tuple-stream
| and fail on the first non-normalized row-tuple seen.
|
| You _could_ pre-guarantee success in limited cases (e.g. it
| 's going to be a 1:N join if the LHS of the join has a UNIQUE
| constraint on the single column being joined against); but
| you're not going to have those guarantees in most cases of
| "computed relations."
| jayski wrote:
| I'd like to have per database WALs, instead of the WAL being for
| everything.
|
| I have no idea what this entails or if it's even possible, I only
| know it would make my life easier.
| forinti wrote:
| That's interesting. I have a separate cluster setup without
| archiving because some databases don't need it. It's so simple
| to run various clusters side-by-side on the same host that I
| don't think having compartmentalised wals would be easier to
| manage.
| derefr wrote:
| Yeah, but I want cheap in-memory joins between the WAL-
| isolated datasets. I.e. "multi-world" MVCC concurrency, where
| a TX is locking in a separate min_xid for dataset A, B, C,
| etc. for the lifetime of the TX -- but without this being a
| big distributed-systems vector-clocks problem, because it's
| all happening in a single process.
|
| Why? Being able to run a physical replica that loads WAL from
| multiple primaries that each have independent data you want
| to work with, for one thing. (Yes, logical replication
| "solves" this problem -- but then you can't offload index
| building to the primary, which is often half the point of a
| physical replication setup.)
| bratao wrote:
| I'm waiting for a compact data-structure such as RocksDB. I store
| TBs of data and MySQL and MyRocks are miles ahead on this
| department compared to Postgres. I prefer to run from sharding
| and distributed solutions for now.
| riku_iki wrote:
| I am curious what specifically makes RocksDB compact comparing
| to Postgress? How you decided it is more compact?
| nightfly wrote:
| Postgres on top of ZFS with compression on gives about 4x more
| efficient storage for the numerically/time-series centric
| database I help manage
|
| Edit: Current on-disk size is 5.34TiB and logical size is
| 20.3TiB so closer to 3.8x savings really
| masklinn wrote:
| Join and index hints.
|
| I know the postgres devs don't like them, and that the query
| planner _should_ be good enough that they 're not needed, but
| it's not, and it regularly fucks up.
| riku_iki wrote:
| You usually can manage this by dividing your query into
| subqueries each creating some temp table, so you have control
| over how joining actually happens.
| aeyes wrote:
| If you do this be careful what your temp_buffers is set to so
| that your temp tables don't spill to disk. If you are on a
| network file system, for example on AWS RDS, writing big (a
| few hundred MB) temp tables to disk will stall all
| transactions.
|
| The same is of course true when you have big joins that don't
| fit in work_mem but default size of this will be much larger.
|
| I can usually fix bad plans with CTEs, no need to get much
| fancier. And the problem is often caused by schema design
| where you have a mapping table of two tables in the middle
| and your join is N-M-M where the planner has no information
| about the relationship between the two outer tables.
| SigmundA wrote:
| Or there could just be a hint that says do it in source code
| order:
|
| https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-
| tr...
|
| There could also be hints that tell what kind of join to use
| and which indexes to use!
| derefr wrote:
| It's a shame, then, that there's no way to define at-first-
| purely-in-memory tables, which only "spill" to disk if they
| cause your query to exceed work_mem.
|
| Within PL/pgSQL, CREATE TEMPORARY TABLE is still (sometimes a
| lot!) slower than just SELECTing an array_agg(...) INTO a
| variable, and then `SELECT ... FROM unnest(that_variable) AS
| t` to scan over it. (And CTEs with MATERIALIZED are really
| just CREATE TEMPORARY TABLE in disguise, so that's no help.)
| taffer wrote:
| > It's a shame, then, that there's no way to define at-
| first-purely-in-memory tables, which only "spill" to disk
| if they cause your query to exceed work_mem.
|
| But isn't this exactly how temp tables work? A temp tabe
| lives in memory and only spills to disk if it exceeds
| _temp_buffers_ [1].
|
| [1] https://www.postgresql.org/docs/current/runtime-config-
| resou...
| riku_iki wrote:
| that one is for some access buffers and not actual temp
| tables?
| taffer wrote:
| No _temp_buffers_ is for temp tables and materialized
| CTEs. Maybe you are confusing it with _shared_buffers_?
| riku_iki wrote:
| I am trying to say that your link says it is for access
| buffer for accessing temp table, and it doesn't say
| actual temp table is stored in that buffer and not
| flushed on the disk.
| derefr wrote:
| Huh, I think you're right... but it's still slower! I've
| definitely measured this effect in practice.
|
| Just spitballing here -- I think the difference _might_
| come from where the _metadata_ required to treat the
| table "as a table" in queries has to be entered into,
| and the overhead (esp. in terms of locking) required to
| do so.
|
| Or, perhaps, it _might_ come from the serialization
| overhead of converting "view" row-tuples (whose contents
| might be merged together from several actual material
| tables / function results) into flattened fully-
| materialized row-tuples... which, presumably, emitting
| data into an array-typed PL/pgSQL variable might get to
| skip, since the handles to the constituent data can be
| held inside the array and thunked later on when needed.
| taffer wrote:
| Temp tables are per session, there shouldn't be any
| locking involved.
| derefr wrote:
| I would go further: give me a PL/ language (or an equivalent
| bytecode-abstract-machine abstraction) that lets me program --
| at least in a read-only capacity -- directly against the
| access-method handles, such that the DB's table heap-file
| pages, index B-tree nodes, locks, etc. are "objects" I can
| manipulate, pass to functions, and navigate graph-wise (i.e.
| ask a table for its partitions; ask a partition for its access-
| method; type-assert the access-method as a heap file; ask for
| the pages as a random-access array; probe some pages for row-
| tuples; iterate those, de-serializing and de-toasting them as a
| type-cast; and then implement some efficient search or sampling
| on those tuples. Basically the code you'd write in a PG
| extension to interact with the storage engine on that level,
| but limited to only "safe" actions against the storage, and so
| able to be directly user-exposed.)
|
| The closest analogy I know of to that, is how you work with ETS
| tables in Erlang. I want to send the RDBMS code that operates
| at that level!
|
| Actually, I presume that SQLite would necessarily have some
| low-level C interface that works like this -- but few people
| seem to talk about it/be aware of it compared to its high-level
| SQL-level interface.
| jiggawatts wrote:
| This is vaguely how the Microsoft "Jet" database engines
| work. Microsoft Exchange uses this low-level query authoring
| technique to achieve its scalability and performance goals.
| This generally makes its performance consistent and
| predictable.
|
| There were some attempts back in the early 2000s to move
| Exchange over to use the SQL Server RDBMS engine, and they
| added a bunch of features to enable this kind of low-level
| control. Not just join hints: you could _force_ specific
| query plans by specifying the plan XML document for that
| query. See: https://learn.microsoft.com/en-
| us/sql/t-sql/queries/hints-tr...
|
| This wasn't good enough however, and Exchange still uses the
| Jet database.
|
| Something that might be interesting is an RDBMS "as a
| library", where instead of poking it with ASCII text queries,
| you get a full programming API surface where you can do
| exactly the type of thing you propose: perform arbitrary
| walks through data structures, develop custom indexes, or
| whatever.
| cdcarter wrote:
| SQLite compiles SQL to bytecode, and then executes that
| bytecode against the database. However, there's no public
| interface for creating/running bytecode directly, instead of
| as a result of a compiled statement. You almost certainly
| COULD do what you're trying to achieve, but the SQLite
| author's have specifically called it out as a bad idea -
| https://sqlite.org/forum/info/c695cbe47b955076 - since
| bytecode representation can change from release to release in
| a way that would only matter to the compiler (or to your
| weird hacked in interface). Meaning, non-portable.
| SigmundA wrote:
| +1, planner is great and all, but I know better sometimes, let
| me tell it so and have it listen.
|
| Nothing like the planner deciding it knows better at some
| random time in production because of data changes.
| forinti wrote:
| I would like to see packages and tools to make migration from
| Oracle easier.
| alexisread wrote:
| Good luck with that! Oracle have never made anything that they
| can't extract money from. In this case look to their attempts
| to kill logminer (which most CDC systems eg. Debezium use) and
| force you to pay for Goldengate hub/microservices/cloud.
|
| Not tried it, but https://github.com/bersler/OpenLogReplicator
| Might work. Redpanda is easy to set up (kafka-compatible
| alternative).
| GregDavidson wrote:
| Inline SQL functions so I can modularize my code without a
| massive performance penalty. Foreign key references to system
| tables so I can reliably store, e.g. function oids.
| Ralfp wrote:
| I would love to see more languages in their fulltext search by
| default. But it looks like currently its ,,if you want language
| contribute it" kind of deal and I've found this outside of my
| depth.
| ceeam wrote:
| Reordering columns.
| muhehe wrote:
| I dont know. I mean...I'd like that too, but is this really
| more important than other mentioned features (in this thread or
| article)? What your reason? (I'm just curious)
___________________________________________________________________
(page generated 2023-01-28 23:00 UTC)